4489 Commits

Author SHA1 Message Date
Fiora
6f617c4175 JIT64: try enabling dcbz again
This time, check the address carefully beforehand, since apparently some games
do horrible things like running it on non-RAM addresses, or at the very least
virtual addresses.
2014-08-29 12:19:58 -07:00
Dolphin Bot
d159bc9998 Merge pull request #886 from RachelBryk/netplay-buffer
Change default netplay buffer to 5.
2014-08-29 06:51:56 +02:00
Fiora
88095a607a JIT: fix RAM check in load-from-constant-address
A bug that seems to have been uncovered by allowing immediate-address loads.
Super Monkey Ball 2 crashes without this change -- it's possible, however, that
the game actually requires the MMU hack, since it crashed due to accessing an
address in the 0x20000000-0x3fffffff range.
2014-08-28 12:54:23 -07:00
Ryan Houdek
ad8fe0fb52 Merge pull request #879 from FioraAeterna/frspx
JIT64: add frspx implementation
2014-08-28 14:12:21 -05:00
Fiora
c359d65dfe JIT64: add frspx implementation 2014-08-28 11:40:31 -07:00
Ryan Houdek
4a78a8a72a Merge pull request #876 from FioraAeterna/floatloadstore
JIT64: clean up and unify float load/store code
2014-08-28 13:37:27 -05:00
Dolphin Bot
359aa664e1 Merge pull request #898 from FioraAeterna/fprffix
JIT: make fprf conditional in fcmp, just like the other instructions
2014-08-28 20:25:26 +02:00
Dolphin Bot
5e514dcfbc Merge pull request #881 from FioraAeterna/mulhwx
JIT64: add mulhwx implementation
2014-08-28 20:25:13 +02:00
Fiora
7929f2f033 JIT: make fprf conditional in fcmp, just like the other instructions
Missed in the FPRF merge (it didn't break anything, but it's probably a bit
slower and not consistent with the others).
2014-08-28 11:19:09 -07:00
Ryan Houdek
0217fb2008 Merge pull request #843 from FioraAeterna/fprf
JIT: Initial FPRF support
2014-08-28 13:15:50 -05:00
Fiora
043256449e Jit64: some load/store optimizations
Avoid extra ops during address calculation in loads; use LEAs or immediates
whenever possible.
2014-08-28 10:12:55 -07:00
Rachel Bryk
a3cfc98f26 Allow system time to move forward during netplay. 2014-08-27 16:45:15 -04:00
Fiora
7e07acbf3f Fix another absent-minded typo in the fmul interpreter patch 2014-08-26 23:00:11 -07:00
Fiora
1a0a33518b Bugfixes for fmul rounding
Fix the places I forgot to add Force25Bit, and fix an incredibly silly typo bug
2014-08-26 21:37:45 -07:00
Rachel Bryk
31353573cb Change default netplay buffer to 5. 2014-08-26 21:50:30 -04:00
Fiora
7dbc623dc0 JIT: Initial FPRF support
Doesn't support all the FPSCR flags, just the FPRF ones.
Add PPCAnalyzer support to remove unnecessary FPRF calculations.

POV-ray benchmark with enableFPRF forced on for an extreme comparison:
Before: 1500s
After, fmul/fmadd only: 728s
After, all float: 753s

In real games that use FPRF, like F-Zero GX, FPRF previously cost a few percent
of total runtime.

Since FPRF is so much faster now, if enableFPRF is set, just do it for every
float instruction, not just fmul/fmadd like before. I don't know if this will
fix any games, but there's little good reason not to.
2014-08-26 10:57:03 -07:00
Fiora
288babf414 PPCFP: add comment 2014-08-26 09:08:22 -07:00
Fiora
90324f3809 JIT64: add mulhwx implementation 2014-08-26 01:09:04 -07:00
Fiora
aaca1b01e5 JIT64: clean up and unify float load/store code
While we're at it, support a bunch of float load/store variants that weren't
implemented in the JIT. Might not have a big speed impact on typical games but
they're used at least a bit in povray and luabench.

694 -> 644 seconds on povray.
2014-08-25 19:51:40 -07:00
Lioncash
44ee2f20b9 Merge pull request #874 from FioraAeterna/fixidiocy
JIT: fix incredibly silly mistake in fmul rounding patch
2014-08-25 13:22:33 -04:00
Fiora
f04e362721 JIT: fix incredibly silly mistake in fmul rounding patch 2014-08-25 10:10:28 -07:00
comex
6574682ff5 Remove unused variable m_zero. 2014-08-24 16:22:19 -04:00
comex
d128795594 Merge pull request #862 from comex/registersinuse
Reduce my idiocy in register saving code.
2014-08-24 16:16:32 -04:00
comex
a7752f49be Merge pull request #861 from comex/warnings
Fix warnings for OS X
2014-08-24 16:15:58 -04:00
comex
cf01f47b52 Fix bloody printf specifiers.
In particular, even in code that only runs on x86-64, you can't use
PRIx64 for size_t because, on OS X, one is unsigned long and the other
is unsigned long long and clang whines about the difference.  I guess
you could make a size_t specifier macro, but those are horribly ugly, so
I just used casting.

Anyone want to make a nice (and slow) template-based printf?

Now without bare 'unsigned'.
2014-08-24 15:56:41 -04:00
Pierre Bourdon
9ff7125786 Merge pull request #810 from lioncash/controller-interface
InputCommon: Don't base default radius of analog sticks off of their name
2014-08-24 19:58:25 +02:00
Pierre Bourdon
ebf1b98106 Merge pull request #834 from FioraAeterna/fixfmulrounding
JIT64: Fix fmul rounding issues
2014-08-24 19:49:56 +02:00
Fiora
4d7b1275c9 Interpreter: apply the same odd rounding to single multiplies as the JIT 2014-08-24 10:28:52 -07:00
Fiora
4f18f6078f JIT64: Fix fmul rounding issues
Thanks to magumagu's softfp experiments, we know a lot more about the Wii's
strange floating point unit than we used to. In particular, when doing a
single-precision floating point multiply (fmulsx), it rounds the right hand
side's mantissa so as to lose the low 28 bits (of the 53-bit mantissa).

Emulating this behavior in Dolphin fixes a bunch of issues with games that
require extremely precise emulation of floating point hardware, especially
game replays. Fortunately, we can do this with rather little CPU cost; just ~5
extra instructions per multiply, instead of the vast load of a pure-software
float implementation.

This doesn't make floating-point behavior at all perfect. I still suspect
fmadd rounding might not be quite right, since the Wii uses fused instructions
and Dolphin doesn't, and NaN/infinity/exception handling is probably off in
various ways... but it's definitely way better than before.

This appears to fix replays in Mario Kart Wii, Mario Kart Double Dash, and
Super Smash Brothers Brawl. I wouldn't be surprised if it fixes a bunch of
other stuff too.

The changes to instructions other than fmulsx may not be strictly necessary,
but I included them for completeness, since it feels wrong to fix some
instructions but not others, since some games we didn't test might rely on
them.
2014-08-24 10:28:52 -07:00
Pierre Bourdon
aaff5a0afb Merge pull request #856 from FioraAeterna/ppcfpopt
JIT: faster PPC_FP code
2014-08-24 19:25:56 +02:00
comex
d19ec35363 Reduce my idiocy in register saving code.
(1) Rename ABI_ALL_CALLEE_SAVED to ABI_ALL_CALLER_SAVED, because that's
what it was actually defined as (and used as).  Derp.

(2) RegistersInUse is always used for the purpose of saving registers
before calling a C++ function in the middle of a JIT block (without
flushing).  There is no need to save callee-saved registers in this
case.  Change the name to CallerSavedRegistersInUse and mask with
ABI_ALL_CALLER_SAVED.

Nothing obvious broke when starting up a Melee game.  (I added a test
for anything actually being masked out; it happens, but in this
particular case seemed to occur at most a few dozen times per second, so
the actual performance benefit is probably negligible.)
2014-08-23 15:46:10 -04:00
comex
e0f35e0e59 Remove unused declarations. 2014-08-23 15:26:59 -04:00
Dolphin Bot
43f890322f Merge pull request #845 from ChuckRozhon/switch_to_cstdint
Changed unsigned ints and chars to cstdint counterparts
2014-08-23 21:18:18 +02:00
Shawn Hoffman
af2405eefd Remove dsound audio backend.
There isn't any reason to use dsound over xaudio.
2014-08-23 11:19:19 -07:00
Shawn Hoffman
327d35377d windows: remove now-extraneous NOMINMAX and WIN32_LEAN_AND_MEAN #defines from dolphin code.
Wrap dinput.h in a header defining DIRECTINPUT_VERSION instead of repeating it multiple places.
2014-08-23 10:48:48 -07:00
Pierre Bourdon
f7102faae7 Merge pull request #853 from lioncash/memmap
Core: Simplify Memory::GetString
2014-08-23 19:19:09 +02:00
Fiora
59c1a46ab1 JIT: faster PPC_FP code
The PPC_FP conversion code can be made a lot simpler with the observation
that the only values that need to be sent through the slow x87 path are
denormals.

A whole bunch faster: 708->678 seconds on POV-RAY.
2014-08-23 07:44:42 -07:00
Ryan Houdek
db039d1440 Merge pull request #850 from lioncash/unused-headers
Core: Removed blank headers Boot_ELF.h and Boot_WiiWAD.h
2014-08-22 02:07:10 -05:00
Lioncash
130f57df91 Core: Simplify Memory::GetString 2014-08-22 01:22:58 -04:00
Ryan Houdek
fc5a73d62e Merge pull request #849 from FioraAeterna/fixppcanalyzer
PPCAnalyzer: move num_instructions initialization to correct place
2014-08-22 00:14:11 -05:00
Lioncash
de9edbeebf Core: Removed blank headers Boot_ELF.h and Boot_WiiWAD.h 2014-08-21 15:30:51 -04:00
Lioncash
25e29c323d Merge pull request #842 from lioncash/jit
Coding style clean up for the Jit, JitARM and JitIL
2014-08-21 15:25:43 -04:00
Fiora
5c0145f71b PPCAnalyzer: move num_instructions initialization to correct place
Much of the PPC Analyzer code (e.g. instruction reordering for merging
branches) wasn't actually being run.
2014-08-21 11:19:23 -07:00
Lioncash
f17dcd2019 Merge pull request #764 from magcius/new-nogui-2
Rewrite GLInterface
2014-08-21 14:14:54 -04:00
Charles Rozhon
965b1f1f2c Removed the typedefs for Elf32 word types 2014-08-20 15:21:41 -05:00
Charles Rozhon
d84b2f3be4 Changed unsigned ints and chars to cstdint counterparts 2014-08-20 14:52:01 -05:00
Lioncash
20f8ec9afa Core: Join a few if statements in IR.cpp 2014-08-20 14:26:16 -04:00
Lioncash
99ae79f7f9 Core: Better assert messages for stx op 2014-08-20 14:16:05 -04:00
Lioncash
d694637938 Core: Clean up brace/body placements for JitIL 2014-08-20 14:04:01 -04:00
Lioncash
b005ba2797 Core: Clean up body/brace placements in JitArm32 2014-08-20 14:03:46 -04:00