6290 Commits

Author SHA1 Message Date
Markus Wick
43d17cb360 Merge pull request #2904 from Sonicadvance1/aarch64_more_inst
[AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.
2015-08-26 07:48:24 +02:00
Tillmann Karras
ee4a12ffe2 Jit64: some byte-swapping changes 2015-08-26 05:41:18 +02:00
Ryan Houdek
6729a36d8d [AArch64] Set BindToRegister's to_load correctly for double FP ops. 2015-08-25 21:29:27 -05:00
Lioncash
db4f692482 GCMemcard: Clean up memcard logging messages. 2015-08-25 21:55:52 -04:00
Tillmann Karras
ee50a2ef28 Jit64: fix bugs in the FPSCR instructions 2015-08-25 23:48:14 +02:00
Markus Wick
bd08c1b01a Merge pull request #2901 from Sonicadvance1/aarch64_stfiwx
[AArch64] Implement stfiwx
2015-08-25 22:47:39 +02:00
Markus Wick
24cb650078 Merge pull request #2663 from degasus/dcbx
Jit64: dcbf + dcbi
2015-08-25 12:16:56 +02:00
Ryan Houdek
0666c0750b [AArch64] Implement fdivx/fdivsx/mfcr/mtcrf.
Gets the povray bench to better times than the Wii.
2015-08-24 15:32:19 -05:00
Ryan Houdek
d96be9250c Merge pull request #2899 from Sonicadvance1/aarch64_fctiwzx
[AArch64] Implement fctiwzx
2015-08-24 13:22:27 -05:00
degasus
0d92c8fb89 Jit64: Optimize dcbx 2015-08-24 18:33:23 +02:00
Tillmann Karras
ac84d6d0fa Jit64: some cache flush changes
- dynamically allocate third scratch register instead of forcing ECX
- use LEA as 3 operand add if possible
- use BT,JC instead of SHR,TEST,JNZ
- merge MOV,TEST
- use appropriate ABI function (no asm change)
2015-08-24 18:33:23 +02:00
degasus
6f34b27323 Jit64: implement dcbf + dcbi 2015-08-24 18:33:19 +02:00
Markus Wick
0ad6fa8f62 Merge pull request #2903 from lioncash/cast
Memmap: Remove pointer casts
2015-08-24 15:42:56 +02:00
Lioncash
abd3b124be Memmap: Remove pointer casts 2015-08-24 09:07:09 -04:00
Tillmann Karras
33eefc2d86 Jit64: quickfix for mtfsfx 2015-08-24 12:12:31 +02:00
Ryan Houdek
d3176fe22a [AArch64] Implement stfiwx
Improves povray performance by ~4%
2015-08-24 01:10:55 -05:00
Ryan Houdek
80fa9af9b1 Merge pull request #2898 from degasus/linking
JitArm64: Faster linking of continuous blocks
2015-08-23 18:09:02 -05:00
degasus
7320d519b4 JitArm64: Implement srwx 2015-08-23 23:29:48 +02:00
degasus
4722a69fd0 JitArm64: Implement divwux 2015-08-23 23:29:18 +02:00
degasus
9e4366963c JitArm64: Implement subfic 2015-08-23 23:29:07 +02:00
degasus
95be17772f JitArm64: Implement addex 2015-08-23 23:29:02 +02:00
degasus
025e7c835a JitArm64: Implement subfcx 2015-08-23 23:28:28 +02:00
degasus
550a90e691 JitArm64: Implement subfex 2015-08-23 23:28:24 +02:00
Ryan Houdek
561744819e [AArch64] Implement fctiwzx
Improves the povray benchmark time by 5.6%
2015-08-23 15:35:18 -05:00
degasus
77a6798094 JitArm64: Faster linking of continuous blocks 2015-08-23 14:44:23 +02:00
Markus Wick
73067b1ef1 Merge pull request #2888 from degasus/jit64
Jit64: Faster linking of continuous blocks
2015-08-23 13:24:15 +02:00
Lioncash
2a1abf8dd6 Merge pull request #2896 from lioncash/using
Core: Minor CPU core typedef cleanup
2015-08-22 19:00:23 -04:00
Markus Wick
8b881a6c34 Merge pull request #2891 from Sonicadvance1/aarch64_implement_crxxx
[AArch64] Implement the cr instructions
2015-08-23 00:44:47 +02:00
Lioncash
fdafa5d063 Core: Move includes out of instruction table headers
These aren't necessary (and cause unnecessary indirect inclusions).
2015-08-22 14:15:02 -04:00
Lioncash
a248a4d2ce Jit64/JitIL: Relocate instruction typedefs 2015-08-22 14:15:00 -04:00
Lioncash
c56717e058 Core: Shorten the _interpreterInstruction typedef
The class itself already acts as a namespace trailer, so '_interpreter'
isn't necessary. This also gets rid of a duplicate typedef in the
Interpreter_Tables.
2015-08-22 14:14:49 -04:00
Markus Wick
a39c0910c4 Merge pull request #2893 from Sonicadvance1/aarch64_memory_base_register
[AArch64] Use a register as a constant for the memory base.
2015-08-22 15:41:57 +02:00
Ryan Houdek
dba579c52f [AArch64] Use a register as a constant for the memory base.
Removes a /lot/ of redundant movk operations in fastmem loadstores.
Improves performance of the povray bench by ~5%
2015-08-22 08:36:34 -05:00
Markus Wick
c2f38f1d16 Merge pull request #2892 from Sonicadvance1/aarch64_frsp
[AArch64] Implement frspx
2015-08-22 09:44:14 +02:00
Ryan Houdek
ce32b76be3 [AArch64] Implement frspx
Improves performance in povray bench by 2%
2015-08-22 00:35:30 -05:00
Ryan Houdek
d74eb0ea58 [AArch64] Fix the bugs in the cr instructions
Makes it a bit more efficient in the process.
2015-08-21 23:24:29 -05:00
degasus
e9ade0abe1 JitArm64: implement crXXX 2015-08-21 20:49:08 -05:00
flacs
95d958c03d Merge pull request #2889 from lioncash/interp
Interpreter: Use std::isnan instead of IsNAN
2015-08-21 21:43:08 +02:00
flacs
bb7f3d1822 Merge pull request #2867 from Tilka/mtspr_hid0
Jit64: implement HID0 case of mtspr
2015-08-21 21:04:35 +02:00
flacs
01aea965ba Merge pull request #2864 from Tilka/fpscr
Jit64: implement FPSCR related instructions
2015-08-21 21:04:20 +02:00
Lioncash
18d658df1f Interpreter_FloatingPoint: Use std::isnan instead of IsNAN
Same thing, except one is part of the stdlib.
2015-08-21 15:04:03 -04:00
degasus
78aa01e06e Jit64: Faster linking of continuous blocks
We compile the blocks as they are executed, so it's common
to link them continuously. We end with calling JMP after every
block, but often just with a distance of 0.
So just emitting NOPs instead also "calls" the next block, but
easier for the CPU.
2015-08-21 17:41:53 +02:00
Ryan Houdek
5f628749ff Merge pull request #2886 from Sonicadvance1/aarch64_faster_lfd
[AArch64] Optimize lfd instructions if possible.
2015-08-21 05:38:53 -05:00
Ryan Houdek
df53b37253 [AArch64] Optimize lfd instructions if possible.
If we are going to be using lfd, then chances are it is going to be used in double heavy areas of code.
If we only need to load the lower register, then we should also not worry about having to insert in to the low 64bits of the guest register.
So add a new flag to the backpatching to handle lfd to directly to the destination register.
This gives ~3% performance improvement to Povray.
2015-08-21 04:31:54 -05:00
Markus Wick
4f45d71840 Merge pull request #2760 from Sonicadvance1/aarch64_fcmp
[AArch64] Implement fcmp{u,o}
2015-08-21 11:03:20 +02:00
Markus Wick
6cb87a9227 Merge pull request #2837 from Sonicadvance1/aarch64_faster_nonpaired
[AArch64] Optimize cases when an FPR is only used for non-paired ops.
2015-08-21 09:51:45 +02:00
Ryan Houdek
7ce4c3138e [AArch64] Optimize cases when an FPR is only used for non-paired ops. 2015-08-20 23:36:29 -05:00
Lioncash
95c57fcec1 Jit: Remove unnecessary namespace prefixes 2015-08-20 05:20:19 -04:00
degasus
896a02b3a8 DSP HLE: Remove timing informations from ucodes
On HLE, we don't emulate the timings on HLE, so there is also no need
to setup periods callbacks.
2015-08-19 16:20:17 +02:00
degasus
7277eb0e6c AX-HLE: Call HLE on mailbox write
It was done on Update() which was called exactly every 5ms.
But the game is allowed to use the DSP more often, eg to generate 48kHz audio.
2015-08-19 16:19:06 +02:00