Tales of Symphonia and Skies of Arcadia Legends now have working musics with
DSPHLE. Some other games with the same symptoms (missing instruments) should
probably be fixed by that change too.
- In icon retrieving I removed the "format check" as it shouldn't really matter to have mixed icon formats. Also removed the "Time splitters hack" as there's no reason for it since we are only checking the last 3 bits and I'm pretty sure having bits 1 and 2 set is the same as having them unset.
- Icon retrieving uses AnimSpeed as stop signal (every icon must have an speed set, the first speed that is 0 means there are no more icons)
- Also, in icon retrieving I added support for "blank frames"(Luigi's Mansion and Pikmin that I know of). With this the base for icon animation is complete.
- Fixed PSOIII savegame patch which was wrong before.
Signed-off-by: LPFaint99 <lpfaint99@gmail.com>
* Capcom-Music-Loop:
Removed the fake DMA wait time as it is no longer needed after the aram-dma-fixes branch is merged. This fixes the Resident Evil 2/3 cutscene audio in DSP LLE mode. Fixes issue 2723.
Changed the loop end address detection to an exact match with the current address for ADPCM audio. Fixes the non-looping music in PN03.
So to compensate lets bring back some speed to the emulation.
change a little the way the vertex are send to the gpu,
This first implementation changes dx9 a lot and dx11 a little to increase the parallelism between the cpu and gpu.
ogl: is my next step in ogl is a little more trickier so i have to take a little more time.
the original concept is Marcos idea, with my little touch to make it even more faster.
what to look for: SPEEEEEDDD :).
please test it a lot and let me know if you see any problem.
in dx9 the code is prepared to fall back to the previous implementation if your card does not support the amount of buffers needed.
So if you did not experience any speed gains you know where is the problem :).
for the ones with more experience and compression of the code please test changing the amount and size of the buffers to tune this for your specific machine.
The current values are the sweet spot for my machine.
All must Thanks Marcos, I hate him for giving good ideas when I'm full of work.
Most of the InvalidateICache calls are for a 32 bytes block: this is the
number of bytes invalidated by PowerPC dcb*/icb* instructions. Profiling
shows that a lot of CPU time is spent checking if there are any JIT blocks
covered by these 32 bytes (using std::map::lower_bound).
This patch adds a bitset containing the state of every 32 bytes block in
RAM (JIT cached/not JIT cached). Using that, a 32 bytes InvalidateICache
can check in the bitset if any JIT block might be invalidated. A bitset
check is a lot faster than an std::map::lower_bound operation, improving
performance of JitCache::InvalidateICache by more than 100%.
Some practical numbers:
* Xenoblade Chronicles (PAL)
56.04FPS -> 59.28FPS (+5.78%)
* The Last Story (PAL)
30.9FPS -> 32.83FPS (+6.25%)
* Super Mario Galaxy (PAL)
59.76FPS -> 62.46FPS (+4.52%)
This function still takes more time than it should - more optimization in
this area might be possible (specializing for 32 bytes blocks to avoid
useless memcpy, for example).