25847 Commits

Author SHA1 Message Date
JosJuice
69c14d6ec3 JitArm64: Fix frspx with single precision source
I haven't observed this breaking any game, but it didn't match
the behavior of the interpreter as far as I could tell from
reading the code, in that denormals weren't being flushed.
2021-04-25 15:56:59 +02:00
JosJuice
54451ac731 JitArm64: Use ConvertSingleToDoubleLower in RW when faster 2021-04-25 15:56:59 +02:00
JosJuice
9d6263f306 JitArm64: Add unit tests for single/double conversion 2021-04-25 15:56:58 +02:00
JosJuice
2a9d88739c JitArm64: Skip accurate single/double conversion if store-safe 2021-04-25 15:56:58 +02:00
JosJuice
1d106ceaf5 JitArm64: Optimize ConvertSingleToDouble, part 2
If we can prove that FCVT will provide a correct conversion,
we can use FCVT. This makes the common case a bit faster
and the less likely cases (unfortunately including zero,
which FCVT actually can convert correctly) a bit slower.
2021-04-25 15:56:19 +02:00
JosJuice
018e247624 JitArm64: Optimize ConvertSingleToDouble, part 1 2021-04-25 15:56:19 +02:00
JosJuice
28e4869c43 JitArm64: Optimize ConvertDoubleToSingle 2021-04-25 15:56:19 +02:00
JosJuice
6e0a5876ef JitArm64: Use accurate single/double conversions
Our old conversion approach became a lot more inaccurate when
enabling flush-to-zero, to the point of obviously breaking games.
2021-04-25 15:56:19 +02:00
JosJuice
39eccf6603 JitArm64: Call RW before FCMPE in fselx
Needed because the next commit will make RW clobber flags.
2021-04-25 15:56:19 +02:00
JosJuice
949686bbe7 JitArm64: Factor out single/double conversion code to functions
Preparation for following commits.

This commit intentionally doesn't touch paired stores,
since paired stores are supposed to flush to zero.
(Consistent with Jit64.)
2021-04-25 15:56:19 +02:00
JosJuice
fdf7744a53 JitArm64: Move float conversion code out of EmitBackpatchRoutine
This simplifies some of the following commits. It does require
an extra register, but hey, we have 32 of them.

Something I think would be nice to add to the register cache
in the future is the ability to keep both the single and double
version of a guest register in two different host registers
when that is useful. That way, the extra register we write to
here can be read by a later instruction, saving us from
having to perform the same conversion again.
2021-04-25 15:56:19 +02:00
JosJuice
f96ee475e4 Implement ArmFPURoundMode.cpp
Fixes https://bugs.dolphin-emu.org/issues/12388. Might also fix
other games that have problems with float/paired instructions
in JitArm64, but I haven't tested any.
2021-04-25 15:56:19 +02:00
Filoppi
3492f51eaf OnScreenDisplay: a few fixes
-They might have never drawn if DrawMessages wasn't called before they actually expired
-Their fade was wrong if the duration of the message was less than the fade time

This makes them much more useful for debugging, I know there might be other means
of debugging like logs and imgui, but this was the simplest so that's what I used.
If you want to print the same message every frame, but with a slightly different value
to see the changes, it now work.

To compensate for the fact that they are now always rendered once,
so on start up a lot of old messages (printed while the emulation was off) could show up,
I've added a "drop" time, which means if a msg isn't rendered for the first
time within that time, it will be dropped and never rendered.
2021-04-25 15:45:30 +03:00
Léo Lam
aa3a96f048
Merge pull request #9644 from JosJuice/jit-fallback-discard
Jits: Fix interpreter fallback handling of discarded registers
2021-04-25 13:20:41 +02:00
JosJuice
b3b5016f54 Jits: Fix interpreter fallback handling of discarded registers
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.

Fixes a 62ce1c7 regression.
2021-04-25 13:01:40 +02:00
Sintendo
47e16133e5 Jit64: divwx - Eliminate XOR for constant dividend
We normally check for division by zero to know if we should set the
destination register to zero with a XOR. However, when the divisor and
destination registers are the same the explicit zeroing can be omitted.
In addition, some of the surrounding branching can be simplified as
well.

Before:
45 85 FF             test        r15d,r15d
75 05                jne         normal_path
45 33 FF             xor         r15d,r15d
EB 0C                jmp         done
normal_path:
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:

After:
45 85 FF             test        r15d,r15d
74 0C                je          done
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:
2021-04-24 21:32:21 +02:00
Léo Lam
1c6232e95f
Merge pull request #9646 from PatrickFerry/sw-textureencoder-alignedwidth
SW: Fix alignedWidth in TextureEncoder
2021-04-24 20:13:10 +02:00
Léo Lam
18174d3ed6
Merge pull request #9649 from leoetlino/cmake-auto-update-track
Make it possible to enable auto-updates by default with CMake builds
2021-04-24 19:44:51 +02:00
Sintendo
abc4c8f601 Jit64: divwx - Eliminate MOV for division by power of 2
Division by a power of two can be slightly improved when the
destination and dividend registers are the same.

Before:
8B C6                mov         eax,esi
85 C0                test        eax,eax
8D 70 03             lea         esi,[rax+3]
0F 49 F0             cmovns      esi,eax
C1 FE 02             sar         esi,2

After:
85 F6                test        esi,esi
8D 46 03             lea         eax,[rsi+3]
0F 48 F0             cmovs       esi,eax
C1 FE 02             sar         esi,2
2021-04-24 19:28:23 +02:00
Sintendo
246adf0d6d Jit64: divwx - Eliminate MOV for division by 2
When destination and input registers match, a redundant MOV instruction
can be eliminated.

Before:
8B C7                mov         eax,edi
8B F8                mov         edi,eax
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1

After:
8B C7                mov         eax,edi
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1
2021-04-24 18:53:21 +02:00
Léo Lam
c812ab6a63
Jit: Optimize block link queries by using hash tables
Repeated erase() + iteration on a std::multimap is extremely slow.

Slow enough that it causes a 7 second long stutter during some
transitions in F-Zero X (a N64 VC game that triggers many, many icache
invalidations).

And slow enough that JitBaseBlockCache::DestroyBlock shows up on a
flame graph as taking >50% of total CPU time on the CPU-GPU thread:
https://i.imgur.com/vvqiFL6.png

This commit optimises those block link queries by replacing the
std::multimap (which is typically implemented with red-black trees)
with hash tables.

Master: https://i.imgur.com/vvqiFL6.png / 7s stutters
(starting from 5.0-2021 and with branch following disabled)

This commit: https://i.imgur.com/hAO74fy.png / ~0.7s stutters, which
is pretty close to 5.0 stable. (5.0-2021 introduced the performance
regression and it is especially noticeable when branch following
is disabled, which is the case for all N64 VC games since 5.0-8377.)
2021-04-24 17:20:59 +02:00
JMC47
18e84361d9
Merge pull request #9660 from ezio1900/master
VideoCommon: Fix scissorOffset, handle negative value correctly
2021-04-23 21:21:53 -04:00
ezio1900
97ea3a603e VideoCommon: Fix scissorOffset, handle negative value correctly
VideoCommon: Change the type of BPMemory.scissorOffset to 10bit signed: S32X10Y10
VideoBackends: Fix Software Clipper.PerspectiveDivide function, use BPMemory.scissorOffset instead of hard code 342
2021-04-24 08:46:21 +08:00
JosJuice
be5775614c
Merge pull request #9619 from leoetlino/scoped-fd
IOS/FS: Add a scoped FD class to make it harder to leak FDs
2021-04-23 21:53:25 +02:00
JosJuice
f0bd6b105f
Merge pull request #9663 from leoetlino/mios-hle-patch
Fix IPL crash when launching MIOS-patched games
2021-04-23 21:00:29 +02:00
JMC47
cfc4af76a9
Merge pull request #9321 from Pokechu22/sw-copyregion
Software: Fix out of bounds accesses in CopyRegion
2021-04-23 14:07:46 -04:00
JMC47
4ab92d4757
Merge pull request #9350 from Pokechu22/sw-viewport
Software: Invert backface test when viewport is positive
2021-04-23 14:06:02 -04:00
Léo Lam
1686b637df
MIOS: Fix SConfig::OnNewTitleLoad not being called
Oversight from #9545, which moved the "new game has been loaded" logic
to a separate OnNewTitleLoad function that has to be called explicitly
*after* a title has loaded.

Coupled with the commit that makes Dolphin not clobber 0x1800-0x3000
when using MIOS, this fixes Wind Waker and other MIOS-patched games
when they are launched from the System Menu.
2021-04-22 21:55:07 +02:00
Léo Lam
568428ca67
HLE: Do not clobber 0x1800-0x3000 when using MIOS to fix IPL crash
MIOS puts patch data in low MEM1 (0x1800-0x3000) for its own use.
Overwriting data in this range can cause the IPL to crash when
launching games that get patched by MIOS.
See https://bugs.dolphin-emu.org/issues/11952 for more info.

Not applying the Gecko HLE patches means that Gecko codes will not work
under MIOS, but this is better than the alternative of having specific
games crash.
2021-04-22 21:50:05 +02:00
Lioncash
adebc499f9 Jit64: Indicate explicit [[fallthrough]] within load helper 2021-04-19 17:37:44 -04:00
Lioncash
e1dfcda8a6 BlockingLoop: Add explicit [[fallthrough]] annotations 2021-04-19 17:34:46 -04:00
Lioncash
b21d62116d DataReport: Amend conditional test for data reports in IsValidMode
This particular range is kind of bizarre, and would only interpret
interleave mode 2 as a valid mode, while rejecting interleave mode 1 and
the extension byte mode.

As far as I know, based off the information on Wiibrew, we should be
considering all three values within this range as valid.
2021-04-19 16:43:29 -04:00
Patrick A. Ferry
f6a4368192 SW: Fix alignedWidth in TextureEncoder
This was causing issues in Software Renderer. Look at bug 11487
2021-04-19 02:06:52 +01:00
ash!!
43ceba4fef optimize TextureCacheBase::SerializeTexture, ::DeserializeTexture
texture serialization and deserialization used to involve many memory
allocations and deallocations, along with many copies to and from
those allocations. avoid those by reserving a memory region inside the
output and writing there directly, skipping the allocation and copy to
an intermediate buffer entirely.
2021-04-18 13:40:42 -07:00
JMC47
821e51cda4
Merge pull request #7214 from stenzek/cp-access-sync
Fifo: Run/sync with the GPU on command processor register access
2021-04-18 11:17:25 -04:00
JosJuice
dbd39ab2a0
Merge pull request #9642 from CrunchBite/xlink-bba-fix
Fix crash when stopping a game that does not use the BBA when XLink Kai BBA is selected in configuration
2021-04-18 10:43:32 +02:00
JosJuice
92308f5e34
Merge pull request #9645 from leoetlino/fifoplayer-optimization
FifoPlayer: Copy data with memcpy instead of one byte at a time
2021-04-18 10:43:26 +02:00
Léo Lam
5f355690e0
Make it possible to enable auto-updates by default with CMake builds
This adds a CMake option (DOLPHIN_DEFAULT_UPDATE_TRACK) to allow
configuring SCM_UPDATE_TRACK_STR. This is needed to enable auto-updates
in Windows CMake builds by default.
2021-04-17 19:45:43 +02:00
Stenzek
e3ac5dca32 Fifo: Run/sync with the GPU on command processor register access 2021-04-18 03:24:01 +10:00
Léo Lam
cc32fa91af
FifoPlayer: Copy data with memcpy instead of one byte at a time
Copying with memcpy/std::copy is a lot faster than writing one byte
at a time. And the behavior should be fully equivalent.
2021-04-17 16:02:43 +02:00
Connor McLaughlin
edeb6bcdb7
Merge pull request #9635 from stenzek/amd-exclusive-fullscreen
Vulkan: Work around AMD exclusive fullscreen bug (21.3+)
2021-04-17 16:09:09 +10:00
CrunchBite
d6b2fe2c0a Fix crash when stopping a game that does not use the BBA 2021-04-15 09:01:52 -04:00
JMC47
14959a1087
Merge pull request #9636 from sspacelynx/mali-broken-and
DriverDetails: Fix broken vector bitwise AND on Mali drivers
2021-04-14 07:04:16 -04:00
Dentomologist
e0a8d931fc Updater: Add code documentation Markdown file
Add docs/autoupdate_overview.md which gives an overview of the update
process, and comments pointing to it in autoupdate related files.
2021-04-13 15:37:31 -07:00
Léo Lam
336518049d
WiiUtils: Add helper functions to get emulated/real Bluetooth device
This adds a function to get the emulated or real Bluetooth device for
an active emulation instance. This lets us deduplicate all the
`ios->GetDeviceByName("/dev/usb/oh1/57e/305")` calls that are currently
scattered in the codebase and ensures Bluetooth passthrough is being
handled correctly.

This also fixes the broken check in WiimoteCommon::UpdateSource.
There was a confusion between "emulated Bluetooth" (as opposed to
"real Bluetooth" aka Bluetooth passthrough) and "emulated Wiimote".
2021-04-12 18:16:56 +02:00
Léo Lam
136f59b434
DolphinQt: Fix latent build error on Windows 2021-04-12 18:16:56 +02:00
Connor McLaughlin
b24e3f2f1a Vulkan: Work around AMD exclusive fullscreen bug (21.3+) 2021-04-12 12:41:17 +10:00
sspacelynx
aba9cae5ab
DriverDetails: Fix broken vector bitwise AND on Mali drivers 2021-04-11 15:15:02 +02:00
JMC47
5322256065
Merge pull request #9625 from leoetlino/mmu-sdr-update
MMU: Fix SDR updates being silently dropped in some cases
2021-04-06 20:23:13 -04:00
Léo Lam
3b6fdb74f6
Merge pull request #9628 from Dentomologist/wiiutils_fix_reference_to_temporary_subobject
WiiUtils: Remove reference qualifier
2021-04-07 01:46:45 +02:00