34659 Commits

Author SHA1 Message Date
Léo Lam
ae9ac510e2
CMake: Do not enable LTO by default for MSVC
LTO is supposed to be enabled by default for VS Release builds
according to the VS prop files but a build log from JMC reveals
that /GL and /LTCG are not actually passed to cl.exe/link.exe
for some reason...

LTO also leads to *extremely* and unacceptably slow build times
when using link.exe, so let's disable it by default to actually
match the project files.
2021-04-27 12:54:18 +02:00
Léo Lam
d0484a9ea9
CMake: Fix MSVC flags for Release/RelWithDebInfo
See https://gitlab.kitware.com/cmake/cmake/-/issues/20812

Manually redefine MSVC flags to match Visual Studio defaults
and ensure that Release builds generate debug info.
2021-04-27 12:54:18 +02:00
Léo Lam
20d00dfc79
CMake: Add missing MSVC optimization flags to match VS project props 2021-04-27 12:54:18 +02:00
Léo Lam
ae67a9382b
CMake: Put the *.mo files directly in the correct output dir
Avoids the need to copy the *.mo files manually *and* more importantly
this ensures that the mo files are always recreated if the build
output directory is cleared.
2021-04-27 12:54:18 +02:00
Léo Lam
e71aef6768
CMake: Ask windeployqt not to copy DLLs that are unnecessary
* no-system-d3d-compiler: d3dcompiler_47.dll
* no-angle, no-opengl-sw: libEGL.dll, libGLESv2.dll
2021-04-27 12:54:18 +02:00
Léo Lam
f6b8d4758e
CMake: Copy license.txt to output folder to match existing Win builds 2021-04-27 12:54:18 +02:00
Léo Lam
c69747c7fb
CMake: Fix CMAKE_RUNTIME_OUTPUT_DIRECTORY being ignored in UnitTests
CMAKE_RUNTIME_OUTPUT_DIRECTORY_<mode> overrode CMAKE_RUNTIME_OUTPUT_DIRECTORY.

It's just unnecessary and it broke UnitTests's custom output directory
2021-04-27 12:54:18 +02:00
Léo Lam
dcf3ca0f89
CMake: Force gtest to link CRT dynamically to avoid runtime mismatches
Required to fix unit test builds for Windows+MSVC+CMake.

For more information, see:

23ef29555e/googletest/README.md (visual-studio-dynamic-vs-static-runtimes)
2021-04-27 12:54:18 +02:00
iwubcode
626c686fee
DolphinQt: update device drop down size policy so that the input profile resizes properly
This also keeps the device profile at a minimum so that it doesn't
completely disappear (which was originally why it was changed to expanding)
2021-04-27 12:50:45 +02:00
Léo Lam
219f66c6e9
Merge pull request #9672 from JosJuice/jit-naming-scheme
DolphinQt/Android: Unify the JIT naming scheme
2021-04-27 12:00:23 +02:00
JMC47
4d10023727
Merge pull request #9552 from endrift/gba-timing
SI/DeviceGBA: Fix SI timings to actually closely match hardware
2021-04-26 21:20:06 -04:00
JosJuice
c09427ccdf
Merge pull request #9676 from leoetlino/colon
DolphinQt: Get rid of an extraneous colon in About dialog
2021-04-27 00:46:09 +02:00
Léo Lam
08215cc975
DolphinQt: Get rid of an extraneous colon in About dialog 2021-04-27 00:24:24 +02:00
Léo Lam
51bf2dca21
Merge pull request #9675 from JosJuice/jit64-div-80000000
Jit64: Fix UB/infinite loop when compiling division by 0x80000000
2021-04-26 23:50:27 +02:00
JosJuice
7d4b87e7ae Jit64: Fix UB/infinite loop when compiling division by 0x80000000 2021-04-26 23:42:03 +02:00
Filoppi
799a368a7c InputCommon: small hotkey threshold symmetry fix 2021-04-26 19:45:13 +03:00
Filoppi
ba2782e9d1 InputCommon: fix hotkey suppression crash if nullptr suppressions were added to the map
Update references was failing to update the references, causing input to stay nullptr and crashing.
I fixed the case that triggered that, though also added checks against nullptrs for safety.

(cherry picked from commit 4bdcf707555a5568eddff957fa3604975ffb6ed7)
2021-04-26 19:44:04 +03:00
Vicki Pfau
4ce3362bce SI/DeviceGBA: Fix SI timings to actually closely match hardware 2021-04-26 01:36:43 -07:00
JosJuice
ac679eb24d
Merge pull request #9666 from leoetlino/jit-block-hashtable
Jit: Optimize block link queries by using hash tables
2021-04-25 18:45:41 +02:00
JosJuice
a2c8050eba DolphinQt/Android: Unify the JIT naming scheme
I think the AArch64 JIT has come far enough that it doesn't have to
be called experimental anymore.

I'm also labeling the x86-64 JIT as x86-64 for consistence with the
AArch64 JIT. This will especially be helpful if we start supporting
AArch64 on macOS, as AArch64 macOS can run both the x86-64 JIT and
the AArch64 JIT depending on whether you enable Rosetta 2.
2021-04-25 17:19:50 +02:00
JMC47
5da85f3a25
Merge pull request #9458 from JosJuice/arm-fpu-round
JitArm64: Set flush-to-zero/rounding mode and improve float/double conversion accuracy
2021-04-25 10:23:19 -04:00
JosJuice
69c14d6ec3 JitArm64: Fix frspx with single precision source
I haven't observed this breaking any game, but it didn't match
the behavior of the interpreter as far as I could tell from
reading the code, in that denormals weren't being flushed.
2021-04-25 15:56:59 +02:00
JosJuice
54451ac731 JitArm64: Use ConvertSingleToDoubleLower in RW when faster 2021-04-25 15:56:59 +02:00
JosJuice
9d6263f306 JitArm64: Add unit tests for single/double conversion 2021-04-25 15:56:58 +02:00
JosJuice
2a9d88739c JitArm64: Skip accurate single/double conversion if store-safe 2021-04-25 15:56:58 +02:00
JosJuice
1d106ceaf5 JitArm64: Optimize ConvertSingleToDouble, part 2
If we can prove that FCVT will provide a correct conversion,
we can use FCVT. This makes the common case a bit faster
and the less likely cases (unfortunately including zero,
which FCVT actually can convert correctly) a bit slower.
2021-04-25 15:56:19 +02:00
JosJuice
018e247624 JitArm64: Optimize ConvertSingleToDouble, part 1 2021-04-25 15:56:19 +02:00
JosJuice
28e4869c43 JitArm64: Optimize ConvertDoubleToSingle 2021-04-25 15:56:19 +02:00
JosJuice
6e0a5876ef JitArm64: Use accurate single/double conversions
Our old conversion approach became a lot more inaccurate when
enabling flush-to-zero, to the point of obviously breaking games.
2021-04-25 15:56:19 +02:00
JosJuice
39eccf6603 JitArm64: Call RW before FCMPE in fselx
Needed because the next commit will make RW clobber flags.
2021-04-25 15:56:19 +02:00
JosJuice
949686bbe7 JitArm64: Factor out single/double conversion code to functions
Preparation for following commits.

This commit intentionally doesn't touch paired stores,
since paired stores are supposed to flush to zero.
(Consistent with Jit64.)
2021-04-25 15:56:19 +02:00
JosJuice
fdf7744a53 JitArm64: Move float conversion code out of EmitBackpatchRoutine
This simplifies some of the following commits. It does require
an extra register, but hey, we have 32 of them.

Something I think would be nice to add to the register cache
in the future is the ability to keep both the single and double
version of a guest register in two different host registers
when that is useful. That way, the extra register we write to
here can be read by a later instruction, saving us from
having to perform the same conversion again.
2021-04-25 15:56:19 +02:00
JosJuice
f96ee475e4 Implement ArmFPURoundMode.cpp
Fixes https://bugs.dolphin-emu.org/issues/12388. Might also fix
other games that have problems with float/paired instructions
in JitArm64, but I haven't tested any.
2021-04-25 15:56:19 +02:00
Filoppi
3492f51eaf OnScreenDisplay: a few fixes
-They might have never drawn if DrawMessages wasn't called before they actually expired
-Their fade was wrong if the duration of the message was less than the fade time

This makes them much more useful for debugging, I know there might be other means
of debugging like logs and imgui, but this was the simplest so that's what I used.
If you want to print the same message every frame, but with a slightly different value
to see the changes, it now work.

To compensate for the fact that they are now always rendered once,
so on start up a lot of old messages (printed while the emulation was off) could show up,
I've added a "drop" time, which means if a msg isn't rendered for the first
time within that time, it will be dropped and never rendered.
2021-04-25 15:45:30 +03:00
Léo Lam
aa3a96f048
Merge pull request #9644 from JosJuice/jit-fallback-discard
Jits: Fix interpreter fallback handling of discarded registers
2021-04-25 13:20:41 +02:00
JosJuice
b3b5016f54 Jits: Fix interpreter fallback handling of discarded registers
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.

Fixes a 62ce1c7 regression.
2021-04-25 13:01:40 +02:00
Sintendo
47e16133e5 Jit64: divwx - Eliminate XOR for constant dividend
We normally check for division by zero to know if we should set the
destination register to zero with a XOR. However, when the divisor and
destination registers are the same the explicit zeroing can be omitted.
In addition, some of the surrounding branching can be simplified as
well.

Before:
45 85 FF             test        r15d,r15d
75 05                jne         normal_path
45 33 FF             xor         r15d,r15d
EB 0C                jmp         done
normal_path:
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:

After:
45 85 FF             test        r15d,r15d
74 0C                je          done
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:
2021-04-24 21:32:21 +02:00
JosJuice
0f563ffd59 Translation resources sync with Transifex 2021-04-24 21:26:12 +02:00
Léo Lam
1c6232e95f
Merge pull request #9646 from PatrickFerry/sw-textureencoder-alignedwidth
SW: Fix alignedWidth in TextureEncoder
2021-04-24 20:13:10 +02:00
Léo Lam
18174d3ed6
Merge pull request #9649 from leoetlino/cmake-auto-update-track
Make it possible to enable auto-updates by default with CMake builds
2021-04-24 19:44:51 +02:00
Sintendo
abc4c8f601 Jit64: divwx - Eliminate MOV for division by power of 2
Division by a power of two can be slightly improved when the
destination and dividend registers are the same.

Before:
8B C6                mov         eax,esi
85 C0                test        eax,eax
8D 70 03             lea         esi,[rax+3]
0F 49 F0             cmovns      esi,eax
C1 FE 02             sar         esi,2

After:
85 F6                test        esi,esi
8D 46 03             lea         eax,[rsi+3]
0F 48 F0             cmovs       esi,eax
C1 FE 02             sar         esi,2
2021-04-24 19:28:23 +02:00
JosJuice
91669c25fe
Merge pull request #9650 from leoetlino/consistent-build-binary-dirs
Put x86_64 Windows binaries in Binary/x64 for consistency with ARM64
2021-04-24 19:22:05 +02:00
Sintendo
246adf0d6d Jit64: divwx - Eliminate MOV for division by 2
When destination and input registers match, a redundant MOV instruction
can be eliminated.

Before:
8B C7                mov         eax,edi
8B F8                mov         edi,eax
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1

After:
8B C7                mov         eax,edi
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1
2021-04-24 18:53:21 +02:00
Léo Lam
302e8136a3
Merge pull request #5624 from Orphis/cmake_windows
cmake: Windows fixes
2021-04-24 18:11:23 +02:00
Florent Castelli
6910fab63f
cmake: Replace /Zi with /Z7 for sccache support
Also allows better parallelization since there's no contention on a PDB
file with compiling.
2021-04-24 17:59:34 +02:00
Florent Castelli
712b078a5b
cmake: Search for sccache too in CCache module
sccache is a ccache rewrite that supports GCC, Clang and MSVC.
2021-04-24 17:59:34 +02:00
Léo Lam
c812ab6a63
Jit: Optimize block link queries by using hash tables
Repeated erase() + iteration on a std::multimap is extremely slow.

Slow enough that it causes a 7 second long stutter during some
transitions in F-Zero X (a N64 VC game that triggers many, many icache
invalidations).

And slow enough that JitBaseBlockCache::DestroyBlock shows up on a
flame graph as taking >50% of total CPU time on the CPU-GPU thread:
https://i.imgur.com/vvqiFL6.png

This commit optimises those block link queries by replacing the
std::multimap (which is typically implemented with red-black trees)
with hash tables.

Master: https://i.imgur.com/vvqiFL6.png / 7s stutters
(starting from 5.0-2021 and with branch following disabled)

This commit: https://i.imgur.com/hAO74fy.png / ~0.7s stutters, which
is pretty close to 5.0 stable. (5.0-2021 introduced the performance
regression and it is especially noticeable when branch following
is disabled, which is the case for all N64 VC games since 5.0-8377.)
2021-04-24 17:20:59 +02:00
JMC47
18e84361d9
Merge pull request #9660 from ezio1900/master
VideoCommon: Fix scissorOffset, handle negative value correctly
2021-04-23 21:21:53 -04:00
ezio1900
97ea3a603e VideoCommon: Fix scissorOffset, handle negative value correctly
VideoCommon: Change the type of BPMemory.scissorOffset to 10bit signed: S32X10Y10
VideoBackends: Fix Software Clipper.PerspectiveDivide function, use BPMemory.scissorOffset instead of hard code 342
2021-04-24 08:46:21 +08:00
JosJuice
be5775614c
Merge pull request #9619 from leoetlino/scoped-fd
IOS/FS: Add a scoped FD class to make it harder to leak FDs
2021-04-23 21:53:25 +02:00