11805 Commits

Author SHA1 Message Date
Mat M
41befc21cd
Merge pull request #9708 from JosJuice/dsp-volatile
DSP: Change external_interrupt_waiting from volatile to atomic
2021-05-14 14:34:09 -04:00
Mat M
964fed77c5
Merge pull request #9707 from JosJuice/remove-atomic-header
Remove Atomic.h
2021-05-14 14:33:24 -04:00
Scott Mansell
9f91fb6447
Merge pull request #9688 from Filoppi/input_cleanup
Input cleanup
2021-05-14 20:51:33 +12:00
JosJuice
d17341572d DSP: Change external_interrupt_waiting from volatile to atomic
Making this volatile accomplishes... Well, nothing in practice.
Making it atomic, on the other hand, lets us enforce a memory ordering.
2021-05-14 09:28:10 +02:00
JosJuice
b93983b50a Remove Atomic.h
The STL has everything we need nowadays.

I have tried to not alter any behavior or semantics with this
change wherever possible. In particular, WriteLow and WriteHigh
in CommandProcessor retain the ability to accidentally undo
another thread's write to the upper half or lower half
respectively. If that should be fixed, it should be done in a
separate commit for clarity. One thing did change: The places
where we were using += on a volatile variable (not an atomic
operation) are now using fetch_add (actually an atomic operation).

Tested with single core and dual core on x86-64 and AArch64.
2021-05-13 18:56:27 +02:00
Mat M
0ef88d4ecb
Merge pull request #9705 from Leseratte10/master
Socket: Fix AF_INET6 on non-Windows systems
2021-05-13 06:42:44 -04:00
Mat M
24b9a64c11
Merge pull request #9690 from Sintendo/jit64divwux
Jit64: divwux - Prefer three-operand IMUL
2021-05-13 06:42:14 -04:00
Mat M
725ea3d9c1
Merge pull request #9637 from JosJuice/jitarm64-fprf
JitArm64: Implement FPRF updates
2021-05-13 06:39:28 -04:00
JosJuice
bfe8b1068d JitArm64: Implement FPRF updates 2021-05-13 11:51:00 +02:00
Jordan Woyak
bf16f77402
Merge pull request #9657 from lioncash/wiimote-mode
DataReport: Amend conditional test for data reports in IsValidMode
2021-05-12 17:23:17 -05:00
Filoppi
4625359a4f InputCommon: clamp the attachment setting max to its actual enum max
NumericSettings support a max, so let's use it.
It might not do much now, but the max and min values will be used to give visual feeback
in the UI in one of my upcoming input PRs
2021-05-12 18:27:24 +03:00
Filoppi
f4fec42165 Add mixed comments to input code, make some tooltip clearer 2021-05-12 18:27:23 +03:00
Florian Bach
c21e9909ab Socket: Fix AF_INET6 on non-Windows systems 2021-05-11 17:00:02 +02:00
Filoppi
e9e41b925b InputCommon: follow coding conventions and rename GetState() to UpdateState()
And remove useless include
2021-05-10 23:48:10 +03:00
Pokechu22
73f4e57006 Add name and description for primitives 2021-05-07 15:42:26 -07:00
Pokechu22
28b71c65af Fix same object count being used for all frames in the FIFO analyzer
If the number of objects varied, this would result in either missing objects on some frames, or too many objects on some frames; the latter case could cause crashes.  Since it used the current frame to get the count, if the FIFO is started before the FIFO analyzer is opened, then the current frame is effectively random, making it hard to reproduce consistently.

This issue has existed since the FIFO analyzer was implemented for Qt.
2021-05-07 15:42:18 -07:00
Pokechu22
ef75381a84 Fix occasional deadlock when stopping FIFO playback 2021-05-07 15:42:18 -07:00
Pokechu22
58333d6feb Make FIFO frame count inclusive
The 'zero frames in the range' check can be removed because now there is always at least 1 frame; of course that might be the same frame over and over again, but that's still useful for e.g. Free Look (and the 1 frame repeating effect already occurred when frame count was exclusive).
2021-05-07 15:42:18 -07:00
Pokechu22
263ca79aae Adjust FIFO player object ranges
A single object can be selected instead of 2 (it was already inclusive internally), and the maximum value is the highest number of objects in any frame (minus 1) to reduce jank when multiple frames are being played back.
2021-05-07 15:42:17 -07:00
Pokechu22
4cc442d7cd Use CP constants in FifoAnalyzer 2021-05-07 15:42:07 -07:00
Léo Lam
049b92b7ef
Merge pull request #9417 from Filoppi/input-1
Fix FPS counter and Game Window speed % breaking on pause/unpause
2021-05-07 15:08:01 +02:00
Sintendo
2cafa0a960 Jit64: divwux - Prefer three-operand IMUL
By taking advantage of three-operand IMUL, we can eliminate a MOV
instruction. This is a small code size win. However, due to IMUL sign
extending the immediate value to 64 bits, we can only apply this when
the magic number's most significant bit is zero.

To ensure this can actually happen, we also minimize the magic number by
checking for trailing zeroes.

Example (Unsigned division by 18)
Before:
41 BE E4 38 8E E3    mov         r14d,0E38E38E4h
4D 0F AF F5          imul        r14,r13
49 C1 EE 24          shr         r14,24h

After:
4D 69 F5 39 8E E3 38 imul        r14,r13,38E38E39h
49 C1 EE 22          shr         r14,22h
2021-05-06 19:54:33 +02:00
Filoppi
d586163e38 Wrap some more control expression around ``
This isn't entirely necessary, as they are interpreted as barewords expressions,
but it's still nicer to have by default. And my upcoming input changes will
always put `` around single letter inputs.
2021-05-06 01:32:03 +03:00
JosJuice
b305e4cfc1 JitArm64: Fix JitRegister::Register call for cstd
Seems like I made a little copy-paste error.
2021-05-06 00:20:47 +02:00
Filoppi
818672b585 Fix FPS counter and Game Window speed % breaking on pause/unpause
-Add pause state to FPSCounter.
-Add ability to have more than one "OnStateChanged" callback in core.
-Add GetActualEmulationSpeed() to Core. Returns 1 by default. It's used by my input PRs.
2021-05-06 01:10:04 +03:00
JosJuice
3397f49a0a IOS: Don't let Kernel initialize WiiRoot if already initialized
The SaveToSYSCONF call in BootManager.cpp was unintentionally
overriding the temporary NAND set by the preceding
InitializeWiiRoot call. Fixes
https://bugs.dolphin-emu.org/issues/12500.
2021-05-02 10:30:32 +02:00
JMC47
4d10023727
Merge pull request #9552 from endrift/gba-timing
SI/DeviceGBA: Fix SI timings to actually closely match hardware
2021-04-26 21:20:06 -04:00
Léo Lam
51bf2dca21
Merge pull request #9675 from JosJuice/jit64-div-80000000
Jit64: Fix UB/infinite loop when compiling division by 0x80000000
2021-04-26 23:50:27 +02:00
JosJuice
7d4b87e7ae Jit64: Fix UB/infinite loop when compiling division by 0x80000000 2021-04-26 23:42:03 +02:00
Vicki Pfau
4ce3362bce SI/DeviceGBA: Fix SI timings to actually closely match hardware 2021-04-26 01:36:43 -07:00
JosJuice
ac679eb24d
Merge pull request #9666 from leoetlino/jit-block-hashtable
Jit: Optimize block link queries by using hash tables
2021-04-25 18:45:41 +02:00
JosJuice
69c14d6ec3 JitArm64: Fix frspx with single precision source
I haven't observed this breaking any game, but it didn't match
the behavior of the interpreter as far as I could tell from
reading the code, in that denormals weren't being flushed.
2021-04-25 15:56:59 +02:00
JosJuice
54451ac731 JitArm64: Use ConvertSingleToDoubleLower in RW when faster 2021-04-25 15:56:59 +02:00
JosJuice
9d6263f306 JitArm64: Add unit tests for single/double conversion 2021-04-25 15:56:58 +02:00
JosJuice
2a9d88739c JitArm64: Skip accurate single/double conversion if store-safe 2021-04-25 15:56:58 +02:00
JosJuice
1d106ceaf5 JitArm64: Optimize ConvertSingleToDouble, part 2
If we can prove that FCVT will provide a correct conversion,
we can use FCVT. This makes the common case a bit faster
and the less likely cases (unfortunately including zero,
which FCVT actually can convert correctly) a bit slower.
2021-04-25 15:56:19 +02:00
JosJuice
018e247624 JitArm64: Optimize ConvertSingleToDouble, part 1 2021-04-25 15:56:19 +02:00
JosJuice
28e4869c43 JitArm64: Optimize ConvertDoubleToSingle 2021-04-25 15:56:19 +02:00
JosJuice
6e0a5876ef JitArm64: Use accurate single/double conversions
Our old conversion approach became a lot more inaccurate when
enabling flush-to-zero, to the point of obviously breaking games.
2021-04-25 15:56:19 +02:00
JosJuice
39eccf6603 JitArm64: Call RW before FCMPE in fselx
Needed because the next commit will make RW clobber flags.
2021-04-25 15:56:19 +02:00
JosJuice
949686bbe7 JitArm64: Factor out single/double conversion code to functions
Preparation for following commits.

This commit intentionally doesn't touch paired stores,
since paired stores are supposed to flush to zero.
(Consistent with Jit64.)
2021-04-25 15:56:19 +02:00
JosJuice
fdf7744a53 JitArm64: Move float conversion code out of EmitBackpatchRoutine
This simplifies some of the following commits. It does require
an extra register, but hey, we have 32 of them.

Something I think would be nice to add to the register cache
in the future is the ability to keep both the single and double
version of a guest register in two different host registers
when that is useful. That way, the extra register we write to
here can be read by a later instruction, saving us from
having to perform the same conversion again.
2021-04-25 15:56:19 +02:00
Léo Lam
aa3a96f048
Merge pull request #9644 from JosJuice/jit-fallback-discard
Jits: Fix interpreter fallback handling of discarded registers
2021-04-25 13:20:41 +02:00
JosJuice
b3b5016f54 Jits: Fix interpreter fallback handling of discarded registers
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.

Fixes a 62ce1c7 regression.
2021-04-25 13:01:40 +02:00
Sintendo
47e16133e5 Jit64: divwx - Eliminate XOR for constant dividend
We normally check for division by zero to know if we should set the
destination register to zero with a XOR. However, when the divisor and
destination registers are the same the explicit zeroing can be omitted.
In addition, some of the surrounding branching can be simplified as
well.

Before:
45 85 FF             test        r15d,r15d
75 05                jne         normal_path
45 33 FF             xor         r15d,r15d
EB 0C                jmp         done
normal_path:
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:

After:
45 85 FF             test        r15d,r15d
74 0C                je          done
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:
2021-04-24 21:32:21 +02:00
Sintendo
abc4c8f601 Jit64: divwx - Eliminate MOV for division by power of 2
Division by a power of two can be slightly improved when the
destination and dividend registers are the same.

Before:
8B C6                mov         eax,esi
85 C0                test        eax,eax
8D 70 03             lea         esi,[rax+3]
0F 49 F0             cmovns      esi,eax
C1 FE 02             sar         esi,2

After:
85 F6                test        esi,esi
8D 46 03             lea         eax,[rsi+3]
0F 48 F0             cmovs       esi,eax
C1 FE 02             sar         esi,2
2021-04-24 19:28:23 +02:00
Sintendo
246adf0d6d Jit64: divwx - Eliminate MOV for division by 2
When destination and input registers match, a redundant MOV instruction
can be eliminated.

Before:
8B C7                mov         eax,edi
8B F8                mov         edi,eax
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1

After:
8B C7                mov         eax,edi
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1
2021-04-24 18:53:21 +02:00
Léo Lam
c812ab6a63
Jit: Optimize block link queries by using hash tables
Repeated erase() + iteration on a std::multimap is extremely slow.

Slow enough that it causes a 7 second long stutter during some
transitions in F-Zero X (a N64 VC game that triggers many, many icache
invalidations).

And slow enough that JitBaseBlockCache::DestroyBlock shows up on a
flame graph as taking >50% of total CPU time on the CPU-GPU thread:
https://i.imgur.com/vvqiFL6.png

This commit optimises those block link queries by replacing the
std::multimap (which is typically implemented with red-black trees)
with hash tables.

Master: https://i.imgur.com/vvqiFL6.png / 7s stutters
(starting from 5.0-2021 and with branch following disabled)

This commit: https://i.imgur.com/hAO74fy.png / ~0.7s stutters, which
is pretty close to 5.0 stable. (5.0-2021 introduced the performance
regression and it is especially noticeable when branch following
is disabled, which is the case for all N64 VC games since 5.0-8377.)
2021-04-24 17:20:59 +02:00
JosJuice
be5775614c
Merge pull request #9619 from leoetlino/scoped-fd
IOS/FS: Add a scoped FD class to make it harder to leak FDs
2021-04-23 21:53:25 +02:00
Léo Lam
1686b637df
MIOS: Fix SConfig::OnNewTitleLoad not being called
Oversight from #9545, which moved the "new game has been loaded" logic
to a separate OnNewTitleLoad function that has to be called explicitly
*after* a title has loaded.

Coupled with the commit that makes Dolphin not clobber 0x1800-0x3000
when using MIOS, this fixes Wind Waker and other MIOS-patched games
when they are launched from the System Menu.
2021-04-22 21:55:07 +02:00