41131 Commits

Author SHA1 Message Date
JosJuice
b1987d0187 JitArm64: Use ADDI2R for psq_lXX/psq_stXX immediate offsets
This simplifies the source code, and slightly improves the emitted code
in some cases.
2023-12-01 21:31:11 +01:00
JosJuice
67791d227c JitArm64: Add special zero case to ADDI2R
This normally doesn't reduce the instruction count, but is nonetheless
useful on CPUs that can do 0-cycle moves.
2023-12-01 21:31:11 +01:00
JosJuice
25ffb0dbfc JitArm64: Mask input to 32-bit ADDI2R
In case the input was a s32 that got sign extended as part of conversion
to u64.
2023-12-01 21:26:37 +01:00
Sketch
b46fcf9032
IOS/KD: Implement Send Mail 2023-12-01 19:55:32 +01:00
Sketch
2c3d05423d
IOS/KD/VFF: Implement reading from VFF 2023-12-01 19:53:01 +01:00
Sketch
0d908a83e7
Common/HTTP: Implement Multiform 2023-12-01 19:52:21 +01:00
Tillmann Karras
6a0b17f8fe CMake: update required enet version
Dolphin now relies on ENET_SOCKOPT_TTL which was merged upstream but not
released yet. This fix assumes that enet continues its current
versioning scheme.
2023-12-01 14:13:07 +00:00
Mai
5f7e9d3bf1
Merge pull request #12320 from JosJuice/jitarm64-mmu-order
PowerPC: Unify "FromJit" MMU functions
2023-11-30 18:34:32 -05:00
Mai
d85cb749c0
Merge pull request #11382 from skyfloogle/traversal-fix-2
Traversal: Use low TTL for probe packet
2023-11-30 18:03:50 -05:00
Mai
d67f54b175
Merge pull request #12186 from TellowKrinkle/MultiTextureComputeMetal
VideoBackends:Metal: Support multiple compute textures
2023-11-30 17:46:02 -05:00
JosJuice
94b31eb4f4 Jit: Replace MSR/MMCR access with feature_flags access
This has the same effect in the end, but in my opinion, doing it this
way makes it more clear for the reader why we can read from ppcState at
JIT time, something that makes no sense for everything else in ppcState.
2023-11-30 22:40:36 +01:00
JosJuice
62787085e1 Jit: Add feature flag for performance monitor
By making the JIT cache check if the current state of MMCR0 and MMRC1
matches the state they had at the time the JIT block was compiled, we
solve a correctness issue (marked in a comment as a speed hack).

Not known to affect any games.
2023-11-30 22:40:36 +01:00
JosJuice
ca7e05bbc4 Jit: Replace "msrBits" with "featureFlags"
Preparation for the next commit.
2023-11-30 22:40:32 +01:00
Admiral H. Curtiss
163acb5d2c
Merge pull request #12339 from Tilka/bruise
GameSettings: add patch to disable interlacing in Black & Bruised
2023-11-30 21:08:15 +01:00
Admiral H. Curtiss
529a51d653
Merge pull request #12341 from JosJuice/jitarm64-msr-pc-order
JitArm64: Fix JitAsm without entry points map
2023-11-30 20:44:33 +01:00
JosJuice
4b50a38cf6 JitArm64: Fix JitAsm without entry points map
This must have broken in a rebase of one of my recently merged PRs.

Dolphin still worked correctly with this bug, for two reasons:

1. Most AArch64 users are not on Windows, and therefore normally do have
   the entry points map.
2. When the bug was triggered, Dolphin would fall back to the slower
   path rather than crashing.
2023-11-30 20:11:02 +01:00
TellowKrinkle
394dd02d0a VideoBackends:Metal: Support multiple compute textures 2023-11-29 18:45:11 -06:00
TellowKrinkle
a399dc43a1 VideoBackends:Metal: Align utility uniform sizes
Prevents complaining from validation layers
2023-11-29 18:45:11 -06:00
Tillmann Karras
d12642b392 GameSettings: add patch to disable interlacing in Black & Bruised 2023-11-29 23:59:33 +00:00
Mai
89963c287c
Merge pull request #11958 from JosJuice/jitarm64-dispatcher-microopt
JitArm64: Dispatcher optimizations
2023-11-29 16:54:09 -05:00
Mai
2d0e577f8f
Merge pull request #12340 from JosJuice/jit-gp-check-discard-cr
PPCAnalyst: Don't discard CR before gather pipe interrupt check
2023-11-29 16:51:03 -05:00
JosJuice
bddcf60673 PPCAnalyst: Don't discard CR before gather pipe interrupt check
This fixes a frequently occurring JitArm64 assert caused by merging
6cc4f593e5 without adapting it to the changes made in 5902b5b113.
2023-11-29 21:53:13 +01:00
JosJuice
06c7862160 JitArm64: Rearrange dispatcher instructions to improve scheduling
Loads can take a little while to complete.
2023-11-29 19:13:09 +01:00
JosJuice
9e970bcb30 JitArm64: Optiming shifting and masking PC in slow dispatcher
Instead of shifting left by 1, we can first shift right by 2 and then
left by 3. This is both faster and smaller, because we get the right
shift for free with the masking and the left shift for free with the
address calculation. It also happens to match the pseudocode more
closely, which is always nice for readability.
2023-11-29 19:13:09 +01:00
JosJuice
c9347a2a19 JitArm64: Use LDP in slow dispatcher
With one LDP instruction, we can replace two LDR instructions.
2023-11-29 19:13:09 +01:00
JosJuice
4a4e7d9b8a Jit: Swap locations of effectiveAddress and msrBits
This slightly improves instruction-level parallelism in Jit64's slow
dispatcher by shifting the PC left instead of the MSR.

In the past, this also enabled an optimization in JitArm64's fast path
where we could use LDP to load normalEntry and msrBits in one
instruction, but this was superseded by fd9c970.
2023-11-29 19:13:09 +01:00
JosJuice
3df09f349d JitArm64: Prefer X8 and up for temporary registers in JitAsm
Just to make the code easier to understand at a glance. I especially
found it a bit annoying to reason about whether callee-saved registers
like W28 were being used because we needed a callee-saved register or
just for no reason in particular.

X8 and up is what compilers normally use when they're not register
starved.
2023-11-29 19:13:03 +01:00
Mai
0a62b30cd4
Merge pull request #11906 from noahpistilli/request-register-user-id
IOS/KD: Implement Request Register User ID
2023-11-29 03:31:59 -05:00
Mai
02de58eb2c
Merge pull request #12337 from Tilka/imm16
Jit64: fix invalid instruction encoding
2023-11-29 01:10:22 -05:00
Tillmann Karras
f6131e9703 Jit64: fix invalid instruction encoding
This is a recent regression introduced in
c70dcf99dd7b434d5196feb47f2a6e87bb34234b.
2023-11-29 05:49:02 +00:00
Mai
a7216a3035
Merge pull request #9857 from JosJuice/jitarm64-cr-analysis
PPCAnalyst: Allow more reordering of CR operations
2023-11-28 21:01:07 -05:00
Sketch
f2607cdd74 IOS/KD: Implement Request Register User ID 2023-11-28 20:40:15 -05:00
Mai
b7435be90a
Merge pull request #12298 from Shoegzer/master
Update default IP for HLE BBA
2023-11-28 22:45:17 +01:00
Mai
d095bddbe7
Merge pull request #12141 from JosJuice/jit-blr-msr
Jit: Check MSR state in BLR optimization
2023-11-28 22:35:35 +01:00
Mai
934418a289
Merge pull request #12092 from JosJuice/jitarm64-last-nan
JitArm64: Skip checking last input for NaN for non-SIMD operations
2023-11-28 22:30:50 +01:00
JosJuice
fc95d59805 JitArm64: Further optimize NaN handling in ps_sumX
So short that using farcode is pointless!
2023-11-28 21:45:44 +01:00
JosJuice
8274dcbfe4 JitArm64: Skip checking last input for NaN for non-SIMD operations
AArch64's handling of NaNs in arithmetic instructions matches PowerPC's
as long as no more than one of the operands is NaN. If we know that all
inputs except the last input are non-NaN, we can therefore skip checking
the last input. This is an optimization that in principle only works for
non-SIMD operations, but ps_sumX effectively is non-SIMD as far as the
arithmetic part of it is concerned, so we can use it there too.
2023-11-28 21:45:40 +01:00
Mai
95f06ef231
Merge pull request #12122 from JosJuice/jit-imm-msr
Jit: Handle imm msr in EmitStoreMembase
2023-11-28 21:34:23 +01:00
Mai
8cf0597d5f
Merge pull request #12091 from JosJuice/jitarm64-skip-quiet-bit
JitArm64: Use one instruction for making NaNs quiet
2023-11-28 21:33:25 +01:00
Mai
e99ead0a68
Merge pull request #12124 from JosJuice/jitarm64-mfsrin-mtsrin-addr
JitArm64: Optimize mfsrin/mtsrin address calculations
2023-11-28 21:30:30 +01:00
Mai
b53ecd73fb
Merge pull request #12143 from JosJuice/jitarm64-loadstore-pc
JitArm64: Write PC when calling MMU.cpp
2023-11-28 21:29:37 +01:00
Mai
1df685b2d7
Merge pull request #12123 from JosJuice/jit-mcrxr
Jit: Some mcrxr optimizations
2023-11-28 19:32:47 +01:00
Mai
20b13df507
Merge pull request #12179 from JosJuice/jitarm64-gp-deduplicate
JitArm64: Deduplicate the gather pipe exception check
2023-11-28 19:21:58 +01:00
Mai
ac53766058
Merge pull request #12215 from JosJuice/android-si-devices
Android: Add more GameCube controller types
2023-11-28 19:21:29 +01:00
Mai
bfc6bca583
Merge pull request #12235 from JosJuice/jitarm64-float-cls
JitArm64: Use LSL+CLS for classifying floats
2023-11-28 19:20:01 +01:00
JosJuice
80171adf1e PPCTables: Retire FL_EVIL
FL_EVIL is only used for blocking instructions from being reordered.
There are three types of instructions which have FL_EVIL set:

1. CR operations. The previous commits improved our CR analysis
   and removed FL_EVIL from these instructions.
2. Load/store operations. These are always blocked from reordering
   due to always having canCauseException set.
3. isync. I don't know if we actually need to prevent reordering
   around this one, since as far as I know we only do reorderings
   that are guaranteed to not change the behavior of the program.
   But just in case, I've renamed FL_EVIL to FL_NO_REORDER instead of
   removing it entirely, so that it can be used for this instruction.
2023-11-28 18:59:34 +01:00
JosJuice
f494a3d9e8 PPCAnalyst: Remove CanSwapAdjacentOps's OPCD check
Other than the CR instructions, which we now analyze properly,
all the covered instructions are not integer operations and also
have either FL_ENDBLOCK or FL_EVIL set, so there are two other
checks in CanSwapAdjacentOps that will reject them.
2023-11-28 18:59:34 +01:00
JosJuice
96d622bb61 PPCAnalyst: Run cror reordering after cmp reordering
We would rather have cror be close to the cmp than the branch.
2023-11-28 18:59:34 +01:00
JosJuice
40e0dd93be PPCAnalyst: Allow more reordering of CR operations
This is possible with the improved CR analysis implemented
in the previous commits.
2023-11-28 18:59:34 +01:00
JosJuice
da63cee711 PPCAnalyst: More strict a_flags checks in CanSwapAdjacentOps
If for instance instruction a sets OE and instruction b
reads it, we shouldn't permit reordering.
2023-11-28 18:59:34 +01:00