Commit Graph

1570 Commits

Author SHA1 Message Date
Billy Laws
04f3fa4b7f Implement basic indirect draw macro HLE
This still requires usagetracker to avoid redundantly performing indirect draws when the memory isn't dirty, and to allow for using it with direct memory, but it's a start.
2023-03-19 13:52:15 +00:00
Billy Laws
2444f2e81d Fix HLE macro code to not hash all of macro memory + update args struct
We incorrectly hashed the entirety of macro memory starting from the macro base address, as opposed to just the macro itself.
2023-03-19 13:52:15 +00:00
Billy Laws
b313dcbdca Avoid dereferencing macro argument pointers in memory where possible
Indirect draws are implemented by having the macro arguments overflow into a seperate GP Entry that points directly to the indirect argument buffer. To HLE indirect draws a buffer needs to be created from this pointer, and it cannot be dereferenced on the CPU at any point to avoid hitting traps.
2023-03-19 13:52:15 +00:00
Billy Laws
2b93604da0 Use hades HLE replacement for constant buffer attributes
In the cases of indirect draws, we don't know the vertex offset to write into the driver info constant buffer ahead of time, and to do it at draw time on the GPU would mean marking the constant buffer as GPU dirty (slow). HLE them in the shader instead using the host draw parameters extension.
2023-03-19 13:52:15 +00:00
Billy Laws
7e1c58accc Implement indirect draws in the Maxwell 3D interconnect
These will be used by the HLE indirect draw macro to perform indirect draws without waiting for GPU idle.
2023-03-19 13:52:15 +00:00
Billy Laws
49cd2a71cc Introduce GPU checkpoints for crash debugging
When GPU crashes aren't reproducable in renderdoc, it helps to have someway to figure out what exactly is going on when a crash happens or what operation caused it. Add a checkpoint system that reports the GPU execution state in perfetto in time with actual GPU execution, and use flow events to show the event's path through execution, vulkan record and executor record stages.
2023-03-19 13:52:15 +00:00
Billy Laws
d5b6c68ae4 Split out common parts of Maxwell 3D draws
These will be able to be shared between indirect and normal draws.
2023-03-19 13:52:15 +00:00
Billy Laws
779ba3de05 Commonise full pipeline barrier recording 2023-03-19 13:52:15 +00:00
Billy Laws
a65aa28df2 Avoid redundant GPU-dirty propagation for direct buffer recreation 2023-03-19 13:52:15 +00:00
Billy Laws
4a3a40aa40 Add more perfetto tracepoints 2023-03-19 13:52:15 +00:00
Billy Laws
c15b89975b Allocate a general purpose GPU-side debug tracing buffer
Can be used for checkpoints, etc.
2023-03-19 13:52:15 +00:00
Billy Laws
c36b8e843e Add index buffer size estimation via mapping size
This is useful for indirect draws, where we don't know the underlying index buffer size and also don't know the index count.
2023-03-19 13:52:15 +00:00
Billy Laws
0deff5e37a Set a higher perfetto size hint to avoid packet loss 2023-03-19 13:52:15 +00:00
Billy Laws
4bb2a41594 Use usagetracker to determine if pushbuffers need to flush the GPU 2023-03-19 13:52:15 +00:00
Billy Laws
090151f0c3 Introduce usage tracker for dirty tracking within an execution
This is neccessary as e.g. shaders can be updated through a mirror and never hit modification traps. By tracking which addresses have sequenced writes applied, the shader manager can then correctly detect if a given shader has been modified by the GPU.
2023-03-19 13:52:15 +00:00
Billy Laws
f64860c93e Commonise buffer interval list code
This will be reused for usagetracker.
2023-03-19 13:52:15 +00:00
Billy Laws
179363a5e7 Fix depth-stencil formats 2023-03-14 23:22:12 +00:00
PabloG02
b4280a61ac Stub IApplicationFunctions::GetSaveDataSize 2023-03-11 18:27:36 +00:00
Billy Laws
6d582566f9 Use AdaptiveSingleWaiterConditionVariable for thread scheduling
Some games, for example PGLE, have heavy contention in code that locks mutexes for only a brief period of time. This heavy contention over multiple threads results in futex latency (often ~20us) impacting performance heavily. Using an adaptive condition variable helps to reduce this latency.
2023-03-11 18:26:02 +00:00
Billy Laws
b1e57bc7bc Introduce adapting condition variable class
By spin waiting for a small period before falling back to an actual condition variable, some of the overheads inherent to futex's can be avoided. The used constants were tuned for optimal performance on 8G1 on Skyrim and PGLE.
2023-03-11 18:26:02 +00:00
TheASVigilante
444e35e34f Fix swizzling regression + minor optimizations to swizzling 2023-03-06 21:56:31 +00:00
TheASVigilante
3e1db818cf Address review 2023-03-06 21:56:31 +00:00
TheASVigilante
caf1abbe31 Fix 3D swizzled copies & a small bug with DMA clears 2023-03-06 21:56:31 +00:00
TheASVigilante
70ee36e85c Add support for 1D remapped buffer clears 2023-03-06 21:56:31 +00:00
TheASVigilante
4c3fed6cd0 Hookup various DMA engine features
The DMA engine now supports these additional functions: pitch (to pitch) copies, subrect copies, split copies.
2023-03-06 21:56:31 +00:00
TheASVigilante
fd205ff0a9 Implement rest of I2M engine copies 2023-03-06 21:56:31 +00:00
TheASVigilante
72c2d94cbe Implement subrect copies 2023-03-06 21:56:31 +00:00
TheASVigilante
df0fd88991 Implement pitch swizzled copies 2023-03-06 21:56:31 +00:00
TheASVigilante
5c4bb1c44e Fix incorrect remapping register layout 2023-03-06 21:56:31 +00:00
Billy Laws
750dfb8f00 Disable extended dynamic state on <r42 mali drivers 2023-03-04 18:55:44 +00:00
Billy Laws
acf118155d Submit an execution on invalidate{Sampler,TextureHeader}Cache accesses 2023-03-04 18:55:44 +00:00
Billy Laws
6ce5202b8e Add exceptions for some more unimplemented maxwell draw regs 2023-03-04 18:55:44 +00:00
Billy Laws
7150ce0d1d Allow disabling the freeing of texture guest memory
This helps to prevent issues that result from the overlapping of buffer and texture data, by only ever syncing back textures if they are actually used as RTs, which are much less likely to overlap buffers.
2023-03-04 18:55:44 +00:00
Billy Laws
5e8cdfda92 Don't populate colour targets with an empty write mask
Avoids breaking VK spec in BOTW, as it has the same colour attachment bound twice, but the former is masked out entirely.
2023-03-04 18:55:44 +00:00
Billy Laws
8baf06c9ab Never free memory for GPU dirty buffers
Fixes Persona 5 textures in some cases due to overlaps with textures.
2023-03-04 18:55:44 +00:00
Billy Laws
1dd13e90b0 Use channel sequence number for TIC cache validity tracking
Fixes some OpenGL games which update a TIC with I2M but never end up triggering an execution otherwise.
2023-03-04 18:55:44 +00:00
Billy Laws
34fddfccba Only clear requested aspect for depth/stencil clears
Fixes water in Skyrim (depth was being cleared when only stencil should have been)
2023-03-04 18:55:44 +00:00
Billy Laws
330f402398 Clear chained fence cycles on the waiter thread
This avoids some sporadic random crashes that happen during fence cycle destruction in BOTW/SMO.
2023-03-04 18:55:44 +00:00
lynxnb
a1ca84f95e Move Kotlin settings to a dedicate package 2023-02-27 19:56:53 +01:00
lynxnb
180d1efd4d Revert "Toggle DisableFrameThrottling setting by clicking on FPS"
This commit reverts PR #2037. Passing `NativeSettings` to emulation code through a member reference, instead of a local variable, caused unpredictable crashes when using custom GPU drivers (v615+) on some Qualcomm SoCs.
The exact cause of the issue remains unknown, my best guess is that it was caused by an incorrect optimization performed on the Kotlin bytecode in release mode, which caused an issue when reading memory that had been forked, because of running emulation in a separate process.
Runtime settings modification will be reimplemented in the future via an alternative method.
2023-02-27 19:00:52 +01:00
PixelyIon
22f8cc5970 Stub indirect layer VI service functions
Indirect layers are used by the game to render layer on its own, the game allocates a buffer with the size from `GetIndirectLayerImageRequiredMemoryInfo` and uses `GetIndirectLayerImageMap` to draw the applet contents into the buffer. 

As we don't LLE applet implementations nor do our HLE implementations draw equivalent applets, we cannot submit this to the guest. As a result, these functions are stubbed with the framebuffer being cleared to red. 

Stubbing these functions allows titles such as Dark Souls to not crash while initializing indirect layers.
2023-02-22 23:23:08 +05:30
lynxnb
dba191d2dc Fix deadlock on settings in PresentationEngine after callback
Accessing the settings class during the execution of the `OnChangedCallback` results in a deadlock, as accesses to values are protected by a mutex. Instead, we now keep a local copy of the relevant settings and update those with the new value.
2023-02-20 21:45:30 +00:00
lynxnb
fc9b34846c Fix KtSettings JNI usage
* Use a global ref for NativeSettings JNI instance
* Always use the JNI env from the JNI call to ensure it's safe to use in the current thread
2023-02-20 21:45:30 +00:00
Matesic Darko
f850271e2d toggle DisableFrameThrottling setting by clicking on FPS display
s/jSurface/vkSurface
2023-02-20 21:45:30 +00:00
TheASVigilante
3168e8efc0 Address review 2023-02-20 18:17:35 +00:00
TheASVigilante
b780e2b755 Deallocate Unmapped memory pages
Reduces memory usage buildup over time, may affect performance.
2023-02-20 18:17:35 +00:00
Billy Laws
7296f8503d Enable always attempt to enable robustness 2023-02-20 18:01:49 +00:00
Billy Laws
d45f9e4d26 Loosen some texture WaR sync when possible
By keeping track of the stages reading the image we can do more fine-grained WaR prevention, as opposed to waiting for all commands to complete.
2023-02-20 18:01:49 +00:00
Billy Laws
6ee8a919e5 Use pipeline barriers, as opposed to an ext dependency for RP barrier
Allows for waiting on compute shaders, which are not a graphics stage.
2023-02-20 18:01:49 +00:00
Billy Laws
ee68facc5d Apply RP barrier masks for every draw, rather than the 1st in RP
I missed that addSubpass was only called once-per-subpass, meaning that if a new barrier req was discovered several draws into the RP it wouldn't be applied. Split out barriers into a seperate function to avoid this.
2023-02-20 18:01:49 +00:00