Commit Graph

1589 Commits

Author SHA1 Message Date
Billy Laws
7d0b7f0b71 Handle OOB blits by adding to the texture base offset
The previous method would cause OOB reads for the last row to clamp, and adding an extra row would potentially encounter unmapped memory. So use this technique based on how Ryu does it.
2023-04-16 16:31:42 +01:00
Dima
6aef7fdd1e Stub some services 2023-04-03 10:30:20 +01:00
Billy Laws
bbc8ccb823 Treat partially unmapped textures as unmapped 2023-04-02 17:35:12 +01:00
Billy Laws
bbe4872a95 Fix missed CachedMappedBufferView initialisation 2023-04-02 17:35:12 +01:00
Billy Laws
7bfe63f679 Enable adreno/mali LLVM misopt workaround
Fixes animal crossing character issues, see shader compiler commit for details.
2023-04-02 17:35:12 +01:00
Billy Laws
737fb2207d Avoid submitting executions on semaphore incrs
This avoids breaking RPs which helps perf, and since we have our own sync logic we don't need to match the guest here.
2023-04-02 17:35:12 +01:00
Billy Laws
99a7b77948 Remove broken descriptor aliasing quirk
This can be fixed in the shader compiler by just naming cbufs differently.
2023-04-02 17:35:12 +01:00
Billy Laws
c440575a56 Add partial conditional rendering support
Disabled for cases where the results come from queries rn, however still functional for cases like AC which don't use it with queries.
2023-04-02 17:35:12 +01:00
Billy Laws
a2798a9184 Implement support for occulusion queries
These are mostly implemented how you would expect, however as opposed to copying out query pool results immeditely, doing so is delayed until the RP end in order to avoid splits.
2023-04-02 17:35:12 +01:00
Billy Laws
202c97a1eb Introduce several new node insertion functions for use with queries
Queries need the ability to insert commands at the beginning and end of RPs.
2023-04-02 17:35:12 +01:00
Billy Laws
7d573db80b Make GetTimeTicks return the time in guest ticks as opposed to host
This is required by audren.
2023-03-27 22:31:14 +01:00
Billy Laws
d5a15faab7 Greatly simplify circular buffer logic, fixing several bugs 2023-03-27 22:31:14 +01:00
Billy Laws
01febe75c4 Reimplement audout and audren using yuzu audio_core
The yuzu audio_core code is mostly untouched, with a set of wrappers used to bridge it with skyline kernel primitives. Huge thanks to maide and their advice, whom without this wouldn't have been possible.
2023-03-27 22:31:14 +01:00
Billy Laws
1f0d297221 Fix infinite loop when reading dirty buffers with direct mem
The code didn't account for interval Queries returning zero when reaching the end.
2023-03-19 13:52:15 +00:00
Billy Laws
0b551e04db Return a null handle when reading from an unbound cbuf 2023-03-19 13:52:15 +00:00
Billy Laws
8d9b0041b4 Return a dummy buffer when encountering unbound SSBOs 2023-03-19 13:52:15 +00:00
Billy Laws
d893777262 Flush deferred draws before executing macro HLE and cleanup 2023-03-19 13:52:15 +00:00
Billy Laws
0f33055176 Fixup accidental change 2023-03-19 13:52:15 +00:00
Billy Laws
3901ecbf49 Hook up indirect draws into usagetracker
Now usagetracker is properly in place, indirect draw HLE can be used without requiring any hacks. Dirtiness is now ignored when fetching macro arguments, and it's now the duty of the HLE impls themselves to perform flushing if they require it.
2023-03-19 13:52:15 +00:00
Billy Laws
04f3fa4b7f Implement basic indirect draw macro HLE
This still requires usagetracker to avoid redundantly performing indirect draws when the memory isn't dirty, and to allow for using it with direct memory, but it's a start.
2023-03-19 13:52:15 +00:00
Billy Laws
2444f2e81d Fix HLE macro code to not hash all of macro memory + update args struct
We incorrectly hashed the entirety of macro memory starting from the macro base address, as opposed to just the macro itself.
2023-03-19 13:52:15 +00:00
Billy Laws
b313dcbdca Avoid dereferencing macro argument pointers in memory where possible
Indirect draws are implemented by having the macro arguments overflow into a seperate GP Entry that points directly to the indirect argument buffer. To HLE indirect draws a buffer needs to be created from this pointer, and it cannot be dereferenced on the CPU at any point to avoid hitting traps.
2023-03-19 13:52:15 +00:00
Billy Laws
2b93604da0 Use hades HLE replacement for constant buffer attributes
In the cases of indirect draws, we don't know the vertex offset to write into the driver info constant buffer ahead of time, and to do it at draw time on the GPU would mean marking the constant buffer as GPU dirty (slow). HLE them in the shader instead using the host draw parameters extension.
2023-03-19 13:52:15 +00:00
Billy Laws
7e1c58accc Implement indirect draws in the Maxwell 3D interconnect
These will be used by the HLE indirect draw macro to perform indirect draws without waiting for GPU idle.
2023-03-19 13:52:15 +00:00
Billy Laws
49cd2a71cc Introduce GPU checkpoints for crash debugging
When GPU crashes aren't reproducable in renderdoc, it helps to have someway to figure out what exactly is going on when a crash happens or what operation caused it. Add a checkpoint system that reports the GPU execution state in perfetto in time with actual GPU execution, and use flow events to show the event's path through execution, vulkan record and executor record stages.
2023-03-19 13:52:15 +00:00
Billy Laws
d5b6c68ae4 Split out common parts of Maxwell 3D draws
These will be able to be shared between indirect and normal draws.
2023-03-19 13:52:15 +00:00
Billy Laws
779ba3de05 Commonise full pipeline barrier recording 2023-03-19 13:52:15 +00:00
Billy Laws
a65aa28df2 Avoid redundant GPU-dirty propagation for direct buffer recreation 2023-03-19 13:52:15 +00:00
Billy Laws
4a3a40aa40 Add more perfetto tracepoints 2023-03-19 13:52:15 +00:00
Billy Laws
c15b89975b Allocate a general purpose GPU-side debug tracing buffer
Can be used for checkpoints, etc.
2023-03-19 13:52:15 +00:00
Billy Laws
c36b8e843e Add index buffer size estimation via mapping size
This is useful for indirect draws, where we don't know the underlying index buffer size and also don't know the index count.
2023-03-19 13:52:15 +00:00
Billy Laws
0deff5e37a Set a higher perfetto size hint to avoid packet loss 2023-03-19 13:52:15 +00:00
Billy Laws
4bb2a41594 Use usagetracker to determine if pushbuffers need to flush the GPU 2023-03-19 13:52:15 +00:00
Billy Laws
090151f0c3 Introduce usage tracker for dirty tracking within an execution
This is neccessary as e.g. shaders can be updated through a mirror and never hit modification traps. By tracking which addresses have sequenced writes applied, the shader manager can then correctly detect if a given shader has been modified by the GPU.
2023-03-19 13:52:15 +00:00
Billy Laws
f64860c93e Commonise buffer interval list code
This will be reused for usagetracker.
2023-03-19 13:52:15 +00:00
Billy Laws
179363a5e7 Fix depth-stencil formats 2023-03-14 23:22:12 +00:00
PabloG02
b4280a61ac Stub IApplicationFunctions::GetSaveDataSize 2023-03-11 18:27:36 +00:00
Billy Laws
6d582566f9 Use AdaptiveSingleWaiterConditionVariable for thread scheduling
Some games, for example PGLE, have heavy contention in code that locks mutexes for only a brief period of time. This heavy contention over multiple threads results in futex latency (often ~20us) impacting performance heavily. Using an adaptive condition variable helps to reduce this latency.
2023-03-11 18:26:02 +00:00
Billy Laws
b1e57bc7bc Introduce adapting condition variable class
By spin waiting for a small period before falling back to an actual condition variable, some of the overheads inherent to futex's can be avoided. The used constants were tuned for optimal performance on 8G1 on Skyrim and PGLE.
2023-03-11 18:26:02 +00:00
TheASVigilante
444e35e34f Fix swizzling regression + minor optimizations to swizzling 2023-03-06 21:56:31 +00:00
TheASVigilante
3e1db818cf Address review 2023-03-06 21:56:31 +00:00
TheASVigilante
caf1abbe31 Fix 3D swizzled copies & a small bug with DMA clears 2023-03-06 21:56:31 +00:00
TheASVigilante
70ee36e85c Add support for 1D remapped buffer clears 2023-03-06 21:56:31 +00:00
TheASVigilante
4c3fed6cd0 Hookup various DMA engine features
The DMA engine now supports these additional functions: pitch (to pitch) copies, subrect copies, split copies.
2023-03-06 21:56:31 +00:00
TheASVigilante
fd205ff0a9 Implement rest of I2M engine copies 2023-03-06 21:56:31 +00:00
TheASVigilante
72c2d94cbe Implement subrect copies 2023-03-06 21:56:31 +00:00
TheASVigilante
df0fd88991 Implement pitch swizzled copies 2023-03-06 21:56:31 +00:00
TheASVigilante
5c4bb1c44e Fix incorrect remapping register layout 2023-03-06 21:56:31 +00:00
Billy Laws
750dfb8f00 Disable extended dynamic state on <r42 mali drivers 2023-03-04 18:55:44 +00:00
Billy Laws
acf118155d Submit an execution on invalidate{Sampler,TextureHeader}Cache accesses 2023-03-04 18:55:44 +00:00