Despite the trampoline size being hardcoded, it was previously dynamic and could change based off of the value stored in the target register potentially leading to instructions being missed.
The previous method would cause OOB reads for the last row to clamp, and adding an extra row would potentially encounter unmapped memory. So use this technique based on how Ryu does it.
These are mostly implemented how you would expect, however as opposed to copying out query pool results immeditely, doing so is delayed until the RP end in order to avoid splits.
The yuzu audio_core code is mostly untouched, with a set of wrappers used to bridge it with skyline kernel primitives. Huge thanks to maide and their advice, whom without this wouldn't have been possible.
Now usagetracker is properly in place, indirect draw HLE can be used without requiring any hacks. Dirtiness is now ignored when fetching macro arguments, and it's now the duty of the HLE impls themselves to perform flushing if they require it.
This still requires usagetracker to avoid redundantly performing indirect draws when the memory isn't dirty, and to allow for using it with direct memory, but it's a start.
Indirect draws are implemented by having the macro arguments overflow into a seperate GP Entry that points directly to the indirect argument buffer. To HLE indirect draws a buffer needs to be created from this pointer, and it cannot be dereferenced on the CPU at any point to avoid hitting traps.
In the cases of indirect draws, we don't know the vertex offset to write into the driver info constant buffer ahead of time, and to do it at draw time on the GPU would mean marking the constant buffer as GPU dirty (slow). HLE them in the shader instead using the host draw parameters extension.
When GPU crashes aren't reproducable in renderdoc, it helps to have someway to figure out what exactly is going on when a crash happens or what operation caused it. Add a checkpoint system that reports the GPU execution state in perfetto in time with actual GPU execution, and use flow events to show the event's path through execution, vulkan record and executor record stages.
This is neccessary as e.g. shaders can be updated through a mirror and never hit modification traps. By tracking which addresses have sequenced writes applied, the shader manager can then correctly detect if a given shader has been modified by the GPU.