The yuzu audio_core code is mostly untouched, with a set of wrappers used to bridge it with skyline kernel primitives. Huge thanks to maide and their advice, whom without this wouldn't have been possible.
Now usagetracker is properly in place, indirect draw HLE can be used without requiring any hacks. Dirtiness is now ignored when fetching macro arguments, and it's now the duty of the HLE impls themselves to perform flushing if they require it.
This still requires usagetracker to avoid redundantly performing indirect draws when the memory isn't dirty, and to allow for using it with direct memory, but it's a start.
Indirect draws are implemented by having the macro arguments overflow into a seperate GP Entry that points directly to the indirect argument buffer. To HLE indirect draws a buffer needs to be created from this pointer, and it cannot be dereferenced on the CPU at any point to avoid hitting traps.
In the cases of indirect draws, we don't know the vertex offset to write into the driver info constant buffer ahead of time, and to do it at draw time on the GPU would mean marking the constant buffer as GPU dirty (slow). HLE them in the shader instead using the host draw parameters extension.
When GPU crashes aren't reproducable in renderdoc, it helps to have someway to figure out what exactly is going on when a crash happens or what operation caused it. Add a checkpoint system that reports the GPU execution state in perfetto in time with actual GPU execution, and use flow events to show the event's path through execution, vulkan record and executor record stages.
This is neccessary as e.g. shaders can be updated through a mirror and never hit modification traps. By tracking which addresses have sequenced writes applied, the shader manager can then correctly detect if a given shader has been modified by the GPU.
Some games, for example PGLE, have heavy contention in code that locks mutexes for only a brief period of time. This heavy contention over multiple threads results in futex latency (often ~20us) impacting performance heavily. Using an adaptive condition variable helps to reduce this latency.
By spin waiting for a small period before falling back to an actual condition variable, some of the overheads inherent to futex's can be avoided. The used constants were tuned for optimal performance on 8G1 on Skyrim and PGLE.
This helps to prevent issues that result from the overlapping of buffer and texture data, by only ever syncing back textures if they are actually used as RTs, which are much less likely to overlap buffers.
This commit reverts PR #2037. Passing `NativeSettings` to emulation code through a member reference, instead of a local variable, caused unpredictable crashes when using custom GPU drivers (v615+) on some Qualcomm SoCs.
The exact cause of the issue remains unknown, my best guess is that it was caused by an incorrect optimization performed on the Kotlin bytecode in release mode, which caused an issue when reading memory that had been forked, because of running emulation in a separate process.
Runtime settings modification will be reimplemented in the future via an alternative method.
Indirect layers are used by the game to render layer on its own, the game allocates a buffer with the size from `GetIndirectLayerImageRequiredMemoryInfo` and uses `GetIndirectLayerImageMap` to draw the applet contents into the buffer.
As we don't LLE applet implementations nor do our HLE implementations draw equivalent applets, we cannot submit this to the guest. As a result, these functions are stubbed with the framebuffer being cleared to red.
Stubbing these functions allows titles such as Dark Souls to not crash while initializing indirect layers.