Commit Graph

160 Commits

Author SHA1 Message Date
Billy Laws
01febe75c4 Reimplement audout and audren using yuzu audio_core
The yuzu audio_core code is mostly untouched, with a set of wrappers used to bridge it with skyline kernel primitives. Huge thanks to maide and their advice, whom without this wouldn't have been possible.
2023-03-27 22:31:14 +01:00
Billy Laws
f64860c93e Commonise buffer interval list code
This will be reused for usagetracker.
2023-03-19 13:52:15 +00:00
Billy Laws
b1e57bc7bc Introduce adapting condition variable class
By spin waiting for a small period before falling back to an actual condition variable, some of the overheads inherent to futex's can be avoided. The used constants were tuned for optimal performance on 8G1 on Skyrim and PGLE.
2023-03-11 18:26:02 +00:00
Billy Laws
7150ce0d1d Allow disabling the freeing of texture guest memory
This helps to prevent issues that result from the overlapping of buffer and texture data, by only ever syncing back textures if they are actually used as RTs, which are much less likely to overlap buffers.
2023-03-04 18:55:44 +00:00
lynxnb
180d1efd4d Revert "Toggle DisableFrameThrottling setting by clicking on FPS"
This commit reverts PR #2037. Passing `NativeSettings` to emulation code through a member reference, instead of a local variable, caused unpredictable crashes when using custom GPU drivers (v615+) on some Qualcomm SoCs.
The exact cause of the issue remains unknown, my best guess is that it was caused by an incorrect optimization performed on the Kotlin bytecode in release mode, which caused an issue when reading memory that had been forked, because of running emulation in a separate process.
Runtime settings modification will be reimplemented in the future via an alternative method.
2023-02-27 19:00:52 +01:00
lynxnb
fc9b34846c Fix KtSettings JNI usage
* Use a global ref for NativeSettings JNI instance
* Always use the JNI env from the JNI call to ensure it's safe to use in the current thread
2023-02-20 21:45:30 +00:00
Billy Laws
a47f010653 Add an option to allow CPU writes when fast readback is used 2023-02-20 18:01:49 +00:00
Billy Laws
6ea1483c9a Fix a race with multiple threads pushing data into circular queue
If the 'end' changes when waiting for data to be consumed our 'waitNext' would be invalid leading to a deadlock.
2023-02-20 18:01:49 +00:00
Billy Laws
12c88babd0 Fix address space allocator slow path to avoid OOB 2023-02-04 23:10:45 +00:00
Billy Laws
bb3baa888d Add a hack to disable shader subgroup shuffles
These are about 100x as expensive on adreno than nvidia due to the lack of a dedicated instruction, since some games work fine without them add a hack to disable them.
2023-02-04 23:10:45 +00:00
PabloG02
8b9d6f79ab Add option to enable/disable shader cache 2023-01-28 11:57:19 +00:00
Billy Laws
6b9be2edd4 Add note about circular queue append contiguosity guarantees 2023-01-20 21:19:04 +00:00
Billy Laws
85a23e73ba Implement a shared spinlock and use it for GPU VMM 2023-01-20 21:07:59 +00:00
Billy Laws
262f92900d Ensure unmapped VMM ranges return an invalid span 2023-01-20 21:07:59 +00:00
Billy Laws
2f6d27e8d7 Rework circular queue locking
Should now be (hopefully) race-free, also switch to a spinlock to avoid any locking overhead.
2023-01-20 21:07:59 +00:00
lynxnb
5d527cb965 Add CNTFRQ_EL0 workaround value for Exynos 1280 2023-01-15 10:16:01 +00:00
Billy Laws
4e5141f879 Fix missed attempt increment in spinlock
Should hog CPU slightly less and correctly yield now
2023-01-08 19:30:52 +00:00
Billy Laws
35a46acbb1 Determine storage buffer alignment dynamically 2023-01-08 19:30:52 +00:00
Billy Laws
12d80fe6c2 Use a shared mutex for GPU VMM to avoid deadlocks
Two reads need to be able to occur simultanously or deadlocks ccan occur (e.g read traps to wait on GPU but GPU needs to read).
2023-01-08 19:30:52 +00:00
Billy Laws
28b2a7a8a1 Dynamically apply GPU turbo clocks only when GPU submissions are queued
Allows for the GPU to clock down in cases where it's idle for most of the time, while still forcing maximum clocks when we care.
2023-01-08 19:30:52 +00:00
Billy Laws
3d31ade35f Implement an alternative buffer path using direct memory importing
By importing guest memory directly onto the host GPU we can avoid many of the complexities that occur with memory tracking as well as the heavy performance overhead in some situations. Since it's still desired to support the traditional buffer method, as it's faster in some cases and more widely supported, most of the exposed buffer methods have been split into two variants with just a small amount of shared code. While in most cases the code is simpler, one area with more complexity is handling CPU accesses that need to be sequenced, since we don't have any place we can easily apply writes to on the GPFIFO thread that wont also impact the buffer on the GPU, to solve this, when the GPU is actively using a buffer's contents, an interval list is used to keep track of any GPFIO-written regions on the CPU and any CPU reads to them will instead be directed to a shadow of the buffer with just those writes applied. Once the GPU has finished using buffer contents the shadow can then be removed as all writes will have been done by the GPU.

The main caveat of this is that it requires tying host sync to guest sync, this can reduce performance in games which double buffer command buffers as it prevents us from fully saturating the CPU with the GPFIFO thread.
2023-01-08 19:30:52 +00:00
Billy Laws
c67f27e914 Add a setting to control the maximum number of accumulated GPU cmds
This helps to keep the GPU fed when processing large command buffers which don't have any syncpoints to force a flush inbetween.
2023-01-08 19:30:52 +00:00
PabloG02
80c0f8f04d
Implement full profile picture support
Extends the profile picture stub into a full-fledged implementation with the ability for users to set their profile picture in settings while having the Skyline icon as the default profile picture.
2022-12-27 22:53:41 +05:30
Billy Laws
bba07fb101 Update for new hades 2022-12-03 22:50:56 +00:00
Billy Laws
579a2d9337 Add dynamic executor slot growth 2022-12-03 22:50:56 +00:00
Billy Laws
bfae292fb0 Make executor slot count setting exponential 2022-12-03 22:50:56 +00:00
Billy Laws
281838fde1 Apply GPU readback hack to both buffers and textures
And rename as appropriate.
2022-12-03 22:50:56 +00:00
Dima
e8e1b910c3 Add possibility to disable audio output 2022-12-02 00:33:28 +01:00
lynxnb
70109f8fbd Work around invalid values in CNTFRQ_EL0 register
Exynos SoCs have a bug where the `CNTFRQ_EL0` register is either set to 0 or contain incoherent values. With this patch, the frequency value is loaded into a static variable and used instead of reading the register. The value will be initialised to the correct value for affected SoCs, while unaffected ones will use the value from the register.
2022-12-02 00:23:28 +01:00
Billy Laws
e1bbd521d9 Fix potential circular queue submission race
If a producer thread was waiting for the queue to have free space and the consumer thread hadn't yet acquired the production mutex a deadlock could occur
2022-11-19 12:49:05 +00:00
Billy Laws
01e27bd2dd Implement ldr:ro LoadModule 2022-11-15 16:23:40 +00:00
PixelyIon
f4a8328cef Implement Symbol Hooking
Symbol hooking is required for HLE implementations of certain features in the future such as `nvdec` and for more in-depth debugging of games as we can inspect them on a SDK function level which allows us to debug issues far more easily.
2022-11-07 23:56:22 +05:30
Billy Laws
9055c98e09 Only enable debug/verbose logs in (rel)debug builds 2022-11-02 17:46:07 +00:00
Billy Laws
6c0f084aae Introduce hack to ignore frequently read-back textures
Readback can be especially slow on mobile due to the varying load pattern it creates which often prevents the CPU/GPU from clocking up. Since some games perform texture readback but don't actually use it for anything significant implement a hack to skip it and significantly improve performance in such cases.
2022-11-02 17:46:07 +00:00
Billy Laws
576bc6f37e Add CommandExecutor slot count setting 2022-11-02 17:46:07 +00:00
Billy Laws
77d76ed05a Batch contiguous GMMU ranges into one 2022-11-02 17:46:07 +00:00
Billy Laws
f5a141a621 Add dirty resource operator* 2022-11-02 17:46:07 +00:00
Billy Laws
2cdf6c1fe6 Add branch hint attributes to a couple branches 2022-11-02 17:46:07 +00:00
Billy Laws
b310b99bdc Handle unmapped ranges in TranslateRange 2022-11-02 17:46:07 +00:00
Billy Laws
f1600f5ad0 Support allocating into spans in the linear allocator 2022-11-02 17:46:07 +00:00
Billy Laws
054d32567d Allow mutation of input data by callback in CircularQueue::AppendTranform 2022-11-02 17:46:07 +00:00
Billy Laws
379b4f163d Implement popping from CircularQueue 2022-11-02 17:46:07 +00:00
Billy Laws
6e22373b59 Add array support to AllocateUntracked 2022-11-02 17:46:07 +00:00
Billy Laws
302b2fcc3f Force flush when dirty refresh returns true 2022-11-02 17:46:07 +00:00
Billy Laws
19a75c3f65 Bind all pipeline states to main pipeline dirty state 2022-11-02 17:46:07 +00:00
Billy Laws
fe51db366b Mark all dirty resources as dirty initially 2022-11-02 17:46:07 +00:00
Billy Laws
d7eab40f1c Introduce resource based dirty tracking infrastructure
This will be heavily used by the upcoming GPU rework. It provides an intuitive way to track dirtiness based on using the underlying pointers of objects, as opposed to other methods which often need an enum entry per dirty state and don't support overlaps. Wrappers for dirty state objects are also provided to abstract as much of the dirty tracking as possible from user code. The pointer based mechanism also serves to avoid having to handle dirty bindings on the user side of the dirty resources, allowing them to bind things internally instead.
2022-11-02 17:46:07 +00:00
Billy Laws
8471ab754d Introduce a spin lock for resources locked at a very high frequency
Constant buffer updates result in a barrage of std::mutex calls that take a lot of time even under no contention (around 5%). Using a custom spinlock in cases like these allows inlining locking code reducing the cost of locks under no contention to almost 0.
2022-11-02 17:46:07 +00:00
Billy Laws
09f376e500 Add const accessors to OffsetMember 2022-11-02 17:46:07 +00:00
Billy Laws
64a9db2e82 Introduce MergeInto helper for simplified construction of arrays of structs
In the upcoming GPU code each state member will hold a reference to its corresponding Maxwell 3D regs, this helper is needed to allow easy transformation from the the main 3D register struct into them.

Example:
```c++
struct Regs {
    std::array<View, 10> viewRegs;
    u32 enable;
} regs;

struct ViewState {
    const View &view;
    const u32 &enable;
    size_t index;
};

std::array<ViewState, 10> viewStates{MergeInto<ViewState, 10>(regs.viewRegs, regs.enable, IncrementingT{})
```
2022-11-02 17:46:07 +00:00