Commit Graph

1276 Commits

Author SHA1 Message Date
PixelyIon
38d3ff4300
Fix BufferManager::FindOrCreate Recreation Bugs
It was determined that `FindOrCreate` has several issues which this commit fixes:
* It wouldn't correctly handle locking of the newly created `Buffer` as the constructor would setup traps prior to being able to lock it which could lead to UB
* It wouldn't propagate the `usedByContext`/`everHadInlineUpdate` flags correctly
* It wouldn't correctly set the `dirtyState` of the buffer according to that of its source buffers
2022-08-06 22:20:54 +05:30
Billy Laws
d1a682eace
Fix setDirty behavior in Buffer::SynchronizeGuest
The condition for `setDirty` in the dirty state CAS was inverted from what it should've been resulting in synchronizing incorrectly, this commit fixes the condition to correct synchronization.
2022-08-06 22:20:54 +05:30
Billy Laws
00d434efdc
Remove Texture::CopyFrom format check
The formats of the textures involved in a texture were checked for equality, this broke certain copies as the presentation engine would invoke copies between textures of different yet compatible formats.

Co-authored-by: PixelyIon <pixelyion@protonmail.com>
2022-08-06 22:20:54 +05:30
Billy Laws
58174f255f
Improve ContextLock semantics
`ContextLock` had unoptimal semantics in the form of direct access to the `isFirst` member which wasn't clearly defined, it's now been broken up into function calls `IsFirstUsage` and `OwnsLock` with explicit move semantics and a function for releasing the lock.

Co-authored-by: PixelyIon <pixelyion@protonmail.com>
2022-08-06 22:20:54 +05:30
Billy Laws
561103d3da
Submit GPFIFO work prior to CircularQueue waiting
The position at which we call submit is a significant factor in performance and we did so at the end of PBs (PushBuffers), this isn't optimal as there could be multiple PBs queued up that would benefit from being in the same submission. We now delay the submission of the workload till we run out of PBs.
2022-08-06 22:20:54 +05:30
PixelyIon
3ac5ed8c06
Attach coalesced Buffer if any source Buffer is attached
A buffer that's attached to a context could be coalesced into a larger buffer which isn't attached, this would break as it wouldn't keep the buffer alive till the end of the associated context. To fix this if any source buffers are attached then the resulting coalesced buffer is also attached now.
2022-08-06 22:20:54 +05:30
PixelyIon
284ac53d88
Fix KThread Priority Inheritance CAS
The CAS condition for KThread PI was inverted which lead to entirely incorrect behavior for CAS conditions which while it might work in the vast majority of cases would lead to significantly inaccurate behavior.
2022-08-06 22:20:54 +05:30
PixelyIon
45cb8388cc
Fix NCE Trap API Lock Callback
The lock callback would `continue` which would end up skipping over the current item as it applied to the inner loop rather than the outer loop as intended. This has now been fixed by using `break` and a check instead.
2022-08-06 22:20:54 +05:30
PixelyIon
745d809e07
Fix Buffer::SynchronizeGuest Non-Blocking Behavior
The buffer's non-blocking behavior could lead to an invalid state where the dirty state doesn't adequately represent the buffer's true state, the check has now been moved inside the CAS loop as its behavior changes depending on the dirty state. In addition, `SynchronizeGuest` returns a boolean denoting if the synchronization was successful now to make code flows depending on non-blocking synchronization cleaner.
2022-08-06 22:20:54 +05:30
PixelyIon
c1f2445772
Set state to CpuDirty directly in SynchronizeGuest
`SynchronizeGuest` could only set the dirty state to `Clean` which was redundant since calls to it from inside the write trap handler would set it to `CpuDirty` directly after, this fixes that by doing it inside the function when necessary.
2022-08-06 22:20:54 +05:30
PixelyIon
4f6a67af36
Fix Texture Trap Data Race
The trap callbacks did not wait on the `Texture` to complete synchronization to the guest, this resulted in races where the contents written to the texture would be overwritten by the synced content. This commit fixes that by waiting on the fences at the end of the trap callback.
2022-08-06 22:20:54 +05:30
PixelyIon
cb7c3602e7
Attach TextureView to FenceCycle
The lifetime of `TextureView` objects wasn't correctly managed as they weren't being attached the the `FenceCycle` in `AttachTexture`, this led to them getting deleted and causing all sorts of UB.
2022-08-06 22:20:54 +05:30
PixelyIon
ffaefc82d3
Call all flush callbacks prior to CommandExecutor submission
The flush callbacks inside `CommandExecutor` weren't being called prior to submission as they should've been, this fixes that by calling them. It additionally removes the requirement to manually flush Maxwell3D at the end of `ChannelGpfifo` pushbuffers as it's a flush callback and will automatically be called by `Submit`.

Co-authored-by: Billy Laws <blaws05@gmail.com>
2022-08-06 22:20:54 +05:30
PixelyIon
e65707cd9d
Handle CommandExecutor submission at end of ChannelGpfifo PB
Any work that was done in a `ChannelGpfifo` pushbuffer needs to be submitted at the end of it, if it isn't done then the work might incorrectly be not done till the next submission. This commit fixes it by calling `CommandExecutor::Submit` at the end of a pushbuffer, submitting any buffers that would've been left over.

Co-authored-by: Billy Laws <blaws05@gmail.com>
2022-08-06 22:20:54 +05:30
PixelyIon
7b209c54a2
Only reallocate MegaBuffer on usage
Certain submissions might not utilize megabuffering but reserve a `MegaBuffer` regardless, this is not optimal since it can inflate the allocations and waste memory. This commit addresses the issue by eliding the allocation given the current submission doesn't utilize them.
2022-08-06 22:20:54 +05:30
PixelyIon
2366f81443
Fix Buffer::PollFence incorrectly handling null-FenceCycle
If a `FenceCycle` isn't attached then `PollFence` returned `false` while it should return if the buffer has any concurrent GPU usages in flight, this has now been fixed by returning `true` in those cases.
2022-08-06 22:20:54 +05:30
PixelyIon
34e1e39d1c
Always reset all attached resources on Submit
Certain resources can be attached to an empty `Submit` with no nodes, this can cause it to become a false dependency and not be removed till the next non-empty submission. This has now been fixed by doing a reset regardless of if any nodes exist.
2022-08-06 22:20:54 +05:30
PixelyIon
47db8e8cbc
Fix GPU inline copy callback for Buffer::Write
The GPU inline copy callback was broken for `Buffer::Write` as it wasn't always called when it needed to be and didn't handle attaching of the buffer to the executor which would cause it to be unlocked. This commit addresses both of these issues, it introduces a `AttachLockedBuffer` method to attach an already locked buffer to the executor.
2022-08-06 22:20:54 +05:30
PixelyIon
2636a37b31
Introduce alternative FPS measurement for disabled frame throttling
The FPS is implicitly bound to the refresh rate due to the timestamp being that of the presentation time, this leads to a misleading FPS figure for disabled frame throttling. It has now been fixed by using the frame submission time rather than the presentation time when frame throttling is disabled and to make this more apparent the color of the OSD FPS has been changed.
2022-08-06 22:20:54 +05:30
PixelyIon
0f56d01e58
Fix Packed format component ordering in IsAdrenoAliasCompatible
All `Packed` formats have their components stored in the opposite ordering to the label, this was not followed for `IsAdrenoAliasCompatible` prior and the ordering has now been flipped.
2022-08-06 22:18:42 +05:30
PixelyIon
3ca56ef578
Fix NCE Trapping API Deadlock
A deadlock was caused by holding `trapMutex` while waiting on the lock of a resource inside a callback while another thread holding the resource's mutex waits on `trapMutex`. This has been fixed by no longer allowing blocking locks inside the callbacks and introducing a separate callback for locking the resource which is done after unlocking the `trapMutex` which can then be locked by any contending threads.
2022-08-06 22:18:42 +05:30
PixelyIon
a6599c30b4
Correct IntervalMap insertion end calculation
The `end` pointer for `interval` was incorrectly calculated as `interval.data() + interval.size_bytes()` which would be incorrect when the interval span type is not `u8` as the pointer derived from `interval.data()` would be a pointer to the span type rather than a byte pointer and be subject to arithmetic of that object's size rather than in terms of a byte.
2022-08-06 22:18:42 +05:30
PixelyIon
b0910e7b1a
Avoid locking Texture/Buffer in trap handler
We generally don't need to lock the `Texture`/`Buffer` in the trap handler, this is particularly problematic now as we hold the lock for the duration of a submission of any workloads. This leads to a large amount of contention for the lock and stalling in the signal handler when the resource may be `Clean` and can simply be switched over to `CpuDirty` without locking and utilizing atomics which is what this commit addresses.
2022-08-06 22:18:42 +05:30
PixelyIon
a60d6ec58f
Replace host immutability FenceCycle with GPU usage tracking
We utilized a `FenceCycle` to keep track of if the buffer was mutable or not and introduced another cycle to track GPU-side requirements only on fulfillment of which could the buffer be utilized on the host but due to the recent change in the behavior this system ended up being unoptimal. 

This commit replaces the cycle with a boolean tracking if there are any usages of the resource on the GPU within the current context that may prevent it from being mutated on the CPU. The fence of the context is simply attached to the buffer based off this which was allowed as the new behavior of buffer fences matches all the requirements for this.
2022-08-06 22:18:42 +05:30
PixelyIon
217d484cba
Abstract TextureView/BufferDelegate locking into LockableSharedPtr
An atomic transactional loop was performed on the backing `std::shared_ptr` inside `BufferView`/`TextureView`'s `lock`/`LockWithTag`/`try_lock` functions, these locks utilized `std::atomic_load` for atomically loading the value from the `shared_ptr` recursively till it was the same value pre/post-locking. 

This commit abstracts the locking functionality of `TextureView`/`BufferDelegate` into `LockableSharedPtr` to avoid code duplication and removes the usage of `std::atomic_load` in either case as it is not necessary due to the implicit memory barrier provided by locking a mutex.
2022-08-06 22:18:42 +05:30
PixelyIon
2d08886e4e
Utilize TextureView rather than Texture for presentation
`PresentationEngine` and `GraphicBufferProducer` methods that utilized textures for the surface utilized the `Texture` type rather than the `TextureView` type, this was never correct but at the time of authoring this code `TextureView` was not finalized and in a major flux which is why it was not utilized and `Texture` was utilized instead. Now that is is far more stable, it has been replaced with `TextureView`.
2022-08-06 22:18:42 +05:30
PixelyIon
d7399e33c1
Avoid waiting on mutex in PresentationEngine::Present
We want to block on the host thread during presentation while the host surface isn't present to implicitly pause the game, this can end up being fairly costly as it involves locking the `PresentationEngine` mutex which can lead to a lot of contention with the presentation thread. This fixes the issue by polling if there is a surface and only if there isn't then doing the wait as it isn't mandatory to wait always, we'll eventually run into the guest thread stalling.
2022-08-06 22:18:42 +05:30
PixelyIon
30475ffc43
Fix queueBuffer GraphicBuffer Compatibility Check
Newer versions of the Deko3D homebrew were crashing due to this check and it was discovered that the check was incorrect and rather than comparing the `NvSurface` what had to be compared was the `GraphicBuffer` associated with the slot directly.

Co-authored-by: lynxnb <niccolo.betto@gmail.com>
2022-08-06 22:18:42 +05:30
PixelyIon
c2685d5f5c
Fix consistency issues with external project copyright headers
The copyright headers for external project such as yuzu/Ryujinx were inconsistent in ordering, Skyline should always be the first item in the list. In addition, they didn't always link to the project's GitHub which has also been fixed.
2022-08-06 22:18:42 +05:30
PixelyIon
0ac5f4ce27
Lock TextureManager/BufferManager during submission
Multiple threads concurrently accessing the `TextureManager`/`BufferManager` (Referred to as "resource managers") has a potential deadlock with a resource being locked while acquiring the resource manager lock while the thread owning it tries to acquire a lock on the resource resulting in a deadlock.

This has been fixed with locking of resource manager now being externally handled which ensures it can be locked prior to locking any resources, `CommandExecutor` provides accessors for retrieving the resource manager which automatically handles locking aside doing so on attachment of resources.
2022-08-06 22:18:42 +05:30
PixelyIon
1239907ce8
Rework Texture & Buffer for Context and FenceCycle Chaining
GPU resources have been designed with locking by fences in mind, fences were treated as implicit locks on a GPU, design paradigms such as `GraphicsContext` simply unlocking the texture mutex after attaching it which would set the fence cycle were considered fine prior but are unoptimal as it enforces that a `FenceCycle` effectively ensures exclusivity. This conflates the function of a mutex which is mutual exclusion and that of the fence which is to track GPU-side completion and led to tying if it was acceptable to use a GPU resource to GPU completion rather than simply if it was not currently being used by the CPU which is the function of the mutex.

This rework fixes this with the groundwork that has been laid with previous commits, as `Context` semantics are utilized to move back to using mutexes for locking of resources and tracking the usage on the GPU in a cleaner way rather than arbitrary fence comparisons. This also leads to cleaning up a lot of methods that involved usage of fences that no longer require it and therefore can be entirely removed, further cleaning up the codebase. It also opens the door for future improvements such as the removal of `hostImmutableCycle` and replacing them with better solutions, the implementation of which is broken at the moment regardless.

While moving to `Context`-based locking the question of multiple GPU workloads being in-flight while using overlapping resources came up which brought a fundamental limitation of `FenceCycle` to light which was that only one resource could be concurrently attached to a cycle and it could not adequately represent multi-cycle dependencies. `FenceCycle` chaining was designed to fix this inadequacy and allows for several different GPU workloads to be in-flight concurrently while utilizing the same resources as long as they can ensure GPU-GPU synchronization.
2022-08-06 22:18:42 +05:30
PixelyIon
07d45ee504
Introduce FenceCycle Chaining
If we want to allow submitting multiple pieces of work to the GPU at once while still requiring CPU synchronization, we'll need to track all past fence cycles associated with a resource alongside the current one. To solve this the concept of chaining fences has been introduced, fences from past usages can be chained to the latest fence which'll then recursively forward operations to chained fences.

This change also ends up mandating a move away from `FenceCycleDependency` as it would prevent fences from concurrently locking the same resources which is required for chaining to work as two fences being chained fundamentally means they're locking the same resources. The `AtomicForwardList` is therefore used as the new container.
2022-08-06 22:18:42 +05:30
PixelyIon
cf9e31c1eb
Implement Atomic Forward List
An implementation of a singly-linked list with atomic access to allow for lock-free access semantics, it eliminates the requirement for a mutex which can introduce additional consideration for synchronization.
2022-08-06 22:18:42 +05:30
PixelyIon
6b9269b88e
Introduce Context semantics to GPU resource locking
Resources on the GPU can be fairly convoluted and involve overlaps which can lead to the same GPU resources being utilized with different views, we previously utilized fences to lock resources to prevent concurrent access but this was overly harsh as it would block usage of resources till GPU completion of the commands associated with a resource.

Fences have now been replaced with locks but locks run into the issue of being per-view and therefore to add a common object for tracking usage the concept of "tags" was introduced to track a single context so locks can be skipped if they're from the same context. This is important to prevent a deadlock when locking a resource which has been already locked from the current context with a different view.
2022-08-06 22:18:42 +05:30
PixelyIon
d913f29662
Only set hasFragileUserData for signed builds
We do not want to allow saving of user data on unsigned builds as they don't have a stable signature and will not properly handle reinstallation. This can lead to a situation where the user has to resort to complex techniques to completely uninstall the package such as ADB or calling into PM directly.
2022-08-06 22:18:42 +05:30
PixelyIon
3139889a09
Implement Asynchronous Presentation
We currently present all frames synchronously on the thread that calls into SurfaceFlinger functions, this is unoptimal as it doesn't match guest behavior which can lead to delaying the guest from working on the next frame. This commit queuing up frames to non-blocking and handles all waiting then presenting the frame on a dedicated thread.
2022-08-06 22:18:42 +05:30
PixelyIon
6e09dc5204
Fix thread name setting
We utilize `pthread_setname_np` to set the thread names but didn't check for any errors which resulted in the `Skyline-Choreographer` and `ChannelCmdFifo` not having proper names as they exceeded the 16 character limit on thread names for the pthread function.  This has now been fixed by changing the names and introducing error checking to invocations of this function.
2022-08-06 22:18:42 +05:30
PixelyIon
7a0cfb484c
Add NPOT AlignUp utility
All our normal alignment functions are designed to only handle power of 2 (`POT`) multiples as we only align or check alignment to `POT` multiples but there are cases where this is not possible and we deal with `NPOT` multiples which is why this function is required.
2022-08-06 22:18:42 +05:30
PixelyIon
662ea532d8
Skip waiting on host GPU after command buffer submission
We waited on the host GPU after `Execute` but this isn't optimal as it causes a major stall on the CPU which can lead to several adverse effects such as downclocking by the governor and losing the opportunity to work in parallel with the GPU.

This has now been fixed by splitting `Execute`'s functionality into two functions: `Submit` and `SubmitWithFlush` which both execute all nodes and submit the resulting command buffer to the GPU but flushing will wait on the GPU to complete while the non-flush variant will not wait and work ahead of the GPU.
2022-08-06 22:18:42 +05:30
PixelyIon
5129d2ae78
Add move-assignment semantics to ActiveCommandBuffer/MegaBuffer
We need move-assignment semantics to viably utilize these objects as class members, they cannot be replaced without move-assign (or copy-assign but that is undesirable here). This commit fixes that by introducing a move assignment operator to them while making the `slot` a pointer which has the necessary nullability semantics.
2022-08-06 22:18:42 +05:30
lynxnb
8991ccac65 Pass ViewHolder on bind to RecyclerView items instead of ViewBinding
This change lets items get the updated position of their view holder in the adapter. Fixes an issue where the position of items was not updated after being removed from a `SelectableGenericAdapter`.
2022-08-06 22:00:19 +05:30
lynxnb
bb922100cb Improve rendering for Right-To-Left layouts 2022-08-06 22:00:19 +05:30
lynxnb
240e7033d7 Support loading a user-selected driver during vulkan initialization 2022-08-06 22:00:19 +05:30
lynxnb
c812de48ea Show an undo button after deleting a gpu driver
After a driver has been deleted, a snackbar will be shown confirming the deletion, with an button to undo it.
2022-08-06 22:00:19 +05:30
lynxnb
59c60df993 Add GPU Driver Configuration preference
This preference launches `GpuDriverActivity` for managing custom gpu drivers. When the device has an incompatible GPU, the preference will be disabled and greyed out.
2022-08-06 22:00:19 +05:30
lynxnb
48cf1263bc Add a custom GPU driver configuration activity
The activity adds the following functionalities:
* Lists installed drivers
* Allows the user to install new drivers, or remove installed ones
* Allows the user to select the driver that will be used by the emulator
2022-08-06 22:00:19 +05:30
lynxnb
e9f609b923 Add a gpuDriver preference setting
This setting represent the GPU driver selected by the user to be used by the emulator.
2022-08-06 22:00:19 +05:30
lynxnb
1815199d2b Add utilities for reading and installing gpu driver packages 2022-08-06 22:00:19 +05:30
lynxnb
f3dd3e53c1 Miscellaneous imports cleanup in preference package 2022-08-06 22:00:19 +05:30
lynxnb
1dfea9ef6f Create an ItemDecorations file for all RecyclerView item decorations
All item decorations are now placed in one file so that any `RecyclerView` in the app can use the same ones.
2022-08-06 22:00:19 +05:30
lynxnb
a59f2baa3a Add a SelectableGenericAdapter as subclass of GenericAdapter
`SelectableGenericAdapter` extends `GenericAdapter` with support for marking one item as selected.
2022-08-06 22:00:19 +05:30
lynxnb
e93fdce845 Add support for removal of items from GenericAdapter 2022-08-06 22:00:19 +05:30
lynxnb
0d1c7965df Add a ZipUtils class for unpacking zip files 2022-08-06 22:00:19 +05:30
lynxnb
b03f624191 Add kotlinx.serialization-json dependencies 2022-08-06 22:00:19 +05:30
Billy Laws
f52ea7bddb Make deferred draw and constant buffer updates reentrant-safe
At some point we will call Submit within draws or constant buffer updates, to avoid any infinite recursion mark draw/cbuf pending as false before performing any operation
2022-07-29 20:07:14 +01:00
Billy Laws
dbb684835f Fix depthClampDisable register offset in Maxwell 3D 2022-07-29 20:07:14 +01:00
Billy Laws
7fd9d347e3 Use per-RT blend enable registers even when independent blend is disabled
The common blend enable register seems to be used for something else. This is required for blending to work correctly in OpenGL games
2022-07-29 20:07:14 +01:00
Billy Laws
048c2fdd29 Fix Vulkan framebuffer dimensions calculations
The framebuffer needs to be large enough to contain both the render area extent and offset
2022-07-29 20:07:14 +01:00
Billy Laws
0e1aa765fc Prevent CNTVCT_EL0 reads from being optimised out by the compiler
Without this the compiler will assume the read always produces the same value, causing issues when the register is used to time function execution
2022-07-29 20:07:14 +01:00
Billy Laws
1df98ba57f Enable fwrapv for defined signed integer overflow behaviour
Nintendo enables this for HOS so we should do the same to avoid any cases where it's relied on.
2022-07-29 20:07:14 +01:00
lynxnb
d183d14e2a Make accesses to setting values thread-safe 2022-07-26 20:16:24 +05:30
lynxnb
30667a0899 Remove unused Compact Logs settings
Since we don't have a log viewer in the app anymore, the setting was left unused and can be safely removed.
2022-07-26 20:16:24 +05:30
lynxnb
5aa2a4cd1c Rename SettingsValues to NativeSettings
The previous name was chosen as an afterthought and didn't clearly indicate what the purpose of the class is. We needed a separate, simple class without delegates members (like PreferenceSettings), so that its fields can be easily accessed via JNI to get settings values from native code.
2022-07-26 20:16:24 +05:30
lynxnb
f734c4d145 Make log level setting changes immediately active 2022-07-26 20:16:24 +05:30
lynxnb
bb4937121f Remove settings from SharedPreference if they are of the wrong type 2022-07-26 20:16:24 +05:30
lynxnb
2840a126dd Introduce AndroidSettings class and use inheritance
The `Settings` class now has a pure virtual `Update` method, and uses inheritance over template specialization for platform-specific behavior override.
2022-07-26 20:16:24 +05:30
lynxnb
3905728447 Make every setting observable individually
A `Setting` delegate class has been introduced, holding the raw value of the setting and adding support for registering callbacks to that setting. Callbacks will then be called when the value of that setting changes.
As a result of this, raw setting values have been made accessible through pointer dereference semantics.
2022-07-26 20:16:24 +05:30
lynxnb
2d70be60d1 Remove PugiXML submodule
`PugiXML` was only used for parsing the SharedPreferences settings file, not needed anymore.
2022-07-26 20:16:24 +05:30
lynxnb
5b4ca79dc8 Rename Settings Kotlin class to PreferenceSettings
SharedPreferences will be partially swapped out in the future to support per-game settings. In the meantime, make it clear from which class settings are coming from.
2022-07-26 20:16:24 +05:30
lynxnb
3b27540250 Rename operationMode setting to isDocked 2022-07-26 20:16:24 +05:30
lynxnb
69cf25b1a7 Initial support for updating settings during emulation + observing settings changes 2022-07-26 20:16:24 +05:30
lynxnb
c5dde5953a Rework how settings are shared between Kotlin and native side
Settings are now shared to the native side by passing an instance of the Kotlin's `Settings` class. This way the C++ `Settings` class doesn't need to parse the SharedPreferences xml anymore.
2022-07-26 20:16:24 +05:30
lynxnb
4be8b4cf66 Add missing SPDX licence header 2022-07-26 20:16:24 +05:30
lynxnb
365ca66b1b Make integer settings use IntegerListPreference
Avoids unnecessary type casting of setting values and duplication in resource files.
2022-07-26 20:16:24 +05:30
lynxnb
cbc896c8f8 Fix waitForFences crash on Mali drivers
Mali GPU drivers utilize the `ppoll()` syscall inside `waitForFences` which isn't correctly restarted after a signal, which we can receive at any time on a guest thread. This commit fixes that by recursively calling the function on failure till it succeeds or returns an unexpected error.

Co-authored-by: PixelyIon <pixelyion@protonmail.com>
Co-authored-by: Billy Laws <blaws05@gmail.com>
2022-07-14 20:34:16 +02:00
MCredstoner2004
942e22f275 Write ApplicationErrorArg ErrorApplets to log
These applets are used by applications to display a custom error message to the user. Both the error message and the detailed error message are printed to the error log.


Co-authored-by: lynxnb <niccolo.betto@gmail.com>
2022-07-02 09:48:59 +05:30
MCredstoner2004
f9a0394577
Implement Software Keyboard applet
This implements the non-inline version of the Software Keyboard (swkbd) applet, which games use to get text input from the user.
2022-07-01 15:19:53 -05:00
MCredstoner2004
a9ee06914d Add ByteBufferSerializable
This allows sending C-like structs between Kotlin and C++ without struct-specific code
2022-06-30 01:17:32 +05:30
Billy Laws
a0275418d6 Add a single-header linear allocator implementation
This conforms to the C++ 'Allocator' named requirement allowing it to be used with any STL type and allows drastically reducing allocation times in cases which are suited for linear allocation.
2022-06-28 21:33:04 +01:00
Billy Laws
e816256220 Add blend, scissor, viewport and vertex state to shader hash
These caused a ton of additional comparisons in Zelda Link's Awakening as many shaders would have the same hash.
2022-06-28 21:32:59 +01:00
lynxnb
e6cfdeb06a Fix non-indexed quad draws
Certain non-indexed quad draws would mistakenly take the indexed quad path because of the assumption that they would not have a bound index buffer. This resulted in a crash for most games using quads due to a faulty exception `Indexed quad conversion is not supported`, when in fact they were not using indexed quads.

Co-authored-by: PixelyIon <pixelyion@protonmail.com>
Co-authored-by: Billy Laws <blaws05@gmail.com>
2022-06-23 10:57:11 +02:00
lynxnb
8fc3bc75f4 Allow providing an index type to calculate quad conversion buffer size 2022-06-23 00:15:44 +02:00
Billy Laws
7709dc8cf6 Rewrite buffer megabuffering to be per view and more efficient
This commit implements several key optimisations in megabuffering that are all inherently interlinked.
- Megabuffering is moved from per-buffer to per-view copies, this makes megabuffering possible for small views into larger underlying buffers which is often the case with even the simplest of games,
- Megabuffering is no longer the default option, it is only enabled for buffer views that have had inline GPU writes applied to them in the past as that is the only case where they are beneficial. In any other case the cost of copying, even with a 128KiB limit can be significant.
- With both of these changes, there is now possibility for overlapping views where one uses megabuffering and one does not. In order to allow GPU inline writes to work consistently in such cases a system of 'host immutability' has been implemented, when a buffer is marked as host immutable for a given cycle, all writes to the buffer from that point to the point the cycle is signalled will be performed on the GPU, ensuring that the backing contents are correctly sequenced
2022-06-11 17:05:39 +05:30
MCredstoner2004
2e356b8f0b Use spans instead of ptr and size in kernel memory 2022-06-11 17:05:39 +05:30
PixelyIon
e3e92ce1d4 Handle unsigned builds on CI
We don't always have access to CI secrets, such as, when a certain CI action is triggered by a PR from an external repository then it won't have access to secrets and be signed. While we likely will allow for this in the future as all workflows do have to be approved,  it is still important to not crash when keys are unavailable and have a graceful fallback for those situations.
2022-06-11 17:05:39 +05:30
Billy Laws
8689886bbb
Update build tools to 33.0.0 to fix CI
We're never switching to RC build tools again
2022-06-10 00:11:12 +01:00
Billy Laws
22039df301 Transition to std::unordered_set for buffer view tracking
Has the same guarantees of pointer stabilty while also being significantly faster in cases where a buffer has thousands of views. This is the case in RE4 and this change leads to an almost 1000% performance improvement in that game.
2022-06-09 23:52:13 +01:00
Billy Laws
b75a06af1b Support forcing 60Hz display on Xiaomi MIUI
Uses an API found through RE since none of the AOSP APIs work, additionaly the code for setting RR was consolidated to a single function that can be ran after all display updates.
2022-06-09 19:29:18 +01:00
Billy Laws
42c365fe70 Automatically exclude llvm and boost submodules in gradle project
There is a god in this world... his name is bylaws
2022-06-06 23:11:56 +01:00
PixelyIon
a5ca370c36 Implement thread-safe MegaBuffer pool
We currently have a global `MegaBuffer` instance that is shared across all channels, this is very problematic as `MegaBuffer` fundamentally works like a state machine with allocations (especially resetting/freeing) and is thread-specific. Therefore, we now have a pool of several `MegaBuffer`s which is allocated from by the `CommandExecutor` and kept channel specific as a result which also limits its usage to a single thread, this allows for individually resetting or freeing any allocations.
2022-06-05 13:04:40 +05:30
PixelyIon
3e08494146 Minor CommandScheduler refactor
There was a lot of redundant code in the `CommandScheduler` when the same functionality could be achieved with much shorter and cleaner code which this commit fixes. This includes no changes to the user-facing API and does not require any changes on the user side as a result.
2022-06-05 13:04:40 +05:30
Billy Laws
bd99d79b51 OsFileSystem: Close directory after file listing is finished 2022-06-04 21:46:23 +01:00
Billy Laws
4888919515 Stub GetFriendInvitationStorageChannelEvent (0x8C) 2022-06-04 21:45:53 +01:00
Billy Laws
d9f6540831 Fix VFS CreateFile directory creation 2022-06-04 19:19:30 +01:00
Billy Laws
f5bcb40c41 Return number of audio outs in ListAudioOuts 2022-06-04 19:12:37 +01:00
Billy Laws
5d6902b3f8 Stub audin:u 2022-06-04 19:11:57 +01:00
Billy Laws
54999957a2 Remove RGB565 format workaround
Will soon be redundant with new texture manager and is quite hacky so drop it.
2022-06-04 17:49:13 +01:00
Billy Laws
d79832091d Force append slash to directory path in OsFilesystem::CreateDirectory
The recursive path creation algorithm requires this to be the case
2022-06-04 17:44:49 +01:00
Billy Laws
616f7b7826 Correct instanced draw topology changed warning location
Before it would trigger even when the draw had the instanceNext flag set and thus wasn't part of the instanced draw at all.
2022-06-04 17:43:03 +01:00
Billy Laws
deb7a0e22a Implement 5x5 and 10x10 ASTC texture formats 2022-06-04 17:42:37 +01:00
Billy Laws
cc5a3f99c1 Reformat format description file 2022-06-04 17:42:13 +01:00
Billy Laws
a476bbaf4d Add 11_11_10 vertex buffer format 2022-06-04 17:41:10 +01:00
Billy Laws
71c37dd6c4 Add D24X8Unorm depth RT format support 2022-06-04 17:40:49 +01:00
Billy Laws
d3af629b83 Support R32G32B32A32 int RT formats 2022-06-04 17:38:57 +01:00
Billy Laws
0f5f04ade3 Set default surfaceflinger parameters based off of preallocated buffers
Required by resident evil 4 as otherwise Dequeue would fail due to it using BGRA buffers but the default being RGBA.
2022-06-04 16:55:08 +01:00
Billy Laws
106ad597db Support BGRA8888 surfaceflinger format
A swizzle is applied to R8G8B8A8 to transform it to BGRA since BGRA isn't a commonly supported swapchain format on Android.
2022-06-04 16:49:26 +01:00
Billy Laws
2bbeb6b08f Fix OsFileSystem initial directory creation
By passing basePath as an argument the CreateDirectory function did mkdir(basePath+basePath) which is obviously not the intended behaviour, fix this.
2022-06-03 19:33:31 +01:00
Billy Laws
84dec7561c Dont cache rendertarget mappings
Some games remap rendertargets or map them late which would lead to weird graphical bugs or crashes. Drop the caching since VMM lookup is fairly cheap anyway.
2022-06-03 19:31:52 +01:00
Billy Laws
581a016991 Add GuestTexture::GetSize helper function
This code was getting duplicated a bit so commonise into a helper function.
2022-06-03 19:30:54 +01:00
Billy Laws
31d418ad54 Fix 3D semaphore counter type 0 handling
Counter type 0 actually releases the semaphore payload rather than a constant zero as was previously thought. This is required by Skyrim.
2022-06-02 22:03:19 +01:00
Billy Laws
0202bf5531 Add semaphore release debug logs 2022-06-02 22:02:59 +01:00
Billy Laws
55cddc7a66 Update hades 2022-06-02 21:58:30 +01:00
Billy Laws
3736d36b75 Fix KPrivateMemory remap permissions 2022-06-02 18:10:35 +01:00
Billy Laws
389ab0fb50 Add {Map,Unmap}Physical memory debug logs 2022-06-02 18:10:10 +01:00
PixelyIon
2712b3276b Fix incorrect VkBufferImageCopy offset calculations
The `VkBufferImageCopy` offset calculations were wrong inside `CopyIntoStagingBuffer` as it multiplied the mip level's linear size by `levelCount` rather than `layerCount`. This led to substantial UB in games which called this function as it led to an overflow and resulted in writing to other areas of the buffer which caused major issues such as vertex/index buffer corruption and corresponding graphical glitches alongside likely being the cause of some crashes.
2022-06-02 22:14:22 +05:30
PixelyIon
06901ef22a Fix BC7 output swizzling from BGRA to RGBA
BC7 CPU decoding had the red and blue channels swapped around as it outputted a BGRA image after decoding while we expected an RGBA image to be produced. This should fix the colors of certain textures in titles such as Cuphead or Sonic Forces.
2022-06-02 19:48:55 +05:30
Billy Laws
9cb68c31e1 Stub nfp IUser::AttachAvailabilityChangeEvent 2022-06-02 00:04:01 +01:00
Billy Laws
33c9731eca Implement IFileSystem::CreateDirectory 2022-06-02 00:04:01 +01:00
Billy Laws
a09414424b Fix broken VFS directory creation 2022-06-02 00:04:01 +01:00
Billy Laws
3518e04a18 Correct Directory EntryType to be u8 rather than u32 2022-06-02 00:04:01 +01:00
Billy Laws
0c11d9e294 Implement IDirectory::GetEntryCount 2022-06-02 00:04:01 +01:00
MCredstoner2004
c15b3a8d40
Make Applet accesses to the data queues lock
Avoids potential races when the guest access the same applet from more than one thread.
2022-06-02 03:47:38 +05:30
Billy Laws
91b2c47991 Fix potential nvdrv submission race
The syncpoint maximum value represents the maximum possible syncpt value at a given time, however due to PBs being submitted before max was incremented, for a brief moment of time this is not the case which could lead to crashes or other such behaviour if a game waits on the fence at the right moment.
2022-06-01 17:15:25 +01:00
PixelyIon
37453ed7fa Use DocumentsProvider for log sharing
We used a `FileProvider` for log sharing prior, this is no longer necessary since it comes under the `DocumentsProvider` now which can be utilized to share the log document directly.
2022-06-01 21:41:14 +05:30
PixelyIon
8efa9298f9 Fix name conflict resolution for copyDocument
Any documents with the same name existing in a directory that is copied to would cause an exception due to existing already, this fixes that by handling conflict resolution in those cases and automatically determining a file name that would avoid a conflict.
2022-06-01 21:41:14 +05:30
Billy Laws
c4bd9c47e4 Stub NVGPU_GPU_IOCTL_ZBC_SET_TABLE nvdrv ioctl
This was missed in the original implementation and caused crashes in some games.
2022-06-01 16:59:14 +01:00
Billy Laws
c639fdcf06 Fixup NFP service stub state handling
Previously a broken state value was returned from GetState that caused crashes in games using newer SDKs and NFP, correctly handle state now by updating it after initialisation.
2022-06-01 15:00:26 +01:00
Billy Laws
c745e0e02b Move image type logic to GuestTexture, allowing 2D array views for 3D RTs
We can't render to a 3D texture through a 3D view, we instead have to create a 2D array view into it and render to that. The texture manager previously didn't support having a different view type/layer count between a guest texture view and the underlying storage texture that is required to support this so that was also implemented by reading the view layer count from the dimensions depth instead if the underlying texture is 3D (and the view type is 2D array). Additionally move away from our own view type enum to Vulkan, inline with other guest texture member types.
2022-05-31 22:09:53 +01:00
Billy Laws
22695c4feb Stub nim services used for eShop communication
We obviously don't need to implement these so add a simple set of stubs to satify games using them (mainly demos such as DQXII)
2022-05-31 22:07:01 +01:00
Billy Laws
ff12dc9c10 Add R32_SFLOAT to adreno validation layer format filtering 2022-05-31 22:03:53 +01:00
Billy Laws
6cc925c2d3 Reset RT mappings on dimension and format changes 2022-05-31 17:49:16 +01:00
Billy Laws
8180bf852e Lock textures before attaching in BlitContext 2022-05-31 16:54:13 +01:00
Billy Laws
cb2b36e3ab Allow providing a callback that's called after every CPU access in GMMU
Required for the planned implementation of GPU side memory trapping.
2022-05-31 16:04:27 +01:00
Billy Laws
46ee18c3e3 Require depthBiasClamp Vulkan device feature
Used in some UE4 games and supported by 95% of devices so skip implementing a fallback path.
2022-05-31 14:46:45 +01:00
PixelyIon
e592b11039 Drop samplerAnisotropy as a required GPU feature
Sampler anisotropy was made a required feature in an earlier commit due to its widespread availability but this was determined to be incorrect as certain Mali GPUs that can otherwise run 2D games in Skyline do not have this feature, while they are still not officially supported as this was the only roadblock to support them, it has now been made an optional feature.
2022-05-31 01:37:40 +05:30
PixelyIon
4336134b07 Reintroduce android:hasFragileUserData due to stable signature
`android:hasFragileUserData` was added in an earlier commit but then removed due to it not functioning because of signature checks. Now that signatures are consistent across builds, it has been readded and should now allow carrying data across CI and developer builds.
2022-05-31 01:37:07 +05:30
PixelyIon
b91ce939a2 Introduce CI build signing
We've done no signing of any Skyline APKs to date which causes issues regarding authenticity of any APKs as they could be entirely unofficial builds which have not been vetted by the team. Additionally, the different keys remove the ability to reinstall a different build successively as Android checks for matching signatures before installing an APK.
2022-05-31 01:25:18 +05:30
PixelyIon
e1cc8676cf Add option to view internal directory
With the Skyline document provider, easy access to the internal directory is required which may be hard to navigate to through the system file manager. This adds an option in settings to directly open up the directory in the system file manager.
2022-05-31 01:25:18 +05:30
PixelyIon
ba97985b55 Revise DocumentsProvider URI structure
The URIs (Document ID + Root) of the Skyline `DocumentsProvider` was unoptimal as it wasn't relative to a base directory. This is required for opening a root without knowledge of the full path in advance, it is therefore cleaner to provide a uniform `ROOT_ID` in a companion class.
2022-05-31 01:25:18 +05:30
Mylah Dee
ee7da31fc6 Add DocumentProvider for accessing internal files
On Android 12 and above, files from an application's external storage directory cannot be accessed by the user. The only proper SAF-compliant way to solve this is to create a `DocumentProvider` which proxies access to internal storage accordingly.
2022-05-30 15:09:30 +05:30
Narr the Reg
7aa6a5c4ca
Add HID touch attribute and index reporting
Adds missing parameter TouchAttribute and emulates correctly the touch point index. Both changes are necessary on Voez to keep track of each finger.
2022-05-29 10:28:51 +01:00
PixelyIon
80c8fb8791 Implement CPU BCn Texture Decoding
Certain GPU vendors such as ARM's Mali do not have support for BCn textures whatsoever while other vendors such as AMD only have partial support (BC1-BC3). Most titles on the guest utilize BC textures and to address this on host GPUs without support for BCn, we need to decompress the texture on the CPU. This commit implements a CPU BCn texture decoder based off Swiftshader's BC decoder, it also adds the necessary infrastructure to have different formats for the `GuestTexture` and `Texture` objects.
2022-05-28 21:22:24 +05:30
PixelyIon
fe615b1e03 Clarify texture swizzling inner-loop iteration count
The iterations of the inner loop for sector deswizzling was miscalculated as `SectorWidth * SectorHeight` while the result was correct at `32`, it should be determined by the amount of sector lines within a GOB i.e.: `(GobWidth / SectorWidth) * GobHeight`.
2022-05-28 21:22:24 +05:30
PixelyIon
7d4e0a7844 Implement Mipmapped Texture Support
Support for mipmapped textures was not implemented which is fairly crucial to proper rendering of games as the only level that would load is the first level (highest resolution), that might result in a lot more memory bandwidth being utilized. Mipmapping also has associated benefits regarding aliasing as it has a minor anti-aliasing effect on distant textures. 

This commit entirely implements mipmapping support but it does not extend to full support for views into specific mipmap levels due to the texture manager implemention being incomplete.
2022-05-28 21:22:24 +05:30
PixelyIon
da7e6a7df7 Replace Maxwell DMA GuestTexture usage with new swizzling API
Maxwell DMA requires swizzled copies to/from textures and earlier it had to construct an arbitrary `GuestTexture` to do so but with the introduction of the cleaner API, this has become redundant which this commit cleans up and replaces with direct calls to the API with all the necessary values.
2022-05-28 21:22:24 +05:30
PixelyIon
de300bfdbe Refactor Texture Swizzling
The API for texture swizzling is now more concrete and abstracted out from `GuestTexture`, this allows for neater usage in certain areas such as MaxwellDMA while having a `GuestTexture` wrapper as well allowing for neater usage in those cases. 

The code itself has also been cleaned up slightly with all usage of `u32`s being upgraded to `size_t` as this is simply more efficient due to the compiler not needing to emulate wraparound behavior for integer types smaller than the processor word size.
2022-05-19 17:13:55 +05:30
Billy Laws
72473369b6 Account for OOB copyOffsets in CircularBuffer::Read
Caused crashes in Pokemon
2022-05-14 15:30:59 +01:00
Robin Kertels
0a3cf25823 Implement the Fermi 2D blitting engine
The Fermi 2D engine implements both image blit and resolve operations, supporting subpixel sampling with both linear and point filtering.

Resolve operations are performed by sampling from the center of each pixel in order to resolve the final image from the MSAA samples
MSAA images are stored in memory like regular images but each pixels dimensions are scaled: e.g for 2x2 MSAA
```
112233
112233
445566
445566
```
These would be sampled with both duDx and duDy as 2 (integer part), resolving to the following:
```
123
456
```
Blit operations are performed by sampling from the corner of each pixel, scaling the image as one would expect.

This implementation isn't fully complete as Vulkan blit doesn't support some combinations which Fermi does, most notably between colour and depth stencil. These will be implemented properly at a later date, likely after the texture manager rework.
Out of Bounds Blit, used by some OpenGL games is also missing since supporting it requires texture aliasing, this will also be supported after the texture manager rework.

Co-authored-by: Billy Laws <blaws05@gmail.com>
2022-05-13 22:37:37 +01:00
Billy Laws
be2546138d Move IOVA class to GMMU so it can be used for other engines 2022-05-13 22:37:37 +01:00
Billy Laws
3ad640fcbc Fix accidental graphics context member/parameter duplication 2022-05-13 22:37:37 +01:00
PixelyIon
7a6f27a19a Fix texture swizzling OOB writes
Certain writes during swizzling went out of bounds due to incorrect `blockExtentY` calculation, the previous commit to fix this ended up breaking it further. This commit returns to the original commit's calculations with the proper addendum of a check for exact alignment with a GOB which is the case that was broken earlier.
2022-05-13 14:52:41 +05:30
PixelyIon
168e51e7ad Always use GetLayerStride for layer stride in Texture
The `GuestTexture::GetLayerStride` function was not always being utilized to retrieve the layer stride inside `Texture`, it would instead directly access the `guestTexture::layerStride` member. This is problematic as it may not be initialized and return `0` which would lead to a broken image copy.
2022-05-13 14:21:37 +05:30
Billy Laws
b81d5bc865 Implement and cleanup semaphore operations in all engines
Most engines have the capability to release a semaphore payload (or reduce in the case of GPFIFO) when a method is called or action is complete. Semaphores are used by games for both timing how long things take on GPU and waiting on resources so missing them can cause deadlocks or other related issues.
2022-05-12 19:40:24 +01:00
Billy Laws
bca88685bd Stub nvdrv {Get,Dump}Status 2022-05-12 17:38:22 +01:00
Billy Laws
97e740c986 Fix slight locking bug with nvmap handle duplication 2022-05-12 17:38:22 +01:00
Billy Laws
57378457dc Treat symbol file paths without slashes as filenames
Prevents crashes printing backtrace if this occurs
2022-05-12 17:38:22 +01:00
Billy Laws
d08ac63bbf Use TIC maximum index over TSC when tscIndexLinked is set 2022-05-12 17:38:22 +01:00
Billy Laws
8e021a9f1f Load custom drivers from app private data dir
Required since /sdcard doesn't have exec perm support
2022-05-12 17:38:21 +01:00
Billy Laws
dcef597345 Introduce TrivialObject concept and use where appropriate
Simplifies type checking and handles excluding container types that are trivially copyable but contain pointers
2022-05-12 17:38:21 +01:00
PixelyIon
f2cc25ee9f Implement Array Texture Swizzling
Textures can have more than one layer which we currently don't handle, all layers past the initial one will be filled with random data or 0s, leading to incorrect rendering. This has now been implemented now which fixes any titles which utilize array textures, such as "Super Mario Odyssey" or "Hatsune Miku: Project DIVA MegaMix".
2022-05-12 18:23:45 +05:30
PixelyIon
2a99e1784d Fix Maxwell3D RT Depth/Layer Count Logic
The Maxwell3D RT layer count wasn't being set correctly as it has the same register as the depth values and is toggled between the two based on another register value.
2022-05-12 18:23:05 +05:30
Billy Laws
543ac3042e Cleanup account services and stub StoreSaveDataThumbnail 2022-05-11 23:24:35 +01:00
Billy Laws
7d30ac0cd8 Add additional nifm stubs 2022-05-11 23:24:35 +01:00
Billy Laws
a164635f32 Stub LibraryAppletPlayerSelect 2022-05-11 23:24:35 +01:00
PixelyIon
4ec1cc7086 Update Build Tools to 33.0.0-rc4
Google has removed `33.0.0-rc3` from their servers and it can no longer be downloaded by the CI, the build tools version has been updated as a result.
2022-05-12 02:53:01 +05:30
Billy Laws
dd0004e208 Set Host1x log tag correctly 2022-05-11 22:11:16 +01:00
Billy Laws
f89bacf8ae Fixup Host1x syncpoint locking 2022-05-11 22:04:02 +01:00
Billy Laws
d8ff318a1a Prevent infinite VFS read loop on EOF 2022-05-11 22:03:39 +01:00
shutterbug2000
f078a5d1ec Stub bt and btm:u
Stub BT services which is required by titles such as Pokémon Let's GO Pikachu and Eevee (non-Demo versions).
2022-05-11 20:44:09 +05:30
PixelyIon
588b4529ee Implement 3D Texture Swizzling
The Maxwell GPU supports 3D textures which are tiled with the block-linear layout which didn't handle swizzling 3D textures correctly till now. This commit addresses that by implementing proper swizzling for 3D textures. Titles such as Cluster Truck and Super Mario Odyssey utilize 3D textures alongside a vast majority of other titles.
2022-05-11 14:06:04 +05:30
Billy Laws
601d67e369 Use resource size rather than allocation size for staging buffer size
As per VMA docs: 'Allocation size returned in this variable may be greater than the size requested for the resource e.g. as VkBufferCreateInfo::size. Whole size of the allocation is accessible for operations on memory e.g. using a pointer after mapping with vmaMapMemory(), but operations on the resource e.g. using vkCmdCopyBuffer must be limited to the size of the resource.'
2022-05-10 18:48:20 +01:00
Billy Laws
d2acec24f5 Handle VFS reads into trapped memory regions
pread will refuse to read into any trapped regions so implement a manual path with a staging buffer and memcpy for such cases
2022-05-10 18:33:55 +01:00
Billy Laws
1609fd2a32 Account for layerCount in SynchronizeGuestWithBuffer staging buffer size 2022-05-10 18:33:31 +01:00
Billy Laws
5b97b87503 Restore previous cullMode when cullFace is enabled 2022-05-10 18:31:32 +01:00
Billy Laws
15e9fa1c80 Fix FillRandomBytes
There were two issues here:
- If a skyline span was passed as a param then the 'T &object' version would be called, filling the span itself with random values rather than its contents
- Random numbers were repeated every call since independent_bits_engine copied generator state and thus it was never actually updated
2022-05-10 18:28:15 +01:00
Billy Laws
622ff2a8f1 Correctly track 5.1 audio channel sample count
Size needs to be adjusted for 5.1 buffers since they're downsampled to stereo.
2022-05-10 18:26:20 +01:00
PixelyIon
56c9b03843 Fix incorrect swizzling Y extent calculation
This calculation for the amount of lines on the Y axis relative to the start of the last block was wrong and would instead determine the amount of lines to the last Y-axis GOB which wasn't accurate when padding was considered, this resulted in titles like Celeste having broken texture decoding (on a 1922x1082 texture) for the last ROB as most pixels would be masked out.
2022-05-09 20:25:43 +05:30
Billy Laws
018df355f0 Replace some VFS exceptions with warnings
These errors aren't necessarily fatal so tone them down.
2022-05-08 19:37:10 +01:00
Billy Laws
e1c13bbc08 Update hades 2022-05-08 19:37:10 +01:00
PixelyIon
b307fca115 Fix attachment reuse within the same subpass
Certain titles such as BOTW trigger behavior to reuse an attachment within the same subpass, this caused an exception inside `RenderPassNode::AddAttachment` as it cannot find corresponding subpass for attachment. To fix this issue, we now assume that when it cannot find a subpass for an existing attachment, it is attached to the latest subpass and return the attachment.
2022-05-08 18:26:40 +05:30
PixelyIon
e027555796 Handle Y-axis GOB non-alignment for swizzling
Certain textures may be unaligned with a GOB's height of 8 lines, we already handle the case of being unaligned with a GOB's width of 64-bytes. This case occurs on titles such as SMO when going in-game.
2022-05-07 18:37:22 +05:30
PixelyIon
c910e29168 Extend HostSignalHandler's SIGSEGV debugger path
The function now returns from a segmentation fault when a debugger is present, this allows the entire context to be intact which can allow the debugger to correctly pick up variables from all stack frames while it could not extrapolate most variables when trapped inside the signal handler without the values of all registers.
2022-05-07 18:37:22 +05:30
Billy Laws
4149ab1067 Implement Maxwell 3D instanced draw support
In the Maxwell 3D engine, instanced draws are implemented by repeating the exact same draw in sequence with special flag set in vertexBeginGl. This flag allows either incrementing the instance counter or resetting it, since we need to supply an instance count to the host API we defer all draws until state changes occur. If there are no state changes between draws we can skip them and count the occurences to get the number of instances to draw.
2022-05-07 13:56:09 +01:00
Billy Laws
03594a081c Ensure correct flushing for batched constant buffer updates
Cbufs could be read by non-maxwell3D engines so force a flush when switching to them or before Execute.
2022-05-07 13:56:09 +01:00
PixelyIon
ad989750fc Implement Maxwell3D Point Sprite Size
Implements register state that corresponds to the size of a single point sprite in Maxwell 3D, this is emitted by the shader compiler in the preamble but needs to be only applied if the input topology is a point primitive and it is invalid to set the point size in any other case.
2022-05-07 03:46:25 +05:30
PixelyIon
874a6a2a6c Fix getTextureType enum conversion fomatting 2022-05-07 03:46:25 +05:30
PixelyIon
ae5bcbdb5c Fix Depth RT lock to be in scope
Earlier texture locking design required the lock to be retained but since the introduction of `AttachTexture`, this no longer needs to be done. This being done caused deadlocks when the depth texture is sampled by the fragment shader while being bound as an RT since it would attempt to lock the texture again.
2022-05-07 02:37:48 +05:30
shutterbug2000
1c8d994161 Basic bcat:u implementation
A basic `bcat:u` implementation to prevent titles such as "Kirby and the Forgotten Land" dependent on BCAT support from crashing due to the lack of an implementation.
2022-05-06 15:41:48 +05:30
PixelyIon
4fd64a53e0 Require Vulkan samplerAnisotropy feature
This is a widely supported feature that games may require conditionally but due to it being supported on effectively all target devices, it was made mandatory. This is used by titles such as ARMS.
2022-05-06 15:41:48 +05:30
PixelyIon
1d9b4a865a Add additional formats to Adreno filter
`VK_FORMAT_R32G32B32A32_SFLOAT` and `D32_SFLOAT` have their capabilities misreported as well, this spams the logs in titles such as ARMS.
2022-05-06 15:41:48 +05:30
PixelyIon
b87295374e Improve Controller Applet log
Improves the readability of the log and replaces the previously uninformative prefix of `operator()` due to being in a lambda with `Controller support`.
2022-05-06 15:41:48 +05:30
PixelyIon
98c730a644 Implement linked TIC/TSC handle in Maxwell3D
Maxwell3D has a register for linking the TIC/TSC index in bindless texture handles, this is used by games to implement bindless combined texture-sampler handles.
2022-05-06 14:58:20 +05:30
PixelyIon
23a091100d Implement ReadCbufValue + ReadTextureType
Implements `GraphicsEnvironment::ReadCbufValue` & `GraphicsEnvironment::ReadTextureType` with a framework of heterogeneous lookups for caching and callbacks for querying constant buffer or TIC values with validation checks for successive draws to ensure unique IR is generated.
2022-05-06 14:39:36 +05:30
PixelyIon
765c3f4e1f Allow draws with no descriptor set resources
The `descriptorSetWrites` being filled is now optional and the case of it being empty is handled correctly, this is done by certain titles such as ARMS and is entirely valid behavior. It should be noted that not doing this leads to errors in the guest due to invalid GPU state while working on the host GPU.
2022-05-06 10:33:47 +05:30
PixelyIon
37327f1955 Fix and refactor SVC SignalToAddress/WaitForAddress
SVC `SignalToAddress` had a bug with the behavior of `SignalAndModifyBasedOnWaitingThreadCountIfEqual` which was entirely incorrect and led to deadlocks in titles such as ARMS that were dependent on it. This commit corrects the behavior and refactors both SVCs and moves their arbitration/waiting to inside the corresponding `KProcess` function rather than the SVC to avoid redundancies and improve code readability.
2022-05-05 19:15:37 +05:30
PixelyIon
396979e897 Extend Adreno format-based filtering for Validation Layer
Filtering of validation logs is now extended beyond BCn formats and now covers other format which have their feature set misreported by the driver, this significantly drives down the amount of logs depending on the title.
2022-05-05 19:15:37 +05:30
PixelyIon
62ea2a6da5 Avoid format aliasing warnings on Adreno
Implements an algorithm to determine formats that can be aliased as views without needing `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT`, this avoids spamming warning logs on view creation when the aliased formats will function in practice.
2022-05-05 19:15:37 +05:30
PixelyIon
7206ab4c67 Fix exclusiveSubpass by finishing render pass at end
There was an oversight with exclusive subpasses which could lead to RPs with more than one subpass could be created even though one pass was exclusive, this oversight was not finishing the render pass at the end of `AddSubpass`. This could lead to a future subpass adding to the end of that RP even though it was intended to exclusively have a single subpass.

This case occurs in titles such as Celeste (in-game) and breaks rendering on GPUs that may require exclusive subpasses for proper functionality.
2022-05-05 11:14:38 +05:30
PixelyIon
96fe5f0a0e Set initial subpassCount value to 1 rather than 0 2022-05-05 11:07:43 +05:30
PixelyIon
5d08d6e06f Disable unnecessary Khronos Validation Layer logs
The Khronos Validation Layer can often generate warning/error logs due to our intentional breakage from Vulkan specification, these can occur several times a frame resulting in the logs being spammed and making it difficult to extract useful information out of logs. The scope of these logs has now been reduced with more general filtering and the introduction of specialized filtering to handle complex cases such as BCn hacks with `libadrenotools` on Adreno devices.
2022-05-04 13:20:59 +05:30