skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-11-05 07:15:06 +01:00

Author	SHA1	Message	Date
PixelyIon	7b209c54a2	Only reallocate `MegaBuffer` on usage Certain submissions might not utilize megabuffering but reserve a `MegaBuffer` regardless, this is not optimal since it can inflate the allocations and waste memory. This commit addresses the issue by eliding the allocation given the current submission doesn't utilize them.	2022-08-06 22:20:54 +05:30
PixelyIon	2366f81443	Fix `Buffer::PollFence` incorrectly handling null-`FenceCycle` If a `FenceCycle` isn't attached then `PollFence` returned `false` while it should return if the buffer has any concurrent GPU usages in flight, this has now been fixed by returning `true` in those cases.	2022-08-06 22:20:54 +05:30
PixelyIon	34e1e39d1c	Always reset all attached resources on `Submit` Certain resources can be attached to an empty `Submit` with no nodes, this can cause it to become a false dependency and not be removed till the next non-empty submission. This has now been fixed by doing a reset regardless of if any nodes exist.	2022-08-06 22:20:54 +05:30
PixelyIon	47db8e8cbc	Fix GPU inline copy callback for `Buffer::Write` The GPU inline copy callback was broken for `Buffer::Write` as it wasn't always called when it needed to be and didn't handle attaching of the buffer to the executor which would cause it to be unlocked. This commit addresses both of these issues, it introduces a `AttachLockedBuffer` method to attach an already locked buffer to the executor.	2022-08-06 22:20:54 +05:30
PixelyIon	2636a37b31	Introduce alternative FPS measurement for disabled frame throttling The FPS is implicitly bound to the refresh rate due to the timestamp being that of the presentation time, this leads to a misleading FPS figure for disabled frame throttling. It has now been fixed by using the frame submission time rather than the presentation time when frame throttling is disabled and to make this more apparent the color of the OSD FPS has been changed.	2022-08-06 22:20:54 +05:30
PixelyIon	0f56d01e58	Fix `Packed` format component ordering in `IsAdrenoAliasCompatible` All `Packed` formats have their components stored in the opposite ordering to the label, this was not followed for `IsAdrenoAliasCompatible` prior and the ordering has now been flipped.	2022-08-06 22:18:42 +05:30
PixelyIon	3ca56ef578	Fix NCE Trapping API Deadlock A deadlock was caused by holding `trapMutex` while waiting on the lock of a resource inside a callback while another thread holding the resource's mutex waits on `trapMutex`. This has been fixed by no longer allowing blocking locks inside the callbacks and introducing a separate callback for locking the resource which is done after unlocking the `trapMutex` which can then be locked by any contending threads.	2022-08-06 22:18:42 +05:30
PixelyIon	a6599c30b4	Correct `IntervalMap` insertion `end` calculation The `end` pointer for `interval` was incorrectly calculated as `interval.data() + interval.size_bytes()` which would be incorrect when the interval span type is not `u8` as the pointer derived from `interval.data()` would be a pointer to the span type rather than a byte pointer and be subject to arithmetic of that object's size rather than in terms of a byte.	2022-08-06 22:18:42 +05:30
PixelyIon	b0910e7b1a	Avoid locking `Texture`/`Buffer` in trap handler We generally don't need to lock the `Texture`/`Buffer` in the trap handler, this is particularly problematic now as we hold the lock for the duration of a submission of any workloads. This leads to a large amount of contention for the lock and stalling in the signal handler when the resource may be `Clean` and can simply be switched over to `CpuDirty` without locking and utilizing atomics which is what this commit addresses.	2022-08-06 22:18:42 +05:30
PixelyIon	a60d6ec58f	Replace host immutability `FenceCycle` with GPU usage tracking We utilized a `FenceCycle` to keep track of if the buffer was mutable or not and introduced another cycle to track GPU-side requirements only on fulfillment of which could the buffer be utilized on the host but due to the recent change in the behavior this system ended up being unoptimal. This commit replaces the cycle with a boolean tracking if there are any usages of the resource on the GPU within the current context that may prevent it from being mutated on the CPU. The fence of the context is simply attached to the buffer based off this which was allowed as the new behavior of buffer fences matches all the requirements for this.	2022-08-06 22:18:42 +05:30
PixelyIon	217d484cba	Abstract `TextureView`/`BufferDelegate` locking into `LockableSharedPtr` An atomic transactional loop was performed on the backing `std::shared_ptr` inside `BufferView`/`TextureView`'s `lock`/`LockWithTag`/`try_lock` functions, these locks utilized `std::atomic_load` for atomically loading the value from the `shared_ptr` recursively till it was the same value pre/post-locking. This commit abstracts the locking functionality of `TextureView`/`BufferDelegate` into `LockableSharedPtr` to avoid code duplication and removes the usage of `std::atomic_load` in either case as it is not necessary due to the implicit memory barrier provided by locking a mutex.	2022-08-06 22:18:42 +05:30
PixelyIon	2d08886e4e	Utilize `TextureView` rather than `Texture` for presentation `PresentationEngine` and `GraphicBufferProducer` methods that utilized textures for the surface utilized the `Texture` type rather than the `TextureView` type, this was never correct but at the time of authoring this code `TextureView` was not finalized and in a major flux which is why it was not utilized and `Texture` was utilized instead. Now that is is far more stable, it has been replaced with `TextureView`.	2022-08-06 22:18:42 +05:30
PixelyIon	d7399e33c1	Avoid waiting on mutex in `PresentationEngine::Present` We want to block on the host thread during presentation while the host surface isn't present to implicitly pause the game, this can end up being fairly costly as it involves locking the `PresentationEngine` mutex which can lead to a lot of contention with the presentation thread. This fixes the issue by polling if there is a surface and only if there isn't then doing the wait as it isn't mandatory to wait always, we'll eventually run into the guest thread stalling.	2022-08-06 22:18:42 +05:30
PixelyIon	30475ffc43	Fix `queueBuffer` `GraphicBuffer` Compatibility Check Newer versions of the Deko3D homebrew were crashing due to this check and it was discovered that the check was incorrect and rather than comparing the `NvSurface` what had to be compared was the `GraphicBuffer` associated with the slot directly. Co-authored-by: lynxnb <niccolo.betto@gmail.com>	2022-08-06 22:18:42 +05:30
PixelyIon	c2685d5f5c	Fix consistency issues with external project copyright headers The copyright headers for external project such as yuzu/Ryujinx were inconsistent in ordering, Skyline should always be the first item in the list. In addition, they didn't always link to the project's GitHub which has also been fixed.	2022-08-06 22:18:42 +05:30
PixelyIon	0ac5f4ce27	Lock `TextureManager`/`BufferManager` during submission Multiple threads concurrently accessing the `TextureManager`/`BufferManager` (Referred to as "resource managers") has a potential deadlock with a resource being locked while acquiring the resource manager lock while the thread owning it tries to acquire a lock on the resource resulting in a deadlock. This has been fixed with locking of resource manager now being externally handled which ensures it can be locked prior to locking any resources, `CommandExecutor` provides accessors for retrieving the resource manager which automatically handles locking aside doing so on attachment of resources.	2022-08-06 22:18:42 +05:30
PixelyIon	1239907ce8	Rework `Texture` & `Buffer` for `Context` and `FenceCycle` Chaining GPU resources have been designed with locking by fences in mind, fences were treated as implicit locks on a GPU, design paradigms such as `GraphicsContext` simply unlocking the texture mutex after attaching it which would set the fence cycle were considered fine prior but are unoptimal as it enforces that a `FenceCycle` effectively ensures exclusivity. This conflates the function of a mutex which is mutual exclusion and that of the fence which is to track GPU-side completion and led to tying if it was acceptable to use a GPU resource to GPU completion rather than simply if it was not currently being used by the CPU which is the function of the mutex. This rework fixes this with the groundwork that has been laid with previous commits, as `Context` semantics are utilized to move back to using mutexes for locking of resources and tracking the usage on the GPU in a cleaner way rather than arbitrary fence comparisons. This also leads to cleaning up a lot of methods that involved usage of fences that no longer require it and therefore can be entirely removed, further cleaning up the codebase. It also opens the door for future improvements such as the removal of `hostImmutableCycle` and replacing them with better solutions, the implementation of which is broken at the moment regardless. While moving to `Context`-based locking the question of multiple GPU workloads being in-flight while using overlapping resources came up which brought a fundamental limitation of `FenceCycle` to light which was that only one resource could be concurrently attached to a cycle and it could not adequately represent multi-cycle dependencies. `FenceCycle` chaining was designed to fix this inadequacy and allows for several different GPU workloads to be in-flight concurrently while utilizing the same resources as long as they can ensure GPU-GPU synchronization.	2022-08-06 22:18:42 +05:30
PixelyIon	07d45ee504	Introduce `FenceCycle` Chaining If we want to allow submitting multiple pieces of work to the GPU at once while still requiring CPU synchronization, we'll need to track all past fence cycles associated with a resource alongside the current one. To solve this the concept of chaining fences has been introduced, fences from past usages can be chained to the latest fence which'll then recursively forward operations to chained fences. This change also ends up mandating a move away from `FenceCycleDependency` as it would prevent fences from concurrently locking the same resources which is required for chaining to work as two fences being chained fundamentally means they're locking the same resources. The `AtomicForwardList` is therefore used as the new container.	2022-08-06 22:18:42 +05:30
PixelyIon	cf9e31c1eb	Implement Atomic Forward List An implementation of a singly-linked list with atomic access to allow for lock-free access semantics, it eliminates the requirement for a mutex which can introduce additional consideration for synchronization.	2022-08-06 22:18:42 +05:30
PixelyIon	6b9269b88e	Introduce `Context` semantics to GPU resource locking Resources on the GPU can be fairly convoluted and involve overlaps which can lead to the same GPU resources being utilized with different views, we previously utilized fences to lock resources to prevent concurrent access but this was overly harsh as it would block usage of resources till GPU completion of the commands associated with a resource. Fences have now been replaced with locks but locks run into the issue of being per-view and therefore to add a common object for tracking usage the concept of "tags" was introduced to track a single context so locks can be skipped if they're from the same context. This is important to prevent a deadlock when locking a resource which has been already locked from the current context with a different view.	2022-08-06 22:18:42 +05:30
PixelyIon	d913f29662	Only set `hasFragileUserData` for signed builds We do not want to allow saving of user data on unsigned builds as they don't have a stable signature and will not properly handle reinstallation. This can lead to a situation where the user has to resort to complex techniques to completely uninstall the package such as ADB or calling into PM directly.	2022-08-06 22:18:42 +05:30
PixelyIon	3139889a09	Implement Asynchronous Presentation We currently present all frames synchronously on the thread that calls into SurfaceFlinger functions, this is unoptimal as it doesn't match guest behavior which can lead to delaying the guest from working on the next frame. This commit queuing up frames to non-blocking and handles all waiting then presenting the frame on a dedicated thread.	2022-08-06 22:18:42 +05:30
PixelyIon	6e09dc5204	Fix thread name setting We utilize `pthread_setname_np` to set the thread names but didn't check for any errors which resulted in the `Skyline-Choreographer` and `ChannelCmdFifo` not having proper names as they exceeded the 16 character limit on thread names for the pthread function. This has now been fixed by changing the names and introducing error checking to invocations of this function.	2022-08-06 22:18:42 +05:30
PixelyIon	7a0cfb484c	Add NPOT `AlignUp` utility All our normal alignment functions are designed to only handle power of 2 (`POT`) multiples as we only align or check alignment to `POT` multiples but there are cases where this is not possible and we deal with `NPOT` multiples which is why this function is required.	2022-08-06 22:18:42 +05:30
PixelyIon	662ea532d8	Skip waiting on host GPU after command buffer submission We waited on the host GPU after `Execute` but this isn't optimal as it causes a major stall on the CPU which can lead to several adverse effects such as downclocking by the governor and losing the opportunity to work in parallel with the GPU. This has now been fixed by splitting `Execute`'s functionality into two functions: `Submit` and `SubmitWithFlush` which both execute all nodes and submit the resulting command buffer to the GPU but flushing will wait on the GPU to complete while the non-flush variant will not wait and work ahead of the GPU.	2022-08-06 22:18:42 +05:30
PixelyIon	5129d2ae78	Add move-assignment semantics to `ActiveCommandBuffer`/`MegaBuffer` We need move-assignment semantics to viably utilize these objects as class members, they cannot be replaced without move-assign (or copy-assign but that is undesirable here). This commit fixes that by introducing a move assignment operator to them while making the `slot` a pointer which has the necessary nullability semantics.	2022-08-06 22:18:42 +05:30
lynxnb	8991ccac65	Pass `ViewHolder` on bind to RecyclerView items instead of `ViewBinding` This change lets items get the updated position of their view holder in the adapter. Fixes an issue where the position of items was not updated after being removed from a `SelectableGenericAdapter`.	2022-08-06 22:00:19 +05:30
lynxnb	bb922100cb	Improve rendering for Right-To-Left layouts	2022-08-06 22:00:19 +05:30
lynxnb	240e7033d7	Support loading a user-selected driver during vulkan initialization	2022-08-06 22:00:19 +05:30
lynxnb	c812de48ea	Show an undo button after deleting a gpu driver After a driver has been deleted, a snackbar will be shown confirming the deletion, with an button to undo it.	2022-08-06 22:00:19 +05:30
lynxnb	59c60df993	Add `GPU Driver Configuration` preference This preference launches `GpuDriverActivity` for managing custom gpu drivers. When the device has an incompatible GPU, the preference will be disabled and greyed out.	2022-08-06 22:00:19 +05:30
lynxnb	48cf1263bc	Add a custom GPU driver configuration activity The activity adds the following functionalities: * Lists installed drivers * Allows the user to install new drivers, or remove installed ones * Allows the user to select the driver that will be used by the emulator	2022-08-06 22:00:19 +05:30
lynxnb	e9f609b923	Add a `gpuDriver` preference setting This setting represent the GPU driver selected by the user to be used by the emulator.	2022-08-06 22:00:19 +05:30
lynxnb	1815199d2b	Add utilities for reading and installing gpu driver packages	2022-08-06 22:00:19 +05:30
lynxnb	f3dd3e53c1	Miscellaneous imports cleanup in `preference` package	2022-08-06 22:00:19 +05:30
lynxnb	1dfea9ef6f	Create an `ItemDecorations` file for all `RecyclerView` item decorations All item decorations are now placed in one file so that any `RecyclerView` in the app can use the same ones.	2022-08-06 22:00:19 +05:30
lynxnb	a59f2baa3a	Add a `SelectableGenericAdapter` as subclass of `GenericAdapter` `SelectableGenericAdapter` extends `GenericAdapter` with support for marking one item as selected.	2022-08-06 22:00:19 +05:30
lynxnb	e93fdce845	Add support for removal of items from `GenericAdapter`	2022-08-06 22:00:19 +05:30
lynxnb	0d1c7965df	Add a `ZipUtils` class for unpacking zip files	2022-08-06 22:00:19 +05:30
Billy Laws	f52ea7bddb	Make deferred draw and constant buffer updates reentrant-safe At some point we will call Submit within draws or constant buffer updates, to avoid any infinite recursion mark draw/cbuf pending as false before performing any operation	2022-07-29 20:07:14 +01:00
Billy Laws	dbb684835f	Fix depthClampDisable register offset in Maxwell 3D	2022-07-29 20:07:14 +01:00
Billy Laws	7fd9d347e3	Use per-RT blend enable registers even when independent blend is disabled The common blend enable register seems to be used for something else. This is required for blending to work correctly in OpenGL games	2022-07-29 20:07:14 +01:00
Billy Laws	048c2fdd29	Fix Vulkan framebuffer dimensions calculations The framebuffer needs to be large enough to contain both the render area extent and offset	2022-07-29 20:07:14 +01:00
Billy Laws	0e1aa765fc	Prevent CNTVCT_EL0 reads from being optimised out by the compiler Without this the compiler will assume the read always produces the same value, causing issues when the register is used to time function execution	2022-07-29 20:07:14 +01:00
lynxnb	d183d14e2a	Make accesses to setting values thread-safe	2022-07-26 20:16:24 +05:30
lynxnb	30667a0899	Remove unused `Compact Logs` settings Since we don't have a log viewer in the app anymore, the setting was left unused and can be safely removed.	2022-07-26 20:16:24 +05:30
lynxnb	5aa2a4cd1c	Rename `SettingsValues` to `NativeSettings` The previous name was chosen as an afterthought and didn't clearly indicate what the purpose of the class is. We needed a separate, simple class without delegates members (like PreferenceSettings), so that its fields can be easily accessed via JNI to get settings values from native code.	2022-07-26 20:16:24 +05:30
lynxnb	f734c4d145	Make log level setting changes immediately active	2022-07-26 20:16:24 +05:30
lynxnb	bb4937121f	Remove settings from SharedPreference if they are of the wrong type	2022-07-26 20:16:24 +05:30
lynxnb	2840a126dd	Introduce `AndroidSettings` class and use inheritance The `Settings` class now has a pure virtual `Update` method, and uses inheritance over template specialization for platform-specific behavior override.	2022-07-26 20:16:24 +05:30
lynxnb	3905728447	Make every setting observable individually A `Setting` delegate class has been introduced, holding the raw value of the setting and adding support for registering callbacks to that setting. Callbacks will then be called when the value of that setting changes. As a result of this, raw setting values have been made accessible through pointer dereference semantics.	2022-07-26 20:16:24 +05:30
lynxnb	5b4ca79dc8	Rename `Settings` Kotlin class to `PreferenceSettings` SharedPreferences will be partially swapped out in the future to support per-game settings. In the meantime, make it clear from which class settings are coming from.	2022-07-26 20:16:24 +05:30
lynxnb	3b27540250	Rename `operationMode` setting to `isDocked`	2022-07-26 20:16:24 +05:30
lynxnb	69cf25b1a7	Initial support for updating settings during emulation + observing settings changes	2022-07-26 20:16:24 +05:30
lynxnb	c5dde5953a	Rework how settings are shared between Kotlin and native side Settings are now shared to the native side by passing an instance of the Kotlin's `Settings` class. This way the C++ `Settings` class doesn't need to parse the SharedPreferences xml anymore.	2022-07-26 20:16:24 +05:30
lynxnb	4be8b4cf66	Add missing SPDX licence header	2022-07-26 20:16:24 +05:30
lynxnb	365ca66b1b	Make integer settings use IntegerListPreference Avoids unnecessary type casting of setting values and duplication in resource files.	2022-07-26 20:16:24 +05:30
lynxnb	cbc896c8f8	Fix `waitForFences` crash on Mali drivers Mali GPU drivers utilize the `ppoll()` syscall inside `waitForFences` which isn't correctly restarted after a signal, which we can receive at any time on a guest thread. This commit fixes that by recursively calling the function on failure till it succeeds or returns an unexpected error. Co-authored-by: PixelyIon <pixelyion@protonmail.com> Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-07-14 20:34:16 +02:00
MCredstoner2004	942e22f275	Write `ApplicationErrorArg` `ErrorApplet`s to log These applets are used by applications to display a custom error message to the user. Both the error message and the detailed error message are printed to the error log. Co-authored-by: lynxnb <niccolo.betto@gmail.com>	2022-07-02 09:48:59 +05:30
MCredstoner2004	f9a0394577	Implement Software Keyboard applet This implements the non-inline version of the Software Keyboard (swkbd) applet, which games use to get text input from the user.	2022-07-01 15:19:53 -05:00
MCredstoner2004	a9ee06914d	Add ByteBufferSerializable This allows sending C-like structs between Kotlin and C++ without struct-specific code	2022-06-30 01:17:32 +05:30
Billy Laws	a0275418d6	Add a single-header linear allocator implementation This conforms to the C++ 'Allocator' named requirement allowing it to be used with any STL type and allows drastically reducing allocation times in cases which are suited for linear allocation.	2022-06-28 21:33:04 +01:00
Billy Laws	e816256220	Add blend, scissor, viewport and vertex state to shader hash These caused a ton of additional comparisons in Zelda Link's Awakening as many shaders would have the same hash.	2022-06-28 21:32:59 +01:00
lynxnb	e6cfdeb06a	Fix non-indexed quad draws Certain non-indexed quad draws would mistakenly take the indexed quad path because of the assumption that they would not have a bound index buffer. This resulted in a crash for most games using quads due to a faulty exception `Indexed quad conversion is not supported`, when in fact they were not using indexed quads. Co-authored-by: PixelyIon <pixelyion@protonmail.com> Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-06-23 10:57:11 +02:00
lynxnb	8fc3bc75f4	Allow providing an index type to calculate quad conversion buffer size	2022-06-23 00:15:44 +02:00
Billy Laws	7709dc8cf6	Rewrite buffer megabuffering to be per view and more efficient This commit implements several key optimisations in megabuffering that are all inherently interlinked. - Megabuffering is moved from per-buffer to per-view copies, this makes megabuffering possible for small views into larger underlying buffers which is often the case with even the simplest of games, - Megabuffering is no longer the default option, it is only enabled for buffer views that have had inline GPU writes applied to them in the past as that is the only case where they are beneficial. In any other case the cost of copying, even with a 128KiB limit can be significant. - With both of these changes, there is now possibility for overlapping views where one uses megabuffering and one does not. In order to allow GPU inline writes to work consistently in such cases a system of 'host immutability' has been implemented, when a buffer is marked as host immutable for a given cycle, all writes to the buffer from that point to the point the cycle is signalled will be performed on the GPU, ensuring that the backing contents are correctly sequenced	2022-06-11 17:05:39 +05:30
MCredstoner2004	2e356b8f0b	Use spans instead of ptr and size in kernel memory	2022-06-11 17:05:39 +05:30
Billy Laws	22039df301	Transition to std::unordered_set for buffer view tracking Has the same guarantees of pointer stabilty while also being significantly faster in cases where a buffer has thousands of views. This is the case in RE4 and this change leads to an almost 1000% performance improvement in that game.	2022-06-09 23:52:13 +01:00
Billy Laws	b75a06af1b	Support forcing 60Hz display on Xiaomi MIUI Uses an API found through RE since none of the AOSP APIs work, additionaly the code for setting RR was consolidated to a single function that can be ran after all display updates.	2022-06-09 19:29:18 +01:00
PixelyIon	a5ca370c36	Implement thread-safe MegaBuffer pool We currently have a global `MegaBuffer` instance that is shared across all channels, this is very problematic as `MegaBuffer` fundamentally works like a state machine with allocations (especially resetting/freeing) and is thread-specific. Therefore, we now have a pool of several `MegaBuffer`s which is allocated from by the `CommandExecutor` and kept channel specific as a result which also limits its usage to a single thread, this allows for individually resetting or freeing any allocations.	2022-06-05 13:04:40 +05:30
PixelyIon	3e08494146	Minor `CommandScheduler` refactor There was a lot of redundant code in the `CommandScheduler` when the same functionality could be achieved with much shorter and cleaner code which this commit fixes. This includes no changes to the user-facing API and does not require any changes on the user side as a result.	2022-06-05 13:04:40 +05:30
Billy Laws	bd99d79b51	OsFileSystem: Close directory after file listing is finished	2022-06-04 21:46:23 +01:00
Billy Laws	4888919515	Stub GetFriendInvitationStorageChannelEvent (0x8C)	2022-06-04 21:45:53 +01:00
Billy Laws	d9f6540831	Fix VFS CreateFile directory creation	2022-06-04 19:19:30 +01:00
Billy Laws	f5bcb40c41	Return number of audio outs in ListAudioOuts	2022-06-04 19:12:37 +01:00
Billy Laws	5d6902b3f8	Stub audin:u	2022-06-04 19:11:57 +01:00
Billy Laws	54999957a2	Remove RGB565 format workaround Will soon be redundant with new texture manager and is quite hacky so drop it.	2022-06-04 17:49:13 +01:00
Billy Laws	d79832091d	Force append slash to directory path in OsFilesystem::CreateDirectory The recursive path creation algorithm requires this to be the case	2022-06-04 17:44:49 +01:00
Billy Laws	616f7b7826	Correct instanced draw topology changed warning location Before it would trigger even when the draw had the instanceNext flag set and thus wasn't part of the instanced draw at all.	2022-06-04 17:43:03 +01:00
Billy Laws	deb7a0e22a	Implement 5x5 and 10x10 ASTC texture formats	2022-06-04 17:42:37 +01:00
Billy Laws	cc5a3f99c1	Reformat format description file	2022-06-04 17:42:13 +01:00
Billy Laws	a476bbaf4d	Add 11_11_10 vertex buffer format	2022-06-04 17:41:10 +01:00
Billy Laws	71c37dd6c4	Add D24X8Unorm depth RT format support	2022-06-04 17:40:49 +01:00
Billy Laws	d3af629b83	Support R32G32B32A32 int RT formats	2022-06-04 17:38:57 +01:00
Billy Laws	0f5f04ade3	Set default surfaceflinger parameters based off of preallocated buffers Required by resident evil 4 as otherwise Dequeue would fail due to it using BGRA buffers but the default being RGBA.	2022-06-04 16:55:08 +01:00
Billy Laws	106ad597db	Support BGRA8888 surfaceflinger format A swizzle is applied to R8G8B8A8 to transform it to BGRA since BGRA isn't a commonly supported swapchain format on Android.	2022-06-04 16:49:26 +01:00
Billy Laws	2bbeb6b08f	Fix OsFileSystem initial directory creation By passing basePath as an argument the CreateDirectory function did mkdir(basePath+basePath) which is obviously not the intended behaviour, fix this.	2022-06-03 19:33:31 +01:00
Billy Laws	84dec7561c	Dont cache rendertarget mappings Some games remap rendertargets or map them late which would lead to weird graphical bugs or crashes. Drop the caching since VMM lookup is fairly cheap anyway.	2022-06-03 19:31:52 +01:00
Billy Laws	581a016991	Add GuestTexture::GetSize helper function This code was getting duplicated a bit so commonise into a helper function.	2022-06-03 19:30:54 +01:00
Billy Laws	31d418ad54	Fix 3D semaphore counter type 0 handling Counter type 0 actually releases the semaphore payload rather than a constant zero as was previously thought. This is required by Skyrim.	2022-06-02 22:03:19 +01:00
Billy Laws	0202bf5531	Add semaphore release debug logs	2022-06-02 22:02:59 +01:00
Billy Laws	3736d36b75	Fix KPrivateMemory remap permissions	2022-06-02 18:10:35 +01:00
Billy Laws	389ab0fb50	Add {Map,Unmap}Physical memory debug logs	2022-06-02 18:10:10 +01:00
PixelyIon	2712b3276b	Fix incorrect `VkBufferImageCopy` offset calculations The `VkBufferImageCopy` offset calculations were wrong inside `CopyIntoStagingBuffer` as it multiplied the mip level's linear size by `levelCount` rather than `layerCount`. This led to substantial UB in games which called this function as it led to an overflow and resulted in writing to other areas of the buffer which caused major issues such as vertex/index buffer corruption and corresponding graphical glitches alongside likely being the cause of some crashes.	2022-06-02 22:14:22 +05:30
PixelyIon	06901ef22a	Fix BC7 output swizzling from BGRA to RGBA BC7 CPU decoding had the red and blue channels swapped around as it outputted a BGRA image after decoding while we expected an RGBA image to be produced. This should fix the colors of certain textures in titles such as Cuphead or Sonic Forces.	2022-06-02 19:48:55 +05:30
Billy Laws	9cb68c31e1	Stub nfp IUser::AttachAvailabilityChangeEvent	2022-06-02 00:04:01 +01:00
Billy Laws	33c9731eca	Implement IFileSystem::CreateDirectory	2022-06-02 00:04:01 +01:00
Billy Laws	a09414424b	Fix broken VFS directory creation	2022-06-02 00:04:01 +01:00
Billy Laws	3518e04a18	Correct Directory EntryType to be u8 rather than u32	2022-06-02 00:04:01 +01:00
Billy Laws	0c11d9e294	Implement IDirectory::GetEntryCount	2022-06-02 00:04:01 +01:00
MCredstoner2004	c15b3a8d40	Make Applet accesses to the data queues lock Avoids potential races when the guest access the same applet from more than one thread.	2022-06-02 03:47:38 +05:30
Billy Laws	91b2c47991	Fix potential nvdrv submission race The syncpoint maximum value represents the maximum possible syncpt value at a given time, however due to PBs being submitted before max was incremented, for a brief moment of time this is not the case which could lead to crashes or other such behaviour if a game waits on the fence at the right moment.	2022-06-01 17:15:25 +01:00
PixelyIon	37453ed7fa	Use `DocumentsProvider` for log sharing We used a `FileProvider` for log sharing prior, this is no longer necessary since it comes under the `DocumentsProvider` now which can be utilized to share the log document directly.	2022-06-01 21:41:14 +05:30
PixelyIon	8efa9298f9	Fix name conflict resolution for `copyDocument` Any documents with the same name existing in a directory that is copied to would cause an exception due to existing already, this fixes that by handling conflict resolution in those cases and automatically determining a file name that would avoid a conflict.	2022-06-01 21:41:14 +05:30
Billy Laws	c4bd9c47e4	Stub NVGPU_GPU_IOCTL_ZBC_SET_TABLE nvdrv ioctl This was missed in the original implementation and caused crashes in some games.	2022-06-01 16:59:14 +01:00
Billy Laws	c639fdcf06	Fixup NFP service stub state handling Previously a broken state value was returned from GetState that caused crashes in games using newer SDKs and NFP, correctly handle state now by updating it after initialisation.	2022-06-01 15:00:26 +01:00
Billy Laws	c745e0e02b	Move image type logic to GuestTexture, allowing 2D array views for 3D RTs We can't render to a 3D texture through a 3D view, we instead have to create a 2D array view into it and render to that. The texture manager previously didn't support having a different view type/layer count between a guest texture view and the underlying storage texture that is required to support this so that was also implemented by reading the view layer count from the dimensions depth instead if the underlying texture is 3D (and the view type is 2D array). Additionally move away from our own view type enum to Vulkan, inline with other guest texture member types.	2022-05-31 22:09:53 +01:00
Billy Laws	22695c4feb	Stub nim services used for eShop communication We obviously don't need to implement these so add a simple set of stubs to satify games using them (mainly demos such as DQXII)	2022-05-31 22:07:01 +01:00
Billy Laws	ff12dc9c10	Add R32_SFLOAT to adreno validation layer format filtering	2022-05-31 22:03:53 +01:00
Billy Laws	6cc925c2d3	Reset RT mappings on dimension and format changes	2022-05-31 17:49:16 +01:00
Billy Laws	8180bf852e	Lock textures before attaching in BlitContext	2022-05-31 16:54:13 +01:00
Billy Laws	cb2b36e3ab	Allow providing a callback that's called after every CPU access in GMMU Required for the planned implementation of GPU side memory trapping.	2022-05-31 16:04:27 +01:00
Billy Laws	46ee18c3e3	Require depthBiasClamp Vulkan device feature Used in some UE4 games and supported by 95% of devices so skip implementing a fallback path.	2022-05-31 14:46:45 +01:00
PixelyIon	e592b11039	Drop `samplerAnisotropy` as a required GPU feature Sampler anisotropy was made a required feature in an earlier commit due to its widespread availability but this was determined to be incorrect as certain Mali GPUs that can otherwise run 2D games in Skyline do not have this feature, while they are still not officially supported as this was the only roadblock to support them, it has now been made an optional feature.	2022-05-31 01:37:40 +05:30
PixelyIon	4336134b07	Reintroduce `android:hasFragileUserData` due to stable signature `android:hasFragileUserData` was added in an earlier commit but then removed due to it not functioning because of signature checks. Now that signatures are consistent across builds, it has been readded and should now allow carrying data across CI and developer builds.	2022-05-31 01:37:07 +05:30
PixelyIon	e1cc8676cf	Add option to view internal directory With the Skyline document provider, easy access to the internal directory is required which may be hard to navigate to through the system file manager. This adds an option in settings to directly open up the directory in the system file manager.	2022-05-31 01:25:18 +05:30
PixelyIon	ba97985b55	Revise `DocumentsProvider` URI structure The URIs (Document ID + Root) of the Skyline `DocumentsProvider` was unoptimal as it wasn't relative to a base directory. This is required for opening a root without knowledge of the full path in advance, it is therefore cleaner to provide a uniform `ROOT_ID` in a companion class.	2022-05-31 01:25:18 +05:30
Mylah Dee	ee7da31fc6	Add `DocumentProvider` for accessing internal files On Android 12 and above, files from an application's external storage directory cannot be accessed by the user. The only proper SAF-compliant way to solve this is to create a `DocumentProvider` which proxies access to internal storage accordingly.	2022-05-30 15:09:30 +05:30
Narr the Reg	7aa6a5c4ca	Add HID touch attribute and index reporting Adds missing parameter TouchAttribute and emulates correctly the touch point index. Both changes are necessary on Voez to keep track of each finger.	2022-05-29 10:28:51 +01:00
PixelyIon	80c8fb8791	Implement CPU BCn Texture Decoding Certain GPU vendors such as ARM's Mali do not have support for BCn textures whatsoever while other vendors such as AMD only have partial support (BC1-BC3). Most titles on the guest utilize BC textures and to address this on host GPUs without support for BCn, we need to decompress the texture on the CPU. This commit implements a CPU BCn texture decoder based off Swiftshader's BC decoder, it also adds the necessary infrastructure to have different formats for the `GuestTexture` and `Texture` objects.	2022-05-28 21:22:24 +05:30
PixelyIon	fe615b1e03	Clarify texture swizzling inner-loop iteration count The iterations of the inner loop for sector deswizzling was miscalculated as `SectorWidth * SectorHeight` while the result was correct at `32`, it should be determined by the amount of sector lines within a GOB i.e.: `(GobWidth / SectorWidth) * GobHeight`.	2022-05-28 21:22:24 +05:30
PixelyIon	7d4e0a7844	Implement Mipmapped Texture Support Support for mipmapped textures was not implemented which is fairly crucial to proper rendering of games as the only level that would load is the first level (highest resolution), that might result in a lot more memory bandwidth being utilized. Mipmapping also has associated benefits regarding aliasing as it has a minor anti-aliasing effect on distant textures. This commit entirely implements mipmapping support but it does not extend to full support for views into specific mipmap levels due to the texture manager implemention being incomplete.	2022-05-28 21:22:24 +05:30
PixelyIon	da7e6a7df7	Replace Maxwell DMA `GuestTexture` usage with new swizzling API Maxwell DMA requires swizzled copies to/from textures and earlier it had to construct an arbitrary `GuestTexture` to do so but with the introduction of the cleaner API, this has become redundant which this commit cleans up and replaces with direct calls to the API with all the necessary values.	2022-05-28 21:22:24 +05:30
PixelyIon	de300bfdbe	Refactor Texture Swizzling The API for texture swizzling is now more concrete and abstracted out from `GuestTexture`, this allows for neater usage in certain areas such as MaxwellDMA while having a `GuestTexture` wrapper as well allowing for neater usage in those cases. The code itself has also been cleaned up slightly with all usage of `u32`s being upgraded to `size_t` as this is simply more efficient due to the compiler not needing to emulate wraparound behavior for integer types smaller than the processor word size.	2022-05-19 17:13:55 +05:30
Billy Laws	72473369b6	Account for OOB copyOffsets in CircularBuffer::Read Caused crashes in Pokemon	2022-05-14 15:30:59 +01:00
Robin Kertels	0a3cf25823	Implement the Fermi 2D blitting engine The Fermi 2D engine implements both image blit and resolve operations, supporting subpixel sampling with both linear and point filtering. Resolve operations are performed by sampling from the center of each pixel in order to resolve the final image from the MSAA samples MSAA images are stored in memory like regular images but each pixels dimensions are scaled: e.g for 2x2 MSAA ``` 112233 112233 445566 445566 ``` These would be sampled with both duDx and duDy as 2 (integer part), resolving to the following: ``` 123 456 ``` Blit operations are performed by sampling from the corner of each pixel, scaling the image as one would expect. This implementation isn't fully complete as Vulkan blit doesn't support some combinations which Fermi does, most notably between colour and depth stencil. These will be implemented properly at a later date, likely after the texture manager rework. Out of Bounds Blit, used by some OpenGL games is also missing since supporting it requires texture aliasing, this will also be supported after the texture manager rework. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-05-13 22:37:37 +01:00
Billy Laws	be2546138d	Move IOVA class to GMMU so it can be used for other engines	2022-05-13 22:37:37 +01:00
Billy Laws	3ad640fcbc	Fix accidental graphics context member/parameter duplication	2022-05-13 22:37:37 +01:00
PixelyIon	7a6f27a19a	Fix texture swizzling OOB writes Certain writes during swizzling went out of bounds due to incorrect `blockExtentY` calculation, the previous commit to fix this ended up breaking it further. This commit returns to the original commit's calculations with the proper addendum of a check for exact alignment with a GOB which is the case that was broken earlier.	2022-05-13 14:52:41 +05:30
PixelyIon	168e51e7ad	Always use `GetLayerStride` for layer stride in Texture The `GuestTexture::GetLayerStride` function was not always being utilized to retrieve the layer stride inside `Texture`, it would instead directly access the `guestTexture::layerStride` member. This is problematic as it may not be initialized and return `0` which would lead to a broken image copy.	2022-05-13 14:21:37 +05:30
Billy Laws	b81d5bc865	Implement and cleanup semaphore operations in all engines Most engines have the capability to release a semaphore payload (or reduce in the case of GPFIFO) when a method is called or action is complete. Semaphores are used by games for both timing how long things take on GPU and waiting on resources so missing them can cause deadlocks or other related issues.	2022-05-12 19:40:24 +01:00
Billy Laws	bca88685bd	Stub nvdrv {Get,Dump}Status	2022-05-12 17:38:22 +01:00
Billy Laws	97e740c986	Fix slight locking bug with nvmap handle duplication	2022-05-12 17:38:22 +01:00
Billy Laws	57378457dc	Treat symbol file paths without slashes as filenames Prevents crashes printing backtrace if this occurs	2022-05-12 17:38:22 +01:00
Billy Laws	d08ac63bbf	Use TIC maximum index over TSC when tscIndexLinked is set	2022-05-12 17:38:22 +01:00
Billy Laws	8e021a9f1f	Load custom drivers from app private data dir Required since /sdcard doesn't have exec perm support	2022-05-12 17:38:21 +01:00
Billy Laws	dcef597345	Introduce TrivialObject concept and use where appropriate Simplifies type checking and handles excluding container types that are trivially copyable but contain pointers	2022-05-12 17:38:21 +01:00
PixelyIon	f2cc25ee9f	Implement Array Texture Swizzling Textures can have more than one layer which we currently don't handle, all layers past the initial one will be filled with random data or 0s, leading to incorrect rendering. This has now been implemented now which fixes any titles which utilize array textures, such as "Super Mario Odyssey" or "Hatsune Miku: Project DIVA MegaMix".	2022-05-12 18:23:45 +05:30
PixelyIon	2a99e1784d	Fix Maxwell3D RT Depth/Layer Count Logic The Maxwell3D RT layer count wasn't being set correctly as it has the same register as the depth values and is toggled between the two based on another register value.	2022-05-12 18:23:05 +05:30
Billy Laws	543ac3042e	Cleanup account services and stub StoreSaveDataThumbnail	2022-05-11 23:24:35 +01:00
Billy Laws	7d30ac0cd8	Add additional nifm stubs	2022-05-11 23:24:35 +01:00
Billy Laws	a164635f32	Stub LibraryAppletPlayerSelect	2022-05-11 23:24:35 +01:00
Billy Laws	dd0004e208	Set Host1x log tag correctly	2022-05-11 22:11:16 +01:00
Billy Laws	f89bacf8ae	Fixup Host1x syncpoint locking	2022-05-11 22:04:02 +01:00
Billy Laws	d8ff318a1a	Prevent infinite VFS read loop on EOF	2022-05-11 22:03:39 +01:00
shutterbug2000	f078a5d1ec	Stub `bt` and `btm:u` Stub BT services which is required by titles such as Pokémon Let's GO Pikachu and Eevee (non-Demo versions).	2022-05-11 20:44:09 +05:30
PixelyIon	588b4529ee	Implement 3D Texture Swizzling The Maxwell GPU supports 3D textures which are tiled with the block-linear layout which didn't handle swizzling 3D textures correctly till now. This commit addresses that by implementing proper swizzling for 3D textures. Titles such as Cluster Truck and Super Mario Odyssey utilize 3D textures alongside a vast majority of other titles.	2022-05-11 14:06:04 +05:30
Billy Laws	601d67e369	Use resource size rather than allocation size for staging buffer size As per VMA docs: 'Allocation size returned in this variable may be greater than the size requested for the resource e.g. as VkBufferCreateInfo::size. Whole size of the allocation is accessible for operations on memory e.g. using a pointer after mapping with vmaMapMemory(), but operations on the resource e.g. using vkCmdCopyBuffer must be limited to the size of the resource.'	2022-05-10 18:48:20 +01:00
Billy Laws	d2acec24f5	Handle VFS reads into trapped memory regions pread will refuse to read into any trapped regions so implement a manual path with a staging buffer and memcpy for such cases	2022-05-10 18:33:55 +01:00
Billy Laws	1609fd2a32	Account for layerCount in SynchronizeGuestWithBuffer staging buffer size	2022-05-10 18:33:31 +01:00
Billy Laws	5b97b87503	Restore previous cullMode when cullFace is enabled	2022-05-10 18:31:32 +01:00
Billy Laws	15e9fa1c80	Fix FillRandomBytes There were two issues here: - If a skyline span was passed as a param then the 'T &object' version would be called, filling the span itself with random values rather than its contents - Random numbers were repeated every call since independent_bits_engine copied generator state and thus it was never actually updated	2022-05-10 18:28:15 +01:00
Billy Laws	622ff2a8f1	Correctly track 5.1 audio channel sample count Size needs to be adjusted for 5.1 buffers since they're downsampled to stereo.	2022-05-10 18:26:20 +01:00
PixelyIon	56c9b03843	Fix incorrect swizzling Y extent calculation This calculation for the amount of lines on the Y axis relative to the start of the last block was wrong and would instead determine the amount of lines to the last Y-axis GOB which wasn't accurate when padding was considered, this resulted in titles like Celeste having broken texture decoding (on a 1922x1082 texture) for the last ROB as most pixels would be masked out.	2022-05-09 20:25:43 +05:30
Billy Laws	018df355f0	Replace some VFS exceptions with warnings These errors aren't necessarily fatal so tone them down.	2022-05-08 19:37:10 +01:00
Billy Laws	e1c13bbc08	Update hades	2022-05-08 19:37:10 +01:00
PixelyIon	b307fca115	Fix attachment reuse within the same subpass Certain titles such as BOTW trigger behavior to reuse an attachment within the same subpass, this caused an exception inside `RenderPassNode::AddAttachment` as it cannot find corresponding subpass for attachment. To fix this issue, we now assume that when it cannot find a subpass for an existing attachment, it is attached to the latest subpass and return the attachment.	2022-05-08 18:26:40 +05:30
PixelyIon	e027555796	Handle Y-axis GOB non-alignment for swizzling Certain textures may be unaligned with a GOB's height of 8 lines, we already handle the case of being unaligned with a GOB's width of 64-bytes. This case occurs on titles such as SMO when going in-game.	2022-05-07 18:37:22 +05:30
PixelyIon	c910e29168	Extend `HostSignalHandler`'s `SIGSEGV` debugger path The function now returns from a segmentation fault when a debugger is present, this allows the entire context to be intact which can allow the debugger to correctly pick up variables from all stack frames while it could not extrapolate most variables when trapped inside the signal handler without the values of all registers.	2022-05-07 18:37:22 +05:30
Billy Laws	4149ab1067	Implement Maxwell 3D instanced draw support In the Maxwell 3D engine, instanced draws are implemented by repeating the exact same draw in sequence with special flag set in vertexBeginGl. This flag allows either incrementing the instance counter or resetting it, since we need to supply an instance count to the host API we defer all draws until state changes occur. If there are no state changes between draws we can skip them and count the occurences to get the number of instances to draw.	2022-05-07 13:56:09 +01:00
Billy Laws	03594a081c	Ensure correct flushing for batched constant buffer updates Cbufs could be read by non-maxwell3D engines so force a flush when switching to them or before Execute.	2022-05-07 13:56:09 +01:00
PixelyIon	ad989750fc	Implement Maxwell3D Point Sprite Size Implements register state that corresponds to the size of a single point sprite in Maxwell 3D, this is emitted by the shader compiler in the preamble but needs to be only applied if the input topology is a point primitive and it is invalid to set the point size in any other case.	2022-05-07 03:46:25 +05:30
PixelyIon	874a6a2a6c	Fix `getTextureType` enum conversion fomatting	2022-05-07 03:46:25 +05:30
PixelyIon	ae5bcbdb5c	Fix Depth RT lock to be in scope Earlier texture locking design required the lock to be retained but since the introduction of `AttachTexture`, this no longer needs to be done. This being done caused deadlocks when the depth texture is sampled by the fragment shader while being bound as an RT since it would attempt to lock the texture again.	2022-05-07 02:37:48 +05:30
shutterbug2000	1c8d994161	Basic `bcat:u` implementation A basic `bcat:u` implementation to prevent titles such as "Kirby and the Forgotten Land" dependent on BCAT support from crashing due to the lack of an implementation.	2022-05-06 15:41:48 +05:30
PixelyIon	4fd64a53e0	Require Vulkan `samplerAnisotropy` feature This is a widely supported feature that games may require conditionally but due to it being supported on effectively all target devices, it was made mandatory. This is used by titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	1d9b4a865a	Add additional formats to Adreno filter `VK_FORMAT_R32G32B32A32_SFLOAT` and `D32_SFLOAT` have their capabilities misreported as well, this spams the logs in titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	b87295374e	Improve Controller Applet log Improves the readability of the log and replaces the previously uninformative prefix of `operator()` due to being in a lambda with `Controller support`.	2022-05-06 15:41:48 +05:30
PixelyIon	98c730a644	Implement linked TIC/TSC handle in Maxwell3D Maxwell3D has a register for linking the TIC/TSC index in bindless texture handles, this is used by games to implement bindless combined texture-sampler handles.	2022-05-06 14:58:20 +05:30
PixelyIon	23a091100d	Implement `ReadCbufValue` + `ReadTextureType` Implements `GraphicsEnvironment::ReadCbufValue` & `GraphicsEnvironment::ReadTextureType` with a framework of heterogeneous lookups for caching and callbacks for querying constant buffer or TIC values with validation checks for successive draws to ensure unique IR is generated.	2022-05-06 14:39:36 +05:30
PixelyIon	765c3f4e1f	Allow draws with no descriptor set resources The `descriptorSetWrites` being filled is now optional and the case of it being empty is handled correctly, this is done by certain titles such as ARMS and is entirely valid behavior. It should be noted that not doing this leads to errors in the guest due to invalid GPU state while working on the host GPU.	2022-05-06 10:33:47 +05:30
PixelyIon	37327f1955	Fix and refactor SVC `SignalToAddress`/`WaitForAddress` SVC `SignalToAddress` had a bug with the behavior of `SignalAndModifyBasedOnWaitingThreadCountIfEqual` which was entirely incorrect and led to deadlocks in titles such as ARMS that were dependent on it. This commit corrects the behavior and refactors both SVCs and moves their arbitration/waiting to inside the corresponding `KProcess` function rather than the SVC to avoid redundancies and improve code readability.	2022-05-05 19:15:37 +05:30
PixelyIon	396979e897	Extend Adreno format-based filtering for Validation Layer Filtering of validation logs is now extended beyond BCn formats and now covers other format which have their feature set misreported by the driver, this significantly drives down the amount of logs depending on the title.	2022-05-05 19:15:37 +05:30
PixelyIon	62ea2a6da5	Avoid format aliasing warnings on Adreno Implements an algorithm to determine formats that can be aliased as views without needing `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT`, this avoids spamming warning logs on view creation when the aliased formats will function in practice.	2022-05-05 19:15:37 +05:30
PixelyIon	7206ab4c67	Fix `exclusiveSubpass` by finishing render pass at end There was an oversight with exclusive subpasses which could lead to RPs with more than one subpass could be created even though one pass was exclusive, this oversight was not finishing the render pass at the end of `AddSubpass`. This could lead to a future subpass adding to the end of that RP even though it was intended to exclusively have a single subpass. This case occurs in titles such as Celeste (in-game) and breaks rendering on GPUs that may require exclusive subpasses for proper functionality.	2022-05-05 11:14:38 +05:30
PixelyIon	96fe5f0a0e	Set initial `subpassCount` value to 1 rather than 0	2022-05-05 11:07:43 +05:30
PixelyIon	5d08d6e06f	Disable unnecessary Khronos Validation Layer logs The Khronos Validation Layer can often generate warning/error logs due to our intentional breakage from Vulkan specification, these can occur several times a frame resulting in the logs being spammed and making it difficult to extract useful information out of logs. The scope of these logs has now been reduced with more general filtering and the introduction of specialized filtering to handle complex cases such as BCn hacks with `libadrenotools` on Adreno devices.	2022-05-04 13:20:59 +05:30
PixelyIon	23c9388caf	Fix `VK_KHR_push_descriptor`-less path for descriptor set updates Descriptor set updates were broken on the non-push-descriptor path due to lifetime issues with VkDescriptorSetLayout's usage during the execution phase which entirely broke rendering on AMD/Mali GPUs due to them not supporting `VK_KHR_push_descriptor`. This commit addresses that by moving the allocation of a descriptor set to outside the lambda and into the recording phase, it also simplifies the semantics and resources passed into the lambda by removing redundancies.	2022-05-04 00:49:21 +05:30
PixelyIon	47bc3b4d99	Fix Render Pass Cache The Vulkan render pass cache was fundamentally broken since it was designed around the Render Pass Compatibility clause due to being designed for framebuffer compatibility initially. As this scope was extended to a general render pass cache, the amount of data in the key was not extended to include everything it should have. This commit introduces the missing pieces in the RP cache and simplifies the underlying code in the process.	2022-05-01 20:31:36 +05:30
PixelyIon	25a29f9044	Skip zero-initializing shader bytecode backing The backing for shader data would implicitly be zero-initialized due to a `resize` on every shader parse, this was entirely unnecessary as we would overwrite the entire range regardless. We avoid this by using statically allocated storage and a span over it containing the shader bytecode which avoids any unnecessary clear semantics without resorting to more complex solutions such as a custom allocator.	2022-05-01 18:27:27 +05:30
PixelyIon	42573170c6	Implement Framebuffer Cache Implements a cache for storing `VkFramebuffer` objects with a special path on devices with `VK_KHR_imageless_framebuffer` to allow for more cache hits due to an abstract image rather than a specific one. Caching framebuffers is a fairly crucial optimization due to the cost of creating framebuffers on TBDRs since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding framebuffer memory.	2022-05-01 18:27:27 +05:30
PixelyIon	af7f0c301e	Avoid redundant `VkImageView` recreation There are a lot of cases of `VkImageView` being recreated arbitrarily due to it being tied to the ephemeral object `TextureView` rather than `Texture`, this commit flips that by storing all `VkImageView`s inside `Texture` with `TextureView` simply holding a copy of the handle to them. Additionally, this change results in stable `VkImageView` handles and helps in paving the path for framebuffer caching when `VK_KHR_imageless_framebuffer` is unavailable.	2022-05-01 18:27:27 +05:30
PixelyIon	41b2c2dc7b	Add `profileable` attribute to `AndroidManifest.xml` As we desire more accurate profiling data in certain circumstances, making the app explicitly profilable will allow for this, it will also remove the (annoying) prompt to do this in the Android Studio profiler.	2022-05-01 18:27:27 +05:30
PixelyIon	da931cf07b	Implement Render Pass Cache Implements a cache for storing `VkRenderPass` objects which are often reused, they are not extremely expensive to create generally but this is a required step to build up to a framebuffer cache which is an extremely expensive object to create on TBDRs generally since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding memory.	2022-05-01 18:16:53 +05:30
Billy Laws	ae77bde171	Fixup audio device name writing in services Games expect the output buffer the be entirely zero filled past the device name.	2022-04-30 16:00:33 +01:00
Billy Laws	194cbe6c7c	Stub several HID functions	2022-04-30 16:00:33 +01:00
Billy Laws	112c20cef2	Stub QueryAudioDevice{Input,Output}Event Used in many 3.0.0+ games	2022-04-30 16:00:33 +01:00
Billy Laws	8d7dbe2c4e	Add a way to get a readonly span of Buffer contents Avoids the need redundantly copy data when it is being directly processed on the CPU (e.g. quad coversion)	2022-04-30 16:00:33 +01:00
MK73DS	4c71ef5c31	Fix American English language code	2022-04-30 18:43:22 +05:30
PixelyIon	90c635bf78	Coalesce subpasses with compatible attachments together We run into a lot of successive subpasses with the exact same framebuffer configuration which we now exploit to avoid the creation of a new subpass due to the overhead involved with this. This provides significant performance boosts in certain cases due to the magnitude of difference in the amount of subpasses being created while providing next to no benefit in other cases.	2022-04-27 13:22:34 +05:30
PixelyIon	a947933bf0	Fix `Buffer` cycle check being inverted The check for the fence cycle being the same as the current cycle was incorrectly inverted to be the opposite of what it should have been, leading to bugs.	2022-04-27 13:07:36 +05:30
PixelyIon	54794f4b71	Move `Texture` locking and synchronization to `PresentationEngine` The responsibility for synchronizing a texture and locking it is now on the `PresentationEngine` rather than the API-user as this'll allow more fine grained locking and delay waiting until necessary.	2022-04-25 21:01:16 +05:30
Billy Laws	1dd230afde	Refactor all std::lock_guard usages to std::scoped_lock	2022-04-25 15:00:30 +01:00
PixelyIon	94e6f3cfa0	Add quirk for relaxed render pass compatibility As we require a relaxed version of the Vulkan render pass compatibility clause for caching multi-subpass render passes, we now utilize a quirk to determine if this is supported which it is on Nvidia/Adreno while AMD/Mali where it isn't supported we force single-subpass render passes.	2022-04-24 16:18:36 +05:30
PixelyIon	44615c8dd2	Implement per-vendor `VkQueue` maximum global priority We found out that certain vendors such as Nvidia had a limitation on the global priority of a queue and requesting `VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT` would result in `VK_ERROR_NOT_PERMITTED_EXT`. A quirk has been introduced to supply the maximum supported global priority which is currently set on a per-vendor basis to avoid future crashes.	2022-04-24 16:15:01 +05:30
PixelyIon	7ef4959060	Implement Graphics Pipeline Cache Implements a cache for storing `VkPipeline` objects which are fairly expensive to create and doing so on a per-frame basis was rather wasteful and consumed a significant part of frametime. It should be noted that this is not compliant with the Vulkan specification and will break unless the driver supports a relaxed version of the Vulkan specification's Render Pass Compatibility clause.	2022-04-24 14:31:00 +05:30
PixelyIon	50a8b69f7b	Optimize descriptor set writes using push descriptors We can use inline push descriptors for writing to descriptor rather than allocating a descriptor set for a one time write and freeing it as this is rather inefficient while an inline push descriptor generally ends up being a direct `memcpy` on the driver side designed for this use-case.	2022-04-24 13:45:09 +05:30
PixelyIon	5adafbff04	Set `VkQueue`'s global priority to high We want Skyline to have the most favorable GPU scheduling possible due to low latency and high throughput requirements, we request high priority scheduling due to this reason.	2022-04-24 13:34:09 +05:30
PixelyIon	f9c052d1b7	Implement Maxwell3D Tessellation State This implements all Maxwell3D registers and HLE Vulkan state for Tessellation including invalidation of the TCS (Tessellation Control Shader) state during state changes.	2022-04-24 13:23:00 +05:30
Billy Laws	de796cd2cd	Implement overhead-free sequenced buffer updates with megabuffers Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. We had earlier tried to fix this by using vkCmdUpdateBuffer however this caused significant performance loss due to an oversight in Adreno drivers. We could have worked around this simply by using vkCmdCopy buffer however there would still be a performance loss due to renderpasses being split up with copies inbetween. To avoid this we introduce 'megabuffers', a brand new technique not done before in any other switch emulators. Rather than replaying the copies in sequence on the GPU, we take advantage of the fact that buffers are generally small in order to replay buffers on the GPU instead. Each write and subsequent usage of a buffer will cause a copy of the buffer with that write, and all prior applied to be pushed into the megabuffer, this way at the start of execute the megabuffer will hold all used states of the buffer simultaneously. Draws then reference these individual states in sequence to allow everything to work without any copies. In order to support this buffers have been moved to an immediate sync model, with synchronisation being done at usage-time rather than execute (in order to keep contents properly sequenced) and GPU-side writes now need to be explictly marked (since they prevent megabuffering). It should also be noted that a fallback path using cmdCopyBuffer exists for the cases where buffers are too large or GPU dirty.	2022-04-23 22:48:28 +01:00
lynxnb	0d9992cb8e	Implement `QuadList` support for non-indexed draws	2022-04-20 18:17:10 +02:00
lynxnb	bcaf7dfe1c	Make `GetVertexBuffer` return a pointer to the requested buffer This avoids a redundancy in the `Draw` function and makes code easier to read	2022-04-20 18:16:45 +02:00
Billy Laws	5c3559e888	Revert "Implement support for GPU-side constant buffer updating" This reverts commit `d79635772f`.	2022-04-18 13:28:58 +01:00
Billy Laws	7bf3580031	Revert "Allow external synchronization for buffers" This reverts commit `372ab8befa`.	2022-04-18 13:28:58 +01:00
PixelyIon	ddc9622b90	Fix Shader Module Cache As bindings weren't correctly handled due to the fact that `EmitSPIRV` would change the bindings, the shader module cache would not correctly function and have no cache hits in `find` and rather have them in `try_emplace` which negated any performance benefit of it. This has now been fixed by retaining the initial cache key for insertion into the cache while also storing the post-emit bindings and restoring them during a cache hit.	2022-04-18 12:18:15 +05:30
Billy Laws	32fe01e145	Implement batch constant buffer updates Avoids spamming the driver with hundreds of cbuf updates per frame by batching all consecutive updates into one.	2022-04-17 00:35:00 +01:00
PixelyIon	02f99273ac	Implement Shader Module Cache Implements caching of the compiled shader module (`VkShaderModule`) in an associative map based on the supplied IR, bindings and runtime state to avoid constant recompilation of shaders. This doesn't entirely address shader compilation as an issue since host shader compilation is tied to Vulkan pipeline objects rather than Vulkan shader modules, they need to be cached to prevent costly host shader recompilation.	2022-04-16 18:45:56 +05:30
PixelyIon	76d8172a35	Implement Shader IR Cache This implements the first step of a full shader cache with caching any IR by treating the shared pointer as a handle and key for an associative map alongside hashing the Maxwell shader bytecode, it supports both single shader program and dual vertex program caching.	2022-04-16 18:45:56 +05:30
PixelyIon	0baa90d641	Implement `SpanEqual` and `SpanHash` We desire the ability to hash and check equality of data across spans to use associative containers such as `std::unordered_map` with spans. The implemented functions provide an easy way to do that.	2022-04-16 18:45:56 +05:30
Billy Laws	df5d1256c2	Implement an object backed IStorage backing This is more convinient and efficient to use when passing structured data out of applets	2022-04-16 18:45:56 +05:30
Billy Laws	d115ce3c05	Stub the controller applet Mostly based off of yuzu's implementation, this will need to be extended in the future to open up a UI for configuring controllers according to the applications requirements.	2022-04-16 18:45:56 +05:30
Billy Laws	9a8e39cba1	Slightly refactor controller code in HID Now uses ranges where possible and a function to get the number of connected controllers has been added.	2022-04-16 18:45:56 +05:30
Billy Laws	2873f11baa	Pass shared pointers by value in applet infrastructure This is more optimal than crefs when used together with std::move	2022-04-16 18:45:56 +05:30
PixelyIon	8ccef733ff	Fix UB with guest-less Texture/Buffers in `MarkGpuDirty` As there was no check for the lack of a `GuestTexture`/`GuestBuffer`, it would lead to UB when a texture/buffer that had no guest such as the `zeroTexture` from `GraphicsContext` would be marked as dirty they would cause a call to `NCE::RetrapRegions` with a `nullptr` handle that would be dereferenced and cause a segmentation fault.	2022-04-16 18:45:56 +05:30
PixelyIon	372ab8befa	Allow external synchronization for buffers In certain situations such as constant buffer updates, we desire to use the guest buffer as a shadow buffer forwarding all writes directly to it while we update the host using inline buffer updates so they happen in-sequence. This requires special behavior as we cannot let any synchronization operations take place as they would break the shadow buffer, as a result, an external synchronization flag has been added to prevent this from happening. It should be noted that this flag is not respected for buffer recreation which will lead to UB, this can and will break updates in certain cases and this change isn't complete without buffer manager support.	2022-04-16 18:44:53 +05:30
PixelyIon	c0c4db68a8	Fix `BufferView` offset not being added in `vkCmdUpdateBuffer` The offset of the view wasn't added to the `vkCmdUpdateBuffer`, this would cause the offset to be incorrect given the buffer was a view of a larger buffer that wasn't the start of it. This commit fixes that by adding the offset of the view to the buffer update.	2022-04-14 18:06:15 +05:30
PixelyIon	a1c06e0401	Mark GPU resources as dirty before GPU usage We didn't call `MarkGpuDirty` on textures/buffers prior to GPU usage, this would cause them to not be R/W protected when they should be and provide outdated copies if there were any read accesses from the CPU (which are not possible at the moment since we assume all accesses are writes at the moment). This has now been fixed by calling it after synchronizing the resource.	2022-04-14 17:20:05 +05:30
PixelyIon	41a6afed01	Fix `GraphicsContext` code formatting for auto formatter	2022-04-14 15:27:22 +05:30
PixelyIon	624df92616	Change `AddNonGraphicsPass` to `AddOutsideRpCommand` The terminology "Non-Graphics pass" was deemed to be fairly inaccurate since it simply covered all Vulkan commands (not "passes") outside the render-pass scope, these may be graphical operations such as blits and therefore it is more accurate to use the new terminology of "Outside-RenderPass command" due to the lack of such an implication while being consistent with the Vulkan specification.	2022-04-14 15:20:22 +05:30
Billy Laws	a31332e35f	Align Maxwell 3D macro newline slashes	2022-04-14 14:14:52 +05:30
Billy Laws	d79635772f	Implement support for GPU-side constant buffer updating Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. Fix this by updating cbufs on the GPU/CPU seperately, only ever syncing them back at the start or after a guest side CPU write, at the moment only a single word is updated at a time however this can be optimised in the future to batch all consecutive updates into one large one.	2022-04-14 14:14:52 +05:30
Robin Kertels	036faedabd	Implement a way to run non-graphics passes with command executor These commands will end the current renderpass and run on their own, this is useful for compute, blits etc.	2022-04-14 14:14:52 +05:30
Billy Laws	feb179fcff	Implement primitive restart support Maxwell3D also supports using an arbitrary restart index value however no games are known to use this so leave it for now.	2022-04-14 14:14:52 +05:30
Billy Laws	3f3acc31d8	Rework swizzle infrastructure to support arbritary format swizzles This is required to support R4G4B4A4 which has no directly corresponding Vulkan format. Co-authored-by: Lunar-Pixel <lunarn452@gmail.com>	2022-04-14 14:14:52 +05:30
PixelyIon	6f85a66151	Implement host-only `Buffer`s We require certain buffers to only be on the host while being accessible through the same abstractions as a guest buffer as they must be interchangeable in usage.	2022-04-14 14:14:52 +05:30
Billy Laws	2c697ec36a	Determine depth/stencil texture aspect based off of image swizzle Required since we can't have a non-rt image with both a depth/stencil aspect at the same time according to vk spec.	2022-04-14 14:14:52 +05:30
PixelyIon	1878e582ad	Add `ScopedStackBlocker` to `RomFile.populate` We needed to block stack frame lookups past JNI code as Java doesn't follow the ARMv8 frame pointer ABI which leads to invalid pointer dereferences. Any JNI function that throws or handles exceptions must do this now or it may lead to a `SIGSEGV`.	2022-04-14 14:14:52 +05:30
Billy Laws	68e693d9f4	Fix DMA Engine debug logs to not crash emu Address causes some type issues when printing directly so explicitly cast to u64 first to prevent them.	2022-04-14 14:14:52 +05:30
Billy Laws	8eaca87de8	Use an empty host texture in place of invalid TIC entries on guest Some games may pass empty TICs as inputs to shaders while not actually using them within the shader. Create an empty texture and pass this in instead when we hit this case, the nullDescriptor feature could be used but it's not supported by all devices so we chose to do it this way instead.	2022-04-14 14:14:52 +05:30
PixelyIon	41b98c7daa	Add stack tracing to `skyline::exception` Skyline's `exception` class now stores a list of all stack frames during the invocation of the exception. These can later be parsed by the exception handler to generate a human-readable stack trace. To assist with more complete stack traces, `-fno-omit-frame-pointer` is now passed on debug builds which forces the inclusion of frames on function calls.	2022-04-14 14:14:52 +05:30
PixelyIon	cd8fa66326	Fix NCE Destruction NCE is implicitly depended on by the `GPU` class due to the NCE Memory Trapping API so the destruction of it must take place after the destruction of the `GPU` class. Additionally, to prevent bugs the NCE destructor must set `staticNce` to `nullptr` as the signal handler will potentially access a destroyed instance of NCE otherwise.	2022-04-14 14:14:52 +05:30
Billy Laws	815f1f4067	Add support for sRGB TIC textures Without this sRGB textures would be interpreted as RGB leading to colours being slighly off. The sRGB flag isn't stored as part of format word so we reuse the _pad_ field of it to store the flag for the switch case.	2022-04-14 14:14:52 +05:30
Billy Laws	1ba4abf950	Add Astc{6x6,8x8} and R4G4B4A4 image formats	2022-04-14 14:14:52 +05:30
MCredstoner2004	dec0571eee	Infrastructure for applets to be implemented This removes a stub for an applet and implements several applet related service calls.	2022-04-14 14:14:52 +05:30
PixelyIon	164d4852fa	Sleep-loop rather than abort during termination We don't want to actually exit the process as it'll automatically be restarted gracefully due to a timeout after being unable to exit within a fixed duration so we just want to infinite sleep during termination. This should fix issues where exiting any game would cause the app to force close after some time as exception signal handling would fail in the background, the app should stay open now and automatically restart itself when another game is loaded in.	2022-04-14 14:14:52 +05:30
PixelyIon	ea00f1bb82	Flush emulation logs after exceptions A lot of logs are incomplete due to being unable to flush inside the signal handler, now we flush after any exceptions so that there is a guarantee of any exceptions being logged as this is crucial for proper debugging.	2022-04-14 14:14:52 +05:30
PixelyIon	62ba180550	Use R5G6B5 as Vulkan swapchain format rather than B5G6R5 B5G6R5 isn't generally supported by the swapchain and the format is used for R5G6B5 with swapped R/B channels to avoid aliasing so we reverse that by using R5G6B5 as the underlying Vulkan format for the swapchain which should be automatically handled by the driver for any copies from B5G6R5 textures and the data representation should be the same as B5G6R5 with swapped R/B channels so not reporting the correct texture::Format should be fine.	2022-04-14 14:14:52 +05:30
MK73DS	e54f86e923	Fix IApplicationFunctions::GetDisplayVersion id (https://switchbrew.org/wiki/Applet_Manager_services#IApplicationFunctions)	2022-04-14 14:14:52 +05:30
Billy Laws	77cf33b643	Trigger command executor before DMA copies DMA copies can use textures currently in active use on the GPU as dst/src so Execute before to prevent a deadlock	2022-04-14 14:14:52 +05:30
Billy Laws	dbbc5704d2	Implement DMA engine Block Linear->Linear copies	2022-04-14 14:14:52 +05:30
Billy Laws	3e4e8de1d2	Implement primitive Linear->Block Linear DMA engine copies Slightly inaccurate and misses some features but good enough for most games, should be revisted later.	2022-04-14 14:14:52 +05:30
Billy Laws	3c26921d54	Implement the Maxwell DMA engine The DMA engine is used to perform DMA buffer/texture copies directly on the GPU. It can deswizzle arbritary regions of input textures, perform component remapping and swizzle into output textures. This impl only supports 1D buffer copies, 2D ones will come later.	2022-04-14 14:14:52 +05:30
Billy Laws	3df76e84c3	Stub IRequest::GetAppletInfo in nifm	2022-04-14 14:14:52 +05:30
Billy Laws	6c5f9941ad	Stub additional IAddOnContentManager functions Used mainly by UE4 games	2022-04-14 14:14:52 +05:30
Billy Laws	486a835d0a	Use guest texture view type to determine the underlying image type If we have a Nx1x1 image then determining the type from dimensions will result in a 1D image being created thus preventing us from creating a 2D view. By using the image view type we can avoid this for textures from TICs since we know in advance how they will be used	2022-04-14 14:14:52 +05:30
Billy Laws	05966f34e5	Stub a pair of ISelfController functions Both used by SMO, SetScreenShotPermission and SetAlbumImageOrientation	2022-04-14 14:14:52 +05:30
Billy Laws	fe37d7c9be	Implement ICommonStateGetter::SetRequestExitToLibraryAppletAtExecuteNextProgramEnabled	2022-04-14 14:14:52 +05:30
Billy Laws	9813f9f8dc	Implement ICommonStateGetter::GetDefaultDisplayResolutionChangeEvent	2022-04-14 14:14:52 +05:30
Billy Laws	7e7c0252ca	Implement IApplicationFunctions::GetDisplayVersion	2022-04-14 14:14:52 +05:30
Billy Laws	b1f10865a0	Attach depth RT to command executor before draws This enforces that the depth RT outlives the draw, without this the depth RT could be freed while in active use by command executor leading to UAFs and crashes.	2022-04-14 14:14:52 +05:30

... 3 4 5 6 7 ...

1267 Commits