skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-12-25 19:11:51 +01:00

Author	SHA1	Message	Date
Billy Laws	a09414424b	Fix broken VFS directory creation	2022-06-02 00:04:01 +01:00
Billy Laws	3518e04a18	Correct Directory EntryType to be u8 rather than u32	2022-06-02 00:04:01 +01:00
Billy Laws	0c11d9e294	Implement IDirectory::GetEntryCount	2022-06-02 00:04:01 +01:00
MCredstoner2004	c15b3a8d40	Make Applet accesses to the data queues lock Avoids potential races when the guest access the same applet from more than one thread.	2022-06-02 03:47:38 +05:30
Billy Laws	91b2c47991	Fix potential nvdrv submission race The syncpoint maximum value represents the maximum possible syncpt value at a given time, however due to PBs being submitted before max was incremented, for a brief moment of time this is not the case which could lead to crashes or other such behaviour if a game waits on the fence at the right moment.	2022-06-01 17:15:25 +01:00
Billy Laws	c4bd9c47e4	Stub NVGPU_GPU_IOCTL_ZBC_SET_TABLE nvdrv ioctl This was missed in the original implementation and caused crashes in some games.	2022-06-01 16:59:14 +01:00
Billy Laws	c639fdcf06	Fixup NFP service stub state handling Previously a broken state value was returned from GetState that caused crashes in games using newer SDKs and NFP, correctly handle state now by updating it after initialisation.	2022-06-01 15:00:26 +01:00
Billy Laws	c745e0e02b	Move image type logic to GuestTexture, allowing 2D array views for 3D RTs We can't render to a 3D texture through a 3D view, we instead have to create a 2D array view into it and render to that. The texture manager previously didn't support having a different view type/layer count between a guest texture view and the underlying storage texture that is required to support this so that was also implemented by reading the view layer count from the dimensions depth instead if the underlying texture is 3D (and the view type is 2D array). Additionally move away from our own view type enum to Vulkan, inline with other guest texture member types.	2022-05-31 22:09:53 +01:00
Billy Laws	22695c4feb	Stub nim services used for eShop communication We obviously don't need to implement these so add a simple set of stubs to satify games using them (mainly demos such as DQXII)	2022-05-31 22:07:01 +01:00
Billy Laws	ff12dc9c10	Add R32_SFLOAT to adreno validation layer format filtering	2022-05-31 22:03:53 +01:00
Billy Laws	6cc925c2d3	Reset RT mappings on dimension and format changes	2022-05-31 17:49:16 +01:00
Billy Laws	8180bf852e	Lock textures before attaching in BlitContext	2022-05-31 16:54:13 +01:00
Billy Laws	cb2b36e3ab	Allow providing a callback that's called after every CPU access in GMMU Required for the planned implementation of GPU side memory trapping.	2022-05-31 16:04:27 +01:00
Billy Laws	46ee18c3e3	Require depthBiasClamp Vulkan device feature Used in some UE4 games and supported by 95% of devices so skip implementing a fallback path.	2022-05-31 14:46:45 +01:00
PixelyIon	e592b11039	Drop `samplerAnisotropy` as a required GPU feature Sampler anisotropy was made a required feature in an earlier commit due to its widespread availability but this was determined to be incorrect as certain Mali GPUs that can otherwise run 2D games in Skyline do not have this feature, while they are still not officially supported as this was the only roadblock to support them, it has now been made an optional feature.	2022-05-31 01:37:40 +05:30
Narr the Reg	7aa6a5c4ca	Add HID touch attribute and index reporting Adds missing parameter TouchAttribute and emulates correctly the touch point index. Both changes are necessary on Voez to keep track of each finger.	2022-05-29 10:28:51 +01:00
PixelyIon	80c8fb8791	Implement CPU BCn Texture Decoding Certain GPU vendors such as ARM's Mali do not have support for BCn textures whatsoever while other vendors such as AMD only have partial support (BC1-BC3). Most titles on the guest utilize BC textures and to address this on host GPUs without support for BCn, we need to decompress the texture on the CPU. This commit implements a CPU BCn texture decoder based off Swiftshader's BC decoder, it also adds the necessary infrastructure to have different formats for the `GuestTexture` and `Texture` objects.	2022-05-28 21:22:24 +05:30
PixelyIon	fe615b1e03	Clarify texture swizzling inner-loop iteration count The iterations of the inner loop for sector deswizzling was miscalculated as `SectorWidth * SectorHeight` while the result was correct at `32`, it should be determined by the amount of sector lines within a GOB i.e.: `(GobWidth / SectorWidth) * GobHeight`.	2022-05-28 21:22:24 +05:30
PixelyIon	7d4e0a7844	Implement Mipmapped Texture Support Support for mipmapped textures was not implemented which is fairly crucial to proper rendering of games as the only level that would load is the first level (highest resolution), that might result in a lot more memory bandwidth being utilized. Mipmapping also has associated benefits regarding aliasing as it has a minor anti-aliasing effect on distant textures. This commit entirely implements mipmapping support but it does not extend to full support for views into specific mipmap levels due to the texture manager implemention being incomplete.	2022-05-28 21:22:24 +05:30
PixelyIon	da7e6a7df7	Replace Maxwell DMA `GuestTexture` usage with new swizzling API Maxwell DMA requires swizzled copies to/from textures and earlier it had to construct an arbitrary `GuestTexture` to do so but with the introduction of the cleaner API, this has become redundant which this commit cleans up and replaces with direct calls to the API with all the necessary values.	2022-05-28 21:22:24 +05:30
PixelyIon	de300bfdbe	Refactor Texture Swizzling The API for texture swizzling is now more concrete and abstracted out from `GuestTexture`, this allows for neater usage in certain areas such as MaxwellDMA while having a `GuestTexture` wrapper as well allowing for neater usage in those cases. The code itself has also been cleaned up slightly with all usage of `u32`s being upgraded to `size_t` as this is simply more efficient due to the compiler not needing to emulate wraparound behavior for integer types smaller than the processor word size.	2022-05-19 17:13:55 +05:30
Billy Laws	72473369b6	Account for OOB copyOffsets in CircularBuffer::Read Caused crashes in Pokemon	2022-05-14 15:30:59 +01:00
Robin Kertels	0a3cf25823	Implement the Fermi 2D blitting engine The Fermi 2D engine implements both image blit and resolve operations, supporting subpixel sampling with both linear and point filtering. Resolve operations are performed by sampling from the center of each pixel in order to resolve the final image from the MSAA samples MSAA images are stored in memory like regular images but each pixels dimensions are scaled: e.g for 2x2 MSAA ``` 112233 112233 445566 445566 ``` These would be sampled with both duDx and duDy as 2 (integer part), resolving to the following: ``` 123 456 ``` Blit operations are performed by sampling from the corner of each pixel, scaling the image as one would expect. This implementation isn't fully complete as Vulkan blit doesn't support some combinations which Fermi does, most notably between colour and depth stencil. These will be implemented properly at a later date, likely after the texture manager rework. Out of Bounds Blit, used by some OpenGL games is also missing since supporting it requires texture aliasing, this will also be supported after the texture manager rework. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-05-13 22:37:37 +01:00
Billy Laws	be2546138d	Move IOVA class to GMMU so it can be used for other engines	2022-05-13 22:37:37 +01:00
Billy Laws	3ad640fcbc	Fix accidental graphics context member/parameter duplication	2022-05-13 22:37:37 +01:00
PixelyIon	7a6f27a19a	Fix texture swizzling OOB writes Certain writes during swizzling went out of bounds due to incorrect `blockExtentY` calculation, the previous commit to fix this ended up breaking it further. This commit returns to the original commit's calculations with the proper addendum of a check for exact alignment with a GOB which is the case that was broken earlier.	2022-05-13 14:52:41 +05:30
PixelyIon	168e51e7ad	Always use `GetLayerStride` for layer stride in Texture The `GuestTexture::GetLayerStride` function was not always being utilized to retrieve the layer stride inside `Texture`, it would instead directly access the `guestTexture::layerStride` member. This is problematic as it may not be initialized and return `0` which would lead to a broken image copy.	2022-05-13 14:21:37 +05:30
Billy Laws	b81d5bc865	Implement and cleanup semaphore operations in all engines Most engines have the capability to release a semaphore payload (or reduce in the case of GPFIFO) when a method is called or action is complete. Semaphores are used by games for both timing how long things take on GPU and waiting on resources so missing them can cause deadlocks or other related issues.	2022-05-12 19:40:24 +01:00
Billy Laws	bca88685bd	Stub nvdrv {Get,Dump}Status	2022-05-12 17:38:22 +01:00
Billy Laws	97e740c986	Fix slight locking bug with nvmap handle duplication	2022-05-12 17:38:22 +01:00
Billy Laws	57378457dc	Treat symbol file paths without slashes as filenames Prevents crashes printing backtrace if this occurs	2022-05-12 17:38:22 +01:00
Billy Laws	d08ac63bbf	Use TIC maximum index over TSC when tscIndexLinked is set	2022-05-12 17:38:22 +01:00
Billy Laws	8e021a9f1f	Load custom drivers from app private data dir Required since /sdcard doesn't have exec perm support	2022-05-12 17:38:21 +01:00
Billy Laws	dcef597345	Introduce TrivialObject concept and use where appropriate Simplifies type checking and handles excluding container types that are trivially copyable but contain pointers	2022-05-12 17:38:21 +01:00
PixelyIon	f2cc25ee9f	Implement Array Texture Swizzling Textures can have more than one layer which we currently don't handle, all layers past the initial one will be filled with random data or 0s, leading to incorrect rendering. This has now been implemented now which fixes any titles which utilize array textures, such as "Super Mario Odyssey" or "Hatsune Miku: Project DIVA MegaMix".	2022-05-12 18:23:45 +05:30
PixelyIon	2a99e1784d	Fix Maxwell3D RT Depth/Layer Count Logic The Maxwell3D RT layer count wasn't being set correctly as it has the same register as the depth values and is toggled between the two based on another register value.	2022-05-12 18:23:05 +05:30
Billy Laws	543ac3042e	Cleanup account services and stub StoreSaveDataThumbnail	2022-05-11 23:24:35 +01:00
Billy Laws	7d30ac0cd8	Add additional nifm stubs	2022-05-11 23:24:35 +01:00
Billy Laws	a164635f32	Stub LibraryAppletPlayerSelect	2022-05-11 23:24:35 +01:00
Billy Laws	dd0004e208	Set Host1x log tag correctly	2022-05-11 22:11:16 +01:00
Billy Laws	f89bacf8ae	Fixup Host1x syncpoint locking	2022-05-11 22:04:02 +01:00
Billy Laws	d8ff318a1a	Prevent infinite VFS read loop on EOF	2022-05-11 22:03:39 +01:00
shutterbug2000	f078a5d1ec	Stub `bt` and `btm:u` Stub BT services which is required by titles such as Pokémon Let's GO Pikachu and Eevee (non-Demo versions).	2022-05-11 20:44:09 +05:30
PixelyIon	588b4529ee	Implement 3D Texture Swizzling The Maxwell GPU supports 3D textures which are tiled with the block-linear layout which didn't handle swizzling 3D textures correctly till now. This commit addresses that by implementing proper swizzling for 3D textures. Titles such as Cluster Truck and Super Mario Odyssey utilize 3D textures alongside a vast majority of other titles.	2022-05-11 14:06:04 +05:30
Billy Laws	601d67e369	Use resource size rather than allocation size for staging buffer size As per VMA docs: 'Allocation size returned in this variable may be greater than the size requested for the resource e.g. as VkBufferCreateInfo::size. Whole size of the allocation is accessible for operations on memory e.g. using a pointer after mapping with vmaMapMemory(), but operations on the resource e.g. using vkCmdCopyBuffer must be limited to the size of the resource.'	2022-05-10 18:48:20 +01:00
Billy Laws	d2acec24f5	Handle VFS reads into trapped memory regions pread will refuse to read into any trapped regions so implement a manual path with a staging buffer and memcpy for such cases	2022-05-10 18:33:55 +01:00
Billy Laws	1609fd2a32	Account for layerCount in SynchronizeGuestWithBuffer staging buffer size	2022-05-10 18:33:31 +01:00
Billy Laws	5b97b87503	Restore previous cullMode when cullFace is enabled	2022-05-10 18:31:32 +01:00
Billy Laws	15e9fa1c80	Fix FillRandomBytes There were two issues here: - If a skyline span was passed as a param then the 'T &object' version would be called, filling the span itself with random values rather than its contents - Random numbers were repeated every call since independent_bits_engine copied generator state and thus it was never actually updated	2022-05-10 18:28:15 +01:00
Billy Laws	622ff2a8f1	Correctly track 5.1 audio channel sample count Size needs to be adjusted for 5.1 buffers since they're downsampled to stereo.	2022-05-10 18:26:20 +01:00
PixelyIon	56c9b03843	Fix incorrect swizzling Y extent calculation This calculation for the amount of lines on the Y axis relative to the start of the last block was wrong and would instead determine the amount of lines to the last Y-axis GOB which wasn't accurate when padding was considered, this resulted in titles like Celeste having broken texture decoding (on a 1922x1082 texture) for the last ROB as most pixels would be masked out.	2022-05-09 20:25:43 +05:30
Billy Laws	018df355f0	Replace some VFS exceptions with warnings These errors aren't necessarily fatal so tone them down.	2022-05-08 19:37:10 +01:00
Billy Laws	e1c13bbc08	Update hades	2022-05-08 19:37:10 +01:00
PixelyIon	b307fca115	Fix attachment reuse within the same subpass Certain titles such as BOTW trigger behavior to reuse an attachment within the same subpass, this caused an exception inside `RenderPassNode::AddAttachment` as it cannot find corresponding subpass for attachment. To fix this issue, we now assume that when it cannot find a subpass for an existing attachment, it is attached to the latest subpass and return the attachment.	2022-05-08 18:26:40 +05:30
PixelyIon	e027555796	Handle Y-axis GOB non-alignment for swizzling Certain textures may be unaligned with a GOB's height of 8 lines, we already handle the case of being unaligned with a GOB's width of 64-bytes. This case occurs on titles such as SMO when going in-game.	2022-05-07 18:37:22 +05:30
PixelyIon	c910e29168	Extend `HostSignalHandler`'s `SIGSEGV` debugger path The function now returns from a segmentation fault when a debugger is present, this allows the entire context to be intact which can allow the debugger to correctly pick up variables from all stack frames while it could not extrapolate most variables when trapped inside the signal handler without the values of all registers.	2022-05-07 18:37:22 +05:30
Billy Laws	4149ab1067	Implement Maxwell 3D instanced draw support In the Maxwell 3D engine, instanced draws are implemented by repeating the exact same draw in sequence with special flag set in vertexBeginGl. This flag allows either incrementing the instance counter or resetting it, since we need to supply an instance count to the host API we defer all draws until state changes occur. If there are no state changes between draws we can skip them and count the occurences to get the number of instances to draw.	2022-05-07 13:56:09 +01:00
Billy Laws	03594a081c	Ensure correct flushing for batched constant buffer updates Cbufs could be read by non-maxwell3D engines so force a flush when switching to them or before Execute.	2022-05-07 13:56:09 +01:00
PixelyIon	ad989750fc	Implement Maxwell3D Point Sprite Size Implements register state that corresponds to the size of a single point sprite in Maxwell 3D, this is emitted by the shader compiler in the preamble but needs to be only applied if the input topology is a point primitive and it is invalid to set the point size in any other case.	2022-05-07 03:46:25 +05:30
PixelyIon	874a6a2a6c	Fix `getTextureType` enum conversion fomatting	2022-05-07 03:46:25 +05:30
PixelyIon	ae5bcbdb5c	Fix Depth RT lock to be in scope Earlier texture locking design required the lock to be retained but since the introduction of `AttachTexture`, this no longer needs to be done. This being done caused deadlocks when the depth texture is sampled by the fragment shader while being bound as an RT since it would attempt to lock the texture again.	2022-05-07 02:37:48 +05:30
shutterbug2000	1c8d994161	Basic `bcat:u` implementation A basic `bcat:u` implementation to prevent titles such as "Kirby and the Forgotten Land" dependent on BCAT support from crashing due to the lack of an implementation.	2022-05-06 15:41:48 +05:30
PixelyIon	4fd64a53e0	Require Vulkan `samplerAnisotropy` feature This is a widely supported feature that games may require conditionally but due to it being supported on effectively all target devices, it was made mandatory. This is used by titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	1d9b4a865a	Add additional formats to Adreno filter `VK_FORMAT_R32G32B32A32_SFLOAT` and `D32_SFLOAT` have their capabilities misreported as well, this spams the logs in titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	b87295374e	Improve Controller Applet log Improves the readability of the log and replaces the previously uninformative prefix of `operator()` due to being in a lambda with `Controller support`.	2022-05-06 15:41:48 +05:30
PixelyIon	98c730a644	Implement linked TIC/TSC handle in Maxwell3D Maxwell3D has a register for linking the TIC/TSC index in bindless texture handles, this is used by games to implement bindless combined texture-sampler handles.	2022-05-06 14:58:20 +05:30
PixelyIon	23a091100d	Implement `ReadCbufValue` + `ReadTextureType` Implements `GraphicsEnvironment::ReadCbufValue` & `GraphicsEnvironment::ReadTextureType` with a framework of heterogeneous lookups for caching and callbacks for querying constant buffer or TIC values with validation checks for successive draws to ensure unique IR is generated.	2022-05-06 14:39:36 +05:30
PixelyIon	765c3f4e1f	Allow draws with no descriptor set resources The `descriptorSetWrites` being filled is now optional and the case of it being empty is handled correctly, this is done by certain titles such as ARMS and is entirely valid behavior. It should be noted that not doing this leads to errors in the guest due to invalid GPU state while working on the host GPU.	2022-05-06 10:33:47 +05:30
PixelyIon	37327f1955	Fix and refactor SVC `SignalToAddress`/`WaitForAddress` SVC `SignalToAddress` had a bug with the behavior of `SignalAndModifyBasedOnWaitingThreadCountIfEqual` which was entirely incorrect and led to deadlocks in titles such as ARMS that were dependent on it. This commit corrects the behavior and refactors both SVCs and moves their arbitration/waiting to inside the corresponding `KProcess` function rather than the SVC to avoid redundancies and improve code readability.	2022-05-05 19:15:37 +05:30
PixelyIon	396979e897	Extend Adreno format-based filtering for Validation Layer Filtering of validation logs is now extended beyond BCn formats and now covers other format which have their feature set misreported by the driver, this significantly drives down the amount of logs depending on the title.	2022-05-05 19:15:37 +05:30
PixelyIon	62ea2a6da5	Avoid format aliasing warnings on Adreno Implements an algorithm to determine formats that can be aliased as views without needing `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT`, this avoids spamming warning logs on view creation when the aliased formats will function in practice.	2022-05-05 19:15:37 +05:30
PixelyIon	7206ab4c67	Fix `exclusiveSubpass` by finishing render pass at end There was an oversight with exclusive subpasses which could lead to RPs with more than one subpass could be created even though one pass was exclusive, this oversight was not finishing the render pass at the end of `AddSubpass`. This could lead to a future subpass adding to the end of that RP even though it was intended to exclusively have a single subpass. This case occurs in titles such as Celeste (in-game) and breaks rendering on GPUs that may require exclusive subpasses for proper functionality.	2022-05-05 11:14:38 +05:30
PixelyIon	96fe5f0a0e	Set initial `subpassCount` value to 1 rather than 0	2022-05-05 11:07:43 +05:30
PixelyIon	5d08d6e06f	Disable unnecessary Khronos Validation Layer logs The Khronos Validation Layer can often generate warning/error logs due to our intentional breakage from Vulkan specification, these can occur several times a frame resulting in the logs being spammed and making it difficult to extract useful information out of logs. The scope of these logs has now been reduced with more general filtering and the introduction of specialized filtering to handle complex cases such as BCn hacks with `libadrenotools` on Adreno devices.	2022-05-04 13:20:59 +05:30
PixelyIon	23c9388caf	Fix `VK_KHR_push_descriptor`-less path for descriptor set updates Descriptor set updates were broken on the non-push-descriptor path due to lifetime issues with VkDescriptorSetLayout's usage during the execution phase which entirely broke rendering on AMD/Mali GPUs due to them not supporting `VK_KHR_push_descriptor`. This commit addresses that by moving the allocation of a descriptor set to outside the lambda and into the recording phase, it also simplifies the semantics and resources passed into the lambda by removing redundancies.	2022-05-04 00:49:21 +05:30
PixelyIon	47bc3b4d99	Fix Render Pass Cache The Vulkan render pass cache was fundamentally broken since it was designed around the Render Pass Compatibility clause due to being designed for framebuffer compatibility initially. As this scope was extended to a general render pass cache, the amount of data in the key was not extended to include everything it should have. This commit introduces the missing pieces in the RP cache and simplifies the underlying code in the process.	2022-05-01 20:31:36 +05:30
PixelyIon	25a29f9044	Skip zero-initializing shader bytecode backing The backing for shader data would implicitly be zero-initialized due to a `resize` on every shader parse, this was entirely unnecessary as we would overwrite the entire range regardless. We avoid this by using statically allocated storage and a span over it containing the shader bytecode which avoids any unnecessary clear semantics without resorting to more complex solutions such as a custom allocator.	2022-05-01 18:27:27 +05:30
PixelyIon	42573170c6	Implement Framebuffer Cache Implements a cache for storing `VkFramebuffer` objects with a special path on devices with `VK_KHR_imageless_framebuffer` to allow for more cache hits due to an abstract image rather than a specific one. Caching framebuffers is a fairly crucial optimization due to the cost of creating framebuffers on TBDRs since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding framebuffer memory.	2022-05-01 18:27:27 +05:30
PixelyIon	af7f0c301e	Avoid redundant `VkImageView` recreation There are a lot of cases of `VkImageView` being recreated arbitrarily due to it being tied to the ephemeral object `TextureView` rather than `Texture`, this commit flips that by storing all `VkImageView`s inside `Texture` with `TextureView` simply holding a copy of the handle to them. Additionally, this change results in stable `VkImageView` handles and helps in paving the path for framebuffer caching when `VK_KHR_imageless_framebuffer` is unavailable.	2022-05-01 18:27:27 +05:30
PixelyIon	da931cf07b	Implement Render Pass Cache Implements a cache for storing `VkRenderPass` objects which are often reused, they are not extremely expensive to create generally but this is a required step to build up to a framebuffer cache which is an extremely expensive object to create on TBDRs generally since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding memory.	2022-05-01 18:16:53 +05:30
Billy Laws	ae77bde171	Fixup audio device name writing in services Games expect the output buffer the be entirely zero filled past the device name.	2022-04-30 16:00:33 +01:00
Billy Laws	194cbe6c7c	Stub several HID functions	2022-04-30 16:00:33 +01:00
Billy Laws	112c20cef2	Stub QueryAudioDevice{Input,Output}Event Used in many 3.0.0+ games	2022-04-30 16:00:33 +01:00
Billy Laws	8d7dbe2c4e	Add a way to get a readonly span of Buffer contents Avoids the need redundantly copy data when it is being directly processed on the CPU (e.g. quad coversion)	2022-04-30 16:00:33 +01:00
MK73DS	4c71ef5c31	Fix American English language code	2022-04-30 18:43:22 +05:30
PixelyIon	90c635bf78	Coalesce subpasses with compatible attachments together We run into a lot of successive subpasses with the exact same framebuffer configuration which we now exploit to avoid the creation of a new subpass due to the overhead involved with this. This provides significant performance boosts in certain cases due to the magnitude of difference in the amount of subpasses being created while providing next to no benefit in other cases.	2022-04-27 13:22:34 +05:30
PixelyIon	a947933bf0	Fix `Buffer` cycle check being inverted The check for the fence cycle being the same as the current cycle was incorrectly inverted to be the opposite of what it should have been, leading to bugs.	2022-04-27 13:07:36 +05:30
PixelyIon	54794f4b71	Move `Texture` locking and synchronization to `PresentationEngine` The responsibility for synchronizing a texture and locking it is now on the `PresentationEngine` rather than the API-user as this'll allow more fine grained locking and delay waiting until necessary.	2022-04-25 21:01:16 +05:30
Billy Laws	1dd230afde	Refactor all std::lock_guard usages to std::scoped_lock	2022-04-25 15:00:30 +01:00
PixelyIon	94e6f3cfa0	Add quirk for relaxed render pass compatibility As we require a relaxed version of the Vulkan render pass compatibility clause for caching multi-subpass render passes, we now utilize a quirk to determine if this is supported which it is on Nvidia/Adreno while AMD/Mali where it isn't supported we force single-subpass render passes.	2022-04-24 16:18:36 +05:30
PixelyIon	44615c8dd2	Implement per-vendor `VkQueue` maximum global priority We found out that certain vendors such as Nvidia had a limitation on the global priority of a queue and requesting `VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT` would result in `VK_ERROR_NOT_PERMITTED_EXT`. A quirk has been introduced to supply the maximum supported global priority which is currently set on a per-vendor basis to avoid future crashes.	2022-04-24 16:15:01 +05:30
PixelyIon	7ef4959060	Implement Graphics Pipeline Cache Implements a cache for storing `VkPipeline` objects which are fairly expensive to create and doing so on a per-frame basis was rather wasteful and consumed a significant part of frametime. It should be noted that this is not compliant with the Vulkan specification and will break unless the driver supports a relaxed version of the Vulkan specification's Render Pass Compatibility clause.	2022-04-24 14:31:00 +05:30
PixelyIon	50a8b69f7b	Optimize descriptor set writes using push descriptors We can use inline push descriptors for writing to descriptor rather than allocating a descriptor set for a one time write and freeing it as this is rather inefficient while an inline push descriptor generally ends up being a direct `memcpy` on the driver side designed for this use-case.	2022-04-24 13:45:09 +05:30
PixelyIon	5adafbff04	Set `VkQueue`'s global priority to high We want Skyline to have the most favorable GPU scheduling possible due to low latency and high throughput requirements, we request high priority scheduling due to this reason.	2022-04-24 13:34:09 +05:30
PixelyIon	f9c052d1b7	Implement Maxwell3D Tessellation State This implements all Maxwell3D registers and HLE Vulkan state for Tessellation including invalidation of the TCS (Tessellation Control Shader) state during state changes.	2022-04-24 13:23:00 +05:30
Billy Laws	de796cd2cd	Implement overhead-free sequenced buffer updates with megabuffers Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. We had earlier tried to fix this by using vkCmdUpdateBuffer however this caused significant performance loss due to an oversight in Adreno drivers. We could have worked around this simply by using vkCmdCopy buffer however there would still be a performance loss due to renderpasses being split up with copies inbetween. To avoid this we introduce 'megabuffers', a brand new technique not done before in any other switch emulators. Rather than replaying the copies in sequence on the GPU, we take advantage of the fact that buffers are generally small in order to replay buffers on the GPU instead. Each write and subsequent usage of a buffer will cause a copy of the buffer with that write, and all prior applied to be pushed into the megabuffer, this way at the start of execute the megabuffer will hold all used states of the buffer simultaneously. Draws then reference these individual states in sequence to allow everything to work without any copies. In order to support this buffers have been moved to an immediate sync model, with synchronisation being done at usage-time rather than execute (in order to keep contents properly sequenced) and GPU-side writes now need to be explictly marked (since they prevent megabuffering). It should also be noted that a fallback path using cmdCopyBuffer exists for the cases where buffers are too large or GPU dirty.	2022-04-23 22:48:28 +01:00
lynxnb	0d9992cb8e	Implement `QuadList` support for non-indexed draws	2022-04-20 18:17:10 +02:00
lynxnb	bcaf7dfe1c	Make `GetVertexBuffer` return a pointer to the requested buffer This avoids a redundancy in the `Draw` function and makes code easier to read	2022-04-20 18:16:45 +02:00
Billy Laws	5c3559e888	Revert "Implement support for GPU-side constant buffer updating" This reverts commit `d79635772f`.	2022-04-18 13:28:58 +01:00
Billy Laws	7bf3580031	Revert "Allow external synchronization for buffers" This reverts commit `372ab8befa`.	2022-04-18 13:28:58 +01:00
PixelyIon	ddc9622b90	Fix Shader Module Cache As bindings weren't correctly handled due to the fact that `EmitSPIRV` would change the bindings, the shader module cache would not correctly function and have no cache hits in `find` and rather have them in `try_emplace` which negated any performance benefit of it. This has now been fixed by retaining the initial cache key for insertion into the cache while also storing the post-emit bindings and restoring them during a cache hit.	2022-04-18 12:18:15 +05:30
Billy Laws	32fe01e145	Implement batch constant buffer updates Avoids spamming the driver with hundreds of cbuf updates per frame by batching all consecutive updates into one.	2022-04-17 00:35:00 +01:00
PixelyIon	02f99273ac	Implement Shader Module Cache Implements caching of the compiled shader module (`VkShaderModule`) in an associative map based on the supplied IR, bindings and runtime state to avoid constant recompilation of shaders. This doesn't entirely address shader compilation as an issue since host shader compilation is tied to Vulkan pipeline objects rather than Vulkan shader modules, they need to be cached to prevent costly host shader recompilation.	2022-04-16 18:45:56 +05:30
PixelyIon	76d8172a35	Implement Shader IR Cache This implements the first step of a full shader cache with caching any IR by treating the shared pointer as a handle and key for an associative map alongside hashing the Maxwell shader bytecode, it supports both single shader program and dual vertex program caching.	2022-04-16 18:45:56 +05:30
PixelyIon	0baa90d641	Implement `SpanEqual` and `SpanHash` We desire the ability to hash and check equality of data across spans to use associative containers such as `std::unordered_map` with spans. The implemented functions provide an easy way to do that.	2022-04-16 18:45:56 +05:30
Billy Laws	df5d1256c2	Implement an object backed IStorage backing This is more convinient and efficient to use when passing structured data out of applets	2022-04-16 18:45:56 +05:30
Billy Laws	d115ce3c05	Stub the controller applet Mostly based off of yuzu's implementation, this will need to be extended in the future to open up a UI for configuring controllers according to the applications requirements.	2022-04-16 18:45:56 +05:30
Billy Laws	9a8e39cba1	Slightly refactor controller code in HID Now uses ranges where possible and a function to get the number of connected controllers has been added.	2022-04-16 18:45:56 +05:30
Billy Laws	2873f11baa	Pass shared pointers by value in applet infrastructure This is more optimal than crefs when used together with std::move	2022-04-16 18:45:56 +05:30
PixelyIon	8ccef733ff	Fix UB with guest-less Texture/Buffers in `MarkGpuDirty` As there was no check for the lack of a `GuestTexture`/`GuestBuffer`, it would lead to UB when a texture/buffer that had no guest such as the `zeroTexture` from `GraphicsContext` would be marked as dirty they would cause a call to `NCE::RetrapRegions` with a `nullptr` handle that would be dereferenced and cause a segmentation fault.	2022-04-16 18:45:56 +05:30
PixelyIon	372ab8befa	Allow external synchronization for buffers In certain situations such as constant buffer updates, we desire to use the guest buffer as a shadow buffer forwarding all writes directly to it while we update the host using inline buffer updates so they happen in-sequence. This requires special behavior as we cannot let any synchronization operations take place as they would break the shadow buffer, as a result, an external synchronization flag has been added to prevent this from happening. It should be noted that this flag is not respected for buffer recreation which will lead to UB, this can and will break updates in certain cases and this change isn't complete without buffer manager support.	2022-04-16 18:44:53 +05:30
PixelyIon	c0c4db68a8	Fix `BufferView` offset not being added in `vkCmdUpdateBuffer` The offset of the view wasn't added to the `vkCmdUpdateBuffer`, this would cause the offset to be incorrect given the buffer was a view of a larger buffer that wasn't the start of it. This commit fixes that by adding the offset of the view to the buffer update.	2022-04-14 18:06:15 +05:30
PixelyIon	a1c06e0401	Mark GPU resources as dirty before GPU usage We didn't call `MarkGpuDirty` on textures/buffers prior to GPU usage, this would cause them to not be R/W protected when they should be and provide outdated copies if there were any read accesses from the CPU (which are not possible at the moment since we assume all accesses are writes at the moment). This has now been fixed by calling it after synchronizing the resource.	2022-04-14 17:20:05 +05:30
PixelyIon	41a6afed01	Fix `GraphicsContext` code formatting for auto formatter	2022-04-14 15:27:22 +05:30
PixelyIon	624df92616	Change `AddNonGraphicsPass` to `AddOutsideRpCommand` The terminology "Non-Graphics pass" was deemed to be fairly inaccurate since it simply covered all Vulkan commands (not "passes") outside the render-pass scope, these may be graphical operations such as blits and therefore it is more accurate to use the new terminology of "Outside-RenderPass command" due to the lack of such an implication while being consistent with the Vulkan specification.	2022-04-14 15:20:22 +05:30
Billy Laws	a31332e35f	Align Maxwell 3D macro newline slashes	2022-04-14 14:14:52 +05:30
Billy Laws	d79635772f	Implement support for GPU-side constant buffer updating Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. Fix this by updating cbufs on the GPU/CPU seperately, only ever syncing them back at the start or after a guest side CPU write, at the moment only a single word is updated at a time however this can be optimised in the future to batch all consecutive updates into one large one.	2022-04-14 14:14:52 +05:30
Robin Kertels	036faedabd	Implement a way to run non-graphics passes with command executor These commands will end the current renderpass and run on their own, this is useful for compute, blits etc.	2022-04-14 14:14:52 +05:30
Billy Laws	feb179fcff	Implement primitive restart support Maxwell3D also supports using an arbitrary restart index value however no games are known to use this so leave it for now.	2022-04-14 14:14:52 +05:30
Billy Laws	3f3acc31d8	Rework swizzle infrastructure to support arbritary format swizzles This is required to support R4G4B4A4 which has no directly corresponding Vulkan format. Co-authored-by: Lunar-Pixel <lunarn452@gmail.com>	2022-04-14 14:14:52 +05:30
PixelyIon	6f85a66151	Implement host-only `Buffer`s We require certain buffers to only be on the host while being accessible through the same abstractions as a guest buffer as they must be interchangeable in usage.	2022-04-14 14:14:52 +05:30
Billy Laws	2c697ec36a	Determine depth/stencil texture aspect based off of image swizzle Required since we can't have a non-rt image with both a depth/stencil aspect at the same time according to vk spec.	2022-04-14 14:14:52 +05:30
PixelyIon	1878e582ad	Add `ScopedStackBlocker` to `RomFile.populate` We needed to block stack frame lookups past JNI code as Java doesn't follow the ARMv8 frame pointer ABI which leads to invalid pointer dereferences. Any JNI function that throws or handles exceptions must do this now or it may lead to a `SIGSEGV`.	2022-04-14 14:14:52 +05:30
Billy Laws	68e693d9f4	Fix DMA Engine debug logs to not crash emu Address causes some type issues when printing directly so explicitly cast to u64 first to prevent them.	2022-04-14 14:14:52 +05:30
Billy Laws	8eaca87de8	Use an empty host texture in place of invalid TIC entries on guest Some games may pass empty TICs as inputs to shaders while not actually using them within the shader. Create an empty texture and pass this in instead when we hit this case, the nullDescriptor feature could be used but it's not supported by all devices so we chose to do it this way instead.	2022-04-14 14:14:52 +05:30
PixelyIon	41b98c7daa	Add stack tracing to `skyline::exception` Skyline's `exception` class now stores a list of all stack frames during the invocation of the exception. These can later be parsed by the exception handler to generate a human-readable stack trace. To assist with more complete stack traces, `-fno-omit-frame-pointer` is now passed on debug builds which forces the inclusion of frames on function calls.	2022-04-14 14:14:52 +05:30
PixelyIon	cd8fa66326	Fix NCE Destruction NCE is implicitly depended on by the `GPU` class due to the NCE Memory Trapping API so the destruction of it must take place after the destruction of the `GPU` class. Additionally, to prevent bugs the NCE destructor must set `staticNce` to `nullptr` as the signal handler will potentially access a destroyed instance of NCE otherwise.	2022-04-14 14:14:52 +05:30
Billy Laws	815f1f4067	Add support for sRGB TIC textures Without this sRGB textures would be interpreted as RGB leading to colours being slighly off. The sRGB flag isn't stored as part of format word so we reuse the _pad_ field of it to store the flag for the switch case.	2022-04-14 14:14:52 +05:30
Billy Laws	1ba4abf950	Add Astc{6x6,8x8} and R4G4B4A4 image formats	2022-04-14 14:14:52 +05:30
MCredstoner2004	dec0571eee	Infrastructure for applets to be implemented This removes a stub for an applet and implements several applet related service calls.	2022-04-14 14:14:52 +05:30
PixelyIon	164d4852fa	Sleep-loop rather than abort during termination We don't want to actually exit the process as it'll automatically be restarted gracefully due to a timeout after being unable to exit within a fixed duration so we just want to infinite sleep during termination. This should fix issues where exiting any game would cause the app to force close after some time as exception signal handling would fail in the background, the app should stay open now and automatically restart itself when another game is loaded in.	2022-04-14 14:14:52 +05:30
PixelyIon	ea00f1bb82	Flush emulation logs after exceptions A lot of logs are incomplete due to being unable to flush inside the signal handler, now we flush after any exceptions so that there is a guarantee of any exceptions being logged as this is crucial for proper debugging.	2022-04-14 14:14:52 +05:30
PixelyIon	62ba180550	Use R5G6B5 as Vulkan swapchain format rather than B5G6R5 B5G6R5 isn't generally supported by the swapchain and the format is used for R5G6B5 with swapped R/B channels to avoid aliasing so we reverse that by using R5G6B5 as the underlying Vulkan format for the swapchain which should be automatically handled by the driver for any copies from B5G6R5 textures and the data representation should be the same as B5G6R5 with swapped R/B channels so not reporting the correct texture::Format should be fine.	2022-04-14 14:14:52 +05:30
MK73DS	e54f86e923	Fix IApplicationFunctions::GetDisplayVersion id (https://switchbrew.org/wiki/Applet_Manager_services#IApplicationFunctions)	2022-04-14 14:14:52 +05:30
Billy Laws	77cf33b643	Trigger command executor before DMA copies DMA copies can use textures currently in active use on the GPU as dst/src so Execute before to prevent a deadlock	2022-04-14 14:14:52 +05:30
Billy Laws	dbbc5704d2	Implement DMA engine Block Linear->Linear copies	2022-04-14 14:14:52 +05:30
Billy Laws	3e4e8de1d2	Implement primitive Linear->Block Linear DMA engine copies Slightly inaccurate and misses some features but good enough for most games, should be revisted later.	2022-04-14 14:14:52 +05:30
Billy Laws	3c26921d54	Implement the Maxwell DMA engine The DMA engine is used to perform DMA buffer/texture copies directly on the GPU. It can deswizzle arbritary regions of input textures, perform component remapping and swizzle into output textures. This impl only supports 1D buffer copies, 2D ones will come later.	2022-04-14 14:14:52 +05:30
Billy Laws	3df76e84c3	Stub IRequest::GetAppletInfo in nifm	2022-04-14 14:14:52 +05:30
Billy Laws	6c5f9941ad	Stub additional IAddOnContentManager functions Used mainly by UE4 games	2022-04-14 14:14:52 +05:30
Billy Laws	486a835d0a	Use guest texture view type to determine the underlying image type If we have a Nx1x1 image then determining the type from dimensions will result in a 1D image being created thus preventing us from creating a 2D view. By using the image view type we can avoid this for textures from TICs since we know in advance how they will be used	2022-04-14 14:14:52 +05:30
Billy Laws	05966f34e5	Stub a pair of ISelfController functions Both used by SMO, SetScreenShotPermission and SetAlbumImageOrientation	2022-04-14 14:14:52 +05:30
Billy Laws	fe37d7c9be	Implement ICommonStateGetter::SetRequestExitToLibraryAppletAtExecuteNextProgramEnabled	2022-04-14 14:14:52 +05:30
Billy Laws	9813f9f8dc	Implement ICommonStateGetter::GetDefaultDisplayResolutionChangeEvent	2022-04-14 14:14:52 +05:30
Billy Laws	7e7c0252ca	Implement IApplicationFunctions::GetDisplayVersion	2022-04-14 14:14:52 +05:30
Billy Laws	b1f10865a0	Attach depth RT to command executor before draws This enforces that the depth RT outlives the draw, without this the depth RT could be freed while in active use by command executor leading to UAFs and crashes.	2022-04-14 14:14:52 +05:30
Billy Laws	0182fabc50	Stub {Set,Get}NpadHandheldActivationMode in HID	2022-04-14 14:14:52 +05:30
Billy Laws	2e197cead5	Support D32S8_Float_Uint_Unorm_Unorm depth/stencil format	2022-04-14 14:14:52 +05:30
Billy Laws	7717a86fb1	Implement VMM region->region copies Required by the DMA engine, a simple memcpy doesn't work since the buffers could span multiple blocks.	2022-04-14 14:14:52 +05:30
Billy Laws	af90d4f977	Implement audren Surround->Stereo downmixing	2022-04-14 14:14:52 +05:30

1 2 3 4 5 ...

942 Commits