skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-11-18 04:19:18 +01:00

Author	SHA1	Message	Date
Billy Laws	622ff2a8f1	Correctly track 5.1 audio channel sample count Size needs to be adjusted for 5.1 buffers since they're downsampled to stereo.	2022-05-10 18:26:20 +01:00
PixelyIon	56c9b03843	Fix incorrect swizzling Y extent calculation This calculation for the amount of lines on the Y axis relative to the start of the last block was wrong and would instead determine the amount of lines to the last Y-axis GOB which wasn't accurate when padding was considered, this resulted in titles like Celeste having broken texture decoding (on a 1922x1082 texture) for the last ROB as most pixels would be masked out.	2022-05-09 20:25:43 +05:30
Billy Laws	018df355f0	Replace some VFS exceptions with warnings These errors aren't necessarily fatal so tone them down.	2022-05-08 19:37:10 +01:00
Billy Laws	e1c13bbc08	Update hades	2022-05-08 19:37:10 +01:00
PixelyIon	b307fca115	Fix attachment reuse within the same subpass Certain titles such as BOTW trigger behavior to reuse an attachment within the same subpass, this caused an exception inside `RenderPassNode::AddAttachment` as it cannot find corresponding subpass for attachment. To fix this issue, we now assume that when it cannot find a subpass for an existing attachment, it is attached to the latest subpass and return the attachment.	2022-05-08 18:26:40 +05:30
PixelyIon	e027555796	Handle Y-axis GOB non-alignment for swizzling Certain textures may be unaligned with a GOB's height of 8 lines, we already handle the case of being unaligned with a GOB's width of 64-bytes. This case occurs on titles such as SMO when going in-game.	2022-05-07 18:37:22 +05:30
PixelyIon	c910e29168	Extend `HostSignalHandler`'s `SIGSEGV` debugger path The function now returns from a segmentation fault when a debugger is present, this allows the entire context to be intact which can allow the debugger to correctly pick up variables from all stack frames while it could not extrapolate most variables when trapped inside the signal handler without the values of all registers.	2022-05-07 18:37:22 +05:30
Billy Laws	4149ab1067	Implement Maxwell 3D instanced draw support In the Maxwell 3D engine, instanced draws are implemented by repeating the exact same draw in sequence with special flag set in vertexBeginGl. This flag allows either incrementing the instance counter or resetting it, since we need to supply an instance count to the host API we defer all draws until state changes occur. If there are no state changes between draws we can skip them and count the occurences to get the number of instances to draw.	2022-05-07 13:56:09 +01:00
Billy Laws	03594a081c	Ensure correct flushing for batched constant buffer updates Cbufs could be read by non-maxwell3D engines so force a flush when switching to them or before Execute.	2022-05-07 13:56:09 +01:00
PixelyIon	ad989750fc	Implement Maxwell3D Point Sprite Size Implements register state that corresponds to the size of a single point sprite in Maxwell 3D, this is emitted by the shader compiler in the preamble but needs to be only applied if the input topology is a point primitive and it is invalid to set the point size in any other case.	2022-05-07 03:46:25 +05:30
PixelyIon	874a6a2a6c	Fix `getTextureType` enum conversion fomatting	2022-05-07 03:46:25 +05:30
PixelyIon	ae5bcbdb5c	Fix Depth RT lock to be in scope Earlier texture locking design required the lock to be retained but since the introduction of `AttachTexture`, this no longer needs to be done. This being done caused deadlocks when the depth texture is sampled by the fragment shader while being bound as an RT since it would attempt to lock the texture again.	2022-05-07 02:37:48 +05:30
shutterbug2000	1c8d994161	Basic `bcat:u` implementation A basic `bcat:u` implementation to prevent titles such as "Kirby and the Forgotten Land" dependent on BCAT support from crashing due to the lack of an implementation.	2022-05-06 15:41:48 +05:30
PixelyIon	4fd64a53e0	Require Vulkan `samplerAnisotropy` feature This is a widely supported feature that games may require conditionally but due to it being supported on effectively all target devices, it was made mandatory. This is used by titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	1d9b4a865a	Add additional formats to Adreno filter `VK_FORMAT_R32G32B32A32_SFLOAT` and `D32_SFLOAT` have their capabilities misreported as well, this spams the logs in titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	b87295374e	Improve Controller Applet log Improves the readability of the log and replaces the previously uninformative prefix of `operator()` due to being in a lambda with `Controller support`.	2022-05-06 15:41:48 +05:30
PixelyIon	98c730a644	Implement linked TIC/TSC handle in Maxwell3D Maxwell3D has a register for linking the TIC/TSC index in bindless texture handles, this is used by games to implement bindless combined texture-sampler handles.	2022-05-06 14:58:20 +05:30
PixelyIon	23a091100d	Implement `ReadCbufValue` + `ReadTextureType` Implements `GraphicsEnvironment::ReadCbufValue` & `GraphicsEnvironment::ReadTextureType` with a framework of heterogeneous lookups for caching and callbacks for querying constant buffer or TIC values with validation checks for successive draws to ensure unique IR is generated.	2022-05-06 14:39:36 +05:30
PixelyIon	765c3f4e1f	Allow draws with no descriptor set resources The `descriptorSetWrites` being filled is now optional and the case of it being empty is handled correctly, this is done by certain titles such as ARMS and is entirely valid behavior. It should be noted that not doing this leads to errors in the guest due to invalid GPU state while working on the host GPU.	2022-05-06 10:33:47 +05:30
PixelyIon	37327f1955	Fix and refactor SVC `SignalToAddress`/`WaitForAddress` SVC `SignalToAddress` had a bug with the behavior of `SignalAndModifyBasedOnWaitingThreadCountIfEqual` which was entirely incorrect and led to deadlocks in titles such as ARMS that were dependent on it. This commit corrects the behavior and refactors both SVCs and moves their arbitration/waiting to inside the corresponding `KProcess` function rather than the SVC to avoid redundancies and improve code readability.	2022-05-05 19:15:37 +05:30
PixelyIon	396979e897	Extend Adreno format-based filtering for Validation Layer Filtering of validation logs is now extended beyond BCn formats and now covers other format which have their feature set misreported by the driver, this significantly drives down the amount of logs depending on the title.	2022-05-05 19:15:37 +05:30
PixelyIon	62ea2a6da5	Avoid format aliasing warnings on Adreno Implements an algorithm to determine formats that can be aliased as views without needing `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT`, this avoids spamming warning logs on view creation when the aliased formats will function in practice.	2022-05-05 19:15:37 +05:30
PixelyIon	7206ab4c67	Fix `exclusiveSubpass` by finishing render pass at end There was an oversight with exclusive subpasses which could lead to RPs with more than one subpass could be created even though one pass was exclusive, this oversight was not finishing the render pass at the end of `AddSubpass`. This could lead to a future subpass adding to the end of that RP even though it was intended to exclusively have a single subpass. This case occurs in titles such as Celeste (in-game) and breaks rendering on GPUs that may require exclusive subpasses for proper functionality.	2022-05-05 11:14:38 +05:30
PixelyIon	96fe5f0a0e	Set initial `subpassCount` value to 1 rather than 0	2022-05-05 11:07:43 +05:30
PixelyIon	5d08d6e06f	Disable unnecessary Khronos Validation Layer logs The Khronos Validation Layer can often generate warning/error logs due to our intentional breakage from Vulkan specification, these can occur several times a frame resulting in the logs being spammed and making it difficult to extract useful information out of logs. The scope of these logs has now been reduced with more general filtering and the introduction of specialized filtering to handle complex cases such as BCn hacks with `libadrenotools` on Adreno devices.	2022-05-04 13:20:59 +05:30
PixelyIon	23c9388caf	Fix `VK_KHR_push_descriptor`-less path for descriptor set updates Descriptor set updates were broken on the non-push-descriptor path due to lifetime issues with VkDescriptorSetLayout's usage during the execution phase which entirely broke rendering on AMD/Mali GPUs due to them not supporting `VK_KHR_push_descriptor`. This commit addresses that by moving the allocation of a descriptor set to outside the lambda and into the recording phase, it also simplifies the semantics and resources passed into the lambda by removing redundancies.	2022-05-04 00:49:21 +05:30
PixelyIon	47bc3b4d99	Fix Render Pass Cache The Vulkan render pass cache was fundamentally broken since it was designed around the Render Pass Compatibility clause due to being designed for framebuffer compatibility initially. As this scope was extended to a general render pass cache, the amount of data in the key was not extended to include everything it should have. This commit introduces the missing pieces in the RP cache and simplifies the underlying code in the process.	2022-05-01 20:31:36 +05:30
PixelyIon	25a29f9044	Skip zero-initializing shader bytecode backing The backing for shader data would implicitly be zero-initialized due to a `resize` on every shader parse, this was entirely unnecessary as we would overwrite the entire range regardless. We avoid this by using statically allocated storage and a span over it containing the shader bytecode which avoids any unnecessary clear semantics without resorting to more complex solutions such as a custom allocator.	2022-05-01 18:27:27 +05:30
PixelyIon	42573170c6	Implement Framebuffer Cache Implements a cache for storing `VkFramebuffer` objects with a special path on devices with `VK_KHR_imageless_framebuffer` to allow for more cache hits due to an abstract image rather than a specific one. Caching framebuffers is a fairly crucial optimization due to the cost of creating framebuffers on TBDRs since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding framebuffer memory.	2022-05-01 18:27:27 +05:30
PixelyIon	af7f0c301e	Avoid redundant `VkImageView` recreation There are a lot of cases of `VkImageView` being recreated arbitrarily due to it being tied to the ephemeral object `TextureView` rather than `Texture`, this commit flips that by storing all `VkImageView`s inside `Texture` with `TextureView` simply holding a copy of the handle to them. Additionally, this change results in stable `VkImageView` handles and helps in paving the path for framebuffer caching when `VK_KHR_imageless_framebuffer` is unavailable.	2022-05-01 18:27:27 +05:30
PixelyIon	41b2c2dc7b	Add `profileable` attribute to `AndroidManifest.xml` As we desire more accurate profiling data in certain circumstances, making the app explicitly profilable will allow for this, it will also remove the (annoying) prompt to do this in the Android Studio profiler.	2022-05-01 18:27:27 +05:30
PixelyIon	da931cf07b	Implement Render Pass Cache Implements a cache for storing `VkRenderPass` objects which are often reused, they are not extremely expensive to create generally but this is a required step to build up to a framebuffer cache which is an extremely expensive object to create on TBDRs generally since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding memory.	2022-05-01 18:16:53 +05:30
Billy Laws	ae77bde171	Fixup audio device name writing in services Games expect the output buffer the be entirely zero filled past the device name.	2022-04-30 16:00:33 +01:00
Billy Laws	194cbe6c7c	Stub several HID functions	2022-04-30 16:00:33 +01:00
Billy Laws	112c20cef2	Stub QueryAudioDevice{Input,Output}Event Used in many 3.0.0+ games	2022-04-30 16:00:33 +01:00
Billy Laws	8d7dbe2c4e	Add a way to get a readonly span of Buffer contents Avoids the need redundantly copy data when it is being directly processed on the CPU (e.g. quad coversion)	2022-04-30 16:00:33 +01:00
MK73DS	4c71ef5c31	Fix American English language code	2022-04-30 18:43:22 +05:30
PixelyIon	4ec0f62e30	Update Kotlin, AGP, Gradle and Build Tools Kotlin was updated to 1.6.21, AGP to 7.1.3, Gradle to 7.4.2 and Build Tools to 33.0.0-rc3.	2022-04-27 14:00:36 +05:30
PixelyIon	90c635bf78	Coalesce subpasses with compatible attachments together We run into a lot of successive subpasses with the exact same framebuffer configuration which we now exploit to avoid the creation of a new subpass due to the overhead involved with this. This provides significant performance boosts in certain cases due to the magnitude of difference in the amount of subpasses being created while providing next to no benefit in other cases.	2022-04-27 13:22:34 +05:30
PixelyIon	a947933bf0	Fix `Buffer` cycle check being inverted The check for the fence cycle being the same as the current cycle was incorrectly inverted to be the opposite of what it should have been, leading to bugs.	2022-04-27 13:07:36 +05:30
PixelyIon	54794f4b71	Move `Texture` locking and synchronization to `PresentationEngine` The responsibility for synchronizing a texture and locking it is now on the `PresentationEngine` rather than the API-user as this'll allow more fine grained locking and delay waiting until necessary.	2022-04-25 21:01:16 +05:30
Billy Laws	1dd230afde	Refactor all std::lock_guard usages to std::scoped_lock	2022-04-25 15:00:30 +01:00
PixelyIon	94e6f3cfa0	Add quirk for relaxed render pass compatibility As we require a relaxed version of the Vulkan render pass compatibility clause for caching multi-subpass render passes, we now utilize a quirk to determine if this is supported which it is on Nvidia/Adreno while AMD/Mali where it isn't supported we force single-subpass render passes.	2022-04-24 16:18:36 +05:30
PixelyIon	44615c8dd2	Implement per-vendor `VkQueue` maximum global priority We found out that certain vendors such as Nvidia had a limitation on the global priority of a queue and requesting `VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT` would result in `VK_ERROR_NOT_PERMITTED_EXT`. A quirk has been introduced to supply the maximum supported global priority which is currently set on a per-vendor basis to avoid future crashes.	2022-04-24 16:15:01 +05:30
PixelyIon	7ef4959060	Implement Graphics Pipeline Cache Implements a cache for storing `VkPipeline` objects which are fairly expensive to create and doing so on a per-frame basis was rather wasteful and consumed a significant part of frametime. It should be noted that this is not compliant with the Vulkan specification and will break unless the driver supports a relaxed version of the Vulkan specification's Render Pass Compatibility clause.	2022-04-24 14:31:00 +05:30
PixelyIon	50a8b69f7b	Optimize descriptor set writes using push descriptors We can use inline push descriptors for writing to descriptor rather than allocating a descriptor set for a one time write and freeing it as this is rather inefficient while an inline push descriptor generally ends up being a direct `memcpy` on the driver side designed for this use-case.	2022-04-24 13:45:09 +05:30
PixelyIon	5adafbff04	Set `VkQueue`'s global priority to high We want Skyline to have the most favorable GPU scheduling possible due to low latency and high throughput requirements, we request high priority scheduling due to this reason.	2022-04-24 13:34:09 +05:30
PixelyIon	f9c052d1b7	Implement Maxwell3D Tessellation State This implements all Maxwell3D registers and HLE Vulkan state for Tessellation including invalidation of the TCS (Tessellation Control Shader) state during state changes.	2022-04-24 13:23:00 +05:30
Billy Laws	de796cd2cd	Implement overhead-free sequenced buffer updates with megabuffers Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. We had earlier tried to fix this by using vkCmdUpdateBuffer however this caused significant performance loss due to an oversight in Adreno drivers. We could have worked around this simply by using vkCmdCopy buffer however there would still be a performance loss due to renderpasses being split up with copies inbetween. To avoid this we introduce 'megabuffers', a brand new technique not done before in any other switch emulators. Rather than replaying the copies in sequence on the GPU, we take advantage of the fact that buffers are generally small in order to replay buffers on the GPU instead. Each write and subsequent usage of a buffer will cause a copy of the buffer with that write, and all prior applied to be pushed into the megabuffer, this way at the start of execute the megabuffer will hold all used states of the buffer simultaneously. Draws then reference these individual states in sequence to allow everything to work without any copies. In order to support this buffers have been moved to an immediate sync model, with synchronisation being done at usage-time rather than execute (in order to keep contents properly sequenced) and GPU-side writes now need to be explictly marked (since they prevent megabuffering). It should also be noted that a fallback path using cmdCopyBuffer exists for the cases where buffers are too large or GPU dirty.	2022-04-23 22:48:28 +01:00
lynxnb	0d9992cb8e	Implement `QuadList` support for non-indexed draws	2022-04-20 18:17:10 +02:00

1 2 3 4 5 ...

951 Commits