skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-06-26 15:26:05 +02:00

Author	SHA1	Message	Date
PixelyIon	e294fa8c91	Add subpass limit quirk to fix Adreno driver bug Older Adreno proprietary drivers (5xx and below) will segfault while destroying the renderpass and associated objects if more than 64 subpasses are within a renderpass due to internal driver implementation details. This commit introduces checks to automatically break up a renderpass when that limit is hit.	2022-04-14 14:14:52 +05:30
PixelyIon	cb1ec9a7f4	Rework `BufferManager`, `Buffer` and `BufferView` This commit encapsulates a complex sequence of cascading changes in the process of supporting overlaps for buffers: * We determined that it is impossible to resolve overlaps with multiple intervals per buffer within the constraints of each overlap being a contiguous view, support for multiple intervals was therefore dropped. The older buffer manager code was entirely reworked to be simpler due to only handling one interval per buffer with code now being based off `IntervalMap` but tailored specifically for buffers. * During overlap resolution, the problem of how existing views into the buffer being recreated would be updated, it had to be replaced with a larger buffer that could contain all overlaps and all existing views would need to be repointed to it. This was addressed by a buffer owning all views to itself, we could automatically recalculate the offset of all views and update the buffers with it. * We still needed to update usage of existing views which was done by handling all access (such as inside a recorded draw) to buffer view properties via `BufferView::RegisterUsage` which dispatches a callback with the view and the corresponding backing buffer. This callback can be stored and called during overlap resolution with the new buffer. * We had issues with lifetime of the buffer with the handle-like semantics of `BufferView` introduced in the last buffer-related commit, if we updated the view to be owned by a new buffer we'd need to extend the lifetime of the new buffer not the older one and the only way to do this was a proxy owner object `BufferDelegate` which holds a shared pointer to the real `Buffer` which in-turn holds a pointer to all `BufferDelegate` objects to update on repointing. A `BufferView` is effectively just a wrapper around `std::shared_ptr<BufferDelegate>` with more favorable semantics but generally just forwarding calls. It should be additionally noted that to support usage of `RegisterUsage` the code around buffers in `GraphicsContext` was refactored to defer truly binding till the recording phase.	2022-04-14 14:14:52 +05:30
PixelyIon	a6781b38f4	Clear `syncBuffers` after `CommandExecutor` execution Due to an oversight, we weren't clearing the list of buffers that needed to be synced after every execution which led to them building up. Due to the relatively cheap synchronization of buffers and only doing so on faults this wasn't caught until now, it does depress the framerate significantly over time due to the size of the list growing to be in the range of 100k buffer views depending on the title.	2022-04-14 14:14:52 +05:30
Robin Kertels	594f061b21	Implement SSBOs Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-04-14 14:14:52 +05:30
Billy Laws	5c387f5c5a	Fixup depth mode init value to allow ignoring redundant calls	2022-04-14 14:14:52 +05:30
PixelyIon	7a5c771f44	Rework GPU BufferView to have handle-like semantics We wanted views to extend the lifetime of the underlying buffers and at the same time preserve all views until the destruction of the buffer to prevent recreation which might be costly in the future when we need `VkBufferView`s of the buffer but also require a centralized list of all views for recreation of the buffer. It also removes the inconsistency between `BufferView*` being returned in `GetXView` in `GraphicsContext`.	2022-04-14 14:14:52 +05:30
Billy Laws	fc2c123ae2	Implement GPU depthMode register This controls the depth range used by the shader, hades already has support for the necessary patching so we only need to pass the current mode over to it and it'll do the necessary work.	2022-04-14 14:14:52 +05:30
Billy Laws	7e088ca465	Fix constbuf updates to actually increment the write offset Uses the register directly now as when we modify it we want the changes to be visible from macros too.	2022-04-14 14:14:52 +05:30
PixelyIon	730bf504f8	Correct Adreno texture binding quirk We incorrectly determined an Adreno driver bug to require padding between binding slots but the real issue was not supporting consecutive binding writes for `VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER` and was fixed by the padding slot unintentionally requiring individual writes. The quirk has now been corrected to explicitly specify this as the bug and the solution is more apt.	2022-04-14 14:14:52 +05:30
PixelyIon	881bb969c4	Implement access-driven Buffer synchronization Similar to constant redundant synchronization for textures, there is a lot of redundant synchronization of buffers. Albeit, buffer synchronization is far cheaper than texture synchronization it still has associated costs which have now been reduced by only synchronizing on access.	2022-04-14 14:14:52 +05:30
PixelyIon	3268b3779a	Implement access-driven Texture synchronization There was a lot of redundant synchronization of textures to and from host constantly as we were not aware of guest memory access, this has now been averted by tracking any memory accesses to the texture memory using the NCE Memory Trapping API and synchronizing only when required.	2022-04-14 14:14:52 +05:30
PixelyIon	5c9e42e384	Use mirror mappings for Textures and Buffers This is a prerequisite to memory trapping as we need to write to the mirror to avoid a race condition with external threads writing to a texture/buffer while we do so ourselves for the sync on a read/write, it also avoids an additional `mprotect` to `-WX`/`RWX` on a read access. An additional advantage for textures especially is that we now support split-mapping textures due to laying them out in a contiguous mirror and they will not require costly algorithmic changes. Buffers should also benefit from not needing to iterate over every region when they are split into multiple mappings.	2022-04-14 14:14:52 +05:30
Billy Laws	011de98940	Rework formats to support passing through guest swizzle values Almost every Maxwell format now directly corresponds to a Vulkan format. This allows formats to be passed through and the swizzle used directly from guest (with some extra swizzle handling for edge cases) thus saving the need to explicitly support each swizzle combination which is adds a lot of code bloat. The format header is additionally reordered with line breaks to separate formats by their bits-per-block.	2022-04-14 14:14:52 +05:30
PixelyIon	727f83e969	Fix Incorrect Vertex Binding Divisor State Submission We always submit pipeline divisor descriptions regardless of binding input rate being vertex rather than instance. This is invalid behavior and has been fixed by only submitting binding descriptors when the input rate is per-instance.	2022-04-14 14:14:52 +05:30
PixelyIon	9f7e80cf8f	Fix Adreno Texture Sampler Binding Bug Adreno proprietary drivers suffer from a bug where `VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER` requires 2 descriptor slots rather than one, we add a padding slot to fix this issue. `QuirkManager` was introduced to handle per-vendor/per-device errata and allow enabling this on Adreno proprietary drivers specifically as to not affect the performance of other devices.	2022-04-14 14:14:52 +05:30
PixelyIon	ddb2ba8a1b	Rename `QuirkManager` to `TraitManager` Quirk terminology was deemed to be inappropriate for describing the features/extensions of a device. It has been replaced with traits which is far more fitting but quirks will be used as a terminology for errata in devices.	2022-04-14 14:14:52 +05:30
PixelyIon	0b2ce6a8f3	Fix Texture Handle Offset Calculation The texture handle offset calculation involved an incorrect shift by descriptor size which was found to be unnecessary and would result in an invalid handle that had the wrong TIC/TSC index and caused broken rendering.	2022-04-14 14:14:52 +05:30
PixelyIon	aa57ec6d55	Destroy `CommandExecutor` Nodes Before Waiting on Execution `nodes` and `syncTextures` were cleared after waiting on the `CommandExecutor` fence rather than before, this wasted execution time after the wait for something that could be performed prior to the wait.	2022-04-14 14:14:52 +05:30
PixelyIon	90a1b3348c	Implement D24S8 + R11G11B10 Formats	2022-04-14 14:14:52 +05:30
PixelyIon	22ce531e6f	Force Memory Barrier at `VkRenderPass` Start We depend on past commands to have completed execution in a renderpass, a subpass dependency on all graphics stages from `VK_SUBPASS_EXTERNAL` to subpass #0 is used to enforce this. Nvidia and Adreno proprietary drivers implicitly do this but Turnip or Mali drivers require this or they execute out of order.	2022-04-14 14:14:52 +05:30
PixelyIon	043be4d8f7	Implement Maxwell3D Two-Side Stencil Toggle Stencil operations are configurable to be the same for both sides or have independent stencil state for both sides. It is controlled via the previously unimplemented `stencilTwoSideEnable`.	2022-04-14 14:14:52 +05:30
PixelyIon	80ae7b255a	Implement Maxwell3D Front Face Flip	2022-04-14 14:14:52 +05:30
PixelyIon	40a3887695	Implement Maxwell3D Viewport Y Swizzle & Lower-Left Origin	2022-04-14 14:14:52 +05:30
Billy Laws	3be30e68c3	Add D16 depth format and ZF32 TIC format Used by One Piece Unlimited World Red	2022-04-14 14:14:52 +05:30
Billy Laws	6e48460c0d	Add BC2/3 format support	2022-04-14 14:14:52 +05:30
PixelyIon	41aad83c33	Tie Shader `ObjectPool` Lifetime to Shader `Program` Shader programs allocate instructions and blocks within an `ObjectPool`, there was a global pool prior that was never reaped aside from on destruction. This led to a leak where the pool would contain resources from shader programs that had been deleted, to avert this the pools are now tied to shader programs.	2022-04-14 14:14:52 +05:30
PixelyIon	e747de37cf	Implement Blocklinear TIC Type	2022-04-14 14:14:52 +05:30
PixelyIon	723189a948	Calculate Blocklinear Texture Aligned Size Correctly The size of blocklinear textures did not consider alignment to Block/ROB boundaries before, it is aligned to them now. Incorrect sizes led to textures not being aliased correctly due to different size calculations for GraphicBufferProducer surfaces and Maxwell3D color RTs.	2022-04-14 14:14:52 +05:30
Billy Laws	e7bfd93541	Implement BC7 format support Used by ARMS	2022-04-14 14:14:52 +05:30
Billy Laws	99652c5eda	Support partially mapped cbufs Buggy games sometimes supply an incorrect cbuf size so limit buffers to the first unmapped region.	2022-04-14 14:14:52 +05:30
PixelyIon	6a6f51ea84	Implement Maxwell3D Depth/Stencil State Implements the entirety of Maxwell3D Depth/Stencil state for both faces including compare/write masks and reference value. Maxwell3D register `stencilTwoSideEnable` is ignored as its behavior is unknown and could mean the same behavior for both stencils or the back facing stencil being disabled as a result of this it is unimplemented.	2022-04-14 14:14:52 +05:30
Billy Laws	ab4962c4e4	Implement additional texture formats, including BCn BCeNabler is required for BCn textures, the pre-swizzled formats will be removed when arbitary swizzle support is added later.	2022-04-14 14:14:52 +05:30
Billy Laws	600b94505c	Fix A2R10G10B10 render target format This was wrongly described as R10G10B10A2 in the enum when it's actually A2R10G10B10, a format natively supported in Vulkan with just a swizzle.	2022-04-14 14:14:52 +05:30
PixelyIon	edd51c3dfa	Fix Color RT Disabling Bug Color RTs are disabled by setting their format as `None`, it was removed while transitioning to macros and resulted in a missing format exception. It has been readded as several applications depend on this behavior.	2022-04-14 14:14:52 +05:30
PixelyIon	a2285669b3	Use static vector for shader bytecode to prevent constant reallocation Using `std::vector` for shader bytecode led to a lot of reallocation due to constant resizing, switching over the static vector allows for a single static allocation of the maximum possible guest shader size (1 MiB) to be done for every stage resulting in a 6 MiB preallocation which is unnoticeable given the total memory overhead of running a Switch application.	2022-04-14 14:14:52 +05:30
PixelyIon	21a6866def	Fix Maxwell3D Blend Enum Conversion Bugs The `OneMinusSourceAlpha` blending factor was converted to `eOneMinusSrcColor` rather than `eOneMinusSrcAlpha` leading to incorrect blending behavior in certain titles. A similar issue with the order of `MinimumGL`/`MaximumGL` and `SubtractGL`/`ReverseSubtractGL` being the opposite of what it should've been, both of these issues have been fixed.	2022-04-14 14:14:52 +05:30
PixelyIon	0a506088f4	Fix `NextSubpassNode` Subpass Index Bug `NextSubpassNode` didn't increment `subpassIndex` which runs commands with the wrong subpass index resulting in them accessing invalid attachments or other bugs that may arise from using the wrong subpass.	2022-04-14 14:14:52 +05:30
PixelyIon	defbfe8f78	Serialize Maxwell3D Draw State for Subpass All Maxwell3D state was passed by reference to the draw command lambda, this would break if there was more than one pass or the state was changed in any way before execution. All state has now been serialized by value into the draw command lambda capture, retaining state regardless of mutations of the class state.	2022-04-14 14:14:52 +05:30
PixelyIon	934130b3e6	Remove Implicit Command Executor Resource Attachment Any usage of a resource in a command now requires attaching that resource externally and will not be implicitly attached on usage, this makes attaching of resources consistent and allows for more lax locking requirements on resources as they can be locked while attaching and don't need to be for any commands, it also avoids redundantly attaching a resource in certain cases.	2022-04-14 14:14:52 +05:30
Billy Laws	3ff8075151	Move vertex and RT format conv to macros and fill them fully in Makes the format conversions easier to read and shorter, and adds in some new formats needed to complete the RT table properly.	2022-04-14 14:14:52 +05:30
Billy Laws	68f31c3688	Use macros for defining texture formats and their conversions Avoids the need to repeat all the possible component types for each texture format while also making them simpler to add and easier to read.	2022-04-14 14:14:52 +05:30
PixelyIon	bc29b23972	Implement CPU-only Maxwell3D Inline Constant Buffer Updates Implements inline constant buffer updates that are written to the CPU copy of the buffer rather than generating an actual inline buffer write, this works for TIC/TSC index updates but won't work when the buffer is expected to actually be updated inline with regard to sequence rather than just as a buffer upload prior to rendering. GPU-sided constant buffer updates will be implemented later with optimizations for updating an entire range by handling GPFIFO `Inc`/`NonInc`directly and submitting it as a host inline buffer update.	2022-04-14 14:14:52 +05:30
PixelyIon	bb14af4f7a	Implement Maxwell3D Sampled Textures The descriptor sets should now contain a combined image and sampler handle for any sampled textures in the guest shader from the supplied offset into the texture constant buffer. Note: Games tend to rely on inline constant buffer updates for writing the texture constant buffer and due to it not being implemented, the value will be read as 0 which is incorrect.	2022-04-14 14:14:52 +05:30
PixelyIon	d9a9e52350	Use `ConstantBuffer` instead of `BufferView` for Shader Constant Buffers We want read semantics inside the constant buffer object via the mappings to avoid a pointless GPU VMM mapping lookup. It is a fairly frequent operation so this is necessary, the ability to write directly will be added in the future as well.	2022-04-14 14:14:52 +05:30
PixelyIon	adb0a16873	Implement Maxwell 3D Textures Implements parsing for the Maxwell 3D TIC pool and conversion of a TIC into a `GuestTexture`, support is limited to pitch-linear RGB565/A8R8G8B8 textures at the moment but will be extended as games utilize more formats and layouts. Support for 1D buffers is also omitted at the moment since they need special handling with them effectively being treated as buffers in Vulkan rather than images.	2022-04-14 14:14:52 +05:30
PixelyIon	a9aa16798f	Add `-fsigned-bitfields` for defined bitfield `int` behavior We want consistent behavior between signed `int`s in bitfields and outside of bitfields, the `-fsigned-bitfields` flag enforces this behavior.	2022-04-14 14:14:52 +05:30
PixelyIon	87c8dc94d2	Implement Maxwell3D Samplers Maxwell3D `TextureSamplerControl` (TSC) are fully converted into Vulkan samplers with extension backing for all aspects that require them (border color/reduction mode) and approximations where Vulkan doesn't support certain functionality (sampler address mode) alongside cases where extensions may not be present (border color).	2022-04-14 14:14:52 +05:30
PixelyIon	e48a7d7009	Fix Mapping Caching For Maxwell 3D Buffers Code involving caching of mappings was copied from `RenderTarget` without much consideration for applicability in buffers, the reason for caching mappings in RTs was that the view may be invalidated by more than the IOVA/Size being changed but this doesn't hold true for buffers generally so invalidation can only be on the view level with the mappings being looked up every time since the invalidation would likely change them.	2022-04-14 14:14:52 +05:30
PixelyIon	c11962e8e4	Implement Maxwell3D Bindless Texture Constant Buffer Index The index of the constant buffer with bindless texture descriptors is now retrieved from Maxwell3D register state and passed to the shader compiler.	2022-04-14 14:14:52 +05:30
PixelyIon	1c3f62b7b4	Implement Maxwell3D Indexed Drawing	2022-04-14 14:14:52 +05:30

1 2 3

111 Commits