skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-11-16 23:29:26 +01:00

Author	SHA1	Message	Date
Robin Kertels	594f061b21	Implement SSBOs Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-04-14 14:14:52 +05:30
Billy Laws	82d2a9ab56	Unify engine related macros to avoid excessive code duplication	2022-04-14 14:14:52 +05:30
Billy Laws	ae41ddf4f0	Implement a skeleton compute engine The Kepler compute engine is used to run compute jobs encapsulated in to QMDs on the GPU, this commit doesn't implement compute itself but adds the register and QMD structs that will be needed for it in the future.	2022-04-14 14:14:52 +05:30
Billy Laws	0298a7b1f6	Implement the actual inline to memory engine on subch 2 Used mostly by OGL games for copying stuff around.	2022-04-14 14:14:52 +05:30
Billy Laws	ba7111d33a	Add maxwell3d I2M support	2022-04-14 14:14:52 +05:30
Billy Laws	8c73b62b2c	Implement basic inline2memory engine support Not currently used by anything but will be used by both compute, 3D and its own engine in the future. Block linear copies are currently unsupported.	2022-04-14 14:14:52 +05:30
Billy Laws	5c387f5c5a	Fixup depth mode init value to allow ignoring redundant calls	2022-04-14 14:14:52 +05:30
PixelyIon	7a5c771f44	Rework GPU BufferView to have handle-like semantics We wanted views to extend the lifetime of the underlying buffers and at the same time preserve all views until the destruction of the buffer to prevent recreation which might be costly in the future when we need `VkBufferView`s of the buffer but also require a centralized list of all views for recreation of the buffer. It also removes the inconsistency between `BufferView*` being returned in `GetXView` in `GraphicsContext`.	2022-04-14 14:14:52 +05:30
Billy Laws	fae5332f20	Disable descriptor aliasing on Adreno to workaround shader compiler bug Alised descriptor sets are incorrectly interpreted by the shader compiler causing it to bugger up LLVM function argument types and crash Co-authored-by: PixelyIon <pixelyion@protonmail.com>	2022-04-14 14:14:52 +05:30
Billy Laws	fc2c123ae2	Implement GPU depthMode register This controls the depth range used by the shader, hades already has support for the necessary patching so we only need to pass the current mode over to it and it'll do the necessary work.	2022-04-14 14:14:52 +05:30
Billy Laws	7e088ca465	Fix constbuf updates to actually increment the write offset Uses the register directly now as when we modify it we want the changes to be visible from macros too.	2022-04-14 14:14:52 +05:30
PixelyIon	d2f3479610	Use `eB5G6R5UnormPack16` VkFormat for `B5G6R5Unorm` and `R5G6B5Unorm` Using `eB5G6R5UnormPack16` (with a swizzle for `R5G6B5Unorm`) removes the need for `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT` when those formats are aliased which happens in Sonic Mania among other titles.	2022-04-14 14:14:52 +05:30
PixelyIon	24d7066d8b	Add quirk to avoid `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT` on Adreno GPUs Adreno GPUs have significant performance penalties from usage of `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT` which require disabling UBWC and on Turnip, forces linear tiling. As a result, it's been made an optional quirk which doesn't supply the flag in `VkImageCreateInfo` and logs a warning if a view with a different Vulkan format from the original image is created.	2022-04-14 14:14:52 +05:30
PixelyIon	731d06010d	Set `eMutableFormat` in Texture Image Creation We often need to alias the underlying data as multiple Vulkan formats which requires the `eMutableFormat` bit to be set in `VkImageCreateInfo`, without doing this there'll be validation layer errors and potentially GPU bugs.	2022-04-14 14:14:52 +05:30
PixelyIon	dafcfa68ca	Transition texture layout to `eGeneral` after creation As we no longer set the layout to general inside the Texture constructor, yet, we need it to be set prior to the image being used as an attachment. We need to transition the layout to `eGeneral` after creation of the texture object.	2022-04-14 14:14:52 +05:30
MK73DS	647cb07dc8	Stub functions in IAccountServiceForApplication: - GetUserCount - InitializeApplicationInfo - IsUserAccountSwitchLocked	2022-04-14 14:14:52 +05:30
PixelyIon	730bf504f8	Correct Adreno texture binding quirk We incorrectly determined an Adreno driver bug to require padding between binding slots but the real issue was not supporting consecutive binding writes for `VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER` and was fixed by the padding slot unintentionally requiring individual writes. The quirk has now been corrected to explicitly specify this as the bug and the solution is more apt.	2022-04-14 14:14:52 +05:30
PixelyIon	da8cb48933	Fix Interval Map `GetAlignedRecursiveRange` lookup bug Any lookups done using `GetAlignedRecursiveRange` incorrectly added intervals in the exclusive interval entry lookups as the condition for adding them was the reverse of what it should've been due to a last minute refactor, it led to graphical glitches and crashes. This has been fixed and the lookups should return the correct results.	2022-04-14 14:14:52 +05:30
PixelyIon	f2faa74707	Fix crashes due to `SEGV_ACCERR` check On certain devices, accesses to a protected memory region can return `si_code` as non-`SEGV_ACCERR` values, this leads to a crash as we only pass access violations to the trap handler and would lead to not doing so on those devices which would then result in going to the crash handler.	2022-04-14 14:14:52 +05:30
PixelyIon	77e2797219	Delete expired `weak_ptr`s for Texture/Buffer views A large amount of Texture/Buffer views would expire before reuse could occur in `Texture::GetView`/`Buffer::GetView`. These can lead to a substantial memory allocation given enough time and they are now deleted during the lookup while iterating on all entries. It should be noted that there are a lot of duplicate views that don't live long enough to be reused and the ultimate solution here is to make those views live long enough to be reused.	2022-04-14 14:14:52 +05:30
PixelyIon	881bb969c4	Implement access-driven Buffer synchronization Similar to constant redundant synchronization for textures, there is a lot of redundant synchronization of buffers. Albeit, buffer synchronization is far cheaper than texture synchronization it still has associated costs which have now been reduced by only synchronizing on access.	2022-04-14 14:14:52 +05:30
PixelyIon	7532eaf050	Attach Texture to Cycle in `Texture::TransitionLayout` Not doing so could result in the texture being destroyed before the completion of a transition and lead to undefined behavior.	2022-04-14 14:14:52 +05:30
PixelyIon	3268b3779a	Implement access-driven Texture synchronization There was a lot of redundant synchronization of textures to and from host constantly as we were not aware of guest memory access, this has now been averted by tracking any memory accesses to the texture memory using the NCE Memory Trapping API and synchronizing only when required.	2022-04-14 14:14:52 +05:30
PixelyIon	3e33d49faf	Implement NCE Memory Trapping API An API for trapping accesses to guest memory and performing callbacks based on those accesses alongside managing protection of the memory. This is a fundamental building block for avoiding redundant synchronization of resources from the guest and host. Note: All accesses are treated as write accesses at the moment, support for picking up read accesses will be implemented later	2022-04-14 14:14:52 +05:30
PixelyIon	08510d75b0	Implement Interval Map An interval map is a crucial piece of infrastructure required for memory faulting to track any regions that have an associated callback and their protection. Additionally, efficient page-aligned lookups with semantics optimal for memory faulting are also a requirement and the ability to associate multiple regions with a single callback/protection entry rather than doing so on a per-region basis as we deal with split-mapping resources.	2022-04-14 14:14:52 +05:30
PixelyIon	5c9e42e384	Use mirror mappings for Textures and Buffers This is a prerequisite to memory trapping as we need to write to the mirror to avoid a race condition with external threads writing to a texture/buffer while we do so ourselves for the sync on a read/write, it also avoids an additional `mprotect` to `-WX`/`RWX` on a read access. An additional advantage for textures especially is that we now support split-mapping textures due to laying them out in a contiguous mirror and they will not require costly algorithmic changes. Buffers should also benefit from not needing to iterate over every region when they are split into multiple mappings.	2022-04-14 14:14:52 +05:30
PixelyIon	577a67babd	Support mirrors of multiple non-contiguous memory regions `CreateMirror` is limited to creating a mirror of a single contiguous region which does not work when creating a contiguous mirror of multiple non-contiguous regions. To support this functionality, `CreateMirrors` which expects a list of page-aligned regions and maps them into a contiguous mirror.	2022-04-14 14:14:52 +05:30
PixelyIon	e35ab6d1e0	Move to mapping guest AS as shared memory We want to create arbitrary mirrors in the guest address space and to make this possible, we map the entire address space as a shared memory file. A mirror is mapped by using `mmap` with the offset into the guest address space.	2022-04-14 14:14:52 +05:30
Billy Laws	a5dd961f01	Add support for batched method sending Important for constbuf updates which would be very slow if done one at a time.	2022-04-14 14:14:52 +05:30
Robin Kertels	43879e2476	Round up when calculating size of compressed texture in bytes	2022-04-14 14:14:52 +05:30
Robin Kertels	d889550e84	Don't set COLOR_ATTACHMENT_BIT for compressed formats. The better solution would be to only set this for formats that support it on original HW but this will get rid of the validation errors for now.	2022-04-14 14:14:52 +05:30
Robin Kertels	82296ac5b8	Use buffer size instead of allocation size for Buffer constructor Fixes a validation error.	2022-04-14 14:14:52 +05:30
Robin Kertels	752245c3c8	Enable provoking vertex feature	2022-04-14 14:14:52 +05:30
Robin Kertels	dd45d054e7	Enable shaderDrawParameters	2022-04-14 14:14:52 +05:30
Billy Laws	7e16c1f989	Heavily optimise GPFIFO command dispatch to reduce redundant checks Previously for methods with count > 1 the subchannel and engine would be looked up for each part of the method rather than only doing so at the start. Each call also needed to be looked up to see if it touched a macro or GPFIFO method. Fix this by doing checks outside of the main dispatch loop with templated helper lambdas to avoid needing to repeat lots of code. Maxwell3D is the only subchannel with a fast path for now but more can be added later if needed.	2022-04-14 14:14:52 +05:30
Billy Laws	b4927d0138	Add support for turnip and driver file redirection via libadrenotools	2022-04-14 14:14:52 +05:30
Billy Laws	dd91d063a5	Pass native library dir to OS + reorder OS init order so paths are first This is required for integrating libadrenotools, which needs access to library and app directories in the GPU class constructor.	2022-04-14 14:14:52 +05:30
Billy Laws	011de98940	Rework formats to support passing through guest swizzle values Almost every Maxwell format now directly corresponds to a Vulkan format. This allows formats to be passed through and the swizzle used directly from guest (with some extra swizzle handling for edge cases) thus saving the need to explicitly support each swizzle combination which is adds a lot of code bloat. The format header is additionally reordered with line breaks to separate formats by their bits-per-block.	2022-04-14 14:14:52 +05:30
Billy Laws	6f17d1351f	Fixup ordering for B10G11R11Float texture format	2022-04-14 14:14:52 +05:30
Billy Laws	78238d550a	Add 6 channel downmixing support for Audout The specific attenuations used for each channel are taken from Ryujinx.	2022-04-14 14:14:52 +05:30
Billy Laws	2e1a1a965d	Fixup AudioTrack locking	2022-04-14 14:14:52 +05:30
PixelyIon	727f83e969	Fix Incorrect Vertex Binding Divisor State Submission We always submit pipeline divisor descriptions regardless of binding input rate being vertex rather than instance. This is invalid behavior and has been fixed by only submitting binding descriptors when the input rate is per-instance.	2022-04-14 14:14:52 +05:30
PixelyIon	9f7e80cf8f	Fix Adreno Texture Sampler Binding Bug Adreno proprietary drivers suffer from a bug where `VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER` requires 2 descriptor slots rather than one, we add a padding slot to fix this issue. `QuirkManager` was introduced to handle per-vendor/per-device errata and allow enabling this on Adreno proprietary drivers specifically as to not affect the performance of other devices.	2022-04-14 14:14:52 +05:30
PixelyIon	ddb2ba8a1b	Rename `QuirkManager` to `TraitManager` Quirk terminology was deemed to be inappropriate for describing the features/extensions of a device. It has been replaced with traits which is far more fitting but quirks will be used as a terminology for errata in devices.	2022-04-14 14:14:52 +05:30
PixelyIon	0b2ce6a8f3	Fix Texture Handle Offset Calculation The texture handle offset calculation involved an incorrect shift by descriptor size which was found to be unnecessary and would result in an invalid handle that had the wrong TIC/TSC index and caused broken rendering.	2022-04-14 14:14:52 +05:30
PixelyIon	aa57ec6d55	Destroy `CommandExecutor` Nodes Before Waiting on Execution `nodes` and `syncTextures` were cleared after waiting on the `CommandExecutor` fence rather than before, this wasted execution time after the wait for something that could be performed prior to the wait.	2022-04-14 14:14:52 +05:30
PixelyIon	90a1b3348c	Implement D24S8 + R11G11B10 Formats	2022-04-14 14:14:52 +05:30
PixelyIon	bd718175ce	Enable `VK_KHR_uniform_buffer_standard_layout` when available We now attempt to enable `VK_KHR_uniform_buffer_standard_layout` when present as lax UBO layout significantly reduces complexity. If a device doesn't support this extension, we still assume that the device supports it implicitly as this has proven to be true across all major mobile GPU vendors regardless of the driver version but enabling this prevents validation layer errors.	2022-04-14 14:14:52 +05:30
PixelyIon	22ce531e6f	Force Memory Barrier at `VkRenderPass` Start We depend on past commands to have completed execution in a renderpass, a subpass dependency on all graphics stages from `VK_SUBPASS_EXTERNAL` to subpass #0 is used to enforce this. Nvidia and Adreno proprietary drivers implicitly do this but Turnip or Mali drivers require this or they execute out of order.	2022-04-14 14:14:52 +05:30
PixelyIon	35fde2cd0b	Rework Blocklinear Texture Deswizzling Blocklinear texture decoding was broken for padding blocks and would incorrectly decode them resulting in major texture corruption for any textures with their widths not aligned to 64 bytes. This has now been fixed with neater code which avoids redundant repetition of any code using lambdas and functions where necessary.	2022-04-14 14:14:52 +05:30

1 2 3 4 5 ...

667 Commits