Older Adreno proprietary drivers (5xx and below) will segfault while destroying the renderpass and associated objects if more than 64 subpasses are within a renderpass due to internal driver implementation details. This commit introduces checks to automatically break up a renderpass when that limit is hit.
Adreno GPUs have significant performance penalties from usage of `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT` which require disabling UBWC and on Turnip, forces linear tiling. As a result, it's been made an optional quirk which doesn't supply the flag in `VkImageCreateInfo` and logs a warning if a view with a different Vulkan format from the original image is created.
We often need to alias the underlying data as multiple Vulkan formats which requires the `eMutableFormat` bit to be set in `VkImageCreateInfo`, without doing this there'll be validation layer errors and potentially GPU bugs.
As we no longer set the layout to general inside the Texture constructor, yet, we need it to be set prior to the image being used as an attachment. We need to transition the layout to `eGeneral` after creation of the texture object.
A large amount of Texture/Buffer views would expire before reuse could occur in `Texture::GetView`/`Buffer::GetView`. These can lead to a substantial memory allocation given enough time and they are now deleted during the lookup while iterating on all entries.
It should be noted that there are a lot of duplicate views that don't live long enough to be reused and the ultimate solution here is to make those views live long enough to be reused.
There was a lot of redundant synchronization of textures to and from host constantly as we were not aware of guest memory access, this has now been averted by tracking any memory accesses to the texture memory using the NCE Memory Trapping API and synchronizing only when required.
This is a prerequisite to memory trapping as we need to write to the mirror to avoid a race condition with external threads writing to a texture/buffer while we do so ourselves for the sync on a read/write, it also avoids an additional `mprotect` to `-WX`/`RWX` on a read access.
An additional advantage for textures especially is that we now support split-mapping textures due to laying them out in a contiguous mirror and they will not require costly algorithmic changes. Buffers should also benefit from not needing to iterate over every region when they are split into multiple mappings.
The size of blocklinear textures did not consider alignment to Block/ROB boundaries before, it is aligned to them now. Incorrect sizes led to textures not being aliased correctly due to different size calculations for GraphicBufferProducer surfaces and Maxwell3D color RTs.
We don't respect the host subresource layout in synchronizing linear textures from the guest to the host when mapped to memory directly, this leads to texture corruption and while the real fix would involve respecting the host subresource layout, this has been deferred for later as real world performance advantages/disadvantages associated with this change can be observed more carefully to determine if it's worth it.
Fixes texture corruption due to incorrect synchronization, the barrier would not enforce waiting till the texture was entirely rendered causing an incomplete texture to be downloaded which lead to rendering bugs for certain GPUs including ARM's Mali GPUs.
The descriptor sets should now contain a combined image and sampler handle for any sampled textures in the guest shader from the supplied offset into the texture constant buffer.
Note: Games tend to rely on inline constant buffer updates for writing the texture constant buffer and due to it not being implemented, the value will be read as 0 which is incorrect.
Only copying a single aspect was supported by `CopyIntoStagingBuffer` earlier due to not supplying a `VkBufferImageCopy` for each aspect separately, this has now been done with Color/Depth/Stencil aspects having their own `VkBufferImageCopy` for the `VkCmdCopyImageToBuffer` command.
The definition of the `TextureView` class was spread across `texture.cpp` and has now been moved to the top of the file above the other half of the definition.
Sets `VkImageUsageFlags` correctly rather than hardcoding it for color attachments and adds multiple `VkBufferImageCopy` to `VkCmdCopyBufferToImage` for Color/Depth/Stencil aspects of an image.
We want `TextureView`(s) to be disconnected from the backing on the host and instead represent a specific texture on the guest with a backing that can change depending on mapping of new textures which'd invalidate the backing but should now be automatically repointed to an appropriate new backing. This approach also requires locking of the backing to function as it is mutable till it has been locked or the backing has an attached `FenceCycle` that hasn't been signaled which will be added for `CommandExecutor` in a subsequent commit.
It was determined that `backing` wasn't a very descriptive name and that it conflicted with the texture's own backing, the name was changed to `texture` to make it more apparent that it was specifically the `Texture` object backing the view.
The Vulkan Pipeline Barriers were unoptimal and incorrect to some degree prior as we purely synchronized images and not staging buffers. This has now been fixed and improved in general with more relevant synchronization.
* Fix `AddClearColorSubpass` bug where it would not generate a `VkCmdNextSubpass` when an attachment clear was utilized
* Fix `AddSubpass` bug where the Depth Stencil texture would not be synced
* Respect `VkCommandPool` external synchronization requirements by making it thread-local with a custom RAII wrapper
* Fix linear RT width calculation as it's provided in terms of bytes rather than format units
* Fix `AllocateStagingBuffer` bug where it would not supply `eTransferDst` as a usage flag
* Fix `AllocateMappedImage` where `VkMemoryPropertyFlags` were not respected resulting in non-`eHostVisible` memory being utilized
* Change feature requirement in `AndroidManifest.xml` to Vulkan 1.1 from OGL 3.1 as this was incorrect
Infrastructure for always syncing textures has been introduced now, they will be synced prior to and after every execution. This does considerably reduce the performance alongside waiting on GPU execution to finish but it will be partially recouped once conditional syncing is performed.
Support for subpasses was added by reworking attachment reuse code to account for preserved attachments and subpass dependencies. A lot of RT formats were also added to allow SMO to boot up entirely, it should be noted that it doesn't render anything.
`FenceCycle` had a cyclic dependency which broke clean exit, we now utilize `std::weak_ptr<FenceCycle>` inside the `Texture` object. A minor fix for broken stack traces was also made caused by supplying a `nullptr` C-string to libfmt when a symbol was unresolved which caused an `abort` due to invocation of `strlen` with it.
This commit implements a filter by type for any validation layer output, this allows filtering out any logs which may be unnecessary and additionally triggering a breakpoint as required.
An issue concerning the `NDEBUG` flag never being set was fixed, it's now supplied as a release compiler flag. The issue can manifest itself by always relying on a validation layer even though it shouldn't on release, this is why the validation layer was mistakenly disabled entirely previously by using `#ifndef` rather than `#ifdef`.
An issue with the initial layout of a texture being supplied as neither `VK_IMAGE_LAYOUT_UNDEFINED` or `VK_IMAGE_LAYOUT_PREINITIALIZED` was fixed, these cases are now handled by transitioning to those layouts after creating the image rather than supplying it within `initialLayout`.
Another issue was fixed regarding not maintaining a transformation after a surface has been destroyed and recreated existed and manifested itself when the user would go out of the app and come back in, they would see the surface having an identity transformation rather than the desired one.
Implement the groundwork for the texture manager to be able to report basic overlaps and be extended to support more in the future. The Maxwell3D registers `RenderTargetControl`, `RenderTarget` and a stub for `ClearBuffers` were implemented.
A lot of changes were also made to `GuestTexture`/`Texture` for supporting mipmapping and multiple array layers alongside significant architectural changes to `GuestTexture` effectively disconnecting it from `Texture` with it no longer being a parent rather an object that can be used to create a `Texture` object.
Note: Support for fragmented CPU mappings hasn't been added for texture synchronization yet
We had issues when combining host and guest presentation since certain configurations in guest presentation such as double buffering were very unoptimal for the host and would significantly affect the FPS. As a result of this, we've now made host presentation have its own presentation textures which are copied into from the guest at presentation time, allowing us to change parameters of the host presentation independently of the guest.
We've implemented the infrastructure for this which includes being able to create images from host GPU memory using VMA, an optimized linear texture sync and a method to do on-GPU texture-to-texture copies.
We've also moved to driving the V-Sync event using AChoreographer on its on thread in this PR, which more accurately encapsulates HOS behavior and allows games such as ARMS to boot as they depend on the V-Sync event being signalled even when the game isn't presenting.
This commit reworks the `Texture` class to include a Vulkan Image backing that can be optionally owning or non-owning and swapped in with consideration for Vulkan image layout, it also adds CPU-sided synchronization for the texture objects with FenceCycle. It also makes the appropriate changes to `PresentationEngine` and `GraphicBufferProducer` to work with the new `Texture` class while setting the groundwork for supporting swapchain recreation. It also fixes a log in `IpcResponse` and improves the display mode selection algorithm by further weighing refresh rate.