The descriptor sets should now contain a combined image and sampler handle for any sampled textures in the guest shader from the supplied offset into the texture constant buffer.
Note: Games tend to rely on inline constant buffer updates for writing the texture constant buffer and due to it not being implemented, the value will be read as 0 which is incorrect.
We want read semantics inside the constant buffer object via the mappings to avoid a pointless GPU VMM mapping lookup. It is a fairly frequent operation so this is necessary, the ability to write directly will be added in the future as well.
Implements parsing for the Maxwell 3D TIC pool and conversion of a TIC into a `GuestTexture`, support is limited to pitch-linear RGB565/A8R8G8B8 textures at the moment but will be extended as games utilize more formats and layouts. Support for 1D buffers is also omitted at the moment since they need special handling with them effectively being treated as buffers in Vulkan rather than images.
The pitch of the texture should always be supplied in terms of bytes as it denotes alignment on a byte boundary rather than a pixel one, it is also always utilized in terms of bytes rather than pixels so this avoids an unnecessary conversion.
Note: GBP stride unit was assumed to be pixels earlier but is likely bytes which is why there are no changes to the supplied value there, if this is not the case it'll be fixed in the future
Maxwell3D `TextureSamplerControl` (TSC) are fully converted into Vulkan samplers with extension backing for all aspects that require them (border color/reduction mode) and approximations where Vulkan doesn't support certain functionality (sampler address mode) alongside cases where extensions may not be present (border color).
Code involving caching of mappings was copied from `RenderTarget` without much consideration for applicability in buffers, the reason for caching mappings in RTs was that the view may be invalidated by more than the IOVA/Size being changed but this doesn't hold true for buffers generally so invalidation can only be on the view level with the mappings being looked up every time since the invalidation would likely change them.
`std::hash` doesn't have a generic template where it can be utilized for arbitrary trivial objects and implementing this might result in conflicts with other types. To fix this a generic templated hash is now provided as a utility structure, that can be utilized directly in hash-based containers such as `unordered_map`.
Nullability allow for optional semantics where a span may be explicitly invalidated with `nullptr` being used as a sentinel value for it and a boolean operator that allows trivial checking for if the span is valid or not.
Adds support for index buffers including U8 index buffers via the `VK_EXT_index_type_uint8` extension which has been added as an optional quirk but an exception will be thrown if the guest utilizes it but the host doesn't support it.
Add support for parsing and combining `VertexA` and `VertexB` programs into a single vertex pipeline program prior to compilation, atomic reparsing and combining is supported to only reparse the stage that was modified and recombine once at most within a single pipeline compilation.
Atomically invalidate pipeline stages as runtime information that pertains to them changes rather than never recompiling pipelines on runtime information being updated resulting in out of date pipelines or recompiling all pipelines on any runtime information updates.
Shader compilation is now broken into shader program parsing and pipeline shader compilation which will allow for supporting dual vertex shaders and more atomic invalidation depending on runtime state limiting the amount of work that is redone.
Bindings are now properly handled allowing for bound UBOs to be converted to the appropriate host UBO as designated by the shader compiler by creating Vulkan Descriptor Sets that match it.
We need this to make the distinction between a shader and pipeline stage in as shader programs are bound at a different rate than that of pipeline stage resources such as UBO.
An instance of `Shader::Backend::Bindings` must be retained across all stages for correct emission of bindings, which is now done inside `GraphicsContext::GetShaderStages`.
The vertex attribute types supplied prior were just the default which is `Float`, this works for some cases but will entirely break if the attribute type isn't a float. The attribute types are now set correctly.
Only copying a single aspect was supported by `CopyIntoStagingBuffer` earlier due to not supplying a `VkBufferImageCopy` for each aspect separately, this has now been done with Color/Depth/Stencil aspects having their own `VkBufferImageCopy` for the `VkCmdCopyImageToBuffer` command.
The definition of the `TextureView` class was spread across `texture.cpp` and has now been moved to the top of the file above the other half of the definition.
A buffer with 0 as the start/end IOVA should be invalid as there shouldn't be any mappings at 0 in GPU VA, titles such as Puyo Puyo Tetris configure the Vertex Buffer with 0 IOVAs which leads to a segmentation fault without this exception.
The lifetime of a texture and buffer view is now bound by the `FenceCycle` in `CommandExecutor`, this ensures that a `VkImageView` isn't destroyed prior to usage leading to UB.
The lifetime of all textures bound to a RenderPass alongside syncing of textures is already handled by `CommandExecutor` and doesn't need to be redundantly handled by `RenderPassNode`. It's been removed as a result of this.
Adds the depth/stencil RT as an attachment for the draw but with `VkPipelineDepthStencilStateCreateInfo` stubbed out, it'll not function correctly and the contents will not be what the guest expects them to be.
Support for clearing the depth/stencil RT has been added as its own function via either optimized `VkAttachmentLoadOp`-based clears or `vkCmdClearAttachments`. A bit of cleanup has also been done for color RT clears with the lambda for the slow-path purely calling the command rather than creating the parameter structures.
Implements `AddClearDepthStencilSubpass` in `CommandExecutor` which is similar to `ClearColorAttachment` in that it uses `VK_ATTACHMENT_LOAD_OP_CLEAR` for the clear which is far more efficient than using `VK_ATTACHMENT_LOAD_OP_LOAD` then doing the clear.
The stage/access mask for `VkSubpassDependency` were hardcoded to only be valid for color attachments earlier, this has now been fixed by branching based on the format aspect.
Sets `VkImageUsageFlags` correctly rather than hardcoding it for color attachments and adds multiple `VkBufferImageCopy` to `VkCmdCopyBufferToImage` for Color/Depth/Stencil aspects of an image.
Support the Maxwell3D Depth RT for Z-buffering, this just creates an equivalent `RenderTarget` object with no support on the API-user side (IE: `Draw` and `ClearBuffers`).
This prefixes all RT functions that deal with color RTs with `Color` and abstracts out common functions that will be used for both color and depth RTs. All common Maxwell3D structures are also moved out of the `ColorRenderTarget` (`RenderTarget` previously) structure.
To allow for caching of pipelines on the host a `VkPipelineCache` has been added, it is entirely in-memory and is not flushed to the disk which'll be done in the future alongside caching guest shaders to further avoid translation where possible.
Uses all Maxwell3D state converted into Vulkan state to do an equivalent draw on the host GPU, it sets up RT/Vertex Buffer/Vertex Attribute/Shader state and creates a stubbed out `VkPipelineLayout` for the draw. Any descriptor state isn't currently handled and is yet to be implemented, currently there's no Vulkan pipeline cache supplied which will be implemented subsequently.
We require a handle to the current renderpass and the index of the subpass in certain cases, this is now tracked by the `CommandExecutor` and passed in as a parameter to `NextSubpassFunctionNode` and the newly-introduced `SubpassFunctionNode`.
Switch from `SubmitWithCycle` to manually allocating the active command buffer to tag dependencies with the `FenceCycle` that prevents them from being mutated prior to execution. This new paradigm could also allow eager recording of commands with only submission being deferred.
`CommandScheduler` API users can now directly allocate an active command buffer that they need to manage alongside its fence, this can allow for more efficient recording as it doesn't need to be immediately submitted after, it can also allow attaching objects to a `FenceCycle` prior to submission that can be useful for locking resources.
Compiles shaders supplied by the guest with caching and automatic invalidation, the size of the shader is also automatically determined by looking for `BRA $` instructions which cause an infloop, it should be noted that we have a maximum shader bytecode size, any shader above this size will not be supported.
Graphics shaders can now be compiled using the shader compiler and emit SPIR-V that can be used on the host. The binding state isn't currently handled alongside constant buffers and textures support in `GraphicsEnvironment` yet.
The operands of the subtraction in the X/Y translation calculation were the wrong way around which led to negative translations that would translate the viewport off the screen.
The default color write mask should mask no channels and write all of them and should be mutated to mask out certain channels as required by the guest.