Commit Graph

1024 Commits

Author SHA1 Message Date
Robin Kertels
0a3cf25823 Implement the Fermi 2D blitting engine
The Fermi 2D engine implements both image blit and resolve operations, supporting subpixel sampling with both linear and point filtering.

Resolve operations are performed by sampling from the center of each pixel in order to resolve the final image from the MSAA samples
MSAA images are stored in memory like regular images but each pixels dimensions are scaled: e.g for 2x2 MSAA
```
112233
112233
445566
445566
```
These would be sampled with both duDx and duDy as 2 (integer part), resolving to the following:
```
123
456
```
Blit operations are performed by sampling from the corner of each pixel, scaling the image as one would expect.

This implementation isn't fully complete as Vulkan blit doesn't support some combinations which Fermi does, most notably between colour and depth stencil. These will be implemented properly at a later date, likely after the texture manager rework.
Out of Bounds Blit, used by some OpenGL games is also missing since supporting it requires texture aliasing, this will also be supported after the texture manager rework.

Co-authored-by: Billy Laws <blaws05@gmail.com>
2022-05-13 22:37:37 +01:00
Billy Laws
be2546138d Move IOVA class to GMMU so it can be used for other engines 2022-05-13 22:37:37 +01:00
Billy Laws
3ad640fcbc Fix accidental graphics context member/parameter duplication 2022-05-13 22:37:37 +01:00
PixelyIon
7a6f27a19a Fix texture swizzling OOB writes
Certain writes during swizzling went out of bounds due to incorrect `blockExtentY` calculation, the previous commit to fix this ended up breaking it further. This commit returns to the original commit's calculations with the proper addendum of a check for exact alignment with a GOB which is the case that was broken earlier.
2022-05-13 14:52:41 +05:30
PixelyIon
168e51e7ad Always use GetLayerStride for layer stride in Texture
The `GuestTexture::GetLayerStride` function was not always being utilized to retrieve the layer stride inside `Texture`, it would instead directly access the `guestTexture::layerStride` member. This is problematic as it may not be initialized and return `0` which would lead to a broken image copy.
2022-05-13 14:21:37 +05:30
Billy Laws
b81d5bc865 Implement and cleanup semaphore operations in all engines
Most engines have the capability to release a semaphore payload (or reduce in the case of GPFIFO) when a method is called or action is complete. Semaphores are used by games for both timing how long things take on GPU and waiting on resources so missing them can cause deadlocks or other related issues.
2022-05-12 19:40:24 +01:00
Billy Laws
bca88685bd Stub nvdrv {Get,Dump}Status 2022-05-12 17:38:22 +01:00
Billy Laws
97e740c986 Fix slight locking bug with nvmap handle duplication 2022-05-12 17:38:22 +01:00
Billy Laws
57378457dc Treat symbol file paths without slashes as filenames
Prevents crashes printing backtrace if this occurs
2022-05-12 17:38:22 +01:00
Billy Laws
d08ac63bbf Use TIC maximum index over TSC when tscIndexLinked is set 2022-05-12 17:38:22 +01:00
Billy Laws
8e021a9f1f Load custom drivers from app private data dir
Required since /sdcard doesn't have exec perm support
2022-05-12 17:38:21 +01:00
Billy Laws
dcef597345 Introduce TrivialObject concept and use where appropriate
Simplifies type checking and handles excluding container types that are trivially copyable but contain pointers
2022-05-12 17:38:21 +01:00
PixelyIon
f2cc25ee9f Implement Array Texture Swizzling
Textures can have more than one layer which we currently don't handle, all layers past the initial one will be filled with random data or 0s, leading to incorrect rendering. This has now been implemented now which fixes any titles which utilize array textures, such as "Super Mario Odyssey" or "Hatsune Miku: Project DIVA MegaMix".
2022-05-12 18:23:45 +05:30
PixelyIon
2a99e1784d Fix Maxwell3D RT Depth/Layer Count Logic
The Maxwell3D RT layer count wasn't being set correctly as it has the same register as the depth values and is toggled between the two based on another register value.
2022-05-12 18:23:05 +05:30
Billy Laws
543ac3042e Cleanup account services and stub StoreSaveDataThumbnail 2022-05-11 23:24:35 +01:00
Billy Laws
7d30ac0cd8 Add additional nifm stubs 2022-05-11 23:24:35 +01:00
Billy Laws
a164635f32 Stub LibraryAppletPlayerSelect 2022-05-11 23:24:35 +01:00
PixelyIon
4ec1cc7086 Update Build Tools to 33.0.0-rc4
Google has removed `33.0.0-rc3` from their servers and it can no longer be downloaded by the CI, the build tools version has been updated as a result.
2022-05-12 02:53:01 +05:30
Billy Laws
dd0004e208 Set Host1x log tag correctly 2022-05-11 22:11:16 +01:00
Billy Laws
f89bacf8ae Fixup Host1x syncpoint locking 2022-05-11 22:04:02 +01:00
Billy Laws
d8ff318a1a Prevent infinite VFS read loop on EOF 2022-05-11 22:03:39 +01:00
shutterbug2000
f078a5d1ec Stub bt and btm:u
Stub BT services which is required by titles such as Pokémon Let's GO Pikachu and Eevee (non-Demo versions).
2022-05-11 20:44:09 +05:30
PixelyIon
588b4529ee Implement 3D Texture Swizzling
The Maxwell GPU supports 3D textures which are tiled with the block-linear layout which didn't handle swizzling 3D textures correctly till now. This commit addresses that by implementing proper swizzling for 3D textures. Titles such as Cluster Truck and Super Mario Odyssey utilize 3D textures alongside a vast majority of other titles.
2022-05-11 14:06:04 +05:30
Billy Laws
601d67e369 Use resource size rather than allocation size for staging buffer size
As per VMA docs: 'Allocation size returned in this variable may be greater than the size requested for the resource e.g. as VkBufferCreateInfo::size. Whole size of the allocation is accessible for operations on memory e.g. using a pointer after mapping with vmaMapMemory(), but operations on the resource e.g. using vkCmdCopyBuffer must be limited to the size of the resource.'
2022-05-10 18:48:20 +01:00
Billy Laws
d2acec24f5 Handle VFS reads into trapped memory regions
pread will refuse to read into any trapped regions so implement a manual path with a staging buffer and memcpy for such cases
2022-05-10 18:33:55 +01:00
Billy Laws
1609fd2a32 Account for layerCount in SynchronizeGuestWithBuffer staging buffer size 2022-05-10 18:33:31 +01:00
Billy Laws
5b97b87503 Restore previous cullMode when cullFace is enabled 2022-05-10 18:31:32 +01:00
Billy Laws
15e9fa1c80 Fix FillRandomBytes
There were two issues here:
- If a skyline span was passed as a param then the 'T &object' version would be called, filling the span itself with random values rather than its contents
- Random numbers were repeated every call since independent_bits_engine copied generator state and thus it was never actually updated
2022-05-10 18:28:15 +01:00
Billy Laws
622ff2a8f1 Correctly track 5.1 audio channel sample count
Size needs to be adjusted for 5.1 buffers since they're downsampled to stereo.
2022-05-10 18:26:20 +01:00
PixelyIon
56c9b03843 Fix incorrect swizzling Y extent calculation
This calculation for the amount of lines on the Y axis relative to the start of the last block was wrong and would instead determine the amount of lines to the last Y-axis GOB which wasn't accurate when padding was considered, this resulted in titles like Celeste having broken texture decoding (on a 1922x1082 texture) for the last ROB as most pixels would be masked out.
2022-05-09 20:25:43 +05:30
Billy Laws
018df355f0 Replace some VFS exceptions with warnings
These errors aren't necessarily fatal so tone them down.
2022-05-08 19:37:10 +01:00
Billy Laws
e1c13bbc08 Update hades 2022-05-08 19:37:10 +01:00
PixelyIon
b307fca115 Fix attachment reuse within the same subpass
Certain titles such as BOTW trigger behavior to reuse an attachment within the same subpass, this caused an exception inside `RenderPassNode::AddAttachment` as it cannot find corresponding subpass for attachment. To fix this issue, we now assume that when it cannot find a subpass for an existing attachment, it is attached to the latest subpass and return the attachment.
2022-05-08 18:26:40 +05:30
PixelyIon
e027555796 Handle Y-axis GOB non-alignment for swizzling
Certain textures may be unaligned with a GOB's height of 8 lines, we already handle the case of being unaligned with a GOB's width of 64-bytes. This case occurs on titles such as SMO when going in-game.
2022-05-07 18:37:22 +05:30
PixelyIon
c910e29168 Extend HostSignalHandler's SIGSEGV debugger path
The function now returns from a segmentation fault when a debugger is present, this allows the entire context to be intact which can allow the debugger to correctly pick up variables from all stack frames while it could not extrapolate most variables when trapped inside the signal handler without the values of all registers.
2022-05-07 18:37:22 +05:30
Billy Laws
4149ab1067 Implement Maxwell 3D instanced draw support
In the Maxwell 3D engine, instanced draws are implemented by repeating the exact same draw in sequence with special flag set in vertexBeginGl. This flag allows either incrementing the instance counter or resetting it, since we need to supply an instance count to the host API we defer all draws until state changes occur. If there are no state changes between draws we can skip them and count the occurences to get the number of instances to draw.
2022-05-07 13:56:09 +01:00
Billy Laws
03594a081c Ensure correct flushing for batched constant buffer updates
Cbufs could be read by non-maxwell3D engines so force a flush when switching to them or before Execute.
2022-05-07 13:56:09 +01:00
PixelyIon
ad989750fc Implement Maxwell3D Point Sprite Size
Implements register state that corresponds to the size of a single point sprite in Maxwell 3D, this is emitted by the shader compiler in the preamble but needs to be only applied if the input topology is a point primitive and it is invalid to set the point size in any other case.
2022-05-07 03:46:25 +05:30
PixelyIon
874a6a2a6c Fix getTextureType enum conversion fomatting 2022-05-07 03:46:25 +05:30
PixelyIon
ae5bcbdb5c Fix Depth RT lock to be in scope
Earlier texture locking design required the lock to be retained but since the introduction of `AttachTexture`, this no longer needs to be done. This being done caused deadlocks when the depth texture is sampled by the fragment shader while being bound as an RT since it would attempt to lock the texture again.
2022-05-07 02:37:48 +05:30
shutterbug2000
1c8d994161 Basic bcat:u implementation
A basic `bcat:u` implementation to prevent titles such as "Kirby and the Forgotten Land" dependent on BCAT support from crashing due to the lack of an implementation.
2022-05-06 15:41:48 +05:30
PixelyIon
4fd64a53e0 Require Vulkan samplerAnisotropy feature
This is a widely supported feature that games may require conditionally but due to it being supported on effectively all target devices, it was made mandatory. This is used by titles such as ARMS.
2022-05-06 15:41:48 +05:30
PixelyIon
1d9b4a865a Add additional formats to Adreno filter
`VK_FORMAT_R32G32B32A32_SFLOAT` and `D32_SFLOAT` have their capabilities misreported as well, this spams the logs in titles such as ARMS.
2022-05-06 15:41:48 +05:30
PixelyIon
b87295374e Improve Controller Applet log
Improves the readability of the log and replaces the previously uninformative prefix of `operator()` due to being in a lambda with `Controller support`.
2022-05-06 15:41:48 +05:30
PixelyIon
98c730a644 Implement linked TIC/TSC handle in Maxwell3D
Maxwell3D has a register for linking the TIC/TSC index in bindless texture handles, this is used by games to implement bindless combined texture-sampler handles.
2022-05-06 14:58:20 +05:30
PixelyIon
23a091100d Implement ReadCbufValue + ReadTextureType
Implements `GraphicsEnvironment::ReadCbufValue` & `GraphicsEnvironment::ReadTextureType` with a framework of heterogeneous lookups for caching and callbacks for querying constant buffer or TIC values with validation checks for successive draws to ensure unique IR is generated.
2022-05-06 14:39:36 +05:30
PixelyIon
765c3f4e1f Allow draws with no descriptor set resources
The `descriptorSetWrites` being filled is now optional and the case of it being empty is handled correctly, this is done by certain titles such as ARMS and is entirely valid behavior. It should be noted that not doing this leads to errors in the guest due to invalid GPU state while working on the host GPU.
2022-05-06 10:33:47 +05:30
PixelyIon
37327f1955 Fix and refactor SVC SignalToAddress/WaitForAddress
SVC `SignalToAddress` had a bug with the behavior of `SignalAndModifyBasedOnWaitingThreadCountIfEqual` which was entirely incorrect and led to deadlocks in titles such as ARMS that were dependent on it. This commit corrects the behavior and refactors both SVCs and moves their arbitration/waiting to inside the corresponding `KProcess` function rather than the SVC to avoid redundancies and improve code readability.
2022-05-05 19:15:37 +05:30
PixelyIon
396979e897 Extend Adreno format-based filtering for Validation Layer
Filtering of validation logs is now extended beyond BCn formats and now covers other format which have their feature set misreported by the driver, this significantly drives down the amount of logs depending on the title.
2022-05-05 19:15:37 +05:30
PixelyIon
62ea2a6da5 Avoid format aliasing warnings on Adreno
Implements an algorithm to determine formats that can be aliased as views without needing `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT`, this avoids spamming warning logs on view creation when the aliased formats will function in practice.
2022-05-05 19:15:37 +05:30