Billy Laws
3e5992e366
Update hades
2023-01-08 19:30:52 +00:00
Billy Laws
45bbf3bb2a
Fix indirect draws with direct buffers
...
We need to wait on the GPFIFO manually as we won't hit the traps when accesing the indirect params with direct as we usually would.
2023-01-08 19:30:52 +00:00
Billy Laws
68ad052cb1
Add geometry passthrough shader support for vertex layer writes
2023-01-08 19:30:52 +00:00
Billy Laws
ec519a7d52
Return null texture on encountering unmapped textures
2023-01-08 19:30:52 +00:00
Billy Laws
97e127153b
Make shader trap mutex recursive
...
There are cases there we hit a shader trap within the GPU, by making it recursive we avoid deadlocking on reads within the GPU.
2023-01-08 19:30:52 +00:00
Billy Laws
1a6165f74d
Fix GetReadOnlyBackingSpan for non-direct buffers
...
This was missed in the initial implementation
2023-01-08 19:30:52 +00:00
Billy Laws
4e5141f879
Fix missed attempt increment in spinlock
...
Should hog CPU slightly less and correctly yield now
2023-01-08 19:30:52 +00:00
Billy Laws
35a46acbb1
Determine storage buffer alignment dynamically
2023-01-08 19:30:52 +00:00
Billy Laws
12d80fe6c2
Use a shared mutex for GPU VMM to avoid deadlocks
...
Two reads need to be able to occur simultanously or deadlocks ccan occur (e.g read traps to wait on GPU but GPU needs to read).
2023-01-08 19:30:52 +00:00
Billy Laws
28b2a7a8a1
Dynamically apply GPU turbo clocks only when GPU submissions are queued
...
Allows for the GPU to clock down in cases where it's idle for most of the time, while still forcing maximum clocks when we care.
2023-01-08 19:30:52 +00:00
Billy Laws
81f3ff348c
Transition memory handling from memfd to anonymous shared mappings
...
Memfd mappings are incompatible with KGSL user memory importing on older kernels, transition to shared anon mappings to avoid this.
2023-01-08 19:30:52 +00:00
Billy Laws
cc3c869b9f
Attempt to signal the vsync event at present time if possible
...
Some games rely on the vsync event to schedule frames, by matching its timing with presentation we can reduce needless waiting as the game will immediely be able to queue the next frame after presentation.
2023-01-08 19:30:52 +00:00
Billy Laws
918a493a45
Implement wfi and setReference GPFIFO barriers
2023-01-08 19:30:52 +00:00
Billy Laws
7315ba04e6
Fixup optional flattenable binder obj structure
2023-01-08 19:30:52 +00:00
Billy Laws
90e21b0ca1
Split syncpoints into host-guest pairs
...
This allows for the presentation engine to grab the presentation image early when direct buffers are in use, since it'll handle sync on its own using semaphores it doesn't need to wait for GPU execution.
2023-01-08 19:30:52 +00:00
Billy Laws
966c31810a
Return appropriate fences in surfaceflinger queue buffer
2023-01-08 19:30:52 +00:00
Billy Laws
afef6c5123
Always populate all colour attachments
...
This better follow the Vulkan spec, which doesn't mention anything about writes to OOB attachments, only those marked as unused.
2023-01-08 19:30:52 +00:00
Billy Laws
3571737392
Reset maxwell3d quick bind state before adding subpasses to executor
...
If a submission happens during the call to addsubpass we could end up with invalid quick bind state, move this to to before to prevent that.
2023-01-08 19:30:52 +00:00
Billy Laws
3d31ade35f
Implement an alternative buffer path using direct memory importing
...
By importing guest memory directly onto the host GPU we can avoid many of the complexities that occur with memory tracking as well as the heavy performance overhead in some situations. Since it's still desired to support the traditional buffer method, as it's faster in some cases and more widely supported, most of the exposed buffer methods have been split into two variants with just a small amount of shared code. While in most cases the code is simpler, one area with more complexity is handling CPU accesses that need to be sequenced, since we don't have any place we can easily apply writes to on the GPFIFO thread that wont also impact the buffer on the GPU, to solve this, when the GPU is actively using a buffer's contents, an interval list is used to keep track of any GPFIO-written regions on the CPU and any CPU reads to them will instead be directed to a shadow of the buffer with just those writes applied. Once the GPU has finished using buffer contents the shadow can then be removed as all writes will have been done by the GPU.
The main caveat of this is that it requires tying host sync to guest sync, this can reduce performance in games which double buffer command buffers as it prevents us from fully saturating the CPU with the GPFIFO thread.
2023-01-08 19:30:52 +00:00
Billy Laws
b3f7e990cc
Allow for tying guest GPU sync operations to host GPU sync
...
This is necessary for the upcoming direct buffer support, as in order to use guest buffers directly without trapping we need to recreate any guest GPU sync on the host GPU. This avoids the guest thinking work is done that isn't and overwriting in-use buffer contents.
2023-01-08 19:30:52 +00:00
Billy Laws
89c6fab1cb
Implement a way to check if the command record thread is idle
...
Useful for debugging and testing
2023-01-08 19:30:52 +00:00
Billy Laws
c67f27e914
Add a setting to control the maximum number of accumulated GPU cmds
...
This helps to keep the GPU fed when processing large command buffers which don't have any syncpoints to force a flush inbetween.
2023-01-08 19:30:52 +00:00
Billy Laws
77214a98dd
Add a setting to force maximum GPU clocks on KGSL devices
2023-01-08 19:30:52 +00:00
Billy Laws
3ecaedd71e
Add adrenotools direct mapping support
2023-01-08 19:30:52 +00:00
Pablo
8846a85d3a
Stub some IPurchaseEventManager functions
2022-12-31 10:45:18 +00:00
PabloG02
80c0f8f04d
Implement full profile picture support
...
Extends the profile picture stub into a full-fledged implementation with the ability for users to set their profile picture in settings while having the Skyline icon as the default profile picture.
2022-12-27 22:53:41 +05:30
PixelyIon
7a3d2e4a26
Start KThread
TID from 1 rather than 0
...
HOS's TIDs are one-based rather than zero-based, certain titles such as Pokémon Arceus, Naruto Shippuden: Ultimate Ninja Storm 3, Splatoon 3, etc. use the TID being zero as a sentinel value but as we assigned this ID to our first thread prior it broke this logic which has now been fixed by this commit as it now matches HOS behavior.
2022-12-27 22:36:06 +05:30
Billy Laws
bab659587f
Use e1 sample count for blits
2022-12-22 18:05:45 +00:00
Billy Laws
516ece6b04
Calculate renderarea from attachment min size
2022-12-22 18:05:45 +00:00
Billy Laws
4a3cd69257
Populate graphics pipeline manager from cache at launch-time
2022-12-22 18:05:45 +00:00
Billy Laws
e9bcdd06eb
Introduce a pipeline cache manager for simple read/write cache accesses
...
All writes are done async into a staging file, which is then merged into the main pipeline cache file at the time of the next launch. Upon encountering file corruption the cache can be trimmed up to the last-known-good entry to avoid any excessive loss of data from just one error.
2022-12-22 18:05:45 +00:00
Billy Laws
06bf1b38af
Introduce a pipeline state accessor that reads from a bundle
2022-12-22 18:05:45 +00:00
Billy Laws
7dd3a1db0f
Avoid InterconnectContext use in graphics PipelineManager
...
We will soon move to a global pipeline manager instance, so it wont be possible to use InterconnectContext at pipeline-creation time anymore
2022-12-22 18:05:45 +00:00
Billy Laws
ffe7263848
Add quirk for 615 drivers with broken multithreaded compilation
2022-12-22 18:05:45 +00:00
Billy Laws
755f7c75af
Add pipeline (de)serialisation support to bundle
...
See comments in code for details on the on-disk format.
2022-12-22 18:05:45 +00:00
Billy Laws
937eff392f
Switch execution-numbers to be globally unique tags
...
This is required for making pipelines usable across channels without introducing caching bugs.
2022-12-22 18:05:45 +00:00
Billy Laws
072b8193a1
Implement thread pool based async pipeline compilation with futures
...
By distributing the load of shader compiling onto multiple threads and then only waiting for completion until absolutely neccessary we can reduce compilation stutters significantly.
2022-12-22 18:05:45 +00:00
Billy Laws
186549748d
Implement HelperShader-local pipeline cache and use dynamic state
...
Avoids the heavy overhead of the VK pipeline cache when we really only have a few bits of non-dynamic state
2022-12-22 18:05:45 +00:00
Billy Laws
9115b8cae8
Properly hash dynamic states in pipeline cache
2022-12-22 18:05:45 +00:00
Billy Laws
7c4b4765bf
Reduce thresholds for slot increase and buffer/texture fast readback
2022-12-22 18:05:45 +00:00
Billy Laws
ce428af2e6
Use attachment formats rather than views in VK pipeline cache
2022-12-22 18:05:45 +00:00
Billy Laws
e849264028
Abstract out pipeline-compile-time GPU state accesses
...
Introduces the base abstractions that will be used for pipeline caching, with a 'PipelineStateBundle' that can be (de)serialised to/from disk and an abstract accessor class to allow switching between creating disk-cached pipelines and fresh ones.
2022-12-22 18:05:45 +00:00
Billy Laws
2e96248fb6
Track RT format info in PackedPipelineState and move VK conv code there
...
When caching pipelines we can't cache whole images, only their formats so refactor PackedPipelineState so that it can be used for pipeline creation, as opposed to passing in a list of attachments.
2022-12-22 18:05:45 +00:00
Billy Laws
bc7e1eb380
Split-out hash from ShaderBinary struct
...
This isn't necessary for pipeline creation and creates some difficulty with pipeline caching.
2022-12-22 18:05:45 +00:00
Dima
de10ab1219
Stub SetConnectionConfirmationOption
2022-12-18 20:34:55 +00:00
Dima
f3b2b4317e
Stub some IPrepoService calls
2022-12-18 20:34:55 +00:00
Dima
efef67b92b
Stub some IAudioDevice calls
2022-12-18 20:34:55 +00:00
Dima
3a94bcf692
Fix ListOpenContextStoredUsers stub
...
The problem is in StoreOpenContext wasn't storing any user, but ListOpenContextStoredUsers was writing default user (when it's not stored by StoreOpenContext)
2022-12-18 20:34:55 +00:00
TheASVigilante
3c5f8dd876
Fix small typo
2022-12-18 14:49:54 +00:00
lynxnb
6599c1dccf
Stub GyroscopeZeroDriftMode
...
Related service calls are called in a loop by SM3DW. A variable tracking zero drift mode has been added to `npad_device`, but it's unused at the moment.
2022-12-10 14:59:44 +00:00
Dima
dcc3047ba8
Stub ErrorCommonArg
2022-12-10 14:58:20 +00:00
Dima
68253fe995
Stub mii:e/mii:u
...
Needed for SSBU
2022-12-10 14:58:20 +00:00
Dima
69ee3cfc66
Stub DeleteDirectory
...
Should allow deleting/rewriting saves in some games
2022-12-10 14:58:20 +00:00
Dima
bbd34ae7e7
Validate if entries are not empty before using
...
Should fix saving problem in Baldur's Gate: Dark Alliance II at least
2022-12-10 14:58:20 +00:00
Dima
5f510d84d7
Stub IsVibrationPermitted
2022-12-10 14:58:20 +00:00
Dima
51d1f519af
Stub ListDisplays
2022-12-10 14:58:20 +00:00
Dima
a3866a3129
Stub LibraryAppletShop
2022-12-10 14:58:20 +00:00
Dima
1ebec7db82
Stub GetImageSize and LoadImage
2022-12-10 14:58:20 +00:00
Dima
52c4228ecf
Stub some friends service calls
...
Needed for Diablo 3
2022-12-10 14:58:20 +00:00
Dima
ebcbc5b05b
Validate NpadId for ActivateVibrationDevice
2022-12-10 14:58:20 +00:00
Dima
4bdd033354
Stub SetRecordVolumeMuted
2022-12-10 14:58:20 +00:00
Dima
f6d95aae01
Stub GetCacheStorageSize
2022-12-10 14:58:20 +00:00
Dima
4ab8699cd4
Stub ImportServerPki
2022-12-10 14:58:20 +00:00
Dima
41cf4bb12d
Stub GetLanguageCode
2022-12-10 14:58:20 +00:00
Dima
3e078d54b6
Stub GetIdleTimeDetectionExtension
2022-12-10 14:58:20 +00:00
Dima
2311f777fc
Stub IsCpuOverclockEnabled
2022-12-10 14:58:20 +00:00
Dima
4601c28c28
Stub GetCurrentIpAddress
2022-12-10 14:58:20 +00:00
Dima
18e6a6c53c
Stub DeclareOpenOnlinePlaySession and DeclareCloseOnlinePlaySession
2022-12-10 14:58:20 +00:00
Dima
150c1370c2
Stub some IApplicationFunctions funcs
2022-12-10 14:58:20 +00:00
Dima
a6f3aa3062
Stub TrySelectUserWithoutInteraction and ListQualifiedUsers
2022-12-10 14:58:20 +00:00
Dima
5a9a2861df
Add TitleId TextView in App Dialog
2022-12-10 14:57:46 +00:00
Billy Laws
7c802aea46
Mark vertex buffers as dirty on limit changes
2022-12-03 22:50:56 +00:00
Billy Laws
df19810c6c
Always set vertex stride for unbound buffers
2022-12-03 22:50:56 +00:00
Billy Laws
f4f658e3b7
Fix typo
2022-12-03 22:50:56 +00:00
Billy Laws
45b10ef776
Return whole mapping for shader code when end instrs aren't found
2022-12-03 22:50:56 +00:00
Billy Laws
d849875656
Only unlock GPU channel state on queue wait if it was previously locked
2022-12-03 22:50:56 +00:00
Billy Laws
a5e0a64adc
Switch patch error logs to debug
2022-12-03 22:50:56 +00:00
Billy Laws
af7c54297f
Cache staging buffer used for texture download
2022-12-03 22:50:56 +00:00
Billy Laws
bba07fb101
Update for new hades
2022-12-03 22:50:56 +00:00
Billy Laws
a16383fd4b
Disable compute shaders on mali
...
This will need to be debugged properly at some point but its fine for now.
2022-12-03 22:50:56 +00:00
Billy Laws
137d801843
Skip host1x HW emulation and effectively stub submission
...
This was causing a bunch of logspam and isn't really needed as we will be using a HLE approach.
2022-12-03 22:50:56 +00:00
Billy Laws
579a2d9337
Add dynamic executor slot growth
2022-12-03 22:50:56 +00:00
Billy Laws
60169fce4c
Support 0-sized constant buffers
2022-12-03 22:50:56 +00:00
Billy Laws
b86dd99e1a
Align all SSBOs to 0x40 bytes
...
Required by Adreno GPUs
2022-12-03 22:50:56 +00:00
Billy Laws
bfae292fb0
Make executor slot count setting exponential
2022-12-03 22:50:56 +00:00
Billy Laws
e0ae94be9d
Enable robustness1 Vulkan feature
2022-12-03 22:50:56 +00:00
Billy Laws
bf03f945ee
Implement the Kepler compute engine
...
This can reuse a fair bit of the now-commonised Maxwell 3D code and mostly consists of compute-specific pipeline code which was deemed not suitable for being commonised (e.g. descriptor update code is somewhat duplicated). Of note is how compute lacks any active state at all de to its use of QMDs which bundle up all state into a single object in memory.
2022-12-03 22:50:56 +00:00
Billy Laws
4bc81f007f
Add some convinience helpers to compute engine regs
2022-12-03 22:50:56 +00:00
Billy Laws
4267a6af36
Add support for parsing and compiling compute shaders to the shader manager
2022-12-03 22:50:56 +00:00
Billy Laws
86dab65af4
Commonise maxwell3d state updater
2022-12-03 22:50:56 +00:00
Billy Laws
a0b81d54d6
Use pitch layout for linear RTs
...
More likely to match in the texture cache when being sampled.
2022-12-03 22:50:56 +00:00
Billy Laws
ac85df7b7a
Start transition cache lookup with most recent one
2022-12-03 22:50:56 +00:00
Billy Laws
62c86b7690
Move maxwell3d to common constant buffer code
2022-12-03 22:50:56 +00:00
Billy Laws
8f0a6e78c5
Add Vulkan stride dynamic state and robustness support
...
Fixes the waterfall in SMO by specifying vertex buffer bounds.
2022-12-03 22:50:56 +00:00
Billy Laws
23a7f70a8e
Commonise maxwell3d guest shader caching code
2022-12-03 22:50:56 +00:00
Billy Laws
6f6a312692
Commonise maxwell3d pipeline binding handling code
...
A lot of pipeline code is difficult to commonise due to the inherent difference between compute and graphics pipelines, however the binding layout is shared so we can at least commonise that
2022-12-03 22:50:56 +00:00
Billy Laws
be8cbabd97
Commonise maxwell3d texture code
...
This will be shared with the compute engine implementation.
2022-12-03 22:50:56 +00:00
Billy Laws
61e95c4b2c
Commonise maxwell3d sampler code
...
This will be shared with the compute engine implementation, the only thing of note with this is that the binding register is now passed as a param since it is part of the compute QMD which can't be dirty tracked.
2022-12-03 22:50:56 +00:00
Billy Laws
7f93ec3df6
Commonise maxwell3d interconnect common code for use by other engines
...
The compute engine will require most of this for basic functionality.
2022-12-03 22:50:56 +00:00
Billy Laws
281838fde1
Apply GPU readback hack to both buffers and textures
...
And rename as appropriate.
2022-12-03 22:50:56 +00:00
Billy Laws
eb00dc62f8
Implement support for 36 bit games by using split code/heap mappings
...
Although rtld and IPC prevent TLS/IO and code from being above the 36-bit AS limit, nothing depends the heap being below it. We can take advantage of this by stealing as much AS as possible for code in the lower 36-bits.
2022-12-02 22:10:03 +00:00
Dima
e8e1b910c3
Add possibility to disable audio output
2022-12-02 00:33:28 +01:00
lynxnb
70109f8fbd
Work around invalid values in CNTFRQ_EL0
register
...
Exynos SoCs have a bug where the `CNTFRQ_EL0` register is either set to 0 or contain incoherent values. With this patch, the frequency value is loaded into a static variable and used instead of reading the register. The value will be initialised to the correct value for affected SoCs, while unaffected ones will use the value from the register.
2022-12-02 00:23:28 +01:00
Billy Laws
b2384e83f5
Add prepo:a service
2022-11-25 16:26:00 +00:00
Billy Laws
736216a6f4
Stub OpenPatchDataStorageByCurrentProcess
2022-11-25 16:26:00 +00:00
Billy Laws
44033d7f8d
Adjust CalendarTime year to be relative to 0AD
2022-11-25 16:26:00 +00:00
Billy Laws
2ce2604421
Implement VFS file deletion
2022-11-25 16:26:00 +00:00
Billy Laws
6c968e0357
Fix GetEntryType IPC return type
2022-11-25 16:26:00 +00:00
Billy Laws
7f1667de82
Avoid using trapping for frequently trapped shaders
...
Fall back to hashing for every shader access as that ends up being faster than applying traps for every execution.
2022-11-19 12:49:05 +00:00
Billy Laws
06095918a9
Introduce per-channel sequence number for invalidation tracking
...
For cases like shaders, which may be uploaded through I2M (which no longer causes an execution) we need a way to cause an invalidation on all writes
2022-11-19 12:49:05 +00:00
Billy Laws
97e3f7fd34
Increase max swapchain image count
2022-11-19 12:49:05 +00:00
Billy Laws
c49119f5ef
Fixup depth bounds register arguments
2022-11-19 12:49:05 +00:00
Billy Laws
db3c5c33c4
Clamp depth bounds into 0-1 range
2022-11-19 12:49:05 +00:00
Billy Laws
e1bbd521d9
Fix potential circular queue submission race
...
If a producer thread was waiting for the queue to have free space and the consumer thread hadn't yet acquired the production mutex a deadlock could occur
2022-11-19 12:49:05 +00:00
Billy Laws
13baf2312f
Add a workaround for sampling BGRA textures with a swizzle
2022-11-19 12:49:05 +00:00
Billy Laws
13a96c5aba
Implement a helper shader for partial clears
...
These are not natively supported by Vulkan, so use a helper shader and colorWriteMask for the same behaviour.
2022-11-19 12:49:05 +00:00
Billy Laws
ac0e225114
Use vkCmdBlit for texture copies when formats dont match
2022-11-19 12:49:05 +00:00
Billy Laws
c8fc8f84ec
Fallback to RGBA888 for unsupported swapchain formats as opposed to swizzle
2022-11-19 12:49:05 +00:00
Billy Laws
e0bc0d3a97
Avoid megabuffering buffers larger than the chunk size
2022-11-19 12:49:05 +00:00
Billy Laws
b6f49884b3
Use lower_bound to speedup texture hostMapping lookup
2022-11-19 12:49:05 +00:00
Billy Laws
e7fda28ac6
Skip over textures in cache which have been replaced with a layer/mip match
2022-11-19 12:49:05 +00:00
Billy Laws
88cc696c7f
Only use 2D array depth targets when depth > 1
2022-11-19 12:49:05 +00:00
Billy Laws
7fed971b2d
Take firstIndex into account when calculating index (quad) buffer size
...
Without this we would miss any elements beyond indexCount in the index buffer and they would be filled with random garbage causing vertex bombs
2022-11-19 12:49:05 +00:00
Billy Laws
1f9de17e98
Begin command buffers asynchronously in command executor
...
vkBeginCommandBuffer can take quite some time on adreno, move it to the cycle waiter thread where it won't block GPFIFO.
2022-11-19 12:49:05 +00:00
Billy Laws
4b3e906c22
Update cached buffer execution number when megabuffering
2022-11-19 12:49:05 +00:00
Billy Laws
3ae1e78544
Match mip layers and array layers in texture manager
2022-11-19 12:49:05 +00:00
Billy Laws
d502adb309
Avoid WRW hazard in subpass deps
2022-11-19 12:49:05 +00:00
Billy Laws
e9313cc291
Use view layer count over texture for attachments
2022-11-19 12:49:05 +00:00
Billy Laws
e65ca52d91
Avoid potential buffer copy race
2022-11-19 12:49:05 +00:00
Dima
720cfaafb6
Stub caps:su
2022-11-18 15:35:03 +00:00
Dima
74afca4aab
Stub caps:u
2022-11-18 15:35:03 +00:00
Dima
27ff1ae19b
Stub caps:c
2022-11-18 15:35:03 +00:00
Dima
ffb0546609
Stub caps:a
2022-11-18 15:35:03 +00:00
Dima
1c8736cb56
Stub IsLargeResourceAvailable
2022-11-18 12:52:25 +00:00
Dima
dcd9e4ff61
Stub SetIdleTimeDetectionExtension, SetAlbumImageTakenNotificationEnabled
2022-11-18 12:52:25 +00:00
Dima
60843269de
Stub GetBlockedUserListIds and UpdateUserPresence
2022-11-18 12:52:25 +00:00
Dima
2cdfc7640c
Stub GetPreviousProgramIndex
2022-11-18 12:52:25 +00:00
Dima
360306eb61
Stub GetAddOnContentListChangedEventWithProcessId
2022-11-18 12:52:25 +00:00
Dima
3d475ca122
Stub GetAccountId
2022-11-18 12:52:25 +00:00
Dima
0b452fe36b
Stub GetFriendList
2022-11-18 12:52:25 +00:00
Dima
cc37d2231d
Stub CheckFreeCommunicationPermission and IsFreeCommunicationAvailable
2022-11-18 12:52:25 +00:00
Dima
ec81c97fa9
Stub TryPopFromFriendInvitationStorageChannel
2022-11-18 12:52:25 +00:00
Dima
413f162cf2
Stub some account functions
2022-11-18 12:52:25 +00:00
lynxnb
675e8dbb2e
Move input handling code to a dedicated class
2022-11-17 21:54:15 +01:00
Dima
262ee28611
Stub some bsd functions
...
Co-authored-by: Lunar-Pixel <83507264+Lunar-Pixel@users.noreply.github.com>
2022-11-15 16:24:33 +00:00
Dima
9afa8b881e
Stub nsd:u/nsd:a and sfdnsres services
2022-11-15 16:24:33 +00:00
Billy Laws
01e27bd2dd
Implement ldr:ro LoadModule
2022-11-15 16:23:40 +00:00
Billy Laws
e571066409
Stub ldr:ro IRoInterface
...
Some games initialise this service on startup however don't actually use it. Add a simple stub to allow such games to boot.
2022-11-15 16:23:40 +00:00
Billy Laws
1fc2641746
Stub the web applet
2022-11-13 11:37:18 +00:00
Billy Laws
021f82ef08
Stub ListOpenContextStoredUsers
2022-11-13 11:35:40 +00:00
Billy Laws
e7bab27d85
Fixup nvdrv channel private memory allocation
...
This was incorrectly allocated in words, rather than bytes, meaning that guest allocations could overwrite the private memory and break inline syncpt operations
2022-11-13 11:35:40 +00:00
Billy Laws
8b523fa1f0
Avoid inline syncpt increments sending OOB GpEntries
...
In cases where no wfi is required, the space where the WFI commands would go needs to be zeroed out to avoid the GPU reading uninitialised memory.
2022-11-13 11:35:40 +00:00
Billy Laws
cd0b2636e5
Prevent truncation of big page start in GetVaRegions
2022-11-13 11:35:40 +00:00
Billy Laws
f650f32bf0
Avoid duplicating NvDrv buffer unmap code
2022-11-13 11:35:40 +00:00
Billy Laws
001064b7bf
Fix GraphicsBufferProducer recreation
...
We need to use a shared_ptr to ensure that the present callback doesn't do any UAFs, also unlocks the GBP during presentation as if the queue is full a deadlock could a rise where the present callback wouldn't be able to run due to the (waiting) DequeueBuffer thread holding the lock.
2022-11-13 11:35:40 +00:00
Billy Laws
29e89a3950
Fix crashes when opening non-existent directories
2022-11-13 11:35:40 +00:00
Billy Laws
ec139b3027
Fixup CancelBuffer fence handling
2022-11-13 11:35:40 +00:00
Billy Laws
7f24c7b857
Store KMemory object ptrs in memory class to avoid linear-time unmap
...
This is quite a horrible solution but fixing it properly would require a whole rewrite of how we handle memory.
2022-11-13 11:35:15 +00:00
Billy Laws
388245789f
Restructure ConditionalVariableSignal to avoid potential deadlock
...
Since InsertThread can block for paused threads, we need to ensure we unlock syncWaiterMutex when calling it.
2022-11-09 23:02:26 +05:30
PixelyIon
f4a8328cef
Implement Symbol Hooking
...
Symbol hooking is required for HLE implementations of certain features in the future such as `nvdec` and for more in-depth debugging of games as we can inspect them on a SDK function level which allows us to debug issues far more easily.
2022-11-07 23:56:22 +05:30
PixelyIon
8892eb08e6
Fix MoveRegister
to clear when value is 0
...
The register wouldn't be cleared with a `MOVZ` when a value was zero due to the condition for writing an instruction requiring the `offsetValue` to be non-zero.
2022-11-07 23:56:22 +05:30
Billy Laws
f7ab3abb86
Allow load balancing when waiting on condvars
2022-11-06 20:47:26 +00:00
german77
b6e2fb894c
service: bcat: Stub CreateDeliveryCacheStorageService
2022-11-06 20:39:41 +00:00
Billy Laws
026bb04386
Impl some more texture formats
2022-11-02 17:46:07 +00:00
Billy Laws
133f08ed14
Stash new register value before executing deferred draws/updates
...
Since the register writes technically happen after the draw, issues can occur if they happen before: e.g. skyrim updates ctSelect and disables all RTs after a draw, but this would happen before it previously and crash the driver.
2022-11-02 17:46:07 +00:00
Billy Laws
c50852e546
Implement the draw(...)BeginEnd Maxwell3D draw registers
...
Used by guest Vulkan games and nouveau.
2022-11-02 17:46:07 +00:00
Billy Laws
270ef3e0d2
Implement GPFIFO semaphore acquire operations
2022-11-02 17:46:07 +00:00
Billy Laws
2ce146e28f
Don't crash on the Grp0SetSubDevMask TertOp
...
Used by Vulkan games to set the SLI mask, not applicable to the switch.
2022-11-02 17:46:07 +00:00
Billy Laws
1d83dadefb
Drop size restruction bypass for frequently synced buffers
...
In cases where large buffers are updated every draw this could seriously increase memory usage beyond 3GB in the megabuffer.
2022-11-02 17:46:07 +00:00
Billy Laws
1088ed514c
Introduce texture usage system to ensure RPs are split when necessary
...
Vulkan doesn't allow sampling a texture and using it as an RT in the same RP, by tracking the texture usage status and splitting RPs when this occurs we can avoid such potential sync errors.
2022-11-02 17:46:07 +00:00
Billy Laws
2dd4698441
Adjust texture matching hacks
2022-11-02 17:46:07 +00:00
Billy Laws
4f5c9047ef
Add some additional texture formats used by Vulkan games
2022-11-02 17:46:07 +00:00
Billy Laws
6a830dfac5
Use shader-compiler side {S,U}Scaled format emulation
2022-11-02 17:46:07 +00:00
Billy Laws
579fd04117
Fixup ReadTextureType shader compiler callback
2022-11-02 17:46:07 +00:00
Billy Laws
b04d18eba5
Add support for split mappings to I2M uploads
...
Used by Super Mario Sunshine and other Vulkan games.
2022-11-02 17:46:07 +00:00
Billy Laws
db5e208379
Clear images even when aspects mismatch
2022-11-02 17:46:07 +00:00
Billy Laws
3c8df327f1
Fixup subpass barriers and flags
2022-11-02 17:46:07 +00:00
Billy Laws
5ab80901c6
Drop some debug code
2022-11-02 17:46:07 +00:00
Billy Laws
4de89c8839
GPU NEW MARGEBAC
2022-11-02 17:46:07 +00:00
Billy Laws
7670c83405
Ensure textures are clean before paging them out
2022-11-02 17:46:07 +00:00
Billy Laws
1a2351386d
Add u64 iova ctor
2022-11-02 17:46:07 +00:00
Billy Laws
93d43e0115
Fully fill in swizzle component mappings
...
Avoids the rest being default initialised to identity, which would break the intended effect of them.
2022-11-02 17:46:07 +00:00
Billy Laws
37ff0ab814
Add buffer manager support for accelerated copies
...
These will be sequenced on the GPU/CPU depending on what's optimal and avoid any serialisation
2022-11-02 17:46:07 +00:00
Billy Laws
cac287d9fd
Implement accelerated uploads/copies through buffer manager
...
Previously, both I2M uploads and DMA copies would force GPU serialisation if they happened to hit a trap or were used to copy GPU dirty buffers. By using the buffer manager to implement them on the host GPU we can avoid such slowdowns entiely.
2022-11-02 17:46:07 +00:00
Billy Laws
c5ec484d9a
Avoid redundantly passing executor in ctors when it's already in ChannelCtx
2022-11-02 17:46:07 +00:00
Billy Laws
463394ba72
Pass correct size for XFB buffers
2022-11-02 17:46:07 +00:00
Billy Laws
bd976676f4
Fix SNorm vertex formats
2022-11-02 17:46:07 +00:00
Billy Laws
b74098570f
Zero-out unused XFB varyings before passing to hades
2022-11-02 17:46:07 +00:00
Billy Laws
22f3ba6b93
Mark XFB buffers as GPU dirty
2022-11-02 17:46:07 +00:00
Billy Laws
26aeeaecf5
Add constant buffer GPU write pipeline barrier
2022-11-02 17:46:07 +00:00
Billy Laws
0b5d9308c4
Be more careful about potentially-unneeded GPU->CPU syncs
...
These can be especially expensive so should be avoided as much as possible.
2022-11-02 17:46:07 +00:00
Billy Laws
e6530e2386
Delete graphics_context
...
F
2022-11-02 17:46:07 +00:00
Billy Laws
b24a8465da
Don't require depthClamp
2022-11-02 17:46:07 +00:00
Billy Laws
9055c98e09
Only enable debug/verbose logs in (rel)debug builds
2022-11-02 17:46:07 +00:00
Billy Laws
0ebdbcf0ff
Don't lock stateMutex when updating buffer cycle
2022-11-02 17:46:07 +00:00
Billy Laws
dd360b8f75
Pass correct wait semaphore array size to queue submit
2022-11-02 17:46:07 +00:00
Billy Laws
c78a4b9699
Fixup buffer recreation to avoid deadlock when waiting on srcs
2022-11-02 17:46:07 +00:00
Billy Laws
d236bfe454
Enable depthClamp VK device feature
2022-11-02 17:46:07 +00:00
Billy Laws
95d849e1f6
Check FenceCycle signalled flag immediately before waiting
...
The lock release within the wait for submission means that another thread could end up signalling the cycle and then the VK wait still happen after when the lock has been reacquired.
2022-11-02 17:46:07 +00:00
Billy Laws
1a23b929a7
Avoid chaining cycles in buffer recreation
...
This had a chance of creating circular chains which obviously caused issues, just do a wait instead for now.
2022-11-02 17:46:07 +00:00