Commit Graph

1769 Commits

Author SHA1 Message Date
Billy Laws
1a6165f74d Fix GetReadOnlyBackingSpan for non-direct buffers
This was missed in the initial implementation
2023-01-08 19:30:52 +00:00
Billy Laws
4e5141f879 Fix missed attempt increment in spinlock
Should hog CPU slightly less and correctly yield now
2023-01-08 19:30:52 +00:00
Billy Laws
35a46acbb1 Determine storage buffer alignment dynamically 2023-01-08 19:30:52 +00:00
Billy Laws
12d80fe6c2 Use a shared mutex for GPU VMM to avoid deadlocks
Two reads need to be able to occur simultanously or deadlocks ccan occur (e.g read traps to wait on GPU but GPU needs to read).
2023-01-08 19:30:52 +00:00
Billy Laws
28b2a7a8a1 Dynamically apply GPU turbo clocks only when GPU submissions are queued
Allows for the GPU to clock down in cases where it's idle for most of the time, while still forcing maximum clocks when we care.
2023-01-08 19:30:52 +00:00
Billy Laws
81f3ff348c Transition memory handling from memfd to anonymous shared mappings
Memfd mappings are incompatible with KGSL user memory importing on older kernels, transition to shared anon mappings to avoid this.
2023-01-08 19:30:52 +00:00
Billy Laws
cc3c869b9f Attempt to signal the vsync event at present time if possible
Some games rely on the vsync event to schedule frames, by matching its timing with presentation we can reduce needless waiting as the game will immediely be able to queue the next frame after presentation.
2023-01-08 19:30:52 +00:00
Billy Laws
918a493a45 Implement wfi and setReference GPFIFO barriers 2023-01-08 19:30:52 +00:00
Billy Laws
7315ba04e6 Fixup optional flattenable binder obj structure 2023-01-08 19:30:52 +00:00
Billy Laws
90e21b0ca1 Split syncpoints into host-guest pairs
This allows for the presentation engine to grab the presentation image early when direct buffers are in use, since it'll handle sync on its own using semaphores it doesn't need to wait for GPU execution.
2023-01-08 19:30:52 +00:00
Billy Laws
966c31810a Return appropriate fences in surfaceflinger queue buffer 2023-01-08 19:30:52 +00:00
Billy Laws
afef6c5123 Always populate all colour attachments
This better follow the Vulkan spec, which doesn't mention anything about writes to OOB attachments, only those marked as unused.
2023-01-08 19:30:52 +00:00
Billy Laws
3571737392 Reset maxwell3d quick bind state before adding subpasses to executor
If a submission happens during the call to addsubpass we could end up with invalid quick bind state, move this to to before to prevent that.
2023-01-08 19:30:52 +00:00
Billy Laws
3d31ade35f Implement an alternative buffer path using direct memory importing
By importing guest memory directly onto the host GPU we can avoid many of the complexities that occur with memory tracking as well as the heavy performance overhead in some situations. Since it's still desired to support the traditional buffer method, as it's faster in some cases and more widely supported, most of the exposed buffer methods have been split into two variants with just a small amount of shared code. While in most cases the code is simpler, one area with more complexity is handling CPU accesses that need to be sequenced, since we don't have any place we can easily apply writes to on the GPFIFO thread that wont also impact the buffer on the GPU, to solve this, when the GPU is actively using a buffer's contents, an interval list is used to keep track of any GPFIO-written regions on the CPU and any CPU reads to them will instead be directed to a shadow of the buffer with just those writes applied. Once the GPU has finished using buffer contents the shadow can then be removed as all writes will have been done by the GPU.

The main caveat of this is that it requires tying host sync to guest sync, this can reduce performance in games which double buffer command buffers as it prevents us from fully saturating the CPU with the GPFIFO thread.
2023-01-08 19:30:52 +00:00
Billy Laws
b3f7e990cc Allow for tying guest GPU sync operations to host GPU sync
This is necessary for the upcoming direct buffer support, as in order to use guest buffers directly without trapping we need to recreate any guest GPU sync on the host GPU. This avoids the guest thinking work is done that isn't and overwriting in-use buffer contents.
2023-01-08 19:30:52 +00:00
Billy Laws
89c6fab1cb Implement a way to check if the command record thread is idle
Useful for debugging and testing
2023-01-08 19:30:52 +00:00
Billy Laws
c67f27e914 Add a setting to control the maximum number of accumulated GPU cmds
This helps to keep the GPU fed when processing large command buffers which don't have any syncpoints to force a flush inbetween.
2023-01-08 19:30:52 +00:00
Billy Laws
77214a98dd Add a setting to force maximum GPU clocks on KGSL devices 2023-01-08 19:30:52 +00:00
Billy Laws
83ecc33a77 Update adrenotools 2023-01-08 19:30:52 +00:00
Billy Laws
3ecaedd71e Add adrenotools direct mapping support 2023-01-08 19:30:52 +00:00
Pablo
8846a85d3a Stub some IPurchaseEventManager functions 2022-12-31 10:45:18 +00:00
PabloG02
80c0f8f04d
Implement full profile picture support
Extends the profile picture stub into a full-fledged implementation with the ability for users to set their profile picture in settings while having the Skyline icon as the default profile picture.
2022-12-27 22:53:41 +05:30
PixelyIon
7a3d2e4a26 Start KThread TID from 1 rather than 0
HOS's TIDs are one-based rather than zero-based, certain titles such as Pokémon Arceus, Naruto Shippuden: Ultimate Ninja Storm 3, Splatoon 3, etc. use the TID being zero as a sentinel value but as we assigned this ID to our first thread prior it broke this logic which has now been fixed by this commit as it now matches HOS behavior.
2022-12-27 22:36:06 +05:30
Billy Laws
bab659587f Use e1 sample count for blits 2022-12-22 18:05:45 +00:00
Billy Laws
516ece6b04 Calculate renderarea from attachment min size 2022-12-22 18:05:45 +00:00
Billy Laws
4a3cd69257 Populate graphics pipeline manager from cache at launch-time 2022-12-22 18:05:45 +00:00
Billy Laws
e9bcdd06eb Introduce a pipeline cache manager for simple read/write cache accesses
All writes are done async into a staging file, which is then merged into the main pipeline cache file at the time of the next launch. Upon encountering file corruption the cache can be trimmed up to the last-known-good entry to avoid any excessive loss of data from just one error.
2022-12-22 18:05:45 +00:00
Billy Laws
06bf1b38af Introduce a pipeline state accessor that reads from a bundle 2022-12-22 18:05:45 +00:00
Billy Laws
7dd3a1db0f Avoid InterconnectContext use in graphics PipelineManager
We will soon move to a global pipeline manager instance, so it wont be possible to use InterconnectContext at pipeline-creation time anymore
2022-12-22 18:05:45 +00:00
Billy Laws
ffe7263848 Add quirk for 615 drivers with broken multithreaded compilation 2022-12-22 18:05:45 +00:00
Billy Laws
755f7c75af Add pipeline (de)serialisation support to bundle
See comments in code for details on the on-disk format.
2022-12-22 18:05:45 +00:00
Billy Laws
937eff392f Switch execution-numbers to be globally unique tags
This is required for making pipelines usable across channels without introducing caching bugs.
2022-12-22 18:05:45 +00:00
Billy Laws
072b8193a1 Implement thread pool based async pipeline compilation with futures
By distributing the load of shader compiling onto multiple threads and then only waiting for completion until absolutely neccessary we can reduce compilation stutters significantly.
2022-12-22 18:05:45 +00:00
Billy Laws
186549748d Implement HelperShader-local pipeline cache and use dynamic state
Avoids the heavy overhead of the VK pipeline cache when we really only have a few bits of non-dynamic state
2022-12-22 18:05:45 +00:00
Billy Laws
9115b8cae8 Properly hash dynamic states in pipeline cache 2022-12-22 18:05:45 +00:00
Billy Laws
7c4b4765bf Reduce thresholds for slot increase and buffer/texture fast readback 2022-12-22 18:05:45 +00:00
Billy Laws
f32ab1feff Include BS thread pool library 2022-12-22 18:05:45 +00:00
Billy Laws
ce428af2e6 Use attachment formats rather than views in VK pipeline cache 2022-12-22 18:05:45 +00:00
Billy Laws
e849264028 Abstract out pipeline-compile-time GPU state accesses
Introduces the base abstractions that will be used for pipeline caching, with a 'PipelineStateBundle' that can be (de)serialised to/from disk and an abstract accessor class to allow switching between creating disk-cached pipelines and fresh ones.
2022-12-22 18:05:45 +00:00
Billy Laws
2e96248fb6 Track RT format info in PackedPipelineState and move VK conv code there
When caching pipelines we can't cache whole images, only their formats so refactor PackedPipelineState so that it can be used for pipeline creation, as opposed to passing in a list of attachments.
2022-12-22 18:05:45 +00:00
Billy Laws
bc7e1eb380 Split-out hash from ShaderBinary struct
This isn't necessary for pipeline creation and creates some difficulty with pipeline caching.
2022-12-22 18:05:45 +00:00
Dima
de10ab1219 Stub SetConnectionConfirmationOption 2022-12-18 20:34:55 +00:00
Dima
f3b2b4317e Stub some IPrepoService calls 2022-12-18 20:34:55 +00:00
Dima
efef67b92b Stub some IAudioDevice calls 2022-12-18 20:34:55 +00:00
Dima
3a94bcf692 Fix ListOpenContextStoredUsers stub
The problem is in StoreOpenContext wasn't storing any user, but ListOpenContextStoredUsers was writing default user (when it's not stored by StoreOpenContext)
2022-12-18 20:34:55 +00:00
TheASVigilante
3c5f8dd876 Fix small typo 2022-12-18 14:49:54 +00:00
lynxnb
6599c1dccf Stub GyroscopeZeroDriftMode
Related service calls are called in a loop by SM3DW. A variable tracking zero drift mode has been added to `npad_device`, but it's unused at the moment.
2022-12-10 14:59:44 +00:00
Dima
dcc3047ba8 Stub ErrorCommonArg 2022-12-10 14:58:20 +00:00
Dima
68253fe995 Stub mii:e/mii:u
Needed for SSBU
2022-12-10 14:58:20 +00:00
Dima
69ee3cfc66 Stub DeleteDirectory
Should allow deleting/rewriting saves in some games
2022-12-10 14:58:20 +00:00
Dima
bbd34ae7e7 Validate if entries are not empty before using
Should fix saving problem in Baldur's Gate: Dark Alliance II at least
2022-12-10 14:58:20 +00:00
Dima
5f510d84d7 Stub IsVibrationPermitted 2022-12-10 14:58:20 +00:00
Dima
51d1f519af Stub ListDisplays 2022-12-10 14:58:20 +00:00
Dima
a3866a3129 Stub LibraryAppletShop 2022-12-10 14:58:20 +00:00
Dima
1ebec7db82 Stub GetImageSize and LoadImage 2022-12-10 14:58:20 +00:00
Dima
52c4228ecf Stub some friends service calls
Needed for Diablo 3
2022-12-10 14:58:20 +00:00
Dima
ebcbc5b05b Validate NpadId for ActivateVibrationDevice 2022-12-10 14:58:20 +00:00
Dima
4bdd033354 Stub SetRecordVolumeMuted 2022-12-10 14:58:20 +00:00
Dima
f6d95aae01 Stub GetCacheStorageSize 2022-12-10 14:58:20 +00:00
Dima
4ab8699cd4 Stub ImportServerPki 2022-12-10 14:58:20 +00:00
Dima
41cf4bb12d Stub GetLanguageCode 2022-12-10 14:58:20 +00:00
Dima
3e078d54b6 Stub GetIdleTimeDetectionExtension 2022-12-10 14:58:20 +00:00
Dima
2311f777fc Stub IsCpuOverclockEnabled 2022-12-10 14:58:20 +00:00
Dima
4601c28c28 Stub GetCurrentIpAddress 2022-12-10 14:58:20 +00:00
Dima
18e6a6c53c Stub DeclareOpenOnlinePlaySession and DeclareCloseOnlinePlaySession 2022-12-10 14:58:20 +00:00
Dima
150c1370c2 Stub some IApplicationFunctions funcs 2022-12-10 14:58:20 +00:00
Dima
a6f3aa3062 Stub TrySelectUserWithoutInteraction and ListQualifiedUsers 2022-12-10 14:58:20 +00:00
Dima
5a9a2861df Add TitleId TextView in App Dialog 2022-12-10 14:57:46 +00:00
Abandoned Cart
b08fcd7027 Favor a predefined "click" over system vibration 2022-12-10 14:57:33 +00:00
Abandoned Cart
cfd3bfecba Add a rudimentary OSC button vibration setting 2022-12-10 14:57:33 +00:00
Billy Laws
7c802aea46 Mark vertex buffers as dirty on limit changes 2022-12-03 22:50:56 +00:00
Billy Laws
df19810c6c Always set vertex stride for unbound buffers 2022-12-03 22:50:56 +00:00
Billy Laws
f4f658e3b7 Fix typo 2022-12-03 22:50:56 +00:00
Billy Laws
45b10ef776 Return whole mapping for shader code when end instrs aren't found 2022-12-03 22:50:56 +00:00
Billy Laws
d849875656 Only unlock GPU channel state on queue wait if it was previously locked 2022-12-03 22:50:56 +00:00
Billy Laws
a5e0a64adc Switch patch error logs to debug 2022-12-03 22:50:56 +00:00
Billy Laws
af7c54297f Cache staging buffer used for texture download 2022-12-03 22:50:56 +00:00
Billy Laws
8c5e6d2bb4 Update VKMA 2022-12-03 22:50:56 +00:00
Billy Laws
bba07fb101 Update for new hades 2022-12-03 22:50:56 +00:00
Billy Laws
a16383fd4b Disable compute shaders on mali
This will need to be debugged properly at some point but its fine for now.
2022-12-03 22:50:56 +00:00
Billy Laws
d69c6851f3 Update hades 2022-12-03 22:50:56 +00:00
Billy Laws
137d801843 Skip host1x HW emulation and effectively stub submission
This was causing a bunch of logspam and isn't really needed as we will be using a HLE approach.
2022-12-03 22:50:56 +00:00
Billy Laws
579a2d9337 Add dynamic executor slot growth 2022-12-03 22:50:56 +00:00
Billy Laws
60169fce4c Support 0-sized constant buffers 2022-12-03 22:50:56 +00:00
Billy Laws
b86dd99e1a Align all SSBOs to 0x40 bytes
Required by Adreno GPUs
2022-12-03 22:50:56 +00:00
Billy Laws
bfae292fb0 Make executor slot count setting exponential 2022-12-03 22:50:56 +00:00
Billy Laws
e0ae94be9d Enable robustness1 Vulkan feature 2022-12-03 22:50:56 +00:00
Billy Laws
e8ef2d80af CMake build file updates 2022-12-03 22:50:56 +00:00
Billy Laws
bf03f945ee Implement the Kepler compute engine
This can reuse a fair bit of the now-commonised Maxwell 3D code and mostly consists of compute-specific pipeline code which was deemed not suitable for being commonised (e.g. descriptor update code is somewhat duplicated). Of note is how compute lacks any active state at all de to its use of QMDs which bundle up all state into a single object in memory.
2022-12-03 22:50:56 +00:00
Billy Laws
4bc81f007f Add some convinience helpers to compute engine regs 2022-12-03 22:50:56 +00:00
Billy Laws
4267a6af36 Add support for parsing and compiling compute shaders to the shader manager 2022-12-03 22:50:56 +00:00
Billy Laws
86dab65af4 Commonise maxwell3d state updater 2022-12-03 22:50:56 +00:00
Billy Laws
a0b81d54d6 Use pitch layout for linear RTs
More likely to match in the texture cache when being sampled.
2022-12-03 22:50:56 +00:00
Billy Laws
ac85df7b7a Start transition cache lookup with most recent one 2022-12-03 22:50:56 +00:00
Billy Laws
62c86b7690 Move maxwell3d to common constant buffer code 2022-12-03 22:50:56 +00:00
Billy Laws
8f0a6e78c5 Add Vulkan stride dynamic state and robustness support
Fixes the waterfall in SMO by specifying vertex buffer bounds.
2022-12-03 22:50:56 +00:00
Billy Laws
23a7f70a8e Commonise maxwell3d guest shader caching code 2022-12-03 22:50:56 +00:00
Billy Laws
6f6a312692 Commonise maxwell3d pipeline binding handling code
A lot of pipeline code is difficult to commonise due to the inherent difference between compute and graphics pipelines, however the binding layout is shared so we can at least commonise that
2022-12-03 22:50:56 +00:00
Billy Laws
be8cbabd97 Commonise maxwell3d texture code
This will be shared with the compute engine implementation.
2022-12-03 22:50:56 +00:00
Billy Laws
61e95c4b2c Commonise maxwell3d sampler code
This will be shared with the compute engine implementation, the only thing of note with this is that the binding register is now passed as a param since it is part of the compute QMD which can't be dirty tracked.
2022-12-03 22:50:56 +00:00
Billy Laws
7f93ec3df6 Commonise maxwell3d interconnect common code for use by other engines
The compute engine will require most of this for basic functionality.
2022-12-03 22:50:56 +00:00
Billy Laws
281838fde1 Apply GPU readback hack to both buffers and textures
And rename as appropriate.
2022-12-03 22:50:56 +00:00
Billy Laws
f358c4517e Update edge credits 2022-12-03 22:50:56 +00:00
Billy Laws
eb00dc62f8 Implement support for 36 bit games by using split code/heap mappings
Although rtld and IPC prevent TLS/IO and code from being above the 36-bit AS limit, nothing depends the heap being below it. We can take advantage of this by stealing as much AS as possible for code in the lower 36-bits.
2022-12-02 22:10:03 +00:00
Dima
e8e1b910c3 Add possibility to disable audio output 2022-12-02 00:33:28 +01:00
lynxnb
70109f8fbd Work around invalid values in CNTFRQ_EL0 register
Exynos SoCs have a bug where the `CNTFRQ_EL0` register is either set to 0 or contain incoherent values. With this patch, the frequency value is loaded into a static variable and used instead of reading the register. The value will be initialised to the correct value for affected SoCs, while unaffected ones will use the value from the register.
2022-12-02 00:23:28 +01:00
lynxnb
54d0246ca6 Tweak GpuDriverActivity FAB padding 2022-11-28 00:06:07 +01:00
lynxnb
2e8d7b559c Use the original view padding/margin when applying window insets
Adding to the current view padding/margin values results in applying the insets over and over again as insets listeners can be called multiple times.
2022-11-28 00:04:39 +01:00
Billy Laws
b2384e83f5 Add prepo:a service 2022-11-25 16:26:00 +00:00
Billy Laws
736216a6f4 Stub OpenPatchDataStorageByCurrentProcess 2022-11-25 16:26:00 +00:00
Billy Laws
44033d7f8d Adjust CalendarTime year to be relative to 0AD 2022-11-25 16:26:00 +00:00
Billy Laws
2ce2604421 Implement VFS file deletion 2022-11-25 16:26:00 +00:00
Billy Laws
6c968e0357 Fix GetEntryType IPC return type 2022-11-25 16:26:00 +00:00
lynxnb
ec220c8ea9 Use an extended FAB in GpuDriverActivity 2022-11-23 19:49:42 +05:30
lynxnb
163f4f2014 Fix window insets handling when in landscape mode
To avoid code duplication, insets handling has been moved to a separate interface.
2022-11-23 19:49:42 +05:30
lynxnb
ab6c5f4c50 Improve robustness of KeyReader.import
* Close the input and output file streams before moving the output file to the final destination
* Clean up the destination path before moving the new file
* Introduce a `ImportResult` return value to differentiate between the possible causes of import errors
* Display more meaningful error messages in the UI
2022-11-23 19:49:42 +05:30
lynxnb
38129d9dc3 Mark some strings as non-translatable 2022-11-23 19:49:42 +05:30
lynxnb
ee8c055641 Make GpuDriverInstallResult PascalCase 2022-11-23 19:49:42 +05:30
Billy Laws
7f1667de82 Avoid using trapping for frequently trapped shaders
Fall back to hashing for every shader access as that ends up being faster than applying traps for every execution.
2022-11-19 12:49:05 +00:00
Billy Laws
06095918a9 Introduce per-channel sequence number for invalidation tracking
For cases like shaders, which may be uploaded through I2M (which no longer causes an execution) we need a way to cause an invalidation on all writes
2022-11-19 12:49:05 +00:00
Billy Laws
97e3f7fd34 Increase max swapchain image count 2022-11-19 12:49:05 +00:00
Billy Laws
c49119f5ef Fixup depth bounds register arguments 2022-11-19 12:49:05 +00:00
Billy Laws
db3c5c33c4 Clamp depth bounds into 0-1 range 2022-11-19 12:49:05 +00:00
Billy Laws
e1bbd521d9 Fix potential circular queue submission race
If a producer thread was waiting for the queue to have free space and the consumer thread hadn't yet acquired the production mutex a deadlock could occur
2022-11-19 12:49:05 +00:00
Billy Laws
13baf2312f Add a workaround for sampling BGRA textures with a swizzle 2022-11-19 12:49:05 +00:00
Billy Laws
13a96c5aba Implement a helper shader for partial clears
These are not natively supported by Vulkan, so use a helper shader and colorWriteMask for the same behaviour.
2022-11-19 12:49:05 +00:00
Billy Laws
ac0e225114 Use vkCmdBlit for texture copies when formats dont match 2022-11-19 12:49:05 +00:00
Billy Laws
c8fc8f84ec Fallback to RGBA888 for unsupported swapchain formats as opposed to swizzle 2022-11-19 12:49:05 +00:00
Billy Laws
e0bc0d3a97 Avoid megabuffering buffers larger than the chunk size 2022-11-19 12:49:05 +00:00
Billy Laws
b6f49884b3 Use lower_bound to speedup texture hostMapping lookup 2022-11-19 12:49:05 +00:00
Billy Laws
e7fda28ac6 Skip over textures in cache which have been replaced with a layer/mip match 2022-11-19 12:49:05 +00:00
Billy Laws
88cc696c7f Only use 2D array depth targets when depth > 1 2022-11-19 12:49:05 +00:00
Billy Laws
7fed971b2d Take firstIndex into account when calculating index (quad) buffer size
Without this we would miss any elements beyond indexCount in the index buffer and they would be filled with random garbage causing vertex bombs
2022-11-19 12:49:05 +00:00
Billy Laws
1f9de17e98 Begin command buffers asynchronously in command executor
vkBeginCommandBuffer can take quite some time on adreno, move it to the cycle waiter thread where it won't block GPFIFO.
2022-11-19 12:49:05 +00:00
Billy Laws
4b3e906c22 Update cached buffer execution number when megabuffering 2022-11-19 12:49:05 +00:00
Billy Laws
3ae1e78544 Match mip layers and array layers in texture manager 2022-11-19 12:49:05 +00:00
Billy Laws
d502adb309 Avoid WRW hazard in subpass deps 2022-11-19 12:49:05 +00:00
Billy Laws
e9313cc291 Use view layer count over texture for attachments 2022-11-19 12:49:05 +00:00
Billy Laws
e65ca52d91 Avoid potential buffer copy race 2022-11-19 12:49:05 +00:00
Dima
720cfaafb6 Stub caps:su 2022-11-18 15:35:03 +00:00
Dima
74afca4aab Stub caps:u 2022-11-18 15:35:03 +00:00
Dima
27ff1ae19b Stub caps:c 2022-11-18 15:35:03 +00:00
Dima
ffb0546609 Stub caps:a 2022-11-18 15:35:03 +00:00
Dima
1c8736cb56 Stub IsLargeResourceAvailable 2022-11-18 12:52:25 +00:00
Dima
dcd9e4ff61 Stub SetIdleTimeDetectionExtension, SetAlbumImageTakenNotificationEnabled 2022-11-18 12:52:25 +00:00
Dima
60843269de Stub GetBlockedUserListIds and UpdateUserPresence 2022-11-18 12:52:25 +00:00
Dima
2cdfc7640c Stub GetPreviousProgramIndex 2022-11-18 12:52:25 +00:00
Dima
360306eb61 Stub GetAddOnContentListChangedEventWithProcessId 2022-11-18 12:52:25 +00:00
Dima
3d475ca122 Stub GetAccountId 2022-11-18 12:52:25 +00:00
Dima
0b452fe36b Stub GetFriendList 2022-11-18 12:52:25 +00:00