Commit Graph

1988 Commits

Author SHA1 Message Date
Billy Laws d45f9e4d26 Loosen some texture WaR sync when possible
By keeping track of the stages reading the image we can do more fine-grained WaR prevention, as opposed to waiting for all commands to complete.
2023-02-20 18:01:49 +00:00
Billy Laws 6ee8a919e5 Use pipeline barriers, as opposed to an ext dependency for RP barrier
Allows for waiting on compute shaders, which are not a graphics stage.
2023-02-20 18:01:49 +00:00
Billy Laws ee68facc5d Apply RP barrier masks for every draw, rather than the 1st in RP
I missed that addSubpass was only called once-per-subpass, meaning that if a new barrier req was discovered several draws into the RP it wouldn't be applied. Split out barriers into a seperate function to avoid this.
2023-02-20 18:01:49 +00:00
Billy Laws bb20b145a8 Hash subpass dependencies in RP cache 2023-02-20 18:01:49 +00:00
Billy Laws 99bf7dbb36 Implement usage based implicit renderpass barrier generation
Full pipeline barriers between every RP can be extremely expensive on HW, by analysing the inputs and outputs of a draw it's possible to construct a much more optimal barrier that only syncs what is neccessary.
2023-02-20 18:01:49 +00:00
Billy Laws 7a759326b3 Don't break RPs on view pointer changes
Sometimes view pointers may change despite the underlying Vulkan image view not actually changing, so use vk::ImageViews for tracking to keep RP breaks to a minimum.
2023-02-20 18:01:49 +00:00
Billy Laws a02e1a2536 R.I.P. Subpasses 2023-02-20 18:01:49 +00:00
Billy Laws a47f010653 Add an option to allow CPU writes when fast readback is used 2023-02-20 18:01:49 +00:00
Billy Laws 2d56ed053d Implement all remaining ASTC formats 2023-02-20 18:01:49 +00:00
Billy Laws d93c96de83 Fix sign error when decoding bc5s images
Using an unsigned loop counter caused an implicit conversion breaking the decoder logic.
2023-02-20 18:01:49 +00:00
Billy Laws 2d97b9fc2c Keep track of buffer dirtiness within an execution 2023-02-20 18:01:49 +00:00
Billy Laws 6ea1483c9a Fix a race with multiple threads pushing data into circular queue
If the 'end' changes when waiting for data to be consumed our 'waitNext' would be invalid leading to a deadlock.
2023-02-20 18:01:49 +00:00
Billy Laws f55b135243 Assert on geometry stream usage 2023-02-20 18:01:49 +00:00
Billy Laws 2e64199640 Add host shader replacement and dumping support
This is useful for debugging, but shouldn't generally be used as bindings in SPIR-V etc are unstable.
2023-02-20 18:01:49 +00:00
Billy Laws 02786839a5 Flush pipeline after texture uploads 2023-02-20 18:01:49 +00:00
Billy Laws ffdd50bdf3 Fix geometry and compute shaders on mali GPUs 2023-02-20 18:01:49 +00:00
mk73ds cb62e15748 Ignore new layer creations instead of replacing previous ones while waiting for proper multiple layers support. 2023-02-18 12:06:37 +00:00
Erwin Spitaler 2855d12f31 quick and dirty implementation for GetFreeSpaceSize 2023-02-18 12:06:18 +00:00
Abandoned Cart b20c6e9fc4 `onBackPressed` -> `onBackPressedDispatcher` 2023-02-16 14:50:02 +01:00
MrPurple666 bc3c49bc28 Add TIC format: 0x58D24946
Cult of the Lamb must be in-game and should fix some textures in Bayonetta 3

Co-authored-by: neogan-bot <neogan-bot@users.noreply.github.com>
2023-02-13 12:10:29 +00:00
PixelyIon 3407f5a6d1 Fix Depth RT layer stride
The layer stride provided by the depth register in Maxwell3D needs to be shifted by 2, this caused the stride to be 1/4th of what it needed to be resulting in OOB access.
2023-02-13 12:02:29 +00:00
PixelyIon df3b961d5d Fix mipmapped GOB dimensions calculation
When calculating mip-level dimensions in terms of GOBs, they need to be divided by 2 while rounding upwards rather than downwards. This fixes corrupted textures and OOB access on lower mip levels across a substantial amount of titles, reducing arbitrary crashes as a result.
2023-02-13 12:02:29 +00:00
Billy Laws bff232f326 Update edge names 2023-02-07 16:52:33 +00:00
Billy Laws 754a9dfd77 Avoid storing guest shader hash in generated spirv
This accidentally broke VK spec and could harm driver caching.
2023-02-07 16:50:46 +00:00
german77 9e1c9caa36 input: Fix motion orientation based on phone orientation 2023-02-07 16:16:41 +00:00
german77 56d43a70c0 Implement SixAxis sensor 2023-02-07 16:16:41 +00:00
lynxnb bac4ec2977 Update Kotlin + Android dependencies
* Update Kotlin to 1.7.21
* Migrate Hilt to Gradle Plugin DSL and use updated package name
* Update dependencies
* Misc cleanup of build scripts
2023-02-06 15:04:20 +01:00
lynxnb 85a711c420 Suppress warnings in `AndroidManifest` 2023-02-06 15:04:20 +01:00
lynxnb 42e0cc4290 Update build.gradle to remove deprecated properties 2023-02-06 15:04:20 +01:00
PixelyIon 2cdff40bcb Add debug log for SVC `CancelSynchronization`
This SVC was missing a log which makes it harder to trace issues to it, it has been added to assist with future debugging.
2023-02-05 18:07:11 +05:30
Billy Laws 7f1b6de1fe Update hades 2023-02-04 23:10:45 +00:00
Billy Laws 94ac457ce0 Ensure mappings are always aligned to big page size when deallocated and
mapped

Since we align up when allocating, not doing so when deallocating would result in a gradual buildup of boundary pages that eventually fill the whole address space.
2023-02-04 23:10:45 +00:00
Billy Laws d659b4f55e Swap min and max depth when negative scale is used
Fixes Super Mario 3D All Stars rendering.
2023-02-04 23:10:45 +00:00
Billy Laws 198e9e8e48 Avoid page faults when using the fallback shader size
These occured in some homebrew otherwise.
2023-02-04 23:10:45 +00:00
Billy Laws 10e7e6272a Pass in pipeline tessellation state to Vulkan 2023-02-04 23:10:45 +00:00
Billy Laws 12c88babd0 Fix address space allocator slow path to avoid OOB 2023-02-04 23:10:45 +00:00
Billy Laws 4a4f6df792 Stub GetBufferHistory transaction 2023-02-04 23:10:45 +00:00
Billy Laws 3795ecceff Stub IdleTickCount GetInfo result 2023-02-04 23:10:45 +00:00
Billy Laws 3ef84b27c3 Avoid pipeline cache warning 2023-02-04 23:10:45 +00:00
Billy Laws bb3baa888d Add a hack to disable shader subgroup shuffles
These are about 100x as expensive on adreno than nvidia due to the lack of a dedicated instruction, since some games work fine without them add a hack to disable them.
2023-02-04 23:10:45 +00:00
Billy Laws 568306195f Prevent Vulkan guest crashes by avoiding intermediate syncpt event signal state
The vulkan guest driver doesn't expect a 0xB return code from SyncptEventWait, even though this is valid when an event is being signalled. Just ignore the intermediate state instead as doing so avoids races without causing any more.
2023-02-04 23:10:45 +00:00
Billy Laws fcb8f2a229 Apply texture shader compiler generated descriptor shifts
These were missed on a hades version upgrade.
2023-02-04 23:10:45 +00:00
Billy Laws bbef006051 Simplify free descriptor set accounting and update ratios
Slightly reduces descriptor usage in Breath of The Wild
2023-02-04 23:10:45 +00:00
Billy Laws 5e862cf5f7 Bail out early if the new pipeline key matches that of the current one
Prevents the transition cache of some pipelines from getting full of copies of itself in cases where an update happens redundantly.
2023-02-04 23:10:45 +00:00
Billy Laws 3e971d4043 Wait for pipeline compilation to finish before loading the guest
The excessive blocking caused by initial compilation happening async to the guest caused issues in some cases, now we have a Vulkan pipeline cache to speed it up we can wait for a full compile before launch without too many issues.
2023-02-04 23:10:45 +00:00
Billy Laws be6f08cd97 Add debug pipeline statistics recording for finding redundant pipelines 2023-02-04 23:10:45 +00:00
Billy Laws 6333a92b53 Only include active RTs in pipeline state key
This was causing a buildup of many redundant pipelines in SMO as a depth-only shader was being called without previous RTs being unbound.
2023-02-04 23:10:45 +00:00
Billy Laws 9d3a9f63d5 Move graphics piplines away from storing hades shader info struct
By only using what we need, and mirroring the descriptor structs to allow for much tighter packing (while keeping the same member names) we can reduce pipeline memory to about 1/3 of what it was before.
2023-02-04 23:10:45 +00:00
Billy Laws dd92cb1536 Implement support for (de)serialising VkPipelineCaches to/from storage
Significantly improves launch times in games with many shader combinations, giving an 5x speedup in some cases.
2023-02-04 23:10:45 +00:00
Billy Laws db173083d7 Update edge credits 2023-02-04 23:10:45 +00:00
PabloG02 35617930d5 Fix rebase 2023-01-28 11:57:19 +00:00
PabloG02 8b9d6f79ab Add option to enable/disable shader cache 2023-01-28 11:57:19 +00:00
PixelyIon 8bfda0d84d Fix hole punching in mappings with SVC `UnmapPhysicalMemory`
Certain titles such as Super Smash Bros Ultimate can use SVC `UnmapPhysicalMemory` to punch holes into physical memory mappings, this wasn't handled correctly as we completely deleted the portion after the hole. It has now been fixed which results in these titles which depend on this behavior to work now.
2023-01-23 21:28:59 +00:00
hacobot.dev ff1e62df7a deleted unnecessary convertion 2023-01-23 21:28:49 +00:00
hacobot.dev 75f6f5e31c pull request requested changes 2023-01-23 21:28:49 +00:00
hacobot.dev 7cd13916a3 Main activity is now refreshing when the group checkbox is changed 2023-01-23 21:28:49 +00:00
hacobot.dev b67bfe3848 Added functionality to make optional to group games by format and sort 2023-01-23 21:28:49 +00:00
Billy Laws 6b9be2edd4 Add note about circular queue append contiguosity guarantees 2023-01-20 21:19:04 +00:00
PabloG02 535eafb57a Add Android 13 themed icon 2023-01-20 21:08:33 +00:00
PabloG02 d544ccf5ea Stub INotificationServicesForApplication 2023-01-20 21:08:12 +00:00
PabloG02 2fa5ea451e Stub IPrepoService::SaveReportWithUserOld2 2023-01-20 21:08:12 +00:00
PabloG02 7327cdbde9 Stub some functions in IDeliveryCacheStorageService 2023-01-20 21:08:12 +00:00
PabloG02 c53d99d393 Stub IDeliveryCacheFileService and IDeliveryCacheDirectoryService 2023-01-20 21:08:12 +00:00
PabloG02 299d11d86f Stub IApplicationFunctions::GetNotificationStorageChannelEvent 2023-01-20 21:08:12 +00:00
Billy Laws 7c623f8301 Use a spinlock for thread waiter mutex
Since the waitermutex is only ever locked for a short amount of time, spinning in contention-heavy scenarios ends up quite a bit more efficient than a kernel wait.
2023-01-20 21:07:59 +00:00
Billy Laws e2463b7619 Adjust gpfifo WFI to only do a pipeline barrier 2023-01-20 21:07:59 +00:00
Billy Laws 2b282ece1a Add more fine-grained buffer recreation locking 2023-01-20 21:07:59 +00:00
Billy Laws 85a23e73ba Implement a shared spinlock and use it for GPU VMM 2023-01-20 21:07:59 +00:00
Billy Laws fd5c141dbf Correct GetNpadIrCameraHandle return value 2023-01-20 21:07:59 +00:00
Billy Laws a8b32c3cef Cleanup helper pipeline cache code 2023-01-20 21:07:59 +00:00
Billy Laws 1f99d63a80 Incr transition cache size 2023-01-20 21:07:59 +00:00
Billy Laws 262f92900d Ensure unmapped VMM ranges return an invalid span 2023-01-20 21:07:59 +00:00
Billy Laws 0a608fb4b2 Update to latest hades 2023-01-20 21:07:59 +00:00
Billy Laws 44f6aada18 Always set blend state for all colour attachments 2023-01-20 21:07:59 +00:00
Billy Laws 177925be93 Avoid OOB memory acceses when trying to read OOB TICs
Some games pass in invalid texture handles (0xffff) when they don't need the texture so return the null texture in this case.
2023-01-20 21:07:59 +00:00
Billy Laws d8a4a2b08d Use a spinlock for GPU waiter thread 2023-01-20 21:07:59 +00:00
Billy Laws f1aed86177 Add a workaround for split-mapping shaders
Some games split shaders across multiple mappings and *also* miss the end header, so read a suitably large amount and hope that's enough for now.
2023-01-20 21:07:59 +00:00
Billy Laws 704660bbeb Store render nodes in a linearly allocated linked list
This is much faster in reldebug builds than boost::stable_vector while still providing iterator stability
2023-01-20 21:07:59 +00:00
Billy Laws 326c05a5de Add guest shader replacement and dumping support 2023-01-20 21:07:59 +00:00
Billy Laws 2f6d27e8d7 Rework circular queue locking
Should now be (hopefully) race-free, also switch to a spinlock to avoid any locking overhead.
2023-01-20 21:07:59 +00:00
lynxnb 5d527cb965 Add `CNTFRQ_EL0` workaround value for Exynos 1280 2023-01-15 10:16:01 +00:00
PabloG02 ea0217de47 Add TIC format: 0x78D24952 2023-01-13 18:05:22 +00:00
Abandoned Cart 88b3f371f4 Display a preview of the current profile picture
This removes the need to concatenate the variable multiple times, recycles the scaled bitmap after it has been stored, addresses the Android Studio complaint about that method name, and generates a preview of the current profile image as the preference icon.
2023-01-13 14:28:20 +01:00
lynxnb aa36c591c6 Exclude Home button from controller setup guide 2023-01-11 20:51:18 +00:00
Maccraft123 c3924e0f08 Stub out InlineKeyboard instead of throwing an error 2023-01-11 20:47:39 +00:00
lynxnb 2a421e7146 Run emulation in a separate process for release builds only 2023-01-11 23:38:57 +05:30
lynxnb 950438bf58 Enable `VK_KHR_image_format_list` during device init
`VK_KHR_image_format_list` is a requirement for `VK_KHR_imageless_framebuffer`, which we use.
2023-01-11 23:38:57 +05:30
PixelyIon d39112e9b9 Enable `IApplicationDisplayService::ConvertScalingMode` implementation
The implementation for this service function wasn't added to the service function table. Additionally, the type for the output `ScalingMode` was implicitly `int` as it was unspecified in the `enum class` which has now been corrected to `u64` as it should be.
2023-01-11 23:38:57 +05:30
PixelyIon 45d0558d00 Check for no Vulkan physical devices
Due to broken drivers, it's possible to find no Vulkan physical devices but this can lead to a cryptic segfault. This explicitly checks for it instead and throws an exception which will be emitted into logcat thus can be easily caught.
2023-01-11 23:38:57 +05:30
PixelyIon f882b613bc Fix `.hook` section being allocated without any hooked symbols
Due to the trampoline and save/load context functions, `GetHookSectionSize` returned a non-zero size for when there were no hooked symbols supplied to it. This is problematic as it isn't required and hooking is currently not stable so it can lead to crashes or freezes in certain titles.
2023-01-11 00:13:15 +05:30
PixelyIon 3fa314f6cb Always print thread IDs rather than handles for SVC logs
Handles are rather arbitrary and difficult to reference, as a result, we've moved to thread IDs across the board for logs.
2023-01-11 00:13:15 +05:30
PixelyIon e192d4e5c1 Warn when `RemoveThread` is called on a non-inserted thread 2023-01-11 00:13:15 +05:30
PixelyIon 3a6f205e6f Clear `insertThreadOnResume` in `RemoveThread`
A thread can be paused while it is in a synchronization primitive which will do `RemoveThread`, we need to update the state of `insertThreadOnResume` in this case by clearing it so it isn't incorrectly reinserted on resuming the thread.
2023-01-11 00:13:15 +05:30
PixelyIon 7fef849594 Make `UpdateCore`'s locking `coreMigrationMutex` requirement explicit
`Scheduler::UpdateCore` implicitly depended on `KThread::coreMigrationMutex` being locked during calls to it, this requirement has now been made explicit to avoid confusion.
2023-01-11 00:13:15 +05:30
PixelyIon c4b4532222 Check `waitThread` rather than `waitMutex` during condvar timeouts
When a timeout occurs in `ConditionVariableWait`, we used to check `waitMutex` which is cleared by `MutexUnlock` but when we hit the CAS case in `ConditionVariableSignal` then we don't clear `waitMutex`. It's far more reliable to check `waitThread` as an indication for if the thread has already been unlocked as it's cleared at the start of `ConditionVariableWait` and would implicitly stay cleared in the CAS case while being set in `MutexLock` and being unset in `MutexUnlock`.
2023-01-11 00:13:15 +05:30
PixelyIon 2525bafe06 Consolidate thread yielding in `Scheduler`
There's multiple locations where a thread is yielded in the scheduler and all of them repeat the code of checking for `pendingYield` and signalling with an optional optimization of checking if the thread being yielded is the calling thread.

All this functionality has now been consolidated into `Scheduler::YieldThread` which checks for `pendingYield` and does the calling thread yield optimization. This should lead to better readability and better performance in cases where `UpdatePriority` would signal the calling thread.
2023-01-11 00:13:15 +05:30
PixelyIon 8b973a3de3 Always set `forceYield` for running threads in `PauseThread`
`forceYield` was incorrectly not set when pausing running threads if the thread already had `pendingYield` set. This could lead to cases where `Rotate` would later throw an exception due to it being unset.
2023-01-11 00:13:15 +05:30
PixelyIon 6645692288 Don't block while inserting paused threads
Blocking while inserting a paused thread can lead to deadlocks where the inserting thread later resumes the paused thread.

Co-authored-by: Billy Laws <blaws05@gmail.com>
2023-01-11 00:13:15 +05:30
Billy Laws 643f4cf864 Ensure thread doesn't migrate during `InsertThread`
As we didn't hold `coreMigrationMutex`, the thread could simply migrate during `InsertThread` which would lead to the thread potentially never waking up as it's been inserted on a non-resident core.

Co-authored-by: PixelyIon <pixelyion@protonmail.com>
2023-01-11 00:13:15 +05:30
PixelyIon 7f7352ed59 Recalculate highest-priority waiters during cvar/address signaling
`SignalToAddress`/`ConditionVariableSignal` need to wake waiters in priority order, while threads are inserted in order this doesn't remain the case as priority updates don't reinsert the thread into `syncWaiters`. 

It was determined that reinsertion into `syncWaiters` would be fairly complex due to locking the `syncWaitersMutex` with the thread's mutexes. To avoid this, this commit instead sorts waiters by priority at signal time to always wake threads in the right order.
2023-01-11 00:13:15 +05:30
PixelyIon 626008d8e2 Fix `WaitForAddress` timeout mutex deadlock
Calling `WaitSchedule` inside the block where `syncWaiterMutex` is locked causes a race with other threads which lock the core mutex and `syncWaiterMutex` together. This commit moves the `WaitSchedule` outside the block while simply setting a flag to wait later similar to `ConditionVariableWait`'s timeout case.

Co-authored-by: Billy Laws <blaws05@gmail.com>
2023-01-11 00:13:15 +05:30
PixelyIon 4df3c98225 Add double-insertion debug check to `InsertThread`
This is a cause for a large amount of scheduler bugs so we should generally check for this on debug builds as it is a fairly easy way to check for issues for some performance cost.
2023-01-11 00:13:15 +05:30
PixelyIon 5694c9b34b Rename `KThread::waitKey` to `KThread::waitMutex`
It was determined that `waitKey` is too ambiguous when waiter members are used for both mutexes and condition variables.
2023-01-11 00:13:15 +05:30
PixelyIon 91bb8d231a Rename `ConditionalVariable` -> `ConditionVariable`
"Conditional Variable" is a typo which was propagated through the codebase, it has been corrected to "Condition Variable".
2023-01-11 00:13:15 +05:30
PixelyIon f487d81769 Refactor Condition Variable Waiting/Signalling
The way we handled waking/timeouts of condition variables was fairly inaccurate to HOS as we moved locking of the mutex to the waker thread which could change the order of operations and would cause what were functionally spurious wakeups for all awoken threads.

This commit fixes it by doing all locks on the waker thread and only awakening the waiter thread once the condition variable was signalled and the mutex was unlocked. In addition, this fixes races between a timeout and a signal that could lead to double-insertion as a result of a refactor of how timeouts work in the new system.
2023-01-11 00:13:15 +05:30
PixelyIon 1eb4eec103 Allow locking external thread in `MutexLock`
We want the ability to lock mutexes on behalf of other threads to refactor condition variables to match HOS on waking behavior.
2023-01-11 00:13:15 +05:30
PixelyIon 6bbe9de881 Fix result returned by `MutexLock`
`MutexLock` incorrectly returned `InvalidCurrentMemory` for cases where the userspace value didn't match the expected value. It's been corrected to return no error in those cases while preserving the error code for usage in `ConditionalVariableWait`.
2023-01-11 00:13:15 +05:30
PixelyIon 08ef88b156 Add early-timeout path for `WaitForAddress` 2023-01-11 00:13:15 +05:30
PixelyIon d0c56235f4 Read `address` atomically in `WaitForAddress`
We didn't read the values for arbitration atomically in all cases as we should have, this consolidates the reading of the value and uses the value across all cases.
2023-01-11 00:13:15 +05:30
PixelyIon e8a1bd1aad Fix `WaitForAddress` timeout signal race
A race could occur from the timeout path in `WaitForAddress` taking place at the same time as `SignalToAddress` has been caused, this causes a deadlock due to double-insertion.
2023-01-11 00:13:15 +05:30
Billy Laws 0f1d97fe2c Update edge supporter names 2023-01-08 21:35:14 +00:00
Billy Laws 31fb6d30eb Fake maxwell occlusion query results 2023-01-08 19:30:52 +00:00
Billy Laws a92c26531e Keep holes in descriptors for unsupported bindings 2023-01-08 19:30:52 +00:00
Billy Laws 81d82008c7 Pre-signal suspend ticks event 2023-01-08 19:30:52 +00:00
Billy Laws 3e5992e366 Update hades 2023-01-08 19:30:52 +00:00
Billy Laws 45bbf3bb2a Fix indirect draws with direct buffers
We need to wait on the GPFIFO manually as we won't hit the traps when accesing the indirect params with direct as we usually would.
2023-01-08 19:30:52 +00:00
Billy Laws 68ad052cb1 Add geometry passthrough shader support for vertex layer writes 2023-01-08 19:30:52 +00:00
Billy Laws ec519a7d52 Return null texture on encountering unmapped textures 2023-01-08 19:30:52 +00:00
Billy Laws 97e127153b Make shader trap mutex recursive
There are cases there we hit a shader trap within the GPU, by making it recursive we avoid deadlocking on reads within the GPU.
2023-01-08 19:30:52 +00:00
Billy Laws 1a6165f74d Fix GetReadOnlyBackingSpan for non-direct buffers
This was missed in the initial implementation
2023-01-08 19:30:52 +00:00
Billy Laws 4e5141f879 Fix missed attempt increment in spinlock
Should hog CPU slightly less and correctly yield now
2023-01-08 19:30:52 +00:00
Billy Laws 35a46acbb1 Determine storage buffer alignment dynamically 2023-01-08 19:30:52 +00:00
Billy Laws 12d80fe6c2 Use a shared mutex for GPU VMM to avoid deadlocks
Two reads need to be able to occur simultanously or deadlocks ccan occur (e.g read traps to wait on GPU but GPU needs to read).
2023-01-08 19:30:52 +00:00
Billy Laws 28b2a7a8a1 Dynamically apply GPU turbo clocks only when GPU submissions are queued
Allows for the GPU to clock down in cases where it's idle for most of the time, while still forcing maximum clocks when we care.
2023-01-08 19:30:52 +00:00
Billy Laws 81f3ff348c Transition memory handling from memfd to anonymous shared mappings
Memfd mappings are incompatible with KGSL user memory importing on older kernels, transition to shared anon mappings to avoid this.
2023-01-08 19:30:52 +00:00
Billy Laws cc3c869b9f Attempt to signal the vsync event at present time if possible
Some games rely on the vsync event to schedule frames, by matching its timing with presentation we can reduce needless waiting as the game will immediely be able to queue the next frame after presentation.
2023-01-08 19:30:52 +00:00
Billy Laws 918a493a45 Implement wfi and setReference GPFIFO barriers 2023-01-08 19:30:52 +00:00
Billy Laws 7315ba04e6 Fixup optional flattenable binder obj structure 2023-01-08 19:30:52 +00:00
Billy Laws 90e21b0ca1 Split syncpoints into host-guest pairs
This allows for the presentation engine to grab the presentation image early when direct buffers are in use, since it'll handle sync on its own using semaphores it doesn't need to wait for GPU execution.
2023-01-08 19:30:52 +00:00
Billy Laws 966c31810a Return appropriate fences in surfaceflinger queue buffer 2023-01-08 19:30:52 +00:00
Billy Laws afef6c5123 Always populate all colour attachments
This better follow the Vulkan spec, which doesn't mention anything about writes to OOB attachments, only those marked as unused.
2023-01-08 19:30:52 +00:00
Billy Laws 3571737392 Reset maxwell3d quick bind state before adding subpasses to executor
If a submission happens during the call to addsubpass we could end up with invalid quick bind state, move this to to before to prevent that.
2023-01-08 19:30:52 +00:00
Billy Laws 3d31ade35f Implement an alternative buffer path using direct memory importing
By importing guest memory directly onto the host GPU we can avoid many of the complexities that occur with memory tracking as well as the heavy performance overhead in some situations. Since it's still desired to support the traditional buffer method, as it's faster in some cases and more widely supported, most of the exposed buffer methods have been split into two variants with just a small amount of shared code. While in most cases the code is simpler, one area with more complexity is handling CPU accesses that need to be sequenced, since we don't have any place we can easily apply writes to on the GPFIFO thread that wont also impact the buffer on the GPU, to solve this, when the GPU is actively using a buffer's contents, an interval list is used to keep track of any GPFIO-written regions on the CPU and any CPU reads to them will instead be directed to a shadow of the buffer with just those writes applied. Once the GPU has finished using buffer contents the shadow can then be removed as all writes will have been done by the GPU.

The main caveat of this is that it requires tying host sync to guest sync, this can reduce performance in games which double buffer command buffers as it prevents us from fully saturating the CPU with the GPFIFO thread.
2023-01-08 19:30:52 +00:00
Billy Laws b3f7e990cc Allow for tying guest GPU sync operations to host GPU sync
This is necessary for the upcoming direct buffer support, as in order to use guest buffers directly without trapping we need to recreate any guest GPU sync on the host GPU. This avoids the guest thinking work is done that isn't and overwriting in-use buffer contents.
2023-01-08 19:30:52 +00:00
Billy Laws 89c6fab1cb Implement a way to check if the command record thread is idle
Useful for debugging and testing
2023-01-08 19:30:52 +00:00
Billy Laws c67f27e914 Add a setting to control the maximum number of accumulated GPU cmds
This helps to keep the GPU fed when processing large command buffers which don't have any syncpoints to force a flush inbetween.
2023-01-08 19:30:52 +00:00
Billy Laws 77214a98dd Add a setting to force maximum GPU clocks on KGSL devices 2023-01-08 19:30:52 +00:00
Billy Laws 83ecc33a77 Update adrenotools 2023-01-08 19:30:52 +00:00
Billy Laws 3ecaedd71e Add adrenotools direct mapping support 2023-01-08 19:30:52 +00:00
Pablo 8846a85d3a Stub some IPurchaseEventManager functions 2022-12-31 10:45:18 +00:00
PabloG02 80c0f8f04d
Implement full profile picture support
Extends the profile picture stub into a full-fledged implementation with the ability for users to set their profile picture in settings while having the Skyline icon as the default profile picture.
2022-12-27 22:53:41 +05:30
PixelyIon 7a3d2e4a26 Start `KThread` TID from 1 rather than 0
HOS's TIDs are one-based rather than zero-based, certain titles such as Pokémon Arceus, Naruto Shippuden: Ultimate Ninja Storm 3, Splatoon 3, etc. use the TID being zero as a sentinel value but as we assigned this ID to our first thread prior it broke this logic which has now been fixed by this commit as it now matches HOS behavior.
2022-12-27 22:36:06 +05:30
Billy Laws bab659587f Use e1 sample count for blits 2022-12-22 18:05:45 +00:00
Billy Laws 516ece6b04 Calculate renderarea from attachment min size 2022-12-22 18:05:45 +00:00
Billy Laws 4a3cd69257 Populate graphics pipeline manager from cache at launch-time 2022-12-22 18:05:45 +00:00
Billy Laws e9bcdd06eb Introduce a pipeline cache manager for simple read/write cache accesses
All writes are done async into a staging file, which is then merged into the main pipeline cache file at the time of the next launch. Upon encountering file corruption the cache can be trimmed up to the last-known-good entry to avoid any excessive loss of data from just one error.
2022-12-22 18:05:45 +00:00
Billy Laws 06bf1b38af Introduce a pipeline state accessor that reads from a bundle 2022-12-22 18:05:45 +00:00
Billy Laws 7dd3a1db0f Avoid InterconnectContext use in graphics PipelineManager
We will soon move to a global pipeline manager instance, so it wont be possible to use InterconnectContext at pipeline-creation time anymore
2022-12-22 18:05:45 +00:00
Billy Laws ffe7263848 Add quirk for 615 drivers with broken multithreaded compilation 2022-12-22 18:05:45 +00:00
Billy Laws 755f7c75af Add pipeline (de)serialisation support to bundle
See comments in code for details on the on-disk format.
2022-12-22 18:05:45 +00:00
Billy Laws 937eff392f Switch execution-numbers to be globally unique tags
This is required for making pipelines usable across channels without introducing caching bugs.
2022-12-22 18:05:45 +00:00
Billy Laws 072b8193a1 Implement thread pool based async pipeline compilation with futures
By distributing the load of shader compiling onto multiple threads and then only waiting for completion until absolutely neccessary we can reduce compilation stutters significantly.
2022-12-22 18:05:45 +00:00
Billy Laws 186549748d Implement HelperShader-local pipeline cache and use dynamic state
Avoids the heavy overhead of the VK pipeline cache when we really only have a few bits of non-dynamic state
2022-12-22 18:05:45 +00:00
Billy Laws 9115b8cae8 Properly hash dynamic states in pipeline cache 2022-12-22 18:05:45 +00:00
Billy Laws 7c4b4765bf Reduce thresholds for slot increase and buffer/texture fast readback 2022-12-22 18:05:45 +00:00
Billy Laws f32ab1feff Include BS thread pool library 2022-12-22 18:05:45 +00:00
Billy Laws ce428af2e6 Use attachment formats rather than views in VK pipeline cache 2022-12-22 18:05:45 +00:00
Billy Laws e849264028 Abstract out pipeline-compile-time GPU state accesses
Introduces the base abstractions that will be used for pipeline caching, with a 'PipelineStateBundle' that can be (de)serialised to/from disk and an abstract accessor class to allow switching between creating disk-cached pipelines and fresh ones.
2022-12-22 18:05:45 +00:00
Billy Laws 2e96248fb6 Track RT format info in PackedPipelineState and move VK conv code there
When caching pipelines we can't cache whole images, only their formats so refactor PackedPipelineState so that it can be used for pipeline creation, as opposed to passing in a list of attachments.
2022-12-22 18:05:45 +00:00
Billy Laws bc7e1eb380 Split-out hash from ShaderBinary struct
This isn't necessary for pipeline creation and creates some difficulty with pipeline caching.
2022-12-22 18:05:45 +00:00
Dima de10ab1219 Stub SetConnectionConfirmationOption 2022-12-18 20:34:55 +00:00
Dima f3b2b4317e Stub some IPrepoService calls 2022-12-18 20:34:55 +00:00
Dima efef67b92b Stub some IAudioDevice calls 2022-12-18 20:34:55 +00:00
Dima 3a94bcf692 Fix ListOpenContextStoredUsers stub
The problem is in StoreOpenContext wasn't storing any user, but ListOpenContextStoredUsers was writing default user (when it's not stored by StoreOpenContext)
2022-12-18 20:34:55 +00:00
TheASVigilante 3c5f8dd876 Fix small typo 2022-12-18 14:49:54 +00:00
lynxnb 6599c1dccf Stub `GyroscopeZeroDriftMode`
Related service calls are called in a loop by SM3DW. A variable tracking zero drift mode has been added to `npad_device`, but it's unused at the moment.
2022-12-10 14:59:44 +00:00
Dima dcc3047ba8 Stub ErrorCommonArg 2022-12-10 14:58:20 +00:00
Dima 68253fe995 Stub mii:e/mii:u
Needed for SSBU
2022-12-10 14:58:20 +00:00
Dima 69ee3cfc66 Stub DeleteDirectory
Should allow deleting/rewriting saves in some games
2022-12-10 14:58:20 +00:00
Dima bbd34ae7e7 Validate if entries are not empty before using
Should fix saving problem in Baldur's Gate: Dark Alliance II at least
2022-12-10 14:58:20 +00:00
Dima 5f510d84d7 Stub IsVibrationPermitted 2022-12-10 14:58:20 +00:00
Dima 51d1f519af Stub ListDisplays 2022-12-10 14:58:20 +00:00
Dima a3866a3129 Stub LibraryAppletShop 2022-12-10 14:58:20 +00:00
Dima 1ebec7db82 Stub GetImageSize and LoadImage 2022-12-10 14:58:20 +00:00
Dima 52c4228ecf Stub some friends service calls
Needed for Diablo 3
2022-12-10 14:58:20 +00:00
Dima ebcbc5b05b Validate NpadId for ActivateVibrationDevice 2022-12-10 14:58:20 +00:00
Dima 4bdd033354 Stub SetRecordVolumeMuted 2022-12-10 14:58:20 +00:00
Dima f6d95aae01 Stub GetCacheStorageSize 2022-12-10 14:58:20 +00:00
Dima 4ab8699cd4 Stub ImportServerPki 2022-12-10 14:58:20 +00:00
Dima 41cf4bb12d Stub GetLanguageCode 2022-12-10 14:58:20 +00:00
Dima 3e078d54b6 Stub GetIdleTimeDetectionExtension 2022-12-10 14:58:20 +00:00
Dima 2311f777fc Stub IsCpuOverclockEnabled 2022-12-10 14:58:20 +00:00
Dima 4601c28c28 Stub GetCurrentIpAddress 2022-12-10 14:58:20 +00:00
Dima 18e6a6c53c Stub DeclareOpenOnlinePlaySession and DeclareCloseOnlinePlaySession 2022-12-10 14:58:20 +00:00
Dima 150c1370c2 Stub some IApplicationFunctions funcs 2022-12-10 14:58:20 +00:00
Dima a6f3aa3062 Stub TrySelectUserWithoutInteraction and ListQualifiedUsers 2022-12-10 14:58:20 +00:00
Dima 5a9a2861df Add TitleId TextView in App Dialog 2022-12-10 14:57:46 +00:00
Abandoned Cart b08fcd7027 Favor a predefined "click" over system vibration 2022-12-10 14:57:33 +00:00
Abandoned Cart cfd3bfecba Add a rudimentary OSC button vibration setting 2022-12-10 14:57:33 +00:00
Billy Laws 7c802aea46 Mark vertex buffers as dirty on limit changes 2022-12-03 22:50:56 +00:00
Billy Laws df19810c6c Always set vertex stride for unbound buffers 2022-12-03 22:50:56 +00:00
Billy Laws f4f658e3b7 Fix typo 2022-12-03 22:50:56 +00:00
Billy Laws 45b10ef776 Return whole mapping for shader code when end instrs aren't found 2022-12-03 22:50:56 +00:00
Billy Laws d849875656 Only unlock GPU channel state on queue wait if it was previously locked 2022-12-03 22:50:56 +00:00
Billy Laws a5e0a64adc Switch patch error logs to debug 2022-12-03 22:50:56 +00:00
Billy Laws af7c54297f Cache staging buffer used for texture download 2022-12-03 22:50:56 +00:00
Billy Laws 8c5e6d2bb4 Update VKMA 2022-12-03 22:50:56 +00:00
Billy Laws bba07fb101 Update for new hades 2022-12-03 22:50:56 +00:00
Billy Laws a16383fd4b Disable compute shaders on mali
This will need to be debugged properly at some point but its fine for now.
2022-12-03 22:50:56 +00:00
Billy Laws d69c6851f3 Update hades 2022-12-03 22:50:56 +00:00
Billy Laws 137d801843 Skip host1x HW emulation and effectively stub submission
This was causing a bunch of logspam and isn't really needed as we will be using a HLE approach.
2022-12-03 22:50:56 +00:00
Billy Laws 579a2d9337 Add dynamic executor slot growth 2022-12-03 22:50:56 +00:00
Billy Laws 60169fce4c Support 0-sized constant buffers 2022-12-03 22:50:56 +00:00
Billy Laws b86dd99e1a Align all SSBOs to 0x40 bytes
Required by Adreno GPUs
2022-12-03 22:50:56 +00:00
Billy Laws bfae292fb0 Make executor slot count setting exponential 2022-12-03 22:50:56 +00:00
Billy Laws e0ae94be9d Enable robustness1 Vulkan feature 2022-12-03 22:50:56 +00:00
Billy Laws e8ef2d80af CMake build file updates 2022-12-03 22:50:56 +00:00
Billy Laws bf03f945ee Implement the Kepler compute engine
This can reuse a fair bit of the now-commonised Maxwell 3D code and mostly consists of compute-specific pipeline code which was deemed not suitable for being commonised (e.g. descriptor update code is somewhat duplicated). Of note is how compute lacks any active state at all de to its use of QMDs which bundle up all state into a single object in memory.
2022-12-03 22:50:56 +00:00
Billy Laws 4bc81f007f Add some convinience helpers to compute engine regs 2022-12-03 22:50:56 +00:00
Billy Laws 4267a6af36 Add support for parsing and compiling compute shaders to the shader manager 2022-12-03 22:50:56 +00:00
Billy Laws 86dab65af4 Commonise maxwell3d state updater 2022-12-03 22:50:56 +00:00
Billy Laws a0b81d54d6 Use pitch layout for linear RTs
More likely to match in the texture cache when being sampled.
2022-12-03 22:50:56 +00:00
Billy Laws ac85df7b7a Start transition cache lookup with most recent one 2022-12-03 22:50:56 +00:00
Billy Laws 62c86b7690 Move maxwell3d to common constant buffer code 2022-12-03 22:50:56 +00:00
Billy Laws 8f0a6e78c5 Add Vulkan stride dynamic state and robustness support
Fixes the waterfall in SMO by specifying vertex buffer bounds.
2022-12-03 22:50:56 +00:00
Billy Laws 23a7f70a8e Commonise maxwell3d guest shader caching code 2022-12-03 22:50:56 +00:00
Billy Laws 6f6a312692 Commonise maxwell3d pipeline binding handling code
A lot of pipeline code is difficult to commonise due to the inherent difference between compute and graphics pipelines, however the binding layout is shared so we can at least commonise that
2022-12-03 22:50:56 +00:00
Billy Laws be8cbabd97 Commonise maxwell3d texture code
This will be shared with the compute engine implementation.
2022-12-03 22:50:56 +00:00
Billy Laws 61e95c4b2c Commonise maxwell3d sampler code
This will be shared with the compute engine implementation, the only thing of note with this is that the binding register is now passed as a param since it is part of the compute QMD which can't be dirty tracked.
2022-12-03 22:50:56 +00:00
Billy Laws 7f93ec3df6 Commonise maxwell3d interconnect common code for use by other engines
The compute engine will require most of this for basic functionality.
2022-12-03 22:50:56 +00:00
Billy Laws 281838fde1 Apply GPU readback hack to both buffers and textures
And rename as appropriate.
2022-12-03 22:50:56 +00:00
Billy Laws f358c4517e Update edge credits 2022-12-03 22:50:56 +00:00
Billy Laws eb00dc62f8 Implement support for 36 bit games by using split code/heap mappings
Although rtld and IPC prevent TLS/IO and code from being above the 36-bit AS limit, nothing depends the heap being below it. We can take advantage of this by stealing as much AS as possible for code in the lower 36-bits.
2022-12-02 22:10:03 +00:00
Dima e8e1b910c3 Add possibility to disable audio output 2022-12-02 00:33:28 +01:00
lynxnb 70109f8fbd Work around invalid values in `CNTFRQ_EL0` register
Exynos SoCs have a bug where the `CNTFRQ_EL0` register is either set to 0 or contain incoherent values. With this patch, the frequency value is loaded into a static variable and used instead of reading the register. The value will be initialised to the correct value for affected SoCs, while unaffected ones will use the value from the register.
2022-12-02 00:23:28 +01:00
lynxnb 54d0246ca6 Tweak `GpuDriverActivity` FAB padding 2022-11-28 00:06:07 +01:00
lynxnb 2e8d7b559c Use the original view padding/margin when applying window insets
Adding to the current view padding/margin values results in applying the insets over and over again as insets listeners can be called multiple times.
2022-11-28 00:04:39 +01:00
Billy Laws b2384e83f5 Add prepo:a service 2022-11-25 16:26:00 +00:00
Billy Laws 736216a6f4 Stub OpenPatchDataStorageByCurrentProcess 2022-11-25 16:26:00 +00:00
Billy Laws 44033d7f8d Adjust CalendarTime year to be relative to 0AD 2022-11-25 16:26:00 +00:00
Billy Laws 2ce2604421 Implement VFS file deletion 2022-11-25 16:26:00 +00:00
Billy Laws 6c968e0357 Fix GetEntryType IPC return type 2022-11-25 16:26:00 +00:00
lynxnb ec220c8ea9 Use an extended FAB in `GpuDriverActivity` 2022-11-23 19:49:42 +05:30
lynxnb 163f4f2014 Fix window insets handling when in landscape mode
To avoid code duplication, insets handling has been moved to a separate interface.
2022-11-23 19:49:42 +05:30
lynxnb ab6c5f4c50 Improve robustness of `KeyReader.import`
* Close the input and output file streams before moving the output file to the final destination
* Clean up the destination path before moving the new file
* Introduce a `ImportResult` return value to differentiate between the possible causes of import errors
* Display more meaningful error messages in the UI
2022-11-23 19:49:42 +05:30
lynxnb 38129d9dc3 Mark some strings as non-translatable 2022-11-23 19:49:42 +05:30
lynxnb ee8c055641 Make `GpuDriverInstallResult` PascalCase 2022-11-23 19:49:42 +05:30
Billy Laws 7f1667de82 Avoid using trapping for frequently trapped shaders
Fall back to hashing for every shader access as that ends up being faster than applying traps for every execution.
2022-11-19 12:49:05 +00:00
Billy Laws 06095918a9 Introduce per-channel sequence number for invalidation tracking
For cases like shaders, which may be uploaded through I2M (which no longer causes an execution) we need a way to cause an invalidation on all writes
2022-11-19 12:49:05 +00:00
Billy Laws 97e3f7fd34 Increase max swapchain image count 2022-11-19 12:49:05 +00:00
Billy Laws c49119f5ef Fixup depth bounds register arguments 2022-11-19 12:49:05 +00:00
Billy Laws db3c5c33c4 Clamp depth bounds into 0-1 range 2022-11-19 12:49:05 +00:00
Billy Laws e1bbd521d9 Fix potential circular queue submission race
If a producer thread was waiting for the queue to have free space and the consumer thread hadn't yet acquired the production mutex a deadlock could occur
2022-11-19 12:49:05 +00:00
Billy Laws 13baf2312f Add a workaround for sampling BGRA textures with a swizzle 2022-11-19 12:49:05 +00:00
Billy Laws 13a96c5aba Implement a helper shader for partial clears
These are not natively supported by Vulkan, so use a helper shader and colorWriteMask for the same behaviour.
2022-11-19 12:49:05 +00:00
Billy Laws ac0e225114 Use vkCmdBlit for texture copies when formats dont match 2022-11-19 12:49:05 +00:00
Billy Laws c8fc8f84ec Fallback to RGBA888 for unsupported swapchain formats as opposed to swizzle 2022-11-19 12:49:05 +00:00
Billy Laws e0bc0d3a97 Avoid megabuffering buffers larger than the chunk size 2022-11-19 12:49:05 +00:00
Billy Laws b6f49884b3 Use lower_bound to speedup texture hostMapping lookup 2022-11-19 12:49:05 +00:00
Billy Laws e7fda28ac6 Skip over textures in cache which have been replaced with a layer/mip match 2022-11-19 12:49:05 +00:00