skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-11-26 12:34:19 +01:00

Author	SHA1	Message	Date
Billy Laws	09f376e500	Add const accessors to OffsetMember	2022-11-02 17:46:07 +00:00
Billy Laws	64a9db2e82	Introduce `MergeInto` helper for simplified construction of arrays of structs In the upcoming GPU code each state member will hold a reference to its corresponding Maxwell 3D regs, this helper is needed to allow easy transformation from the the main 3D register struct into them. Example: ```c++ struct Regs { std::array<View, 10> viewRegs; u32 enable; } regs; struct ViewState { const View &view; const u32 &enable; size_t index; }; std::array<ViewState, 10> viewStates{MergeInto<ViewState, 10>(regs.viewRegs, regs.enable, IncrementingT{}) ```	2022-11-02 17:46:07 +00:00
Billy Laws	2c682f19a6	Add untracked linear allocator emplace/allocate functions Useful for cases where allocations are guaranteed to be unused by the time `Reset()` is called and calling `Free()` would be difficult or add extra performance cost due to how the allocation is used.	2022-11-02 17:46:07 +00:00
Billy Laws	6359852652	Introduce page size constants and replace all usages of PAGE_SIZE Avoids using macros and results in code which looks slightly cleaner.	2022-11-02 17:46:07 +00:00
Billy Laws	30ec844a1b	Use GPFIFO pushbuffer contents in-place if possible The memcpy within `Read()` was taking up a fair amount of time, avoid this by using the mapped range in-place when the mapping isn't split.	2022-11-02 17:46:07 +00:00
Billy Laws	be825b7aad	Utilise SegmentTable for rapid FlatMemoryManager lookups In some games performing the binary search in `TranslateRange()` ended up taking a fairly large (~8%) proportion of GPFIFO time. By using a segment table for O(1) lookups this is reduced to <2% for non-split mappings at the cost of slightly increased memory usage (2GiB in the absolute worse case but more like 50MiB in real world situations). In addition to adapting `TranslateRange()` to use the segment table, a new function `LookupBlock()` for cases where only a single mapping would ever be looked up so the small_vector handling and fallback paths can be skipped and the entire lookup be inlined.	2022-11-02 17:46:07 +00:00
Morph	4ea0b0e1e5	fssrv: IFileSystemProxy: Implement OpenReadOnlySaveDataFileSystem Forward this function to OpenSaveDataFileSystem for now. A proper implementation should wrap the underlying filesystem with nn::fs::ReadOnlyFileSystem.	2022-10-30 20:04:40 +00:00
Narr the Reg	25b9bb00fd	service: hid: Properly clear and set npad devices	2022-10-30 15:52:03 +00:00
german77	cf95cfb056	service: hid: Stub SetPalmaBoostMode	2022-10-30 15:52:03 +00:00
Narr the Reg	4da934579c	service: hid: Set the correct maxEntry value and signal on acquire event handle	2022-10-30 15:52:03 +00:00
Dima	baa6b5d5ea	check if NpadId is valid when update	2022-10-30 15:51:45 +00:00
Dima	a409f30e91	add GetAvailableLanguageCodeCount for both lists	2022-10-30 15:51:29 +00:00
Dima	51ce3f7c3c	Stub IClient::Poll	2022-10-29 19:52:50 +05:30
Billy Laws	1846f533bc	Add credits PreferenceCategory	2022-10-25 21:40:28 +01:00
Abandoned Cart	267d25b5a5	Prevent a false positive `SecurityException` for DocumentsProvider	2022-10-25 20:17:18 +05:30
lynxnb	1ae36cea24	Update `ngword2` archive with the correct content	2022-10-25 00:40:46 +02:00
Billy Laws	160c2f3457	Add Ko-Fi credits to settings	2022-10-23 21:14:39 +01:00
PixelyIon	128ea33073	Print NPDM + NACP metadata for title determination We determined that printing NPDM + NACP metadata is a significantly better way to determine what title is running rather than printing the filename.	2022-10-23 20:20:44 +05:30
PixelyIon	a0539a3edb	Trace Scheduler Preemption/Yield in Perfetto	2022-10-22 17:37:03 +05:30
PixelyIon	c874907eb5	Log and flush inside `KProcess::Kill` We want to know when the `KProcess` is being killed and flushing log during it is important since it can often result in hangs due to joining not working correctly.	2022-10-22 17:17:04 +05:30
PixelyIon	597a6ff31d	Wait on slot to be freed in `GraphicBufferProducer::DequeueBuffer` We currently don't wait on a slot to be freed if none are free, this worked prior to async presentation as GBP's slots wouldn't change their state until other commands were called but now slots can be held by the presentation engine. As a result, we now have to wait on the presentation engine to free up slots. This commit also fixes the behavior of the `async` flag in `DequeueBuffer` as it was treated as a non-blocking flag but isn't supposed to do anything on HOS.	2022-10-22 17:15:33 +05:30
lynxnb	cdb2b85d6c	Stub `ngword` and `ngword2` Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>	2022-10-18 20:54:57 +01:00
lynxnb	a7be4fd1e1	Stub `pl::RequestLoad` and `pl::GetSharedFontInOrderOfPriority` Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>	2022-10-18 20:54:57 +01:00
lynxnb	782f9e37ee	Add a system region setting Needed for games such as AC:NH. The `Auto` option automatically selects a region based on the currently selected system language. Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>	2022-10-18 20:54:57 +01:00
lynxnb	d4800d13b8	Stub `hid::ActivateMouse` and `hid::ActivateKeyboard` Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>	2022-10-18 20:54:57 +01:00
lynxnb	45830633eb	Stub `am::SetTerminateResult`	2022-10-18 20:54:57 +01:00
lynxnb	bc016aff47	Make the vulkan validation layer toggleable via setting As part of this commit, a new preference category for debug settings is being introduced. All future settings only relevant for debugging purposes will be put there. The category is hidden on release builds.	2022-10-18 19:47:23 +02:00
lynxnb	5cf14e45e1	Enable frame throttling when triple buffering is disabled	2022-10-18 19:27:14 +02:00
lynxnb	6b76c61cd1	Introduce a `releasedebug` build variant	2022-10-17 18:39:32 +02:00
lynxnb	b17364bb92	Introduce a `dev` app flavor for side-by-side installation	2022-10-01 13:01:46 +02:00
Billy Laws	8d1026d0cc	Reduce font shared memory size for compacted fonts	2022-09-22 21:51:13 +01:00
Billy Laws	25fb09800a	Compact fonts to only include necessary glyphs	2022-09-22 21:51:13 +01:00
Billy Laws	e31ed6a429	Increase font shared memory size for Noto fonts	2022-09-22 21:34:29 +01:00
Billy Laws	0d4893c448	Cleanup font magic generation code	2022-09-22 21:34:29 +01:00
Billy Laws	5c4cc3d51f	Fix font load order to match HOS Without this games would load an inappropriate font file for their target language.	2022-09-22 21:34:29 +01:00
Billy Laws	272bbf6cd2	Switch to Noto Sans fonts for shared fonts replacement Provides CJK characters, which the previous replacements lacked entirely.	2022-09-22 21:34:29 +01:00
lynxnb	54172322fe	Fix host synchronization for texture with a different guest format Host synchronization of a guest texture with a different guest format represents a valid use case where the host doesn't support the guest format and conversion to a host-compatible format must be performed. The issue is most evident on Mali GPUs, as they don't support BCn texture formats thus needing manual decoding before submission. It was disabled by mistake in a previous commit, this commit re-enables it.	2022-09-15 15:22:52 +02:00
TheASVigilante	a1ff4e1777	Implement OpenHardwareOpusDecoderEx and GetWorkBufferSizeEx Implements these 2 functions which were introduced in HOS 12.0.0. Fixes a crash in Xenoblade Chronicles 3.	2022-09-10 22:04:15 +01:00
lynxnb	34bd16426c	Fix quads index buffer conversion not accounting for first index Unindexed quad draws were broken when multiple draw calls were done on the same vertex buffer, with a non-zero `first` index. Indexed quad draws also suffered from the same issue, but was never encountered in games. This commit fixes both cases by accounting for the `first` drawn index when generating conversion index buffers.	2022-09-04 12:42:33 +02:00
Billy Laws	9af5df4bae	Increase IPC pointer buffer size Some services report a pointer buffer size > 0x1000, so up it as to not cause issues when they're implemented.	2022-09-02 23:14:05 +01:00
Billy Laws	51f4e7662e	Add support for the TIPC protocol introduced in HOS 12.0.0 TIPC is a much lighter layer ontop of the Horizon IPC system than CMIF and is used by SM in 12.0.0+. This implementation is slightly hacky since it doesn't really keep a seperation between the underlying kernel IPC stuff and userspace like CMIF/TIPC, this should be fixed eventually, probably together with an IPC dispatch rewrite to avoid the mess of frozen maps. Tested with Hentai Uni, which now crashes needing 'ldr:ro'.	2022-09-02 23:13:23 +01:00
Billy Laws	a40d7c78ad	Always recreate oboe stream on error This is what's done in oboe examples to avoid spurious errors breaking audio entirely.	2022-08-31 21:26:14 +01:00
Billy Laws	8917ec9c88	Don't set framesPerCallback for main stream as per oboe guidance It's best to let oboe figure it out on it's own	2022-08-31 21:26:14 +01:00
Billy Laws	b00008daf5	Fix identifier release check in AudioTrack::Stop	2022-08-31 21:26:14 +01:00
Billy Laws	d9c8e62d1c	Don't warn on GetConfig IOCTL fails	2022-08-31 21:26:14 +01:00
Billy Laws	4aef24ba32	Implement NVGPU_GPU_IOCTL_GET_GPU_TIME in nvdrv	2022-08-31 21:26:14 +01:00
Billy Laws	5841799420	Fix decoding of IOCTLs with padding at the end	2022-08-31 21:26:14 +01:00
Billy Laws	82444f3b0a	Don't set push descrptor flag for desc sets This is redundant and against the spec since we no longer use push descriptors.	2022-08-31 21:26:14 +01:00
PixelyIon	94fdd6aa43	Fix HID touch points not being removed from screen Tapping anything in titles that supported touch (such as Puyo Puyo Tetris or Sonic Mania) wouldn't work due to the first touch point never being removed from the screen, it is supposed to be removed after a 3 frame delay from the touch ending. This commit introduces a mechanism to "time-out" touch points which counts down during the shared memory updates and removes them from the screen after a specified timeout duration.	2022-08-31 23:43:02 +05:30
PixelyIon	70ad4498a2	Write HID LIFO entries at fixed intervals Certain titles depend on HID LIFO entries being written out at a fixed frequency rather than on actual state change, not doing this can lead to applications freezing till the LIFO is filled up to maximum size, this behavior is seen in Super Mario Odyssey. In other cases such as Metroid Dread, the game can run into race conditions that would lead to crashes, these were worked around by smashing a button during loading prior. This commit introduces a thread which sleeps and wakes up occasionally to write LIFO entries into HID shared memory at the desired frequencies. This alleviates any issues as it fills up the LIFO instantly and correctly emulates HID Shared Memory behavior expected by the guest. Co-authored-by: Narr the Reg <juangerman-13@hotmail.com>	2022-08-31 22:49:36 +05:30
PixelyIon	7966bfa9f6	Fix PI update `KThread::waiterMutex` deadlock It was determined that deadlocks inside `KThread::UpdatePriorityInheritance` would not only arise from the first level of locking with `waitingOn->waiterMutex` but also the second level of locking with `nextThread->waiterMutex` which has now also been fixed to fallback when facing contention.	2022-08-28 20:15:08 +05:30
Abandoned Cart	86f6fc510e	Remove printing result message from author There is no purpose in printing the same result twice, so any duplicate messages should instead allow the field to be truncated.	2022-08-27 18:54:27 +05:30
Abandoned Cart	1013857fc4	Refactor subtitle as author to remove subtitle Subtitle is no longer used, so instances have been rerouted to author. DataItem was also updated to reflect the removal of a subtitle.	2022-08-27 18:54:27 +05:30
Abandoned Cart	04cae942ea	Follow typical per-file detail formatting Format the details in the expected format of individual files (instead of a complete game) and move the code to match the updated placement.	2022-08-27 18:54:27 +05:30
lynxnb	5d6eaee301	Correctly save/restore ROM version to/from game entry cache PR #1758 introduced a bug where the game list would be entirely loaded every time the app was opened. This commit addresses that issue, which was caused by the `version` member of the cached game list being serialized to file (although incorrectly) but never actually read back when deserializing.	2022-08-21 20:24:36 +02:00
lynxnb	cfa5f0e030	Fix OSC alpha not changing on button press	2022-08-20 17:00:40 +02:00
KikiManjaro	8fb4e62c28	Add version information about rom Review: Co-authored-by: Niccolò Betto <niccolo.betto@gmail.com>	2022-08-20 13:48:07 +02:00
KikiManjaro	3407f6d530	Add OSC opacity adjustment	2022-08-20 13:46:17 +02:00
lynxnb	d129fb09cd	Android Manifest cleanup * Remove `package` from manifest and from activity prefixes, gradle `namespace` will be used instead * Removed deprecated `android.support.PARENT_ACTIVITY` metadata entries * Make `MainActivity` and `SettingsActivity` launched in `singleTop` mode to avoid unnecessary activity restarts while navigating the app	2022-08-17 15:33:53 +02:00
lynxnb	d128856c7d	Update target SDK version to Android 13	2022-08-17 12:29:46 +02:00
lynxnb	39f398f76b	Update Kotlin (1.7.10), NDK (25.0.8775105), AGP (7.2.2) and Kotlin deps	2022-08-17 12:28:31 +02:00
lynxnb	e9618d9e2c	Use pragma pack directions for tightly packing structs containing `u128` Using `__attribute__((packed))` doesn't work in new NDKs when a struct contains 128-bit integer members, likely because of a ndk/compiler bug. We now enclose the requiring structs in `#pragma pack` directives to tightly pack them.	2022-08-17 12:22:11 +02:00
lynxnb	c4bf92a49f	Fix Kotlin compilation errors from incorrect overloading of null-safe types	2022-08-17 12:16:26 +02:00
Billy Laws	bf491f71f9	Simplify blit helper shader vertex order	2022-08-10 15:43:16 +01:00
Billy Laws	c32bec071c	Adjust blit src{X,Y} to account for centred sampling before calling into helper shader Since the blit engine itself samples from pixel corners and the helper shader from pixel centres teh src coordinates need to be adjusted to avoid the helper shader wrapping round on the final column.	2022-08-10 15:39:37 +01:00
Billy Laws	08f36aac33	Enable hades vertex position input workaround for Adreno Caused crashes in any games using geometry shaders as by default hades uses the position builtin directly.	2022-08-08 18:09:00 +01:00
Billy Laws	04e7b684d2	Enable vertexPipelineStoresAndAtomics, fragmentStoresAndAtomics and shaderStorageImageWriteWithoutFormat Vulkan features Used by Xenoblade Chronicles DE	2022-08-08 17:43:18 +01:00
Billy Laws	390558c802	Add partial support for legacy attribute conversion We previously missed the hades pass for attribute conversion leading to crashes when games would attempt to use such an attribute. The hades pass for this isn't a proper fix however as it modifies the IR directly and will break if any of the previous stages in the pipeline change. Enable it to allow for games using them to at least have a chance at working. In the long term the pass will be reworked on the hades side to avoid modifying the IR in a way that can't be undone.	2022-08-08 17:43:18 +01:00
Billy Laws	540437b547	Fixup index buffer view caching We forgot to set the view size, which would end up forcing a view to be recreated with every call	2022-08-08 17:43:18 +01:00
Billy Laws	c966cd3b26	Prevent runtimeInfo vertex state from leaking into wrong shaders This vertex state must only be present for the last pipeline stage that touches vertices, if it is present for other stages it could result in incorrect behaviour like performing TFB in the fragment shader or flipping device coordinates twice.	2022-08-08 17:43:13 +01:00
Billy Laws	c52d3195cf	Ensure shader stage enable state matches pipeline stage enable state As the code was before, if we had a shader that was disabled and enabled again after without being invalidated the pipeline stage would stay disabled and break rendering.	2022-08-08 17:40:35 +01:00
Billy Laws	b1c669ba14	Always keep the VertexB shader stage enabled HW doesn't allow disabling the VertexB stage, enforce this in code.	2022-08-08 17:40:35 +01:00
lynxnb	d5174175d1	Implement indexed quads support We previously only supported non-indexed quads. Support for this is implemented by converting the index buffer at record time and pushing the result into the megabuffer, which is then used as the index buffer in the final draw command.	2022-08-08 17:40:35 +01:00
lynxnb	e6741642ba	Split out megabuffer allocation from pushing data The `Allocate` method allocates the given amount of space in a megabuffer chunk, returning a descriptor of the allocated region. This is useful for situations where you want to write directly to the megabuffer, avoiding the need for an intermediary buffer.	2022-08-08 17:40:35 +01:00
Billy Laws	cdc6a4628a	Enable VK uint8 indices feature when supported	2022-08-08 17:40:35 +01:00
Billy Laws	dccc86ea97	Implement transform feedback with VK_EXT_transform_feedback Tested to work in Xenoblade Chronicles DE, the code handles both hades varying input and buffer setup.	2022-08-08 17:40:35 +01:00
Billy Laws	06053d3caf	Rewrite Fermi 2D engine to use the blit helper shader Entirely rewrites the engine and interconnect code to take advantage of the subpixel and OOB blit support offered by the blit helper shader. The interconnect code is also cleaned up significantly with the 'context' naming being dropped due to potential conflicts with the 'context' from context lock	2022-08-08 17:40:35 +01:00
Billy Laws	395f665a13	Implement a system for helper shaders together with a simple blit shader It is desirable for us to use a shader for blits to allow easily emulating out of bounds blits and blits between different swizzled colour formats. The helper shader infrastructure is designed to be generic so it can be reused by any other helper shaders that we may need in the future.	2022-08-08 17:40:35 +01:00
Billy Laws	1da1698f90	Disable unused Vulkan HPP setters and smart handles	2022-08-08 14:57:44 +01:00
Billy Laws	f4e58a9238	Remove redundant synchost creating a new buffer	2022-08-08 14:57:44 +01:00
Billy Laws	11a8feb037	Correct nvdrv DMA copy class ID Was wrongly copy and pasted.	2022-08-08 14:57:44 +01:00
Billy Laws	13e7b54c61	Ensure failed IOCTLs are logged as a warning log	2022-08-08 14:57:44 +01:00
Billy Laws	eeb86a4f8a	Calculate renderArea from min(attachments.dimensions...) Vulkan doesn't support a renderArea larger than that of the smallest attachment	2022-08-08 14:57:44 +01:00
Billy Laws	9ea658d0ed	Don't throw on unsupported TIC formats These sometimes spuriously occur in games during transitions, to avoid crashing during them just use the null texture if they occur and log an error log	2022-08-08 14:57:44 +01:00
Billy Laws	856818c8eb	Emulate the 'None' mipfilter by adjusting LOD Borrowed this technique from yuzu since Vulkan has no direct equivalent	2022-08-08 14:57:44 +01:00
Billy Laws	9d50b6d0f7	Avoid locking presentation mutex in GetTransformHint This caused slowdown in Pokemon as it was being called every frame	2022-08-08 14:57:44 +01:00
Billy Laws	460e6c9c84	Use raw pointers to hold constant buffer views The constant destruction and creation of `BufferView`s in cbuf-heavy games showed up as a large chunk of the profiler. Fix this by taking advantage of the fact that constant buffer `BufferView`s are never deleted and always kept around in the cache to just return a pointer to them in the cache.	2022-08-08 14:54:57 +01:00
Billy Laws	6b2e84712b	Avoid race in nvdrv debug prints Looking up the device name without locking it could race with map insertions or deletions, so lock it to avoid that	2022-08-08 13:24:23 +01:00
Billy Laws	683cd594ad	Use a linear allocator for most per-execution GPU allocations Currently we heavily thrash the heap each draw, with malloc/free taking up about 10% of GPFIFOs execution time. Using a linear allocator for the main offenders of buffer usage callbacks and index/vertex state helps to reduce this to about 4%	2022-08-08 13:24:21 +01:00
Billy Laws	70eec5a414	Store delegate attached state within the delegate itself Avoids a costly map lookup for every AttachBuffer call, this was a serious bottleneck in SMO	2022-08-08 13:23:26 +01:00
Billy Laws	0268e1d5a0	Force a submit before any i2m engine writes We need traps to be inplace so we dont end up overwriting a resource that's being actively used by the current context without setting it to dirty	2022-08-08 13:22:37 +01:00
Billy Laws	cb0b132486	Allow supplying push constants to GetPipeline	2022-08-08 13:22:37 +01:00
Billy Laws	1c8863ec3b	Use const references for holding pipeline state in pipeline cache Allows passing in constexpr structs to state directly	2022-08-08 13:22:37 +01:00
Billy Laws	b6b04fa6c5	Use small_vector for VMM TranslateRange results This was the source of a lot of heap allocs, moving to small_vector helps to avoid most of them	2022-08-08 13:22:37 +01:00
Billy Laws	1fe6d92970	Wait on Swapchain Image copy to complete Certain titles can have a display frames out of order due to not waiting on the copy from the final RT to the swapchain image to occur. Although `PresentFrame` does wait on the syncpoint, that isn't enough to ensure the source texture is up-to-date due to us signalling syncpoints early. By waiting on the swapchain texture after the copy is submitted, we now implicitly wait on the source texture's cycle to be signalled thus waiting on the frame to be done which fixes the issue.	2022-08-07 03:12:27 +05:30
PixelyIon	5b7572a8b3	Introduce chunked `MegaBuffer` allocation After the introduction of workahead a system to hold a single large megabuffer per submission was implemented, this worked fine for most cases however when many submissions were flight at the same time memory usage would increase dramatically due to the amount of megabuffers needed. Since only one megabuffer was allowed per execution, it forced the buffer to be fairly large in order to accomodate the upper-bound, even further increasing memory usage. This commit implements a system to fix the memory usage issue described above by allowing multiple megabuffers to be allocated per execution, as well as reuse across executions. Allocations now go through a global allocator object which chooses which chunk to allocate into on a per-allocation scale, if all are in use by the GPU another chunk will be allocated, that can then be reused for future allocations too. This reduces Hollow Knight megabuffer memory usage by a factor 4 and SMO by even more.	2022-08-07 03:12:27 +05:30
Billy Laws	99b5fc35c6	Change `SegmentTable` semantics to respect unset entries Accesses to unset entries is now clearly defined as returning a 0'd out value, the prior behavior would be to optimize sets for border segments to use L2 atomicity when the specific segment had no L1 entries set. This would lead to any future lookups of offsets within the same L2 segment but a different L1 entry to incorrectly return an inaccurate value as the only prior guarantee was that lookups after setting a segment would return the same value as was set but lacked the guarantee for unset segments to also consistently return unset values. This could lead to issues in practical usages such as the `BufferManager` lookups returning the existence of a `Buffer` at a location falsely even though the segment was never set to the value, this was problematic as raw pointers were utilized and bound checks would lead to a segmentation fault. This commit fixes this issue by introducing this guarantee and refactoring the class accordingly, it also deletes the `Set` method for setting a single entry as the meaning is ambiguous and it's functionality was more akin to the past guarantee and no longer makes sense. Co-authored-by: PixelyIon <pixelyion@protonmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	36b8d3c445	Account for `SegmentTable` insertions entirely within an L2 entry We would always write all L1 entries that correspond to an L2 entry, even if setting an input range ended before that. This would effectively reduce the atomicity of the segment table to that of the L2 range and lead to breaking API guarantees by returning entirely wrong segment values for a lookup covering a region that was overwritten.	2022-08-06 22:20:54 +05:30
PixelyIon	c72316d9f6	Rename `RangeTable` to `SegmentTable` It was determined that `RangeTable` was too ambiguous of a name as it could be interpreted to be holding ranges rather than looking them up, to avoid confusion the terminology has been changed to `range` to `segment`. As "segment table" is more clear in describing that it is a table comprised of descriptors regarding segments and it avoids any overlaps with terminology concerning "pages" which would be overly specific for this data structure or the ambiguous "ranges".	2022-08-06 22:20:54 +05:30
Billy Laws	5398eff045	Fix `KProcess::MutexUnlock` PI CAS The PI CAS in `MutexUnlock` ends up loading `basePriority` rather than `priority` which could lead to an infinite CAS loop when `basePriority` doesn't equal to `priority` and the `highestPriorityThread`'s priority is lower than `basePriority`.	2022-08-06 22:20:54 +05:30
PixelyIon	850c0f4092	Make `Texture::SynchronizeGuest` Blocking It was determined that `Texture::SynchronizeGuest`'s `TextureBufferCopy` had races that were exposed by the introduction of the cycle waiter thread, the synchronization did not take place under a locked context so the texture could be mutated at any point in addition to the destructor not being run during `FenceCycle::Wait` due to `shouldDestroy` being `false`. This commit fixes the issue by making `SynchronizeGuest` entirely blocking as all usages of the function required blocking semantics regardless so it would be pointless to retain its async nature while solving any races that may arise from it being async. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
Billy Laws	77d15b02a3	Ensure backing continuity when recreating GPU dirty buffers Since we don't call `SynchronizeHost` on source buffers which are GPU dirty, their mirrors will be out of date. The backing contents of this source buffer's region in the new buffer will be incorrect. By copying from the backing directly, we can ensure that no writes are lost and that if the newly created buffer needs to turn GPU dirty during recreation no copies need to be done since the backing is as up to date as the mirror at a minimum.	2022-08-06 22:20:54 +05:30
Billy Laws	c1bf5a804a	Extend `stateMutex` scope inside `Buffer::SynchronizeHost` The code is much simpler to reason about when reading the code as it doesn't require evaluating all the potential edge cases of trap handlers in different states. It should be noted that this should not change behavior in any meaningful way, at most it can prevent a minor race where the protection could be upgraded after being downgraded by the signal handler leading to a redundant trap.	2022-08-06 22:20:54 +05:30
PixelyIon	c3cf79cb39	Rework `KThread::waiterMutex` Locking Two issues exist with locking of `KThread::waiterMutex`: * It was not always locked when accessing waiter members such as `waitThread`, `waitKey` and `waitTag` which would lead to a race that could end up in a deadlock or most notably a segfault inside `UpdatePriorityInheritance` * There could be a deadlock from `UpdatePriorityInheritance` locking `waiterMutex` of a thread and waiting to get the owner's `waiterMutex` while on another thread `MutexUnlock` holds the owner's `waiterMutex` and waits on locking the `waiterMutex` held by `UpdatePriorityInheritance` This commit fixes both issues by adding appropriate locking to all locations where waiter members are accessed in addition to adding a fallback mechanism inside `UpdatePriorityInheritance` that unlocks `waiterMutex` on contention to avoid a deadlock.	2022-08-06 22:20:54 +05:30
PixelyIon	68615703c1	Fix `KProcess`/`SetThreadPriority` PI CAS The condition for exiting the CAS loops is incorrect in several places which leads to additional loops, while this doesn't make the behavior incorrect it does lead to redundant iterations. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	8fc3cc7a16	Rework Descriptor Set Allocation/Updates A substantial amount of time would be spent on creation/destruction of `VkDescriptorSet` which scales on titles doing a substantial amount of draws with bindings, this leads to poor performance on those titles as the frametime is dragged down by performing these tasks while they repeatedly create descriptor sets of the same layouts. This commit fixes it by pooling descriptor sets per-layout in a dynamically resizable pool and keeping them around rather than destroying them after usage which leads to the vast majority of cases not requiring a new descriptor set to even be created. It leads to significantly improved performance where it would otherwise be spent on redundant destruction/recreation or push descriptor updates which took a substantial amount of time themselves. Additionally, the `BaseDescriptorSizes` were not kept up to date with all of the descriptor types, it led to no crashes on Adreno/Mali as they were purely used for size calculations on either driver but has been corrected to avoid any future issues.	2022-08-06 22:20:54 +05:30
PixelyIon	e1a4325137	Introduce `FenceCycle` Waiter Thread A substantial amount of time is spent destroying dependencies for any threads waiting or polling `FenceCycle`s, this is not optimal as it blocks them from moving onto other tasks while destruction is a fundamentally async task and can be delayed. This commit solves this by introducing a thread that is dedicated to waiting on every `FenceCycle` then signalling and destroying all dependencies which entirely fixes the issue of destruction blocking on more important threads.	2022-08-06 22:20:54 +05:30
PixelyIon	5f8619f791	Optimize Buffer Lookups using Range Tables Buffer lookups are a fairly expensive operation that we currently spend `O(log n)` on the simplest and most frequent case of which is a direct match, this is a very frequent operation where that may be insufficient. This commit optimizes that case to `O(1)` by utilizing a `RangeTable` at the cost of slightly higher insertion/deletion costs for setting ranges of values but these are minimal in frequency compared to lookups.	2022-08-06 22:20:54 +05:30
PixelyIon	578ae86cca	Implement Multi-Level Range Table A data structure that can represent the same value for a range of addresses (pages) is required for fast lookup in certain cases. This commit implements a near optimal data structure for mass insertion and O(1) lookup of range-based data, this is achieved using the host MMU and implementing multiple levels of atomicity for the ranges. It should be noted that the table is limited to two levels but can be extended to a variable amount of ranges in the future, it was determined that additional levels of ranges can be beneficial for performance depending on the specific use-case.	2022-08-06 22:20:54 +05:30
Billy Laws	38eab80ed8	Disable Vulkan Push Descriptors on Adreno Adreno drivers have certain errata which leads to Vulkan Push Descriptors to be broken on them in certain cases which leads to a descriptor set update being swallowed. This has been worked around by disabling push descriptors on Adreno drivers, this may lead to reduced performance on certain titles which frequently bind new descriptors.	2022-08-06 22:20:54 +05:30
Billy Laws	88fd491ed5	Submit after Maxwell3D Semaphore Release Any semaphore releases are implicit synchronization events that can be utilized by the guest to pick up that the GPU has executed till a certain point and therefore we must submit all prior work accordingly.	2022-08-06 22:20:54 +05:30
Billy Laws	b77da1182f	Don't flush submission on DMA Copies DMA copies utilized `SubmitWithFlush` instead of `Submit`, this is not required and incurs significant additional synchronization penalties which will no longer be required.	2022-08-06 22:20:54 +05:30
PixelyIon	0992fde028	Don't block on surface creation in `GetTransformHint` We want to avoid blocking on surface creation unless necessary, this commit doesn't wait on the creation of the surface as it default initializes the value which'll generally be `Identity` or the transformation of the previous surface if it was lost. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
Billy Laws	35133381b6	Fix V-Sync `KEvent` construction order The V-Sync `KEvent` would be used by the presentation thread prior to construction leading to dereferencing an invalid value, this has been fixed by changing the order of construction to move the construction of the presentation thread after the V-Sync event.	2022-08-06 22:20:54 +05:30
PixelyIon	ffad246d67	Split NCE Trap page-out functionality from `TrapRegions` The `TrapRegions` function performed a page-out on any regions that were trapped as read-only, this wasn't optimal as it would tie them both into the same operation while Buffers/Textures require to protect then synchronize and page-out. The trap was being moved to after the synchronize to get around this limitation but that can cause a potential race due to certain writes being done after the synchronization but prior to the trap which would be lost. This commit fixes these issues by splitting paging out into `PageOutRegions` which can be called after `TrapRegions` by any API users. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	da464d84bc	Consolidate `NCE::TrapRegions` functionality into `CreateTrap` `NCE::TrapRegions` was a bit too overloaded as a method as it implicitly trapped which was unnecessary in all current usage cases, this has now been made more explicit by consolidating the functionality into `NCE::CreateTrap` which handles just creation of the trap and nothing past that, `RetrapRegions` has been renamed to `TrapRegions` and handles all trapping now. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	8a62f8d37b	Rework Texture Synchronization API + Locking Similar to `Buffer`s, `Texture`s suffered from unoptimal behavior due to using atomics for `DirtyState` and had certain bugs with replacement of the variable at times where it shouldn't be possible which have now been fixed by moving to using a mutex instead of atomics. This commit also updates the API to more closely match what is expected of it now and removes any functions that weren't utilized such as `SynchronizeGuestWithBuffer`.	2022-08-06 22:20:54 +05:30
Billy Laws	04bcd7e580	Rework Buffer `DirtyState` with `BackingImmutability` Having a single variable denoting the exact state of a buffer and the operations that could be performed on it was found to be too restrictive, it's now been expanded into an additional `BackingImmutability` variable but due to these two. We can no longer use atomics without significant additional complexity so all accesses to the state are now mediated through `stateMutex`, a mutex specifically designed for tracking the state. While designing the system around `stateMutex` it was determined to be more efficient than atomics as it would enforce blocking far less than it would generally have been compared to if the regular atomic fallback of locking the main resource lock which is locked for significantly longer generally. Co-authored-by: PixelyIon <pixelyion@protonmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	1af781c0a5	Add Perfetto Tracing to NCE Trapping API As a performance sensitive part of code, the NCE Trapping API benefits from having tracing and it helps us better determine where guest code is spending its time for more targeted optimizations.	2022-08-06 22:20:54 +05:30
PixelyIon	9d294b9ccc	Use `weak_ptr` for `TrapHandler` Callbacks The lifetime of the `this` pointer in the trap callbacks could be invalid as the lifetime of the underlying `Buffer`/`Texture` object wasn't guaranteed, this commit fixes that by passing a `weak_ptr` of the objects into the callbacks which is locked during the callbacks and ensures that a destroyed object isn't accessed. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
Billy Laws	96d8676d5b	Fix `SubmitWithFlush` not updating `MegaBuffer` cycle The `CommandExecutor`'s `MegaBuffer` was not being updated with the latest `FenceCycle` on being flushed in `SubmitWIthFlush`, this led to the megabuffer being overwritten prior to its GPU-side usage being complete. This commit fixes that by replacing the cycle to the latest cycle and prevents any races that occurred prior.	2022-08-06 22:20:54 +05:30
PixelyIon	3e9d84b0c3	Split `FindOrCreate` functionality across `BufferManager` `FindOrCreate` ended up being monolithic function with poor readability, this commit addresses those concerns by refactoring the function to split it up into multiple member functions of `BufferManager`, while some of these member functions may only have a single call-site they are important to logically categorize tasks into individual functions. The end result is far neater logic which is far more readable and slightly better optimized by virtue of being abstracted better.	2022-08-06 22:20:54 +05:30
PixelyIon	d2a34b5f7a	Implement `ContextLock` Move-assignment operator In certain cases the move constructor may not suffice and the move assignment operator is required, this commit implements that and moves to using a pointer for storing the `resource` member rather than a reference as its semantics matched what we desired more and allowed for assignment of the `resource`.	2022-08-06 22:20:54 +05:30
PixelyIon	38d3ff4300	Fix `BufferManager::FindOrCreate` Recreation Bugs It was determined that `FindOrCreate` has several issues which this commit fixes: * It wouldn't correctly handle locking of the newly created `Buffer` as the constructor would setup traps prior to being able to lock it which could lead to UB * It wouldn't propagate the `usedByContext`/`everHadInlineUpdate` flags correctly * It wouldn't correctly set the `dirtyState` of the buffer according to that of its source buffers	2022-08-06 22:20:54 +05:30
Billy Laws	d1a682eace	Fix `setDirty` behavior in `Buffer::SynchronizeGuest` The condition for `setDirty` in the dirty state CAS was inverted from what it should've been resulting in synchronizing incorrectly, this commit fixes the condition to correct synchronization.	2022-08-06 22:20:54 +05:30
Billy Laws	00d434efdc	Remove `Texture::CopyFrom` format check The formats of the textures involved in a texture were checked for equality, this broke certain copies as the presentation engine would invoke copies between textures of different yet compatible formats. Co-authored-by: PixelyIon <pixelyion@protonmail.com>	2022-08-06 22:20:54 +05:30
Billy Laws	58174f255f	Improve `ContextLock` semantics `ContextLock` had unoptimal semantics in the form of direct access to the `isFirst` member which wasn't clearly defined, it's now been broken up into function calls `IsFirstUsage` and `OwnsLock` with explicit move semantics and a function for releasing the lock. Co-authored-by: PixelyIon <pixelyion@protonmail.com>	2022-08-06 22:20:54 +05:30
Billy Laws	561103d3da	Submit GPFIFO work prior to `CircularQueue` waiting The position at which we call submit is a significant factor in performance and we did so at the end of PBs (PushBuffers), this isn't optimal as there could be multiple PBs queued up that would benefit from being in the same submission. We now delay the submission of the workload till we run out of PBs.	2022-08-06 22:20:54 +05:30
PixelyIon	3ac5ed8c06	Attach coalesced `Buffer` if any source `Buffer` is attached A buffer that's attached to a context could be coalesced into a larger buffer which isn't attached, this would break as it wouldn't keep the buffer alive till the end of the associated context. To fix this if any source buffers are attached then the resulting coalesced buffer is also attached now.	2022-08-06 22:20:54 +05:30
PixelyIon	284ac53d88	Fix KThread Priority Inheritance CAS The CAS condition for KThread PI was inverted which lead to entirely incorrect behavior for CAS conditions which while it might work in the vast majority of cases would lead to significantly inaccurate behavior.	2022-08-06 22:20:54 +05:30
PixelyIon	45cb8388cc	Fix NCE Trap API Lock Callback The lock callback would `continue` which would end up skipping over the current item as it applied to the inner loop rather than the outer loop as intended. This has now been fixed by using `break` and a check instead.	2022-08-06 22:20:54 +05:30
PixelyIon	745d809e07	Fix `Buffer::SynchronizeGuest` Non-Blocking Behavior The buffer's non-blocking behavior could lead to an invalid state where the dirty state doesn't adequately represent the buffer's true state, the check has now been moved inside the CAS loop as its behavior changes depending on the dirty state. In addition, `SynchronizeGuest` returns a boolean denoting if the synchronization was successful now to make code flows depending on non-blocking synchronization cleaner.	2022-08-06 22:20:54 +05:30
PixelyIon	c1f2445772	Set state to `CpuDirty` directly in `SynchronizeGuest` `SynchronizeGuest` could only set the dirty state to `Clean` which was redundant since calls to it from inside the write trap handler would set it to `CpuDirty` directly after, this fixes that by doing it inside the function when necessary.	2022-08-06 22:20:54 +05:30
PixelyIon	4f6a67af36	Fix `Texture` Trap Data Race The trap callbacks did not wait on the `Texture` to complete synchronization to the guest, this resulted in races where the contents written to the texture would be overwritten by the synced content. This commit fixes that by waiting on the fences at the end of the trap callback.	2022-08-06 22:20:54 +05:30
PixelyIon	cb7c3602e7	Attach `TextureView` to `FenceCycle` The lifetime of `TextureView` objects wasn't correctly managed as they weren't being attached the the `FenceCycle` in `AttachTexture`, this led to them getting deleted and causing all sorts of UB.	2022-08-06 22:20:54 +05:30
PixelyIon	ffaefc82d3	Call all flush callbacks prior to `CommandExecutor` submission The flush callbacks inside `CommandExecutor` weren't being called prior to submission as they should've been, this fixes that by calling them. It additionally removes the requirement to manually flush Maxwell3D at the end of `ChannelGpfifo` pushbuffers as it's a flush callback and will automatically be called by `Submit`. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	e65707cd9d	Handle `CommandExecutor` submission at end of `ChannelGpfifo` PB Any work that was done in a `ChannelGpfifo` pushbuffer needs to be submitted at the end of it, if it isn't done then the work might incorrectly be not done till the next submission. This commit fixes it by calling `CommandExecutor::Submit` at the end of a pushbuffer, submitting any buffers that would've been left over. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-08-06 22:20:54 +05:30
PixelyIon	7b209c54a2	Only reallocate `MegaBuffer` on usage Certain submissions might not utilize megabuffering but reserve a `MegaBuffer` regardless, this is not optimal since it can inflate the allocations and waste memory. This commit addresses the issue by eliding the allocation given the current submission doesn't utilize them.	2022-08-06 22:20:54 +05:30
PixelyIon	2366f81443	Fix `Buffer::PollFence` incorrectly handling null-`FenceCycle` If a `FenceCycle` isn't attached then `PollFence` returned `false` while it should return if the buffer has any concurrent GPU usages in flight, this has now been fixed by returning `true` in those cases.	2022-08-06 22:20:54 +05:30
PixelyIon	34e1e39d1c	Always reset all attached resources on `Submit` Certain resources can be attached to an empty `Submit` with no nodes, this can cause it to become a false dependency and not be removed till the next non-empty submission. This has now been fixed by doing a reset regardless of if any nodes exist.	2022-08-06 22:20:54 +05:30
PixelyIon	47db8e8cbc	Fix GPU inline copy callback for `Buffer::Write` The GPU inline copy callback was broken for `Buffer::Write` as it wasn't always called when it needed to be and didn't handle attaching of the buffer to the executor which would cause it to be unlocked. This commit addresses both of these issues, it introduces a `AttachLockedBuffer` method to attach an already locked buffer to the executor.	2022-08-06 22:20:54 +05:30
PixelyIon	2636a37b31	Introduce alternative FPS measurement for disabled frame throttling The FPS is implicitly bound to the refresh rate due to the timestamp being that of the presentation time, this leads to a misleading FPS figure for disabled frame throttling. It has now been fixed by using the frame submission time rather than the presentation time when frame throttling is disabled and to make this more apparent the color of the OSD FPS has been changed.	2022-08-06 22:20:54 +05:30
PixelyIon	0f56d01e58	Fix `Packed` format component ordering in `IsAdrenoAliasCompatible` All `Packed` formats have their components stored in the opposite ordering to the label, this was not followed for `IsAdrenoAliasCompatible` prior and the ordering has now been flipped.	2022-08-06 22:18:42 +05:30
PixelyIon	3ca56ef578	Fix NCE Trapping API Deadlock A deadlock was caused by holding `trapMutex` while waiting on the lock of a resource inside a callback while another thread holding the resource's mutex waits on `trapMutex`. This has been fixed by no longer allowing blocking locks inside the callbacks and introducing a separate callback for locking the resource which is done after unlocking the `trapMutex` which can then be locked by any contending threads.	2022-08-06 22:18:42 +05:30
PixelyIon	a6599c30b4	Correct `IntervalMap` insertion `end` calculation The `end` pointer for `interval` was incorrectly calculated as `interval.data() + interval.size_bytes()` which would be incorrect when the interval span type is not `u8` as the pointer derived from `interval.data()` would be a pointer to the span type rather than a byte pointer and be subject to arithmetic of that object's size rather than in terms of a byte.	2022-08-06 22:18:42 +05:30
PixelyIon	b0910e7b1a	Avoid locking `Texture`/`Buffer` in trap handler We generally don't need to lock the `Texture`/`Buffer` in the trap handler, this is particularly problematic now as we hold the lock for the duration of a submission of any workloads. This leads to a large amount of contention for the lock and stalling in the signal handler when the resource may be `Clean` and can simply be switched over to `CpuDirty` without locking and utilizing atomics which is what this commit addresses.	2022-08-06 22:18:42 +05:30
PixelyIon	a60d6ec58f	Replace host immutability `FenceCycle` with GPU usage tracking We utilized a `FenceCycle` to keep track of if the buffer was mutable or not and introduced another cycle to track GPU-side requirements only on fulfillment of which could the buffer be utilized on the host but due to the recent change in the behavior this system ended up being unoptimal. This commit replaces the cycle with a boolean tracking if there are any usages of the resource on the GPU within the current context that may prevent it from being mutated on the CPU. The fence of the context is simply attached to the buffer based off this which was allowed as the new behavior of buffer fences matches all the requirements for this.	2022-08-06 22:18:42 +05:30
PixelyIon	217d484cba	Abstract `TextureView`/`BufferDelegate` locking into `LockableSharedPtr` An atomic transactional loop was performed on the backing `std::shared_ptr` inside `BufferView`/`TextureView`'s `lock`/`LockWithTag`/`try_lock` functions, these locks utilized `std::atomic_load` for atomically loading the value from the `shared_ptr` recursively till it was the same value pre/post-locking. This commit abstracts the locking functionality of `TextureView`/`BufferDelegate` into `LockableSharedPtr` to avoid code duplication and removes the usage of `std::atomic_load` in either case as it is not necessary due to the implicit memory barrier provided by locking a mutex.	2022-08-06 22:18:42 +05:30
PixelyIon	2d08886e4e	Utilize `TextureView` rather than `Texture` for presentation `PresentationEngine` and `GraphicBufferProducer` methods that utilized textures for the surface utilized the `Texture` type rather than the `TextureView` type, this was never correct but at the time of authoring this code `TextureView` was not finalized and in a major flux which is why it was not utilized and `Texture` was utilized instead. Now that is is far more stable, it has been replaced with `TextureView`.	2022-08-06 22:18:42 +05:30
PixelyIon	d7399e33c1	Avoid waiting on mutex in `PresentationEngine::Present` We want to block on the host thread during presentation while the host surface isn't present to implicitly pause the game, this can end up being fairly costly as it involves locking the `PresentationEngine` mutex which can lead to a lot of contention with the presentation thread. This fixes the issue by polling if there is a surface and only if there isn't then doing the wait as it isn't mandatory to wait always, we'll eventually run into the guest thread stalling.	2022-08-06 22:18:42 +05:30
PixelyIon	30475ffc43	Fix `queueBuffer` `GraphicBuffer` Compatibility Check Newer versions of the Deko3D homebrew were crashing due to this check and it was discovered that the check was incorrect and rather than comparing the `NvSurface` what had to be compared was the `GraphicBuffer` associated with the slot directly. Co-authored-by: lynxnb <niccolo.betto@gmail.com>	2022-08-06 22:18:42 +05:30
PixelyIon	c2685d5f5c	Fix consistency issues with external project copyright headers The copyright headers for external project such as yuzu/Ryujinx were inconsistent in ordering, Skyline should always be the first item in the list. In addition, they didn't always link to the project's GitHub which has also been fixed.	2022-08-06 22:18:42 +05:30
PixelyIon	0ac5f4ce27	Lock `TextureManager`/`BufferManager` during submission Multiple threads concurrently accessing the `TextureManager`/`BufferManager` (Referred to as "resource managers") has a potential deadlock with a resource being locked while acquiring the resource manager lock while the thread owning it tries to acquire a lock on the resource resulting in a deadlock. This has been fixed with locking of resource manager now being externally handled which ensures it can be locked prior to locking any resources, `CommandExecutor` provides accessors for retrieving the resource manager which automatically handles locking aside doing so on attachment of resources.	2022-08-06 22:18:42 +05:30
PixelyIon	1239907ce8	Rework `Texture` & `Buffer` for `Context` and `FenceCycle` Chaining GPU resources have been designed with locking by fences in mind, fences were treated as implicit locks on a GPU, design paradigms such as `GraphicsContext` simply unlocking the texture mutex after attaching it which would set the fence cycle were considered fine prior but are unoptimal as it enforces that a `FenceCycle` effectively ensures exclusivity. This conflates the function of a mutex which is mutual exclusion and that of the fence which is to track GPU-side completion and led to tying if it was acceptable to use a GPU resource to GPU completion rather than simply if it was not currently being used by the CPU which is the function of the mutex. This rework fixes this with the groundwork that has been laid with previous commits, as `Context` semantics are utilized to move back to using mutexes for locking of resources and tracking the usage on the GPU in a cleaner way rather than arbitrary fence comparisons. This also leads to cleaning up a lot of methods that involved usage of fences that no longer require it and therefore can be entirely removed, further cleaning up the codebase. It also opens the door for future improvements such as the removal of `hostImmutableCycle` and replacing them with better solutions, the implementation of which is broken at the moment regardless. While moving to `Context`-based locking the question of multiple GPU workloads being in-flight while using overlapping resources came up which brought a fundamental limitation of `FenceCycle` to light which was that only one resource could be concurrently attached to a cycle and it could not adequately represent multi-cycle dependencies. `FenceCycle` chaining was designed to fix this inadequacy and allows for several different GPU workloads to be in-flight concurrently while utilizing the same resources as long as they can ensure GPU-GPU synchronization.	2022-08-06 22:18:42 +05:30
PixelyIon	07d45ee504	Introduce `FenceCycle` Chaining If we want to allow submitting multiple pieces of work to the GPU at once while still requiring CPU synchronization, we'll need to track all past fence cycles associated with a resource alongside the current one. To solve this the concept of chaining fences has been introduced, fences from past usages can be chained to the latest fence which'll then recursively forward operations to chained fences. This change also ends up mandating a move away from `FenceCycleDependency` as it would prevent fences from concurrently locking the same resources which is required for chaining to work as two fences being chained fundamentally means they're locking the same resources. The `AtomicForwardList` is therefore used as the new container.	2022-08-06 22:18:42 +05:30
PixelyIon	cf9e31c1eb	Implement Atomic Forward List An implementation of a singly-linked list with atomic access to allow for lock-free access semantics, it eliminates the requirement for a mutex which can introduce additional consideration for synchronization.	2022-08-06 22:18:42 +05:30
PixelyIon	6b9269b88e	Introduce `Context` semantics to GPU resource locking Resources on the GPU can be fairly convoluted and involve overlaps which can lead to the same GPU resources being utilized with different views, we previously utilized fences to lock resources to prevent concurrent access but this was overly harsh as it would block usage of resources till GPU completion of the commands associated with a resource. Fences have now been replaced with locks but locks run into the issue of being per-view and therefore to add a common object for tracking usage the concept of "tags" was introduced to track a single context so locks can be skipped if they're from the same context. This is important to prevent a deadlock when locking a resource which has been already locked from the current context with a different view.	2022-08-06 22:18:42 +05:30
PixelyIon	d913f29662	Only set `hasFragileUserData` for signed builds We do not want to allow saving of user data on unsigned builds as they don't have a stable signature and will not properly handle reinstallation. This can lead to a situation where the user has to resort to complex techniques to completely uninstall the package such as ADB or calling into PM directly.	2022-08-06 22:18:42 +05:30
PixelyIon	3139889a09	Implement Asynchronous Presentation We currently present all frames synchronously on the thread that calls into SurfaceFlinger functions, this is unoptimal as it doesn't match guest behavior which can lead to delaying the guest from working on the next frame. This commit queuing up frames to non-blocking and handles all waiting then presenting the frame on a dedicated thread.	2022-08-06 22:18:42 +05:30
PixelyIon	6e09dc5204	Fix thread name setting We utilize `pthread_setname_np` to set the thread names but didn't check for any errors which resulted in the `Skyline-Choreographer` and `ChannelCmdFifo` not having proper names as they exceeded the 16 character limit on thread names for the pthread function. This has now been fixed by changing the names and introducing error checking to invocations of this function.	2022-08-06 22:18:42 +05:30
PixelyIon	7a0cfb484c	Add NPOT `AlignUp` utility All our normal alignment functions are designed to only handle power of 2 (`POT`) multiples as we only align or check alignment to `POT` multiples but there are cases where this is not possible and we deal with `NPOT` multiples which is why this function is required.	2022-08-06 22:18:42 +05:30
PixelyIon	662ea532d8	Skip waiting on host GPU after command buffer submission We waited on the host GPU after `Execute` but this isn't optimal as it causes a major stall on the CPU which can lead to several adverse effects such as downclocking by the governor and losing the opportunity to work in parallel with the GPU. This has now been fixed by splitting `Execute`'s functionality into two functions: `Submit` and `SubmitWithFlush` which both execute all nodes and submit the resulting command buffer to the GPU but flushing will wait on the GPU to complete while the non-flush variant will not wait and work ahead of the GPU.	2022-08-06 22:18:42 +05:30
PixelyIon	5129d2ae78	Add move-assignment semantics to `ActiveCommandBuffer`/`MegaBuffer` We need move-assignment semantics to viably utilize these objects as class members, they cannot be replaced without move-assign (or copy-assign but that is undesirable here). This commit fixes that by introducing a move assignment operator to them while making the `slot` a pointer which has the necessary nullability semantics.	2022-08-06 22:18:42 +05:30
lynxnb	8991ccac65	Pass `ViewHolder` on bind to RecyclerView items instead of `ViewBinding` This change lets items get the updated position of their view holder in the adapter. Fixes an issue where the position of items was not updated after being removed from a `SelectableGenericAdapter`.	2022-08-06 22:00:19 +05:30
lynxnb	bb922100cb	Improve rendering for Right-To-Left layouts	2022-08-06 22:00:19 +05:30
lynxnb	240e7033d7	Support loading a user-selected driver during vulkan initialization	2022-08-06 22:00:19 +05:30
lynxnb	c812de48ea	Show an undo button after deleting a gpu driver After a driver has been deleted, a snackbar will be shown confirming the deletion, with an button to undo it.	2022-08-06 22:00:19 +05:30
lynxnb	59c60df993	Add `GPU Driver Configuration` preference This preference launches `GpuDriverActivity` for managing custom gpu drivers. When the device has an incompatible GPU, the preference will be disabled and greyed out.	2022-08-06 22:00:19 +05:30
lynxnb	48cf1263bc	Add a custom GPU driver configuration activity The activity adds the following functionalities: * Lists installed drivers * Allows the user to install new drivers, or remove installed ones * Allows the user to select the driver that will be used by the emulator	2022-08-06 22:00:19 +05:30
lynxnb	e9f609b923	Add a `gpuDriver` preference setting This setting represent the GPU driver selected by the user to be used by the emulator.	2022-08-06 22:00:19 +05:30
lynxnb	1815199d2b	Add utilities for reading and installing gpu driver packages	2022-08-06 22:00:19 +05:30
lynxnb	f3dd3e53c1	Miscellaneous imports cleanup in `preference` package	2022-08-06 22:00:19 +05:30
lynxnb	1dfea9ef6f	Create an `ItemDecorations` file for all `RecyclerView` item decorations All item decorations are now placed in one file so that any `RecyclerView` in the app can use the same ones.	2022-08-06 22:00:19 +05:30
lynxnb	a59f2baa3a	Add a `SelectableGenericAdapter` as subclass of `GenericAdapter` `SelectableGenericAdapter` extends `GenericAdapter` with support for marking one item as selected.	2022-08-06 22:00:19 +05:30
lynxnb	e93fdce845	Add support for removal of items from `GenericAdapter`	2022-08-06 22:00:19 +05:30
lynxnb	0d1c7965df	Add a `ZipUtils` class for unpacking zip files	2022-08-06 22:00:19 +05:30
lynxnb	b03f624191	Add `kotlinx.serialization-json` dependencies	2022-08-06 22:00:19 +05:30
Billy Laws	f52ea7bddb	Make deferred draw and constant buffer updates reentrant-safe At some point we will call Submit within draws or constant buffer updates, to avoid any infinite recursion mark draw/cbuf pending as false before performing any operation	2022-07-29 20:07:14 +01:00
Billy Laws	dbb684835f	Fix depthClampDisable register offset in Maxwell 3D	2022-07-29 20:07:14 +01:00
Billy Laws	7fd9d347e3	Use per-RT blend enable registers even when independent blend is disabled The common blend enable register seems to be used for something else. This is required for blending to work correctly in OpenGL games	2022-07-29 20:07:14 +01:00
Billy Laws	048c2fdd29	Fix Vulkan framebuffer dimensions calculations The framebuffer needs to be large enough to contain both the render area extent and offset	2022-07-29 20:07:14 +01:00
Billy Laws	0e1aa765fc	Prevent CNTVCT_EL0 reads from being optimised out by the compiler Without this the compiler will assume the read always produces the same value, causing issues when the register is used to time function execution	2022-07-29 20:07:14 +01:00
Billy Laws	1df98ba57f	Enable fwrapv for defined signed integer overflow behaviour Nintendo enables this for HOS so we should do the same to avoid any cases where it's relied on.	2022-07-29 20:07:14 +01:00
lynxnb	d183d14e2a	Make accesses to setting values thread-safe	2022-07-26 20:16:24 +05:30
lynxnb	30667a0899	Remove unused `Compact Logs` settings Since we don't have a log viewer in the app anymore, the setting was left unused and can be safely removed.	2022-07-26 20:16:24 +05:30
lynxnb	5aa2a4cd1c	Rename `SettingsValues` to `NativeSettings` The previous name was chosen as an afterthought and didn't clearly indicate what the purpose of the class is. We needed a separate, simple class without delegates members (like PreferenceSettings), so that its fields can be easily accessed via JNI to get settings values from native code.	2022-07-26 20:16:24 +05:30
lynxnb	f734c4d145	Make log level setting changes immediately active	2022-07-26 20:16:24 +05:30
lynxnb	bb4937121f	Remove settings from SharedPreference if they are of the wrong type	2022-07-26 20:16:24 +05:30
lynxnb	2840a126dd	Introduce `AndroidSettings` class and use inheritance The `Settings` class now has a pure virtual `Update` method, and uses inheritance over template specialization for platform-specific behavior override.	2022-07-26 20:16:24 +05:30
lynxnb	3905728447	Make every setting observable individually A `Setting` delegate class has been introduced, holding the raw value of the setting and adding support for registering callbacks to that setting. Callbacks will then be called when the value of that setting changes. As a result of this, raw setting values have been made accessible through pointer dereference semantics.	2022-07-26 20:16:24 +05:30
lynxnb	2d70be60d1	Remove `PugiXML` submodule `PugiXML` was only used for parsing the SharedPreferences settings file, not needed anymore.	2022-07-26 20:16:24 +05:30
lynxnb	5b4ca79dc8	Rename `Settings` Kotlin class to `PreferenceSettings` SharedPreferences will be partially swapped out in the future to support per-game settings. In the meantime, make it clear from which class settings are coming from.	2022-07-26 20:16:24 +05:30
lynxnb	3b27540250	Rename `operationMode` setting to `isDocked`	2022-07-26 20:16:24 +05:30
lynxnb	69cf25b1a7	Initial support for updating settings during emulation + observing settings changes	2022-07-26 20:16:24 +05:30
lynxnb	c5dde5953a	Rework how settings are shared between Kotlin and native side Settings are now shared to the native side by passing an instance of the Kotlin's `Settings` class. This way the C++ `Settings` class doesn't need to parse the SharedPreferences xml anymore.	2022-07-26 20:16:24 +05:30
lynxnb	4be8b4cf66	Add missing SPDX licence header	2022-07-26 20:16:24 +05:30
lynxnb	365ca66b1b	Make integer settings use IntegerListPreference Avoids unnecessary type casting of setting values and duplication in resource files.	2022-07-26 20:16:24 +05:30
lynxnb	cbc896c8f8	Fix `waitForFences` crash on Mali drivers Mali GPU drivers utilize the `ppoll()` syscall inside `waitForFences` which isn't correctly restarted after a signal, which we can receive at any time on a guest thread. This commit fixes that by recursively calling the function on failure till it succeeds or returns an unexpected error. Co-authored-by: PixelyIon <pixelyion@protonmail.com> Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-07-14 20:34:16 +02:00
MCredstoner2004	942e22f275	Write `ApplicationErrorArg` `ErrorApplet`s to log These applets are used by applications to display a custom error message to the user. Both the error message and the detailed error message are printed to the error log. Co-authored-by: lynxnb <niccolo.betto@gmail.com>	2022-07-02 09:48:59 +05:30
MCredstoner2004	f9a0394577	Implement Software Keyboard applet This implements the non-inline version of the Software Keyboard (swkbd) applet, which games use to get text input from the user.	2022-07-01 15:19:53 -05:00
MCredstoner2004	a9ee06914d	Add ByteBufferSerializable This allows sending C-like structs between Kotlin and C++ without struct-specific code	2022-06-30 01:17:32 +05:30
Billy Laws	a0275418d6	Add a single-header linear allocator implementation This conforms to the C++ 'Allocator' named requirement allowing it to be used with any STL type and allows drastically reducing allocation times in cases which are suited for linear allocation.	2022-06-28 21:33:04 +01:00
Billy Laws	e816256220	Add blend, scissor, viewport and vertex state to shader hash These caused a ton of additional comparisons in Zelda Link's Awakening as many shaders would have the same hash.	2022-06-28 21:32:59 +01:00
lynxnb	e6cfdeb06a	Fix non-indexed quad draws Certain non-indexed quad draws would mistakenly take the indexed quad path because of the assumption that they would not have a bound index buffer. This resulted in a crash for most games using quads due to a faulty exception `Indexed quad conversion is not supported`, when in fact they were not using indexed quads. Co-authored-by: PixelyIon <pixelyion@protonmail.com> Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-06-23 10:57:11 +02:00
lynxnb	8fc3bc75f4	Allow providing an index type to calculate quad conversion buffer size	2022-06-23 00:15:44 +02:00
Billy Laws	7709dc8cf6	Rewrite buffer megabuffering to be per view and more efficient This commit implements several key optimisations in megabuffering that are all inherently interlinked. - Megabuffering is moved from per-buffer to per-view copies, this makes megabuffering possible for small views into larger underlying buffers which is often the case with even the simplest of games, - Megabuffering is no longer the default option, it is only enabled for buffer views that have had inline GPU writes applied to them in the past as that is the only case where they are beneficial. In any other case the cost of copying, even with a 128KiB limit can be significant. - With both of these changes, there is now possibility for overlapping views where one uses megabuffering and one does not. In order to allow GPU inline writes to work consistently in such cases a system of 'host immutability' has been implemented, when a buffer is marked as host immutable for a given cycle, all writes to the buffer from that point to the point the cycle is signalled will be performed on the GPU, ensuring that the backing contents are correctly sequenced	2022-06-11 17:05:39 +05:30
MCredstoner2004	2e356b8f0b	Use spans instead of ptr and size in kernel memory	2022-06-11 17:05:39 +05:30
PixelyIon	e3e92ce1d4	Handle unsigned builds on CI We don't always have access to CI secrets, such as, when a certain CI action is triggered by a PR from an external repository then it won't have access to secrets and be signed. While we likely will allow for this in the future as all workflows do have to be approved, it is still important to not crash when keys are unavailable and have a graceful fallback for those situations.	2022-06-11 17:05:39 +05:30
Billy Laws	8689886bbb	Update build tools to 33.0.0 to fix CI We're never switching to RC build tools again	2022-06-10 00:11:12 +01:00
Billy Laws	22039df301	Transition to std::unordered_set for buffer view tracking Has the same guarantees of pointer stabilty while also being significantly faster in cases where a buffer has thousands of views. This is the case in RE4 and this change leads to an almost 1000% performance improvement in that game.	2022-06-09 23:52:13 +01:00
Billy Laws	b75a06af1b	Support forcing 60Hz display on Xiaomi MIUI Uses an API found through RE since none of the AOSP APIs work, additionaly the code for setting RR was consolidated to a single function that can be ran after all display updates.	2022-06-09 19:29:18 +01:00
Billy Laws	42c365fe70	Automatically exclude llvm and boost submodules in gradle project There is a god in this world... his name is bylaws	2022-06-06 23:11:56 +01:00
PixelyIon	a5ca370c36	Implement thread-safe MegaBuffer pool We currently have a global `MegaBuffer` instance that is shared across all channels, this is very problematic as `MegaBuffer` fundamentally works like a state machine with allocations (especially resetting/freeing) and is thread-specific. Therefore, we now have a pool of several `MegaBuffer`s which is allocated from by the `CommandExecutor` and kept channel specific as a result which also limits its usage to a single thread, this allows for individually resetting or freeing any allocations.	2022-06-05 13:04:40 +05:30
PixelyIon	3e08494146	Minor `CommandScheduler` refactor There was a lot of redundant code in the `CommandScheduler` when the same functionality could be achieved with much shorter and cleaner code which this commit fixes. This includes no changes to the user-facing API and does not require any changes on the user side as a result.	2022-06-05 13:04:40 +05:30
Billy Laws	bd99d79b51	OsFileSystem: Close directory after file listing is finished	2022-06-04 21:46:23 +01:00
Billy Laws	4888919515	Stub GetFriendInvitationStorageChannelEvent (0x8C)	2022-06-04 21:45:53 +01:00
Billy Laws	d9f6540831	Fix VFS CreateFile directory creation	2022-06-04 19:19:30 +01:00
Billy Laws	f5bcb40c41	Return number of audio outs in ListAudioOuts	2022-06-04 19:12:37 +01:00
Billy Laws	5d6902b3f8	Stub audin:u	2022-06-04 19:11:57 +01:00
Billy Laws	54999957a2	Remove RGB565 format workaround Will soon be redundant with new texture manager and is quite hacky so drop it.	2022-06-04 17:49:13 +01:00
Billy Laws	d79832091d	Force append slash to directory path in OsFilesystem::CreateDirectory The recursive path creation algorithm requires this to be the case	2022-06-04 17:44:49 +01:00
Billy Laws	616f7b7826	Correct instanced draw topology changed warning location Before it would trigger even when the draw had the instanceNext flag set and thus wasn't part of the instanced draw at all.	2022-06-04 17:43:03 +01:00
Billy Laws	deb7a0e22a	Implement 5x5 and 10x10 ASTC texture formats	2022-06-04 17:42:37 +01:00
Billy Laws	cc5a3f99c1	Reformat format description file	2022-06-04 17:42:13 +01:00
Billy Laws	a476bbaf4d	Add 11_11_10 vertex buffer format	2022-06-04 17:41:10 +01:00
Billy Laws	71c37dd6c4	Add D24X8Unorm depth RT format support	2022-06-04 17:40:49 +01:00
Billy Laws	d3af629b83	Support R32G32B32A32 int RT formats	2022-06-04 17:38:57 +01:00
Billy Laws	0f5f04ade3	Set default surfaceflinger parameters based off of preallocated buffers Required by resident evil 4 as otherwise Dequeue would fail due to it using BGRA buffers but the default being RGBA.	2022-06-04 16:55:08 +01:00
Billy Laws	106ad597db	Support BGRA8888 surfaceflinger format A swizzle is applied to R8G8B8A8 to transform it to BGRA since BGRA isn't a commonly supported swapchain format on Android.	2022-06-04 16:49:26 +01:00
Billy Laws	2bbeb6b08f	Fix OsFileSystem initial directory creation By passing basePath as an argument the CreateDirectory function did mkdir(basePath+basePath) which is obviously not the intended behaviour, fix this.	2022-06-03 19:33:31 +01:00
Billy Laws	84dec7561c	Dont cache rendertarget mappings Some games remap rendertargets or map them late which would lead to weird graphical bugs or crashes. Drop the caching since VMM lookup is fairly cheap anyway.	2022-06-03 19:31:52 +01:00
Billy Laws	581a016991	Add GuestTexture::GetSize helper function This code was getting duplicated a bit so commonise into a helper function.	2022-06-03 19:30:54 +01:00
Billy Laws	31d418ad54	Fix 3D semaphore counter type 0 handling Counter type 0 actually releases the semaphore payload rather than a constant zero as was previously thought. This is required by Skyrim.	2022-06-02 22:03:19 +01:00
Billy Laws	0202bf5531	Add semaphore release debug logs	2022-06-02 22:02:59 +01:00
Billy Laws	55cddc7a66	Update hades	2022-06-02 21:58:30 +01:00
Billy Laws	3736d36b75	Fix KPrivateMemory remap permissions	2022-06-02 18:10:35 +01:00
Billy Laws	389ab0fb50	Add {Map,Unmap}Physical memory debug logs	2022-06-02 18:10:10 +01:00
PixelyIon	2712b3276b	Fix incorrect `VkBufferImageCopy` offset calculations The `VkBufferImageCopy` offset calculations were wrong inside `CopyIntoStagingBuffer` as it multiplied the mip level's linear size by `levelCount` rather than `layerCount`. This led to substantial UB in games which called this function as it led to an overflow and resulted in writing to other areas of the buffer which caused major issues such as vertex/index buffer corruption and corresponding graphical glitches alongside likely being the cause of some crashes.	2022-06-02 22:14:22 +05:30
PixelyIon	06901ef22a	Fix BC7 output swizzling from BGRA to RGBA BC7 CPU decoding had the red and blue channels swapped around as it outputted a BGRA image after decoding while we expected an RGBA image to be produced. This should fix the colors of certain textures in titles such as Cuphead or Sonic Forces.	2022-06-02 19:48:55 +05:30
Billy Laws	9cb68c31e1	Stub nfp IUser::AttachAvailabilityChangeEvent	2022-06-02 00:04:01 +01:00
Billy Laws	33c9731eca	Implement IFileSystem::CreateDirectory	2022-06-02 00:04:01 +01:00
Billy Laws	a09414424b	Fix broken VFS directory creation	2022-06-02 00:04:01 +01:00
Billy Laws	3518e04a18	Correct Directory EntryType to be u8 rather than u32	2022-06-02 00:04:01 +01:00
Billy Laws	0c11d9e294	Implement IDirectory::GetEntryCount	2022-06-02 00:04:01 +01:00
MCredstoner2004	c15b3a8d40	Make Applet accesses to the data queues lock Avoids potential races when the guest access the same applet from more than one thread.	2022-06-02 03:47:38 +05:30
Billy Laws	91b2c47991	Fix potential nvdrv submission race The syncpoint maximum value represents the maximum possible syncpt value at a given time, however due to PBs being submitted before max was incremented, for a brief moment of time this is not the case which could lead to crashes or other such behaviour if a game waits on the fence at the right moment.	2022-06-01 17:15:25 +01:00
PixelyIon	37453ed7fa	Use `DocumentsProvider` for log sharing We used a `FileProvider` for log sharing prior, this is no longer necessary since it comes under the `DocumentsProvider` now which can be utilized to share the log document directly.	2022-06-01 21:41:14 +05:30
PixelyIon	8efa9298f9	Fix name conflict resolution for `copyDocument` Any documents with the same name existing in a directory that is copied to would cause an exception due to existing already, this fixes that by handling conflict resolution in those cases and automatically determining a file name that would avoid a conflict.	2022-06-01 21:41:14 +05:30
Billy Laws	c4bd9c47e4	Stub NVGPU_GPU_IOCTL_ZBC_SET_TABLE nvdrv ioctl This was missed in the original implementation and caused crashes in some games.	2022-06-01 16:59:14 +01:00
Billy Laws	c639fdcf06	Fixup NFP service stub state handling Previously a broken state value was returned from GetState that caused crashes in games using newer SDKs and NFP, correctly handle state now by updating it after initialisation.	2022-06-01 15:00:26 +01:00
Billy Laws	c745e0e02b	Move image type logic to GuestTexture, allowing 2D array views for 3D RTs We can't render to a 3D texture through a 3D view, we instead have to create a 2D array view into it and render to that. The texture manager previously didn't support having a different view type/layer count between a guest texture view and the underlying storage texture that is required to support this so that was also implemented by reading the view layer count from the dimensions depth instead if the underlying texture is 3D (and the view type is 2D array). Additionally move away from our own view type enum to Vulkan, inline with other guest texture member types.	2022-05-31 22:09:53 +01:00
Billy Laws	22695c4feb	Stub nim services used for eShop communication We obviously don't need to implement these so add a simple set of stubs to satify games using them (mainly demos such as DQXII)	2022-05-31 22:07:01 +01:00
Billy Laws	ff12dc9c10	Add R32_SFLOAT to adreno validation layer format filtering	2022-05-31 22:03:53 +01:00
Billy Laws	6cc925c2d3	Reset RT mappings on dimension and format changes	2022-05-31 17:49:16 +01:00
Billy Laws	8180bf852e	Lock textures before attaching in BlitContext	2022-05-31 16:54:13 +01:00
Billy Laws	cb2b36e3ab	Allow providing a callback that's called after every CPU access in GMMU Required for the planned implementation of GPU side memory trapping.	2022-05-31 16:04:27 +01:00
Billy Laws	46ee18c3e3	Require depthBiasClamp Vulkan device feature Used in some UE4 games and supported by 95% of devices so skip implementing a fallback path.	2022-05-31 14:46:45 +01:00
PixelyIon	e592b11039	Drop `samplerAnisotropy` as a required GPU feature Sampler anisotropy was made a required feature in an earlier commit due to its widespread availability but this was determined to be incorrect as certain Mali GPUs that can otherwise run 2D games in Skyline do not have this feature, while they are still not officially supported as this was the only roadblock to support them, it has now been made an optional feature.	2022-05-31 01:37:40 +05:30
PixelyIon	4336134b07	Reintroduce `android:hasFragileUserData` due to stable signature `android:hasFragileUserData` was added in an earlier commit but then removed due to it not functioning because of signature checks. Now that signatures are consistent across builds, it has been readded and should now allow carrying data across CI and developer builds.	2022-05-31 01:37:07 +05:30
PixelyIon	b91ce939a2	Introduce CI build signing We've done no signing of any Skyline APKs to date which causes issues regarding authenticity of any APKs as they could be entirely unofficial builds which have not been vetted by the team. Additionally, the different keys remove the ability to reinstall a different build successively as Android checks for matching signatures before installing an APK.	2022-05-31 01:25:18 +05:30
PixelyIon	e1cc8676cf	Add option to view internal directory With the Skyline document provider, easy access to the internal directory is required which may be hard to navigate to through the system file manager. This adds an option in settings to directly open up the directory in the system file manager.	2022-05-31 01:25:18 +05:30
PixelyIon	ba97985b55	Revise `DocumentsProvider` URI structure The URIs (Document ID + Root) of the Skyline `DocumentsProvider` was unoptimal as it wasn't relative to a base directory. This is required for opening a root without knowledge of the full path in advance, it is therefore cleaner to provide a uniform `ROOT_ID` in a companion class.	2022-05-31 01:25:18 +05:30
Mylah Dee	ee7da31fc6	Add `DocumentProvider` for accessing internal files On Android 12 and above, files from an application's external storage directory cannot be accessed by the user. The only proper SAF-compliant way to solve this is to create a `DocumentProvider` which proxies access to internal storage accordingly.	2022-05-30 15:09:30 +05:30
Narr the Reg	7aa6a5c4ca	Add HID touch attribute and index reporting Adds missing parameter TouchAttribute and emulates correctly the touch point index. Both changes are necessary on Voez to keep track of each finger.	2022-05-29 10:28:51 +01:00
PixelyIon	80c8fb8791	Implement CPU BCn Texture Decoding Certain GPU vendors such as ARM's Mali do not have support for BCn textures whatsoever while other vendors such as AMD only have partial support (BC1-BC3). Most titles on the guest utilize BC textures and to address this on host GPUs without support for BCn, we need to decompress the texture on the CPU. This commit implements a CPU BCn texture decoder based off Swiftshader's BC decoder, it also adds the necessary infrastructure to have different formats for the `GuestTexture` and `Texture` objects.	2022-05-28 21:22:24 +05:30
PixelyIon	fe615b1e03	Clarify texture swizzling inner-loop iteration count The iterations of the inner loop for sector deswizzling was miscalculated as `SectorWidth * SectorHeight` while the result was correct at `32`, it should be determined by the amount of sector lines within a GOB i.e.: `(GobWidth / SectorWidth) * GobHeight`.	2022-05-28 21:22:24 +05:30
PixelyIon	7d4e0a7844	Implement Mipmapped Texture Support Support for mipmapped textures was not implemented which is fairly crucial to proper rendering of games as the only level that would load is the first level (highest resolution), that might result in a lot more memory bandwidth being utilized. Mipmapping also has associated benefits regarding aliasing as it has a minor anti-aliasing effect on distant textures. This commit entirely implements mipmapping support but it does not extend to full support for views into specific mipmap levels due to the texture manager implemention being incomplete.	2022-05-28 21:22:24 +05:30
PixelyIon	da7e6a7df7	Replace Maxwell DMA `GuestTexture` usage with new swizzling API Maxwell DMA requires swizzled copies to/from textures and earlier it had to construct an arbitrary `GuestTexture` to do so but with the introduction of the cleaner API, this has become redundant which this commit cleans up and replaces with direct calls to the API with all the necessary values.	2022-05-28 21:22:24 +05:30
PixelyIon	de300bfdbe	Refactor Texture Swizzling The API for texture swizzling is now more concrete and abstracted out from `GuestTexture`, this allows for neater usage in certain areas such as MaxwellDMA while having a `GuestTexture` wrapper as well allowing for neater usage in those cases. The code itself has also been cleaned up slightly with all usage of `u32`s being upgraded to `size_t` as this is simply more efficient due to the compiler not needing to emulate wraparound behavior for integer types smaller than the processor word size.	2022-05-19 17:13:55 +05:30
Billy Laws	72473369b6	Account for OOB copyOffsets in CircularBuffer::Read Caused crashes in Pokemon	2022-05-14 15:30:59 +01:00
Robin Kertels	0a3cf25823	Implement the Fermi 2D blitting engine The Fermi 2D engine implements both image blit and resolve operations, supporting subpixel sampling with both linear and point filtering. Resolve operations are performed by sampling from the center of each pixel in order to resolve the final image from the MSAA samples MSAA images are stored in memory like regular images but each pixels dimensions are scaled: e.g for 2x2 MSAA ``` 112233 112233 445566 445566 ``` These would be sampled with both duDx and duDy as 2 (integer part), resolving to the following: ``` 123 456 ``` Blit operations are performed by sampling from the corner of each pixel, scaling the image as one would expect. This implementation isn't fully complete as Vulkan blit doesn't support some combinations which Fermi does, most notably between colour and depth stencil. These will be implemented properly at a later date, likely after the texture manager rework. Out of Bounds Blit, used by some OpenGL games is also missing since supporting it requires texture aliasing, this will also be supported after the texture manager rework. Co-authored-by: Billy Laws <blaws05@gmail.com>	2022-05-13 22:37:37 +01:00
Billy Laws	be2546138d	Move IOVA class to GMMU so it can be used for other engines	2022-05-13 22:37:37 +01:00
Billy Laws	3ad640fcbc	Fix accidental graphics context member/parameter duplication	2022-05-13 22:37:37 +01:00
PixelyIon	7a6f27a19a	Fix texture swizzling OOB writes Certain writes during swizzling went out of bounds due to incorrect `blockExtentY` calculation, the previous commit to fix this ended up breaking it further. This commit returns to the original commit's calculations with the proper addendum of a check for exact alignment with a GOB which is the case that was broken earlier.	2022-05-13 14:52:41 +05:30
PixelyIon	168e51e7ad	Always use `GetLayerStride` for layer stride in Texture The `GuestTexture::GetLayerStride` function was not always being utilized to retrieve the layer stride inside `Texture`, it would instead directly access the `guestTexture::layerStride` member. This is problematic as it may not be initialized and return `0` which would lead to a broken image copy.	2022-05-13 14:21:37 +05:30
Billy Laws	b81d5bc865	Implement and cleanup semaphore operations in all engines Most engines have the capability to release a semaphore payload (or reduce in the case of GPFIFO) when a method is called or action is complete. Semaphores are used by games for both timing how long things take on GPU and waiting on resources so missing them can cause deadlocks or other related issues.	2022-05-12 19:40:24 +01:00
Billy Laws	bca88685bd	Stub nvdrv {Get,Dump}Status	2022-05-12 17:38:22 +01:00
Billy Laws	97e740c986	Fix slight locking bug with nvmap handle duplication	2022-05-12 17:38:22 +01:00
Billy Laws	57378457dc	Treat symbol file paths without slashes as filenames Prevents crashes printing backtrace if this occurs	2022-05-12 17:38:22 +01:00
Billy Laws	d08ac63bbf	Use TIC maximum index over TSC when tscIndexLinked is set	2022-05-12 17:38:22 +01:00
Billy Laws	8e021a9f1f	Load custom drivers from app private data dir Required since /sdcard doesn't have exec perm support	2022-05-12 17:38:21 +01:00
Billy Laws	dcef597345	Introduce TrivialObject concept and use where appropriate Simplifies type checking and handles excluding container types that are trivially copyable but contain pointers	2022-05-12 17:38:21 +01:00
PixelyIon	f2cc25ee9f	Implement Array Texture Swizzling Textures can have more than one layer which we currently don't handle, all layers past the initial one will be filled with random data or 0s, leading to incorrect rendering. This has now been implemented now which fixes any titles which utilize array textures, such as "Super Mario Odyssey" or "Hatsune Miku: Project DIVA MegaMix".	2022-05-12 18:23:45 +05:30
PixelyIon	2a99e1784d	Fix Maxwell3D RT Depth/Layer Count Logic The Maxwell3D RT layer count wasn't being set correctly as it has the same register as the depth values and is toggled between the two based on another register value.	2022-05-12 18:23:05 +05:30
Billy Laws	543ac3042e	Cleanup account services and stub StoreSaveDataThumbnail	2022-05-11 23:24:35 +01:00
Billy Laws	7d30ac0cd8	Add additional nifm stubs	2022-05-11 23:24:35 +01:00
Billy Laws	a164635f32	Stub LibraryAppletPlayerSelect	2022-05-11 23:24:35 +01:00
PixelyIon	4ec1cc7086	Update Build Tools to `33.0.0-rc4` Google has removed `33.0.0-rc3` from their servers and it can no longer be downloaded by the CI, the build tools version has been updated as a result.	2022-05-12 02:53:01 +05:30
Billy Laws	dd0004e208	Set Host1x log tag correctly	2022-05-11 22:11:16 +01:00
Billy Laws	f89bacf8ae	Fixup Host1x syncpoint locking	2022-05-11 22:04:02 +01:00
Billy Laws	d8ff318a1a	Prevent infinite VFS read loop on EOF	2022-05-11 22:03:39 +01:00
shutterbug2000	f078a5d1ec	Stub `bt` and `btm:u` Stub BT services which is required by titles such as Pokémon Let's GO Pikachu and Eevee (non-Demo versions).	2022-05-11 20:44:09 +05:30
PixelyIon	588b4529ee	Implement 3D Texture Swizzling The Maxwell GPU supports 3D textures which are tiled with the block-linear layout which didn't handle swizzling 3D textures correctly till now. This commit addresses that by implementing proper swizzling for 3D textures. Titles such as Cluster Truck and Super Mario Odyssey utilize 3D textures alongside a vast majority of other titles.	2022-05-11 14:06:04 +05:30
Billy Laws	601d67e369	Use resource size rather than allocation size for staging buffer size As per VMA docs: 'Allocation size returned in this variable may be greater than the size requested for the resource e.g. as VkBufferCreateInfo::size. Whole size of the allocation is accessible for operations on memory e.g. using a pointer after mapping with vmaMapMemory(), but operations on the resource e.g. using vkCmdCopyBuffer must be limited to the size of the resource.'	2022-05-10 18:48:20 +01:00
Billy Laws	d2acec24f5	Handle VFS reads into trapped memory regions pread will refuse to read into any trapped regions so implement a manual path with a staging buffer and memcpy for such cases	2022-05-10 18:33:55 +01:00
Billy Laws	1609fd2a32	Account for layerCount in SynchronizeGuestWithBuffer staging buffer size	2022-05-10 18:33:31 +01:00
Billy Laws	5b97b87503	Restore previous cullMode when cullFace is enabled	2022-05-10 18:31:32 +01:00
Billy Laws	15e9fa1c80	Fix FillRandomBytes There were two issues here: - If a skyline span was passed as a param then the 'T &object' version would be called, filling the span itself with random values rather than its contents - Random numbers were repeated every call since independent_bits_engine copied generator state and thus it was never actually updated	2022-05-10 18:28:15 +01:00
Billy Laws	622ff2a8f1	Correctly track 5.1 audio channel sample count Size needs to be adjusted for 5.1 buffers since they're downsampled to stereo.	2022-05-10 18:26:20 +01:00
PixelyIon	56c9b03843	Fix incorrect swizzling Y extent calculation This calculation for the amount of lines on the Y axis relative to the start of the last block was wrong and would instead determine the amount of lines to the last Y-axis GOB which wasn't accurate when padding was considered, this resulted in titles like Celeste having broken texture decoding (on a 1922x1082 texture) for the last ROB as most pixels would be masked out.	2022-05-09 20:25:43 +05:30
Billy Laws	018df355f0	Replace some VFS exceptions with warnings These errors aren't necessarily fatal so tone them down.	2022-05-08 19:37:10 +01:00
Billy Laws	e1c13bbc08	Update hades	2022-05-08 19:37:10 +01:00
PixelyIon	b307fca115	Fix attachment reuse within the same subpass Certain titles such as BOTW trigger behavior to reuse an attachment within the same subpass, this caused an exception inside `RenderPassNode::AddAttachment` as it cannot find corresponding subpass for attachment. To fix this issue, we now assume that when it cannot find a subpass for an existing attachment, it is attached to the latest subpass and return the attachment.	2022-05-08 18:26:40 +05:30
PixelyIon	e027555796	Handle Y-axis GOB non-alignment for swizzling Certain textures may be unaligned with a GOB's height of 8 lines, we already handle the case of being unaligned with a GOB's width of 64-bytes. This case occurs on titles such as SMO when going in-game.	2022-05-07 18:37:22 +05:30
PixelyIon	c910e29168	Extend `HostSignalHandler`'s `SIGSEGV` debugger path The function now returns from a segmentation fault when a debugger is present, this allows the entire context to be intact which can allow the debugger to correctly pick up variables from all stack frames while it could not extrapolate most variables when trapped inside the signal handler without the values of all registers.	2022-05-07 18:37:22 +05:30
Billy Laws	4149ab1067	Implement Maxwell 3D instanced draw support In the Maxwell 3D engine, instanced draws are implemented by repeating the exact same draw in sequence with special flag set in vertexBeginGl. This flag allows either incrementing the instance counter or resetting it, since we need to supply an instance count to the host API we defer all draws until state changes occur. If there are no state changes between draws we can skip them and count the occurences to get the number of instances to draw.	2022-05-07 13:56:09 +01:00
Billy Laws	03594a081c	Ensure correct flushing for batched constant buffer updates Cbufs could be read by non-maxwell3D engines so force a flush when switching to them or before Execute.	2022-05-07 13:56:09 +01:00
PixelyIon	ad989750fc	Implement Maxwell3D Point Sprite Size Implements register state that corresponds to the size of a single point sprite in Maxwell 3D, this is emitted by the shader compiler in the preamble but needs to be only applied if the input topology is a point primitive and it is invalid to set the point size in any other case.	2022-05-07 03:46:25 +05:30
PixelyIon	874a6a2a6c	Fix `getTextureType` enum conversion fomatting	2022-05-07 03:46:25 +05:30
PixelyIon	ae5bcbdb5c	Fix Depth RT lock to be in scope Earlier texture locking design required the lock to be retained but since the introduction of `AttachTexture`, this no longer needs to be done. This being done caused deadlocks when the depth texture is sampled by the fragment shader while being bound as an RT since it would attempt to lock the texture again.	2022-05-07 02:37:48 +05:30
shutterbug2000	1c8d994161	Basic `bcat:u` implementation A basic `bcat:u` implementation to prevent titles such as "Kirby and the Forgotten Land" dependent on BCAT support from crashing due to the lack of an implementation.	2022-05-06 15:41:48 +05:30
PixelyIon	4fd64a53e0	Require Vulkan `samplerAnisotropy` feature This is a widely supported feature that games may require conditionally but due to it being supported on effectively all target devices, it was made mandatory. This is used by titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	1d9b4a865a	Add additional formats to Adreno filter `VK_FORMAT_R32G32B32A32_SFLOAT` and `D32_SFLOAT` have their capabilities misreported as well, this spams the logs in titles such as ARMS.	2022-05-06 15:41:48 +05:30
PixelyIon	b87295374e	Improve Controller Applet log Improves the readability of the log and replaces the previously uninformative prefix of `operator()` due to being in a lambda with `Controller support`.	2022-05-06 15:41:48 +05:30
PixelyIon	98c730a644	Implement linked TIC/TSC handle in Maxwell3D Maxwell3D has a register for linking the TIC/TSC index in bindless texture handles, this is used by games to implement bindless combined texture-sampler handles.	2022-05-06 14:58:20 +05:30
PixelyIon	23a091100d	Implement `ReadCbufValue` + `ReadTextureType` Implements `GraphicsEnvironment::ReadCbufValue` & `GraphicsEnvironment::ReadTextureType` with a framework of heterogeneous lookups for caching and callbacks for querying constant buffer or TIC values with validation checks for successive draws to ensure unique IR is generated.	2022-05-06 14:39:36 +05:30
PixelyIon	765c3f4e1f	Allow draws with no descriptor set resources The `descriptorSetWrites` being filled is now optional and the case of it being empty is handled correctly, this is done by certain titles such as ARMS and is entirely valid behavior. It should be noted that not doing this leads to errors in the guest due to invalid GPU state while working on the host GPU.	2022-05-06 10:33:47 +05:30
PixelyIon	37327f1955	Fix and refactor SVC `SignalToAddress`/`WaitForAddress` SVC `SignalToAddress` had a bug with the behavior of `SignalAndModifyBasedOnWaitingThreadCountIfEqual` which was entirely incorrect and led to deadlocks in titles such as ARMS that were dependent on it. This commit corrects the behavior and refactors both SVCs and moves their arbitration/waiting to inside the corresponding `KProcess` function rather than the SVC to avoid redundancies and improve code readability.	2022-05-05 19:15:37 +05:30
PixelyIon	396979e897	Extend Adreno format-based filtering for Validation Layer Filtering of validation logs is now extended beyond BCn formats and now covers other format which have their feature set misreported by the driver, this significantly drives down the amount of logs depending on the title.	2022-05-05 19:15:37 +05:30
PixelyIon	62ea2a6da5	Avoid format aliasing warnings on Adreno Implements an algorithm to determine formats that can be aliased as views without needing `VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT`, this avoids spamming warning logs on view creation when the aliased formats will function in practice.	2022-05-05 19:15:37 +05:30
PixelyIon	7206ab4c67	Fix `exclusiveSubpass` by finishing render pass at end There was an oversight with exclusive subpasses which could lead to RPs with more than one subpass could be created even though one pass was exclusive, this oversight was not finishing the render pass at the end of `AddSubpass`. This could lead to a future subpass adding to the end of that RP even though it was intended to exclusively have a single subpass. This case occurs in titles such as Celeste (in-game) and breaks rendering on GPUs that may require exclusive subpasses for proper functionality.	2022-05-05 11:14:38 +05:30
PixelyIon	96fe5f0a0e	Set initial `subpassCount` value to 1 rather than 0	2022-05-05 11:07:43 +05:30
PixelyIon	5d08d6e06f	Disable unnecessary Khronos Validation Layer logs The Khronos Validation Layer can often generate warning/error logs due to our intentional breakage from Vulkan specification, these can occur several times a frame resulting in the logs being spammed and making it difficult to extract useful information out of logs. The scope of these logs has now been reduced with more general filtering and the introduction of specialized filtering to handle complex cases such as BCn hacks with `libadrenotools` on Adreno devices.	2022-05-04 13:20:59 +05:30
PixelyIon	23c9388caf	Fix `VK_KHR_push_descriptor`-less path for descriptor set updates Descriptor set updates were broken on the non-push-descriptor path due to lifetime issues with VkDescriptorSetLayout's usage during the execution phase which entirely broke rendering on AMD/Mali GPUs due to them not supporting `VK_KHR_push_descriptor`. This commit addresses that by moving the allocation of a descriptor set to outside the lambda and into the recording phase, it also simplifies the semantics and resources passed into the lambda by removing redundancies.	2022-05-04 00:49:21 +05:30
PixelyIon	47bc3b4d99	Fix Render Pass Cache The Vulkan render pass cache was fundamentally broken since it was designed around the Render Pass Compatibility clause due to being designed for framebuffer compatibility initially. As this scope was extended to a general render pass cache, the amount of data in the key was not extended to include everything it should have. This commit introduces the missing pieces in the RP cache and simplifies the underlying code in the process.	2022-05-01 20:31:36 +05:30
PixelyIon	25a29f9044	Skip zero-initializing shader bytecode backing The backing for shader data would implicitly be zero-initialized due to a `resize` on every shader parse, this was entirely unnecessary as we would overwrite the entire range regardless. We avoid this by using statically allocated storage and a span over it containing the shader bytecode which avoids any unnecessary clear semantics without resorting to more complex solutions such as a custom allocator.	2022-05-01 18:27:27 +05:30
PixelyIon	42573170c6	Implement Framebuffer Cache Implements a cache for storing `VkFramebuffer` objects with a special path on devices with `VK_KHR_imageless_framebuffer` to allow for more cache hits due to an abstract image rather than a specific one. Caching framebuffers is a fairly crucial optimization due to the cost of creating framebuffers on TBDRs since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding framebuffer memory.	2022-05-01 18:27:27 +05:30
PixelyIon	af7f0c301e	Avoid redundant `VkImageView` recreation There are a lot of cases of `VkImageView` being recreated arbitrarily due to it being tied to the ephemeral object `TextureView` rather than `Texture`, this commit flips that by storing all `VkImageView`s inside `Texture` with `TextureView` simply holding a copy of the handle to them. Additionally, this change results in stable `VkImageView` handles and helps in paving the path for framebuffer caching when `VK_KHR_imageless_framebuffer` is unavailable.	2022-05-01 18:27:27 +05:30
PixelyIon	41b2c2dc7b	Add `profileable` attribute to `AndroidManifest.xml` As we desire more accurate profiling data in certain circumstances, making the app explicitly profilable will allow for this, it will also remove the (annoying) prompt to do this in the Android Studio profiler.	2022-05-01 18:27:27 +05:30
PixelyIon	da931cf07b	Implement Render Pass Cache Implements a cache for storing `VkRenderPass` objects which are often reused, they are not extremely expensive to create generally but this is a required step to build up to a framebuffer cache which is an extremely expensive object to create on TBDRs generally since it involves calculating tiling memory allocations and in the case of Adreno's proprietary driver involves several kernel calls for mapping and allocating the corresponding memory.	2022-05-01 18:16:53 +05:30
Billy Laws	ae77bde171	Fixup audio device name writing in services Games expect the output buffer the be entirely zero filled past the device name.	2022-04-30 16:00:33 +01:00
Billy Laws	194cbe6c7c	Stub several HID functions	2022-04-30 16:00:33 +01:00
Billy Laws	112c20cef2	Stub QueryAudioDevice{Input,Output}Event Used in many 3.0.0+ games	2022-04-30 16:00:33 +01:00
Billy Laws	8d7dbe2c4e	Add a way to get a readonly span of Buffer contents Avoids the need redundantly copy data when it is being directly processed on the CPU (e.g. quad coversion)	2022-04-30 16:00:33 +01:00
MK73DS	4c71ef5c31	Fix American English language code	2022-04-30 18:43:22 +05:30
PixelyIon	4ec0f62e30	Update Kotlin, AGP, Gradle and Build Tools Kotlin was updated to 1.6.21, AGP to 7.1.3, Gradle to 7.4.2 and Build Tools to 33.0.0-rc3.	2022-04-27 14:00:36 +05:30
PixelyIon	90c635bf78	Coalesce subpasses with compatible attachments together We run into a lot of successive subpasses with the exact same framebuffer configuration which we now exploit to avoid the creation of a new subpass due to the overhead involved with this. This provides significant performance boosts in certain cases due to the magnitude of difference in the amount of subpasses being created while providing next to no benefit in other cases.	2022-04-27 13:22:34 +05:30
PixelyIon	a947933bf0	Fix `Buffer` cycle check being inverted The check for the fence cycle being the same as the current cycle was incorrectly inverted to be the opposite of what it should have been, leading to bugs.	2022-04-27 13:07:36 +05:30
PixelyIon	54794f4b71	Move `Texture` locking and synchronization to `PresentationEngine` The responsibility for synchronizing a texture and locking it is now on the `PresentationEngine` rather than the API-user as this'll allow more fine grained locking and delay waiting until necessary.	2022-04-25 21:01:16 +05:30
Billy Laws	1dd230afde	Refactor all std::lock_guard usages to std::scoped_lock	2022-04-25 15:00:30 +01:00
PixelyIon	94e6f3cfa0	Add quirk for relaxed render pass compatibility As we require a relaxed version of the Vulkan render pass compatibility clause for caching multi-subpass render passes, we now utilize a quirk to determine if this is supported which it is on Nvidia/Adreno while AMD/Mali where it isn't supported we force single-subpass render passes.	2022-04-24 16:18:36 +05:30
PixelyIon	44615c8dd2	Implement per-vendor `VkQueue` maximum global priority We found out that certain vendors such as Nvidia had a limitation on the global priority of a queue and requesting `VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT` would result in `VK_ERROR_NOT_PERMITTED_EXT`. A quirk has been introduced to supply the maximum supported global priority which is currently set on a per-vendor basis to avoid future crashes.	2022-04-24 16:15:01 +05:30
PixelyIon	7ef4959060	Implement Graphics Pipeline Cache Implements a cache for storing `VkPipeline` objects which are fairly expensive to create and doing so on a per-frame basis was rather wasteful and consumed a significant part of frametime. It should be noted that this is not compliant with the Vulkan specification and will break unless the driver supports a relaxed version of the Vulkan specification's Render Pass Compatibility clause.	2022-04-24 14:31:00 +05:30
PixelyIon	50a8b69f7b	Optimize descriptor set writes using push descriptors We can use inline push descriptors for writing to descriptor rather than allocating a descriptor set for a one time write and freeing it as this is rather inefficient while an inline push descriptor generally ends up being a direct `memcpy` on the driver side designed for this use-case.	2022-04-24 13:45:09 +05:30
PixelyIon	5adafbff04	Set `VkQueue`'s global priority to high We want Skyline to have the most favorable GPU scheduling possible due to low latency and high throughput requirements, we request high priority scheduling due to this reason.	2022-04-24 13:34:09 +05:30
PixelyIon	f9c052d1b7	Implement Maxwell3D Tessellation State This implements all Maxwell3D registers and HLE Vulkan state for Tessellation including invalidation of the TCS (Tessellation Control Shader) state during state changes.	2022-04-24 13:23:00 +05:30
Billy Laws	de796cd2cd	Implement overhead-free sequenced buffer updates with megabuffers Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. We had earlier tried to fix this by using vkCmdUpdateBuffer however this caused significant performance loss due to an oversight in Adreno drivers. We could have worked around this simply by using vkCmdCopy buffer however there would still be a performance loss due to renderpasses being split up with copies inbetween. To avoid this we introduce 'megabuffers', a brand new technique not done before in any other switch emulators. Rather than replaying the copies in sequence on the GPU, we take advantage of the fact that buffers are generally small in order to replay buffers on the GPU instead. Each write and subsequent usage of a buffer will cause a copy of the buffer with that write, and all prior applied to be pushed into the megabuffer, this way at the start of execute the megabuffer will hold all used states of the buffer simultaneously. Draws then reference these individual states in sequence to allow everything to work without any copies. In order to support this buffers have been moved to an immediate sync model, with synchronisation being done at usage-time rather than execute (in order to keep contents properly sequenced) and GPU-side writes now need to be explictly marked (since they prevent megabuffering). It should also be noted that a fallback path using cmdCopyBuffer exists for the cases where buffers are too large or GPU dirty.	2022-04-23 22:48:28 +01:00
lynxnb	0d9992cb8e	Implement `QuadList` support for non-indexed draws	2022-04-20 18:17:10 +02:00
lynxnb	bcaf7dfe1c	Make `GetVertexBuffer` return a pointer to the requested buffer This avoids a redundancy in the `Draw` function and makes code easier to read	2022-04-20 18:16:45 +02:00
Billy Laws	5c3559e888	Revert "Implement support for GPU-side constant buffer updating" This reverts commit `d79635772f`.	2022-04-18 13:28:58 +01:00

... 5 6 7 8 9 ...

1549 Commits