skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-11-05 11:45:10 +01:00

Author	SHA1	Message	Date
PixelyIon	94e6f3cfa0	Add quirk for relaxed render pass compatibility As we require a relaxed version of the Vulkan render pass compatibility clause for caching multi-subpass render passes, we now utilize a quirk to determine if this is supported which it is on Nvidia/Adreno while AMD/Mali where it isn't supported we force single-subpass render passes.	2022-04-24 16:18:36 +05:30
PixelyIon	44615c8dd2	Implement per-vendor `VkQueue` maximum global priority We found out that certain vendors such as Nvidia had a limitation on the global priority of a queue and requesting `VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT` would result in `VK_ERROR_NOT_PERMITTED_EXT`. A quirk has been introduced to supply the maximum supported global priority which is currently set on a per-vendor basis to avoid future crashes.	2022-04-24 16:15:01 +05:30
PixelyIon	7ef4959060	Implement Graphics Pipeline Cache Implements a cache for storing `VkPipeline` objects which are fairly expensive to create and doing so on a per-frame basis was rather wasteful and consumed a significant part of frametime. It should be noted that this is not compliant with the Vulkan specification and will break unless the driver supports a relaxed version of the Vulkan specification's Render Pass Compatibility clause.	2022-04-24 14:31:00 +05:30
PixelyIon	50a8b69f7b	Optimize descriptor set writes using push descriptors We can use inline push descriptors for writing to descriptor rather than allocating a descriptor set for a one time write and freeing it as this is rather inefficient while an inline push descriptor generally ends up being a direct `memcpy` on the driver side designed for this use-case.	2022-04-24 13:45:09 +05:30
PixelyIon	5adafbff04	Set `VkQueue`'s global priority to high We want Skyline to have the most favorable GPU scheduling possible due to low latency and high throughput requirements, we request high priority scheduling due to this reason.	2022-04-24 13:34:09 +05:30
PixelyIon	f9c052d1b7	Implement Maxwell3D Tessellation State This implements all Maxwell3D registers and HLE Vulkan state for Tessellation including invalidation of the TCS (Tessellation Control Shader) state during state changes.	2022-04-24 13:23:00 +05:30
Billy Laws	de796cd2cd	Implement overhead-free sequenced buffer updates with megabuffers Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. We had earlier tried to fix this by using vkCmdUpdateBuffer however this caused significant performance loss due to an oversight in Adreno drivers. We could have worked around this simply by using vkCmdCopy buffer however there would still be a performance loss due to renderpasses being split up with copies inbetween. To avoid this we introduce 'megabuffers', a brand new technique not done before in any other switch emulators. Rather than replaying the copies in sequence on the GPU, we take advantage of the fact that buffers are generally small in order to replay buffers on the GPU instead. Each write and subsequent usage of a buffer will cause a copy of the buffer with that write, and all prior applied to be pushed into the megabuffer, this way at the start of execute the megabuffer will hold all used states of the buffer simultaneously. Draws then reference these individual states in sequence to allow everything to work without any copies. In order to support this buffers have been moved to an immediate sync model, with synchronisation being done at usage-time rather than execute (in order to keep contents properly sequenced) and GPU-side writes now need to be explictly marked (since they prevent megabuffering). It should also be noted that a fallback path using cmdCopyBuffer exists for the cases where buffers are too large or GPU dirty.	2022-04-23 22:48:28 +01:00
lynxnb	0d9992cb8e	Implement `QuadList` support for non-indexed draws	2022-04-20 18:17:10 +02:00
lynxnb	bcaf7dfe1c	Make `GetVertexBuffer` return a pointer to the requested buffer This avoids a redundancy in the `Draw` function and makes code easier to read	2022-04-20 18:16:45 +02:00
Billy Laws	5c3559e888	Revert "Implement support for GPU-side constant buffer updating" This reverts commit `d79635772f`.	2022-04-18 13:28:58 +01:00
Billy Laws	7bf3580031	Revert "Allow external synchronization for buffers" This reverts commit `372ab8befa`.	2022-04-18 13:28:58 +01:00
PixelyIon	ddc9622b90	Fix Shader Module Cache As bindings weren't correctly handled due to the fact that `EmitSPIRV` would change the bindings, the shader module cache would not correctly function and have no cache hits in `find` and rather have them in `try_emplace` which negated any performance benefit of it. This has now been fixed by retaining the initial cache key for insertion into the cache while also storing the post-emit bindings and restoring them during a cache hit.	2022-04-18 12:18:15 +05:30
Billy Laws	32fe01e145	Implement batch constant buffer updates Avoids spamming the driver with hundreds of cbuf updates per frame by batching all consecutive updates into one.	2022-04-17 00:35:00 +01:00
PixelyIon	02f99273ac	Implement Shader Module Cache Implements caching of the compiled shader module (`VkShaderModule`) in an associative map based on the supplied IR, bindings and runtime state to avoid constant recompilation of shaders. This doesn't entirely address shader compilation as an issue since host shader compilation is tied to Vulkan pipeline objects rather than Vulkan shader modules, they need to be cached to prevent costly host shader recompilation.	2022-04-16 18:45:56 +05:30
PixelyIon	76d8172a35	Implement Shader IR Cache This implements the first step of a full shader cache with caching any IR by treating the shared pointer as a handle and key for an associative map alongside hashing the Maxwell shader bytecode, it supports both single shader program and dual vertex program caching.	2022-04-16 18:45:56 +05:30
PixelyIon	0baa90d641	Implement `SpanEqual` and `SpanHash` We desire the ability to hash and check equality of data across spans to use associative containers such as `std::unordered_map` with spans. The implemented functions provide an easy way to do that.	2022-04-16 18:45:56 +05:30
Billy Laws	df5d1256c2	Implement an object backed IStorage backing This is more convinient and efficient to use when passing structured data out of applets	2022-04-16 18:45:56 +05:30
Billy Laws	d115ce3c05	Stub the controller applet Mostly based off of yuzu's implementation, this will need to be extended in the future to open up a UI for configuring controllers according to the applications requirements.	2022-04-16 18:45:56 +05:30
Billy Laws	9a8e39cba1	Slightly refactor controller code in HID Now uses ranges where possible and a function to get the number of connected controllers has been added.	2022-04-16 18:45:56 +05:30
Billy Laws	2873f11baa	Pass shared pointers by value in applet infrastructure This is more optimal than crefs when used together with std::move	2022-04-16 18:45:56 +05:30
PixelyIon	8ccef733ff	Fix UB with guest-less Texture/Buffers in `MarkGpuDirty` As there was no check for the lack of a `GuestTexture`/`GuestBuffer`, it would lead to UB when a texture/buffer that had no guest such as the `zeroTexture` from `GraphicsContext` would be marked as dirty they would cause a call to `NCE::RetrapRegions` with a `nullptr` handle that would be dereferenced and cause a segmentation fault.	2022-04-16 18:45:56 +05:30
PixelyIon	372ab8befa	Allow external synchronization for buffers In certain situations such as constant buffer updates, we desire to use the guest buffer as a shadow buffer forwarding all writes directly to it while we update the host using inline buffer updates so they happen in-sequence. This requires special behavior as we cannot let any synchronization operations take place as they would break the shadow buffer, as a result, an external synchronization flag has been added to prevent this from happening. It should be noted that this flag is not respected for buffer recreation which will lead to UB, this can and will break updates in certain cases and this change isn't complete without buffer manager support.	2022-04-16 18:44:53 +05:30
PixelyIon	c0c4db68a8	Fix `BufferView` offset not being added in `vkCmdUpdateBuffer` The offset of the view wasn't added to the `vkCmdUpdateBuffer`, this would cause the offset to be incorrect given the buffer was a view of a larger buffer that wasn't the start of it. This commit fixes that by adding the offset of the view to the buffer update.	2022-04-14 18:06:15 +05:30
PixelyIon	a1c06e0401	Mark GPU resources as dirty before GPU usage We didn't call `MarkGpuDirty` on textures/buffers prior to GPU usage, this would cause them to not be R/W protected when they should be and provide outdated copies if there were any read accesses from the CPU (which are not possible at the moment since we assume all accesses are writes at the moment). This has now been fixed by calling it after synchronizing the resource.	2022-04-14 17:20:05 +05:30
PixelyIon	41a6afed01	Fix `GraphicsContext` code formatting for auto formatter	2022-04-14 15:27:22 +05:30
PixelyIon	624df92616	Change `AddNonGraphicsPass` to `AddOutsideRpCommand` The terminology "Non-Graphics pass" was deemed to be fairly inaccurate since it simply covered all Vulkan commands (not "passes") outside the render-pass scope, these may be graphical operations such as blits and therefore it is more accurate to use the new terminology of "Outside-RenderPass command" due to the lack of such an implication while being consistent with the Vulkan specification.	2022-04-14 15:20:22 +05:30
Billy Laws	a31332e35f	Align Maxwell 3D macro newline slashes	2022-04-14 14:14:52 +05:30
Billy Laws	d79635772f	Implement support for GPU-side constant buffer updating Previously constant buffer updates would be handled on the CPU and only the end result would be synced to the GPU before execute. This caused issues as if the constant buffer contents was changed between each draw in a renderpass (e.g. text rendering) the draws themselves would only see the final resulting constant buffer. Fix this by updating cbufs on the GPU/CPU seperately, only ever syncing them back at the start or after a guest side CPU write, at the moment only a single word is updated at a time however this can be optimised in the future to batch all consecutive updates into one large one.	2022-04-14 14:14:52 +05:30
Robin Kertels	036faedabd	Implement a way to run non-graphics passes with command executor These commands will end the current renderpass and run on their own, this is useful for compute, blits etc.	2022-04-14 14:14:52 +05:30
Billy Laws	feb179fcff	Implement primitive restart support Maxwell3D also supports using an arbitrary restart index value however no games are known to use this so leave it for now.	2022-04-14 14:14:52 +05:30
Billy Laws	3f3acc31d8	Rework swizzle infrastructure to support arbritary format swizzles This is required to support R4G4B4A4 which has no directly corresponding Vulkan format. Co-authored-by: Lunar-Pixel <lunarn452@gmail.com>	2022-04-14 14:14:52 +05:30
PixelyIon	6f85a66151	Implement host-only `Buffer`s We require certain buffers to only be on the host while being accessible through the same abstractions as a guest buffer as they must be interchangeable in usage.	2022-04-14 14:14:52 +05:30
Billy Laws	2c697ec36a	Determine depth/stencil texture aspect based off of image swizzle Required since we can't have a non-rt image with both a depth/stencil aspect at the same time according to vk spec.	2022-04-14 14:14:52 +05:30
PixelyIon	1878e582ad	Add `ScopedStackBlocker` to `RomFile.populate` We needed to block stack frame lookups past JNI code as Java doesn't follow the ARMv8 frame pointer ABI which leads to invalid pointer dereferences. Any JNI function that throws or handles exceptions must do this now or it may lead to a `SIGSEGV`.	2022-04-14 14:14:52 +05:30
Billy Laws	68e693d9f4	Fix DMA Engine debug logs to not crash emu Address causes some type issues when printing directly so explicitly cast to u64 first to prevent them.	2022-04-14 14:14:52 +05:30
Billy Laws	8eaca87de8	Use an empty host texture in place of invalid TIC entries on guest Some games may pass empty TICs as inputs to shaders while not actually using them within the shader. Create an empty texture and pass this in instead when we hit this case, the nullDescriptor feature could be used but it's not supported by all devices so we chose to do it this way instead.	2022-04-14 14:14:52 +05:30
PixelyIon	41b98c7daa	Add stack tracing to `skyline::exception` Skyline's `exception` class now stores a list of all stack frames during the invocation of the exception. These can later be parsed by the exception handler to generate a human-readable stack trace. To assist with more complete stack traces, `-fno-omit-frame-pointer` is now passed on debug builds which forces the inclusion of frames on function calls.	2022-04-14 14:14:52 +05:30
PixelyIon	cd8fa66326	Fix NCE Destruction NCE is implicitly depended on by the `GPU` class due to the NCE Memory Trapping API so the destruction of it must take place after the destruction of the `GPU` class. Additionally, to prevent bugs the NCE destructor must set `staticNce` to `nullptr` as the signal handler will potentially access a destroyed instance of NCE otherwise.	2022-04-14 14:14:52 +05:30
Billy Laws	815f1f4067	Add support for sRGB TIC textures Without this sRGB textures would be interpreted as RGB leading to colours being slighly off. The sRGB flag isn't stored as part of format word so we reuse the _pad_ field of it to store the flag for the switch case.	2022-04-14 14:14:52 +05:30
Billy Laws	1ba4abf950	Add Astc{6x6,8x8} and R4G4B4A4 image formats	2022-04-14 14:14:52 +05:30
MCredstoner2004	dec0571eee	Infrastructure for applets to be implemented This removes a stub for an applet and implements several applet related service calls.	2022-04-14 14:14:52 +05:30
PixelyIon	164d4852fa	Sleep-loop rather than abort during termination We don't want to actually exit the process as it'll automatically be restarted gracefully due to a timeout after being unable to exit within a fixed duration so we just want to infinite sleep during termination. This should fix issues where exiting any game would cause the app to force close after some time as exception signal handling would fail in the background, the app should stay open now and automatically restart itself when another game is loaded in.	2022-04-14 14:14:52 +05:30
PixelyIon	ea00f1bb82	Flush emulation logs after exceptions A lot of logs are incomplete due to being unable to flush inside the signal handler, now we flush after any exceptions so that there is a guarantee of any exceptions being logged as this is crucial for proper debugging.	2022-04-14 14:14:52 +05:30
PixelyIon	62ba180550	Use R5G6B5 as Vulkan swapchain format rather than B5G6R5 B5G6R5 isn't generally supported by the swapchain and the format is used for R5G6B5 with swapped R/B channels to avoid aliasing so we reverse that by using R5G6B5 as the underlying Vulkan format for the swapchain which should be automatically handled by the driver for any copies from B5G6R5 textures and the data representation should be the same as B5G6R5 with swapped R/B channels so not reporting the correct texture::Format should be fine.	2022-04-14 14:14:52 +05:30
MK73DS	e54f86e923	Fix IApplicationFunctions::GetDisplayVersion id (https://switchbrew.org/wiki/Applet_Manager_services#IApplicationFunctions)	2022-04-14 14:14:52 +05:30
Billy Laws	77cf33b643	Trigger command executor before DMA copies DMA copies can use textures currently in active use on the GPU as dst/src so Execute before to prevent a deadlock	2022-04-14 14:14:52 +05:30
Billy Laws	dbbc5704d2	Implement DMA engine Block Linear->Linear copies	2022-04-14 14:14:52 +05:30
Billy Laws	3e4e8de1d2	Implement primitive Linear->Block Linear DMA engine copies Slightly inaccurate and misses some features but good enough for most games, should be revisted later.	2022-04-14 14:14:52 +05:30
Billy Laws	3c26921d54	Implement the Maxwell DMA engine The DMA engine is used to perform DMA buffer/texture copies directly on the GPU. It can deswizzle arbritary regions of input textures, perform component remapping and swizzle into output textures. This impl only supports 1D buffer copies, 2D ones will come later.	2022-04-14 14:14:52 +05:30
Billy Laws	3df76e84c3	Stub IRequest::GetAppletInfo in nifm	2022-04-14 14:14:52 +05:30

1 2 3 4 5 ...

874 Commits