skyline

mirror of https://github.com/skyline-emu/skyline.git synced 2024-12-27 21:41:50 +01:00

Author	SHA1	Message	Date
Billy Laws	37ff0ab814	Add buffer manager support for accelerated copies These will be sequenced on the GPU/CPU depending on what's optimal and avoid any serialisation	2022-11-02 17:46:07 +00:00
Billy Laws	cac287d9fd	Implement accelerated uploads/copies through buffer manager Previously, both I2M uploads and DMA copies would force GPU serialisation if they happened to hit a trap or were used to copy GPU dirty buffers. By using the buffer manager to implement them on the host GPU we can avoid such slowdowns entiely.	2022-11-02 17:46:07 +00:00
Billy Laws	c5ec484d9a	Avoid redundantly passing executor in ctors when it's already in ChannelCtx	2022-11-02 17:46:07 +00:00
Billy Laws	463394ba72	Pass correct size for XFB buffers	2022-11-02 17:46:07 +00:00
Billy Laws	bd976676f4	Fix SNorm vertex formats	2022-11-02 17:46:07 +00:00
Billy Laws	b74098570f	Zero-out unused XFB varyings before passing to hades	2022-11-02 17:46:07 +00:00
Billy Laws	22f3ba6b93	Mark XFB buffers as GPU dirty	2022-11-02 17:46:07 +00:00
Billy Laws	26aeeaecf5	Add constant buffer GPU write pipeline barrier	2022-11-02 17:46:07 +00:00
Billy Laws	0b5d9308c4	Be more careful about potentially-unneeded GPU->CPU syncs These can be especially expensive so should be avoided as much as possible.	2022-11-02 17:46:07 +00:00
Billy Laws	e6530e2386	Delete graphics_context F	2022-11-02 17:46:07 +00:00
Billy Laws	ac2e6c125b	Switch to Roboto for Korean font	2022-11-02 17:46:07 +00:00
Billy Laws	b24a8465da	Don't require depthClamp	2022-11-02 17:46:07 +00:00
Billy Laws	9055c98e09	Only enable debug/verbose logs in (rel)debug builds	2022-11-02 17:46:07 +00:00
Billy Laws	0ebdbcf0ff	Don't lock stateMutex when updating buffer cycle	2022-11-02 17:46:07 +00:00
Billy Laws	dd360b8f75	Pass correct wait semaphore array size to queue submit	2022-11-02 17:46:07 +00:00
Billy Laws	c78a4b9699	Fixup buffer recreation to avoid deadlock when waiting on srcs	2022-11-02 17:46:07 +00:00
Billy Laws	d236bfe454	Enable depthClamp VK device feature	2022-11-02 17:46:07 +00:00
Billy Laws	95d849e1f6	Check FenceCycle signalled flag immediately before waiting The lock release within the wait for submission means that another thread could end up signalling the cycle and then the VK wait still happen after when the lock has been reacquired.	2022-11-02 17:46:07 +00:00
Billy Laws	1a23b929a7	Avoid chaining cycles in buffer recreation This had a chance of creating circular chains which obviously caused issues, just do a wait instead for now.	2022-11-02 17:46:07 +00:00
Billy Laws	a15db9cb06	Update hades submodule	2022-11-02 17:46:07 +00:00
Billy Laws	cfc55e60b0	Add robin map submodule	2022-11-02 17:46:07 +00:00
Billy Laws	6c0f084aae	Introduce hack to ignore frequently read-back textures Readback can be especially slow on mobile due to the varying load pattern it creates which often prevents the CPU/GPU from clocking up. Since some games perform texture readback but don't actually use it for anything significant implement a hack to skip it and significantly improve performance in such cases.	2022-11-02 17:46:07 +00:00
Billy Laws	e45e7546c8	Redesign buffer megabuffering Due to the frequency at which is is called megabuffering performance is critical to the performance of the entire emulator, especially in high-drawcall-count scenarios. After the view redesign, megabuffering on a per-view level was no longer possible nor desirable, and thus megabuffering was modified to just copy for every usage of a view. This worked great at the time since there were other bottlenecks, however gpu-new has since removed almost all of them and megabuffering is now a major sore point. Fix this by megabuffering small chunks and storing them in a page-table like structure within the buffer, these chunks can be referenced by multiple views and will be smartly invalidated whenever the sequence number or execution number changes to avoid any sequencing issues. In addition to this, to help the case where almost the whole buffer is read every single frame across a set of multiple views, an optimisation to skip the chunked tracking and use one large single megabuffer allocation and one single memcpy has been introduced. This reduces the overall amount of time spent in memcpy since large memcpys are quicker.	2022-11-02 17:46:07 +00:00
Billy Laws	7ea9aa52f5	Speed up reported guest GPU time Avoids triggering DRS in games in cases where it wouldn't actually benefit anything due to being CPU bottlenecked.	2022-11-02 17:46:07 +00:00
Billy Laws	31c2fb7d7a	Fixup IDirectory read	2022-11-02 17:46:07 +00:00
Billy Laws	7491178a9e	Pass base array layer to texture views	2022-11-02 17:46:07 +00:00
Billy Laws	ff57d2fbbf	Enforce stronger format and weaker dimension texture compat checks Rather than using just bpb for format compat, additionally check that the exact component bit layout matches since many games end up reusing RTs for unrelated textures. The texture size requirements have also been weaked to only check the resulting layer size as opposed to width/height - this is somewhat hacky but it gets around the problem of blocklinear alignment.	2022-11-02 17:46:07 +00:00
Billy Laws	14af383238	Only allow submitting `swapchainImageCount` images for host present at a time Prevents situations where nothing would otherwise be waiting on the GPU and since presentation no longer blocks too many images would be submitted for presentation.	2022-11-02 17:46:07 +00:00
Billy Laws	bcd96ac77d	Fixup A8R8G8B8 TIC format mapping 8-bit formats are inverted in TICs compared to Vulkan	2022-11-02 17:46:07 +00:00
Billy Laws	90466b8830	Implement depth clamp rasterisation state Used in SMO for shadows.	2022-11-02 17:46:07 +00:00
Billy Laws	1cfc4278f9	Disable preserve buffer/texture attachment opt for now Causes several issues and crashes in Pokemon without an obvious cause.	2022-11-02 17:46:07 +00:00
Billy Laws	e483cf9634	Use shader memory mirror when reading guest shaders Avoids triggering any traps that may be present on the region	2022-11-02 17:46:07 +00:00
Billy Laws	f6e4328b5a	Ensure blit src/dst textures are attached as execution cycle dependencies Since they're not in the TIC pool they would otherwise be freed	2022-11-02 17:46:07 +00:00
Billy Laws	77a131df60	Support using in-app renderdoc API to capture individual executions	2022-11-02 17:46:07 +00:00
Billy Laws	576bc6f37e	Add CommandExecutor slot count setting	2022-11-02 17:46:07 +00:00
Billy Laws	1a0819fb76	Use semaphores for presentation engine frame synchronisation Avoids waits on the CPU which can be costly and confuse the scheduler, also reduces latency significantly.	2022-11-02 17:46:07 +00:00
Billy Laws	0670e0e0dc	Support using Vulkan semaphores with fence cycles In some cases like presentation, it may be possible to avoid waiting on the CPU by using a semaphore to indicate GPU completion. Due to the binary nature of Vulkan semaphores this requires a fair bit of code as we need to ensure semaphores are always unsignalled before they are waited on and signalled again. This is achieved with a special kind of chained cycle that can be added even after guest GPFIFO processing for a given cycle, the main cycle's semaphore can be waited and then the cycle for the wait attached to the main cycle and it will be waited on before signalling.	2022-11-02 17:46:07 +00:00
Billy Laws	5b72be88c3	Stub ldn:u service	2022-11-02 17:46:07 +00:00
Billy Laws	77d76ed05a	Batch contiguous GMMU ranges into one	2022-11-02 17:46:07 +00:00
Billy Laws	e52dbf202f	Pass more Maxwell3D registers into interconnect	2022-11-02 17:46:07 +00:00
Billy Laws	83c7ed314e	Setup KThread pthread handle in StartThread Avoids a race with starting the thread and the handle not being set yet	2022-11-02 17:46:07 +00:00
Billy Laws	9784ae23e9	Skip checking affinity before taking load-balance WaitScheduler path The affinity mask may be set after the wait has began	2022-11-02 17:46:07 +00:00
Billy Laws	ad3195e06f	Split out guest texture layer size calcs into a seperate func	2022-11-02 17:46:07 +00:00
Billy Laws	8fa83fdf13	Fix deswizzling non-pow2 block size formats We need to use DivideCeil to avoid rounding off part of the texture. Fixes texture in Nier Automata: Game of the YoRHa edition.	2022-11-02 17:46:07 +00:00
Billy Laws	27de42f8df	Use surfaceClip as a hint for the underlying rendertarget size TIC sizes may not be aligned to block linear dimensions whereas RT sizes are and then limited by the surface clip. By using this to determine surface size we are more likely to get a match in texture manager for any future usages.	2022-11-02 17:46:07 +00:00
Billy Laws	297597f697	Fix texture manager depth compat comparison	2022-11-02 17:46:07 +00:00
Billy Laws	500f817a28	Synchronize all non-matching textures back to host before recreation	2022-11-02 17:46:07 +00:00
Billy Laws	05581f2230	Remove now redundant buffer/texture/megabuffer manager locks They have been superseeded by the global channel lock	2022-11-02 17:46:07 +00:00
Billy Laws	f5a141a621	Add dirty resource operator*	2022-11-02 17:46:07 +00:00
Billy Laws	b72720e8db	Finish off transform feedback implementation	2022-11-02 17:46:07 +00:00

1 2 3 4 5 ...

1515 Commits