Avoid trapping frequently synced buffers by using megabuffer copies

When a buffer is trapped nearly every frame, the cost of trapping and synchronising its contents starts to quickly add up. By always using the megabuffer when this is the case, since megabuffer copies are done directly from the guest, we skip the need to synchronise/trap the backing.
This commit is contained in:
Billy Laws 2022-09-17 12:53:50 +01:00
parent a24aec03a6
commit 99a34df4cc
3 changed files with 4 additions and 2 deletions

View File

@ -279,7 +279,7 @@ namespace skyline::gpu {
// Bail out if buffer cannot be synced, we don't know the contents ahead of time so the sequence is indeterminate
return {};
if (!everHadInlineUpdate)
if (!everHadInlineUpdate && sequenceNumber < FrequentlySyncedThreshold)
// Don't megabuffer buffers that have never had inline updates and are not frequently synced since performance is only going to be harmed as a result of the constant copying and there wont be any benefit since there are no GPU inline updates that would be avoided
return {};

View File

@ -70,6 +70,7 @@ namespace skyline::gpu {
bool everHadInlineUpdate{}; //!< Whether the buffer has ever had an inline update since it was created, if this is set then megabuffering will be attempted by views to avoid the cost of inline GPU updates
static constexpr u64 InitialSequenceNumber{1}; //!< Sequence number that all buffers start off with
static constexpr u64 FrequentlySyncedThreshold{15}; //!< Threshold for the sequence number after which the buffer is considered elegible for megabuffering
u64 sequenceNumber{InitialSequenceNumber}; //!< Sequence number that is incremented after all modifications to the host side `backing` buffer, used to prevent redundant copies of the buffer being stored in the megabuffer by views
constexpr static vk::DeviceSize MegaBufferingDisableThreshold{1024 * 128}; //!< The threshold at which a view is considered to be too large to be megabuffered (128KiB)

View File

@ -286,7 +286,8 @@ namespace skyline::gpu::interconnect {
commandBuffer.end();
for (const auto &attachedBuffer : attachedBuffers)
attachedBuffer->SynchronizeHost(); // Synchronize attached buffers from the CPU without using a staging buffer, this is done directly prior to submission to prevent stalls
if (attachedBuffer->SequencedCpuBackingWritesBlocked())
attachedBuffer->SynchronizeHost(); // Synchronize attached buffers from the CPU without using a staging buffer, this is done directly prior to submission to prevent stalls
gpu.scheduler.SubmitCommandBuffer(commandBuffer, cycle);