Commit Graph

1050 Commits

Author SHA1 Message Date
Billy Laws
d1e7bbc1d8 Introduce common code for Maxwell 3D interconnect rewrite
This common code will be used across the entirety of the 3D rewrite, it also includes a stub for StateUpdateBuilder, which will be used by active state code to apply state updates.
2022-11-02 17:46:07 +00:00
Billy Laws
a6c49115f9 Rewrite all Maxwell 3D registers up to clears to match Nvidia docs
All the names are directly translated from Nvidia docs, with minimal conversions to enums/structs when appropriate. Not all registers have been rewritten, only those that are needed to implement clears and dynamic state, the rest will be added as they are used in the GPU rework.
2022-11-02 17:46:07 +00:00
Billy Laws
d7eab40f1c Introduce resource based dirty tracking infrastructure
This will be heavily used by the upcoming GPU rework. It provides an intuitive way to track dirtiness based on using the underlying pointers of objects, as opposed to other methods which often need an enum entry per dirty state and don't support overlaps. Wrappers for dirty state objects are also provided to abstract as much of the dirty tracking as possible from user code. The pointer based mechanism also serves to avoid having to handle dirty bindings on the user side of the dirty resources, allowing them to bind things internally instead.
2022-11-02 17:46:07 +00:00
Billy Laws
8471ab754d Introduce a spin lock for resources locked at a very high frequency
Constant buffer updates result in a barrage of std::mutex calls that take a lot of time even under no contention (around 5%). Using a custom spinlock in cases like these allows inlining locking code reducing the cost of locks under no contention to almost 0.
2022-11-02 17:46:07 +00:00
Billy Laws
d810619203 Drop 3D engine method calling fast path in GPFIFO
This ended up actually turning out to be a slow path when Maxwell 3D method handling code was inlined.
2022-11-02 17:46:07 +00:00
Billy Laws
ded02e3eac Small engine.h fixups 2022-11-02 17:46:07 +00:00
Billy Laws
38ba963311 Drop usage of unique_ptr for Maxwell3D
Since graphics context is being replaced and split into cpp files there will no longer be any circular includes that previously prevented this.
2022-11-02 17:46:07 +00:00
Billy Laws
90db743c56 Source AsGpu GMMU page sizes from GMMU class 2022-11-02 17:46:07 +00:00
Billy Laws
e72fe02c15 Add inline fast-path for Buffer::FindOrCreate()
This can be inlined by the compiler much easier which helps perf a fair bit due to the number of times buffers are looked up, also avoids the need for small vector construction that was done in the previous fast-path.
2022-11-02 17:46:07 +00:00
Billy Laws
49478e178a Avoid redundantly syncing buffers before every Write in an execution
This isn't a guarantee provided by actual HW so we don't need to provide it either, the sync can be skipped once the buffer already been synced at least once within the execution.
2022-11-02 17:46:07 +00:00
Billy Laws
f7a726e452 Allow attempting to write to buffers without passing a GPU copy callback
Constructing the GPU copy callback in `ConstantBuffers::Load()` ended up taking a fair amount of time despite it almost never being used in practice. By making it optional it can be skipped most of the time and only done when it's actually neccessary by calling `Write()` again if the initial call returned true.
2022-11-02 17:46:07 +00:00
Billy Laws
5dca5cc10e Redesign buffer view infra to remarkably reduce creation overhead
Buffer views creation was a significant pain point, requiring several layers of caching to reduce the number of creations that introduced a lot of complexity. By reworking delegates to be per-buffer rather than per-view and then linearly allocating delegates (without ever freeing) views can be reduced to just {delegatePtr, offset, size}, avoiding the need for any allocations or set operations in GetView. The one difficulty with this is the need to support buffer recreation, which is achived by allowing delegates to be chained - during recreation all source buffers have their delegates modified to point to the newly created buffer's delegate. Upon accessing a view with such a chained delegate the view will be modified to point directly to the end delegate with offset being updated accordingly, skipping the need to traverse the chain for future accesses.
2022-11-02 17:46:07 +00:00
Billy Laws
09f376e500 Add const accessors to OffsetMember 2022-11-02 17:46:07 +00:00
Billy Laws
64a9db2e82 Introduce MergeInto helper for simplified construction of arrays of structs
In the upcoming GPU code each state member will hold a reference to its corresponding Maxwell 3D regs, this helper is needed to allow easy transformation from the the main 3D register struct into them.

Example:
```c++
struct Regs {
    std::array<View, 10> viewRegs;
    u32 enable;
} regs;

struct ViewState {
    const View &view;
    const u32 &enable;
    size_t index;
};

std::array<ViewState, 10> viewStates{MergeInto<ViewState, 10>(regs.viewRegs, regs.enable, IncrementingT{})
```
2022-11-02 17:46:07 +00:00
Billy Laws
2c682f19a6 Add untracked linear allocator emplace/allocate functions
Useful for cases where allocations are guaranteed to be unused by the time `Reset()` is called and calling `Free()` would be difficult or add extra performance cost due to how the allocation is used.
2022-11-02 17:46:07 +00:00
Billy Laws
6359852652 Introduce page size constants and replace all usages of PAGE_SIZE
Avoids using macros and results in code which looks slightly cleaner.
2022-11-02 17:46:07 +00:00
Billy Laws
30ec844a1b Use GPFIFO pushbuffer contents in-place if possible
The memcpy within `Read()` was taking up a fair amount of time, avoid this by using the mapped range in-place when the mapping isn't split.
2022-11-02 17:46:07 +00:00
Billy Laws
be825b7aad Utilise SegmentTable for rapid FlatMemoryManager lookups
In some games performing the binary search in `TranslateRange()` ended up taking a fairly large (~8%) proportion of GPFIFO time. By using a segment table for O(1) lookups this is reduced to <2% for non-split mappings at the cost of slightly increased memory usage (2GiB in the absolute worse case but more like 50MiB in real world situations).

In addition to adapting `TranslateRange()` to use the segment table, a new function `LookupBlock()` for cases where only a single mapping would ever be looked up so the small_vector handling and fallback paths can be skipped and the entire lookup be inlined.
2022-11-02 17:46:07 +00:00
Morph
4ea0b0e1e5 fssrv: IFileSystemProxy: Implement OpenReadOnlySaveDataFileSystem
Forward this function to OpenSaveDataFileSystem for now. A proper implementation should wrap the underlying filesystem with nn::fs::ReadOnlyFileSystem.
2022-10-30 20:04:40 +00:00
Narr the Reg
25b9bb00fd service: hid: Properly clear and set npad devices 2022-10-30 15:52:03 +00:00
german77
cf95cfb056 service: hid: Stub SetPalmaBoostMode 2022-10-30 15:52:03 +00:00
Narr the Reg
4da934579c service: hid: Set the correct maxEntry value and signal on acquire event handle 2022-10-30 15:52:03 +00:00
Dima
baa6b5d5ea check if NpadId is valid when update 2022-10-30 15:51:45 +00:00
Dima
a409f30e91 add GetAvailableLanguageCodeCount for both lists 2022-10-30 15:51:29 +00:00
Dima
51ce3f7c3c Stub IClient::Poll 2022-10-29 19:52:50 +05:30
PixelyIon
128ea33073 Print NPDM + NACP metadata for title determination
We determined that printing NPDM + NACP metadata is a significantly better way to determine what title is running rather than printing the filename.
2022-10-23 20:20:44 +05:30
PixelyIon
a0539a3edb Trace Scheduler Preemption/Yield in Perfetto 2022-10-22 17:37:03 +05:30
PixelyIon
c874907eb5 Log and flush inside KProcess::Kill
We want to know when the `KProcess` is being killed and flushing log during it is important since it can often result in hangs due to joining not working correctly.
2022-10-22 17:17:04 +05:30
PixelyIon
597a6ff31d Wait on slot to be freed in GraphicBufferProducer::DequeueBuffer
We currently don't wait on a slot to be freed if none are free, this worked prior to async presentation as GBP's slots wouldn't change their state until other commands were called but now slots can be held by the presentation engine. As a result, we now have to wait on the presentation engine to free up slots.

This commit also fixes the behavior of the `async` flag in `DequeueBuffer` as it was treated as a non-blocking flag but isn't supposed to do anything on HOS.
2022-10-22 17:15:33 +05:30
lynxnb
cdb2b85d6c Stub ngword and ngword2
Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>
2022-10-18 20:54:57 +01:00
lynxnb
a7be4fd1e1 Stub pl::RequestLoad and pl::GetSharedFontInOrderOfPriority
Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>
2022-10-18 20:54:57 +01:00
lynxnb
782f9e37ee Add a system region setting
Needed for games such as AC:NH.
The `Auto` option automatically selects a region based on the currently selected system language.

Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>
2022-10-18 20:54:57 +01:00
lynxnb
d4800d13b8 Stub hid::ActivateMouse and hid::ActivateKeyboard
Co-Authored-By: Timotej Leginus <35149140+timleg002@users.noreply.github.com>
2022-10-18 20:54:57 +01:00
lynxnb
45830633eb Stub am::SetTerminateResult 2022-10-18 20:54:57 +01:00
lynxnb
bc016aff47 Make the vulkan validation layer toggleable via setting
As part of this commit, a new preference category for debug settings is being introduced. All future settings only relevant for debugging purposes will be put there. The category is hidden on release builds.
2022-10-18 19:47:23 +02:00
Billy Laws
8d1026d0cc Reduce font shared memory size for compacted fonts 2022-09-22 21:51:13 +01:00
Billy Laws
e31ed6a429 Increase font shared memory size for Noto fonts 2022-09-22 21:34:29 +01:00
Billy Laws
0d4893c448 Cleanup font magic generation code 2022-09-22 21:34:29 +01:00
Billy Laws
5c4cc3d51f Fix font load order to match HOS
Without this games would load an inappropriate font file for their target language.
2022-09-22 21:34:29 +01:00
lynxnb
54172322fe Fix host synchronization for texture with a different guest format
Host synchronization of a guest texture with a different guest format represents a valid use case where the host doesn't support the guest format and conversion to a host-compatible format must be performed. The issue is most evident on Mali GPUs, as they don't support BCn texture formats thus needing manual decoding before submission. It was disabled by mistake in a previous commit, this commit re-enables it.
2022-09-15 15:22:52 +02:00
TheASVigilante
a1ff4e1777
Implement OpenHardwareOpusDecoderEx and GetWorkBufferSizeEx
Implements these 2 functions which were introduced in HOS 12.0.0. Fixes a crash in Xenoblade Chronicles 3.
2022-09-10 22:04:15 +01:00
lynxnb
34bd16426c Fix quads index buffer conversion not accounting for first index
Unindexed quad draws were broken when multiple draw calls were done on the same vertex buffer, with a non-zero `first` index.
Indexed quad draws also suffered from the same issue, but was never encountered in games.
This commit fixes both cases by accounting for the `first` drawn index when generating conversion index buffers.
2022-09-04 12:42:33 +02:00
Billy Laws
9af5df4bae Increase IPC pointer buffer size
Some services report a pointer buffer size > 0x1000, so up it as to not cause issues when they're implemented.
2022-09-02 23:14:05 +01:00
Billy Laws
51f4e7662e Add support for the TIPC protocol introduced in HOS 12.0.0
TIPC is a much lighter layer ontop of the Horizon IPC system than CMIF and is used by SM in 12.0.0+. This implementation is slightly hacky since it doesn't really keep a seperation between the underlying kernel IPC stuff and userspace like CMIF/TIPC, this should be fixed eventually, probably together with an IPC dispatch rewrite to avoid the mess of frozen maps.

Tested with Hentai Uni, which now crashes needing 'ldr:ro'.
2022-09-02 23:13:23 +01:00
Billy Laws
a40d7c78ad Always recreate oboe stream on error
This is what's done in oboe examples to avoid spurious errors breaking audio entirely.
2022-08-31 21:26:14 +01:00
Billy Laws
8917ec9c88 Don't set framesPerCallback for main stream as per oboe guidance
It's best to let oboe figure it out on it's own
2022-08-31 21:26:14 +01:00
Billy Laws
b00008daf5 Fix identifier release check in AudioTrack::Stop 2022-08-31 21:26:14 +01:00
Billy Laws
d9c8e62d1c Don't warn on GetConfig IOCTL fails 2022-08-31 21:26:14 +01:00
Billy Laws
4aef24ba32 Implement NVGPU_GPU_IOCTL_GET_GPU_TIME in nvdrv 2022-08-31 21:26:14 +01:00
Billy Laws
5841799420 Fix decoding of IOCTLs with padding at the end 2022-08-31 21:26:14 +01:00