Commit Graph

166 Commits

Author SHA1 Message Date
Mary
48f6570557
Salieri: shader cache (#1701)
Here come Salieri, my implementation of a disk shader cache!

"I'm sure you know why I named it that."
"It doesn't really mean anything."

This implementation collects shaders at runtime and cache them to be later compiled when starting a game.
2020-11-13 00:15:34 +01:00
riperiperi
02872833b6
Size hints for copy regions and viewport dimensions to avoid data loss (#1686)
* Size hints for copy regions and viewport dimensions to avoid data loss

* Reword comment.

* Use info for the rule rather than calculating aligned size.

* Reorder min/max, remove spaces
2020-11-09 21:41:13 -03:00
gdkchan
934a78005e
Simplify logic for bindless texture handling (#1667)
* Simplify logic for bindless texture handling

* Nits
2020-11-09 19:35:04 -03:00
gdkchan
8d168574eb
Use explicit buffer and texture bindings on shaders (#1666)
* Use explicit buffer and texture bindings on shaders

* More XML docs and other nits
2020-11-08 12:10:00 +01:00
riperiperi
5561a3b95e
Synchronize Rasterizer State before Clear (#1680) 2020-11-07 16:21:10 -03:00
riperiperi
500b48251c
Only report that GPU commands are available when the queue is not empty. (#1656)
* Only report that commands are available when the queue is not empty.

* Address Feedback

Co-authored-by: FICTURE7 <FICTURE7@gmail.com>

Co-authored-by: FICTURE7 <FICTURE7@gmail.com>
2020-11-06 23:04:26 -03:00
riperiperi
a16f201a6f
Do not align sizes for buffer texture targets. (#1671)
This should fix a random crash in Hyrule Warriors, and potentially other games that use buffer textures.
2020-11-06 18:45:30 +01:00
gdkchan
24dbfc0fe6
Correct BPP of buffer to texture copies (#1670) 2020-11-06 18:37:05 +01:00
gdkchan
a89b81a812
Separate zeta from color formats (#1647) 2020-11-05 23:50:34 +01:00
riperiperi
4c6feb652f
Add seamless cubemap flag in sampler parameters. (#1658)
* Add seamless cubemap flag in sampler parameters.

* Check for the extension
2020-11-02 17:03:06 -03:00
riperiperi
e1da7df207
Support res scale on images, correctly blacklist for SUST, move logic out of backend. (#1657)
* Support res scale on images, correctly blacklist for SUST, move logic
out of backend.

* Fix Typo
2020-11-02 16:53:23 -03:00
gdkchan
11a7c99764
Support 3D BC4 and BC5 compressed textures (#1655)
* Support 3D BC4 and BC5 compressed textures

* PR feedback

* Fix some typos
2020-11-01 15:32:53 -03:00
gdkchan
876fa656f6
Remove unused texture and sampler pool invalidation code (#1648) 2020-11-01 15:17:29 -03:00
gdkchan
423da5cc91
Scale texture resolution before sending to backend (#1646)
* Work

* Propagate scale factor to copy temp. Not really needed, just here for consistency

* PR feedback
2020-10-29 22:57:34 +01:00
gdkchan
812e32f775
Fix transform feedback errors caused by host pause/resume and multiple uses (#1634)
* Fix transform feedback errors caused by host pause/resume

* Fix TFB being used as something else issue with copies

* This is supposed to be StreamCopy
2020-10-25 17:23:42 -03:00
gdkchan
cf0f0fc4e7
Improve the speed of redundant ASTC texture data updates (#1636) 2020-10-25 17:09:45 -03:00
gdkchan
efa77a2415
Add missing null check on image binding (#1632) 2020-10-21 14:06:13 +02:00
gdkchan
2dcc6333f8
Fix image binding format (#1625)
* Fix image binding format

* XML doc
2020-10-20 19:03:20 -03:00
riperiperi
08332bdc04
Ensure storage is set for Buffer Textures when binding an Image. (#1627) 2020-10-20 18:56:23 -03:00
riperiperi
b4d8d893a4
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking

- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.

It works :)

Still a few hacks, messy things, slow things

More work in progress stuff (also move to memory project)

Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)

Move some stuff.

I think we'll eventually just put the dll and so for this in a nuget package.

Fix rebase.

[WIP] MultiRegionHandle variable size ranges

- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity

Fix rebase issue

Commit everything needed for software only tracking.

Remove native components.

Remove more native stuff.

Cleanup

Use a separate window for the background context, update opentk. (fixes linux)

Some experimental changes

Should get things working up to scratch - still need to try some things with flush/modification and res scale.

Include address with the region action.

Initial work to make range tracking work

Still a ton of bugs

Fix some issues with the new stuff.

* Fix texture flush instability

There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)

* Find the destination texture for Buffer->Texture full copy

Greatly improves performance for nvdec videos (with range tracking)

* Further improve texture tracking

* Disable Memory Tracking for view parents

This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)

The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.

* Introduce some tracking tests.

WIP

* Complete base tests.

* Add more tests for multiregion, fix existing test.

* Cleanup Part 1

* Remove unnecessary code from memory tracking

* Fix some inconsistencies with 3D texture rule.

* Add dispose tests.

* Use a background thread for the background context.

Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.

Also nerf the multithreading test a bit.

* Copy to texture with matching alignment

This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.

* Track reads for buffer copies. Synchronize new buffers before copying overlaps.

* Remove old texture flushing mechanisms.

Range tracking all the way, baby.

* Wake the background thread when disposing.

Avoids a deadlock when games are closed.

* Address Feedback 1

* Separate TextureCopy instance for background thread

Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.

* Add missing XML docs.

* Address Feedback

* Maybe I should start drinking coffee.

* Some more feedback.

* Remove flush warning, Refocus window after making background context
2020-10-16 17:18:35 -03:00
gdkchan
b066cfc1a3
Add support for shader constant buffer slot indexing (#1608)
* Add support for shader constant buffer slot indexing

* Fix typo
2020-10-12 21:40:50 -03:00
gdkchan
0954e76a26
Improve BRX target detection heuristics (#1591) 2020-10-03 15:43:33 +10:00
gdkchan
86412ed30a
Supper 2D array ASTC compressed texture formats decoding (#1593) 2020-10-02 11:22:23 +10:00
gdkchan
1560f236da
Convert 1D texture targets to 2D (#1584)
* Convert 1D texture targets to 2D

* Fix typo

* Simplify some code

* Should mask that too

* Consistency
2020-09-29 22:28:50 +02:00
riperiperi
f89b754abb
Always set new texture data for textures initialized by a copy. (#1576) 2020-09-27 09:37:45 +10:00
gdkchan
bd28ce90e6
Implement small indexed draws and other fixes to make guest Vulkan work (#1558) 2020-09-24 09:48:34 +10:00
riperiperi
5dd6f41ff4
Make viewStorage still valid after view removal. (#1564) 2020-09-21 16:51:33 -03:00
gdkchan
1eea35554c
Better viewport flipping and depth mode detection method (#1556)
* Use a better viewport flipping approach

* New approach to detect depth mode

* nit: Sort method on the OpenGL backend

* Adjust spacing on comment

* Unswap near and far parameters based on ScaleZ
2020-09-19 19:46:49 -03:00
riperiperi
3d055da5fc
Allow swizzles to match with "undefined" components (#1538)
* Add swizzle matching rules.

Improves rules which try to match incompatible formats as perfect, such as D32 float -> R32 float.

Remove Format.HasOneComponent, since this information is now available via the FormatInfo struct.

* Fix this rule.

* Update component counts for depth formats.
2020-09-11 09:48:48 +10:00
riperiperi
5d69d9103e
Texture/Buffer Memory Management Improvements (#1408)
* Initial implementation. Still pending better valid-overlap handling,
disposed pool, compressed format flush fix.

* Very messy backend resource cache.

* Oops

* Dispose -> Release

* Improve Release/Dispose.

* More rule refinement.

* View compatibility levels as an enum - you can always know if a view is only copy compatible.

* General cleanup.

Use locking on the resource cache, as it is likely to be used by other threads in future.

* Rename resource cache to resource pool.

* Address some of the smaller nits.

* Fix regression with MK8 lens flare

Texture flushes done the old way should trigger memory tracking.

* Use TextureCreateInfo as a key.

It now implements IEquatable and generates a hashcode based on width/height.

* Fix size change for compressed+non-compressed view combos.

Before, this could set either the compressed or non compressed texture with a size with the wrong size, depending on which texture had its size changed. This caused exceptions when flushing the texture.

Now it correctly takes the block size into account, assuming that these textures are only related because a pixel in the non-compressed texture represents a block in the compressed one.

* Implement JD's suggestion for HashCode Combine

Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>

* Address feedback

* Address feedback.

Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>
2020-09-10 16:44:04 -03:00
gdkchan
bdfbcf4017
Fix regression on texture compatibility match checks (#1521) 2020-09-01 16:58:40 +10:00
sharmander
bc19114bb5
Fix: Issue #1475 Texture Compatibility Check methods need to be centralized (#1482)
* Texture Compatibility Check methods need to be centralized #1475

* Fix spacing

* Fix spacing

* Undo removal of .ToString()

* Move isPerfectMatch back to Texture.cs

Rename parameters in TextureCompatibility.cs for consistency

* Add switch from 1474 to TextureCompatibility as requested by mageven.

* Actually add TextureCompatibility changes to the PR (Add DeriveDepthFormat method)

* Alignment corrections + Derive method signature adjustment.

* Removed empty line as erquested

* Remove empty lines

* Remove blank lines, fix alignment

* Fix alignment

* Remove emtpy line
2020-08-31 21:06:27 -03:00
gdkchan
09341dc11d
Fix off by one error in pages count calculation on GPU pool (#1511) 2020-08-29 16:42:34 -03:00
mageven
2a314f3c28
Add missing depth-color conversions in CopyTexture (#1474)
* Add missing depth-color conversions in CopyTexture

* Whitespace

* switch expression
2020-08-14 20:03:19 +10:00
LDj3SNuD
8624dd8de6
Fix MacroJit SubtractWithBorrow Alu Reg Operation. (#1473) 2020-08-13 12:08:48 -03:00
gdkchan
157ad3f54f
Silence several build warnings (#1428)
* Silence several build warnings

* Remove fixed buffers from NVDEC struct

* Remove unused field and usings

* Fix wrong name

* Silence more warning on H264 PictureInfo
2020-08-06 23:40:41 +02:00
mageven
a33dc2f491
Improved Logger (#1292)
* Logger class changes only

Now compile-time checking is possible with the help of Nullable Value
types.

* Misc formatting

* Manual optimizations

PrintGuestLog
PrintGuestStackTrace
Surfaceflinger DequeueBuffer

* Reduce SendVibrationXX log level to Debug

* Add Notice log level

This level is always enabled and used to print system info, etc...
Also, rewrite LogColor to switch expression as colors are static

* Unify unhandled exception event handlers

* Print enabled LogLevels during init

* Re-add App Exit disposes in proper order

nit: switch case spacing

* Revert PrintGuestStackTrace to Info logs due to #1407

PrintGuestStackTrace is now called in some critical error handlers
so revert to old behavior as KThread isn't part of Guest.

* Batch replace Logger statements
2020-08-04 01:32:53 +02:00
gdkchan
60db4c3530
Implement a Macro JIT (#1445)
* Implement a Macro JIT

* Nit: space
2020-08-03 03:36:57 +02:00
gdkchan
991784868f
Fix shader regression on Intel iGPUs by reverting layout changes (#1425) 2020-07-29 08:01:11 +10:00
gdkchan
43c13057da
Implement alpha test using legacy functions (#1426) 2020-07-28 18:30:08 -03:00
gdkchan
51fbc1fde4
Use polygon offset clamp if supported (#1429) 2020-07-26 18:11:28 -03:00
gdkchan
111534a74e
Remove GPU MemoryAccessor (#1423)
* Remove GPU MemoryAccessor

* Update outdated XML doc

* Update more outdated stuff
2020-07-25 16:39:45 +10:00
gdkchan
5a7df48975
New GPFifo and fast guest constant buffer updates (#1400)
* Add new structures from official docs, start migrating GPFifo

* Finish migration to new GPFifo processor

* Implement fast constant buffer data upload

* Migrate to new GPFifo class

* XML docs
2020-07-23 23:53:25 -03:00
mageven
723ae240dc
GL: Implement more Point parameters (#1399)
* Fix GL_INVALID_VALUE on glPointSize calls

* Implement more of Point primitive state

* Use existing Origin enum
2020-07-20 21:59:13 -03:00
gdkchan
986be200ba
Force TFB rebind after buffer modifications (#1392) 2020-07-15 19:05:06 -03:00
gdkchan
788ca6a411
Initial transform feedback support (#1370)
* Initial transform feedback support

* Some nits and fixes

* Update ReportCounterType and Write method

* Can't change shader or TFB bindings while TFB is active

* Fix geometry shader input names with new naming
2020-07-15 13:01:10 +10:00
gdkchan
2900dda633
Fix depth stencil formats copy by matching equivalent color formats (#1198) 2020-07-13 21:41:30 +10:00
gdkchan
4d02a2d2c0
New NVDEC and VIC implementation (#1384)
* Initial NVDEC and VIC implementation

* Update FFmpeg.AutoGen to 4.3.0

* Add nvdec dependencies for Windows

* Unify some VP9 structures

* Rename VP9 structure fields

* Improvements to Video API

* XML docs for Common.Memory

* Remove now unused or redundant overloads from MemoryAccessor

* NVDEC UV surface read/write scalar paths

* Add FIXME comments about hacky things/stuff that will need to be fixed in the future

* Cleaned up VP9 memory allocation

* Remove some debug logs

* Rename some VP9 structs

* Remove unused struct

* No need to compile Ryujinx.Graphics.Host1x with unsafe anymore

* Name AsyncWorkQueue threads to make debugging easier

* Make Vp9PictureInfo a ref struct

* LayoutConverter no longer needs the depth argument (broken by rebase)

* Pooling of VP9 buffers, plus fix a memory leak on VP9

* Really wish VS could rename projects properly...

* Address feedback

* Remove using

* Catch OperationCanceledException

* Add licensing informations

* Add THIRDPARTY.md to release too

Co-authored-by: Thog <me@thog.eu>
2020-07-12 05:07:01 +02:00
riperiperi
f224769c49
Implement Logical Operation registers and functionality (#1380)
* Implement Logical Operation registers and functionality.

* Address Feedback 1
2020-07-10 14:23:15 -03:00
riperiperi
484eb645ae
Implement Zero-Configuration Resolution Scaling (#1365)
* Initial implementation of Render Target Scaling

Works with most games I have. No GUI option right now, it is hardcoded.

Missing handling for texelFetch operation.

* Realtime Configuration, refactoring.

* texelFetch scaling on fragment shader (WIP)

* Improve Shader-Side changes.

* Fix potential crash when no color/depth bound

* Workaround random uses of textures in compute.

This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything.

* Fix scales oscillating when changing between non-native scales.

* Scaled textures on compute, cleanup, lazier uniform update.

* Cleanup.

* Fix stupidity

* Address Thog Feedback.

* Cover most of GDK's feedback (two comments remain)

* Fix bad rename

* Move IsDepthStencil to FormatExtensions, add docs.

* Fix default config, square texture detection.

* Three final fixes:

- Nearest copy when texture is integer format.
- Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error)
- Discount small textures.

* Remove scale threshold.

Not needed right now - we'll see if we run into problems.

* All CPU modification blacklists scale.

* Fix comment.
2020-07-07 04:41:07 +02:00