* Discard gas_mode renders
This discards the gas_mode / fog effect from games that use it and allows the games to display without it. Note that gas mode is still unimplemented and will LOG<CRITICAL>.
This bypasses #3287. (Doesn't fix it)
* fix clang
citra-qt: Add customizable speed limit target
* Update SDL config for the new frame_limit option
* Made max lag time a function of target speed percent.
* Added a checkbox to enable/disable frame limiter
* UI: Prevent frame_limit from under/overflowing
* UI: Hide target speed percent when frame limiter is off
* Disable frame limit spin box when framelimit isn't enabled
Optimize AttributeBuffer to OutputVertex conversion
First I unrolled the inner loop, then I pushed semantics validation
outside of the hotloop.
I also added overflow slots to avoid conditional branches.
Super Mario 3D Land's intro runs at almost full speed when compiled with
Clang, and theres a noticible speed increase in MSVC. GCC hasn't been
tested but I'm confident in its ability to optimize this code.
Several games such as Smash will cause some regions that are cached on
the gpu to be revalidated, but (seemingly) we can just ignore these
cases. If the data is already found on the gpu in dirty_regions, then we
validate those, and skip flushing that region from cpu.
Its unknown if this breaks any games, but it does speed up many games.
Additionally, it removes outlines in the pokemon games.
The previous commits added the methods where they were located
originally to try to get an easy to read diff between changes. This
commit fixes compliation since the static methods are now declared
before they are used.
Changes the public interface of the surface cache to make it easier to
use. Reintroduces the cached page count cached pages that was removed in
an earlier commit.
Breaks CachedSurface into two classes, the parameters used to create or
find a cached surface, and the actual cached surface. This also adds a
few helper methods for getting surfaces from cache
In a future commit, the count of cached pages will be reintroduced in
the actual surface cache. Also adds an Invalidate only to the cache
which marks a region as invalid in order to try to avoid a costly flush
from 3ds memory
The only functional change is the error handling of GSP_GPU::ReadHWRegs function. We previously didn't return error codes (not even for success). The new returns were found by reverse engineering the GSP module.
It is unlikely we will ever use this without first doing a Cast to a signed type.
Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
- no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
- GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
- GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
- GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
While debugging the software renderer implementation, it was noticed
that this is actually exactly what the hardware does, upgrading the
status of this "hack" to being a proper implementation. And there was
much rejoicing.
Modules didn't correctly define their dependencies before, which relied
on the frontends implicitly including every module for linking to
succeed.
Also changed every target_link_libraries call to specify visibility of
dependencies to avoid leaking definitions to dependents when not
necessary.
Current order of operations (rotate then normalize) seems to produce a
lot more distortion than normalizing and then rotating. This makes Citra
results match pretty closesly with hardware, and indicates that hardware
may also be using lerp instead of slerp to interpolate the quaternions.
One of the later commits will enable writing to GS regs.
It turns out that on startup, most games will write 4096 GS program words.
The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages:
```
HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024
```
New constants have been introduced to represent these limits.
The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
1. removed zl, zr and c-stick from HID::PadState. They are handled by IR, not HID
2. removed button handling in EmuWindow
3. removed key_map
4. cleanup #include
Corrects a few issues with regards to Doxygen documentation, for example:
- Incorrect parameter referencing.
- Missing @param tags.
- Typos in @param tags.
and a few minor other issues.
Now based on std::chrono, and also works in terms of emulated time
instead of frames, so we can in the future frame-limit even when the
display is disabled, etc.
The frame limiter can also be enabled along with v-sync now, which
should be useful for those with displays running at more than 60 Hz.