Commit Graph

1158 Commits

Author SHA1 Message Date
James Rowe
5b872c41d8 OpenGL Cache: Reorder methods
The previous commits added the methods where they were located
originally to try to get an easy to read diff between changes. This
commit fixes compliation since the static methods are now declared
before they are used.
2017-12-23 16:10:28 -07:00
James Rowe
24e187891f OpenGL Rasterizer: Update to use the new cache 2017-12-23 16:10:28 -07:00
James Rowe
e5adb6a26b OpenGL Cache: Add the rest of the Cache methods
Fills in the rasterizer cache methods using the helper methods added in
the previous commits.
2017-12-23 16:10:27 -07:00
James Rowe
81ea32d1e0 OpenGL Cache: Refactor Surface Cache interface
Changes the public interface of the surface cache to make it easier to
use. Reintroduces the cached page count cached pages that was removed in
an earlier commit.
2017-12-23 16:10:27 -07:00
James Rowe
3e1cbb7d14 OpenGL Cache: Split CachedSurface
Breaks CachedSurface into two classes, the parameters used to create or
find a cached surface, and the actual cached surface. This also adds a
few helper methods for getting surfaces from cache
2017-12-23 16:10:27 -07:00
James Rowe
0b98b768f5 OpenGL Cache: Add surface utility functions
Separates creating and filling surfaces into static functions that
can be reused from the different RasterizerCache methods.
2017-12-23 16:10:26 -07:00
James Rowe
e9e2d444ef OpenGL Cache: Optimize Morton Copy to copy in tiles
Compiles two lookup arrays of functions for the different
configurations of Morton Copy.
2017-12-23 16:10:26 -07:00
James Rowe
160ac25527 OpenGL State: Change setters so they don't directly write to curstate 2017-12-23 16:10:25 -07:00
James Rowe
13606a6d0b Memory: Remove count of cached pages and add InvalidateRegion
In a future commit, the count of cached pages will be reintroduced in
the actual surface cache. Also adds an Invalidate only to the cache
which marks a region as invalid in order to try to avoid a costly flush
from 3ds memory
2017-12-23 16:10:25 -07:00
James Rowe
c821c14908 Settings: Change resolution scaling to an integer instead of a float 2017-12-23 16:10:25 -07:00
Subv
3652809408 HLE: Convert GSP_GPU to ServiceFramework.
The only functional change is the error handling of GSP_GPU::ReadHWRegs function. We previously didn't return error codes (not even for success). The new returns were found by reverse engineering the GSP module.
2017-12-21 10:30:22 -05:00
Tillmann Karras
fd3ec6be30 video_core: fix infinity and NaN conversions 2017-12-14 19:51:58 +00:00
Yuri Kunde Schlesner
aecd2b85fe
Merge pull request #3261 from MerryMage/DPH
shader_jit_x64_compiler: Use haddps for horizontal summation
2017-12-13 09:09:42 -05:00
bunnei
4695f12a08
Merge pull request #3264 from lioncash/cmake-target
CMakeLists: Derive the source directory grouping from targets themselves
2017-12-12 14:34:51 -05:00
MerryMage
6c199e4699 fixup! shader_jit_x64_compiler: Use haddps for horizontal summation 2017-12-12 15:37:00 +00:00
Lioncash
ab021d163e CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables,
and then construct the target in most cases.
2017-12-11 21:11:52 -05:00
Yuri Kunde Schlesner
ae7240a2cb
Merge pull request #3097 from ds84182/round-primary-color-swrast
Round primary color in swrast
2017-12-11 20:06:21 -05:00
MerryMage
efec8fe513 shader_jit_x64_compiler: Use haddps for horizontal summation 2017-12-10 22:04:30 +00:00
Yuri Kunde Schlesner
230a7557f1 Shader: Store AttributeBuffers in GS output buffer
This also does the output masking early at EMIT time, instead of when a
triangle is sent to the vertex handler.
2017-12-09 20:33:59 -08:00
Yuri Kunde Schlesner
0184419814 Shader: Refactor output_mask copy loop to function 2017-12-09 20:31:24 -08:00
Tillmann Karras
1c2750d5bd video_core: optimize NaN check 2017-12-05 22:34:22 +00:00
MerryMage
c1aef260af shader_jit_x64_compiler: Remove ABI overhead of LG2 and EX2
This involves reimplementing log2f and exp2f.
2017-11-30 18:17:35 +00:00
MerryMage
235a251d3c tests: Add tests for x64 shader jit
Tests LG2 and EX2 instructions
2017-11-30 18:17:35 +00:00
Dwayne Slater
fcc141a327 Maintain the PICA's 8 bits of color precision when using the interpolated primary color
This matches the software renderer by using round.
The actual hardware rounds the results up instead of flooring.
2017-11-29 16:49:04 -05:00
Dwayne Slater
350082ab75 Fix logic ops not being enabled in the OpenGL renderer 2017-11-29 16:30:19 -05:00
Dwayne Slater
dc48deaecc Round primary color inputs in software rasterizer
OpenGL version coming soon.
2017-11-29 16:30:18 -05:00
James Rowe
9d9693c13d Revert "Extracted the attribute setup and draw commands into their own functions"
This reverts commit b3b34a1e76. This
commit causes a performance regression for not enough benefits
2017-11-16 11:46:17 -07:00
wwylele
47c0c87c47 video_core: clean format warnings 2017-11-01 12:35:32 +02:00
Dragios
3e26b0dee5 swrasterizer folder minor edit 2017-10-27 09:44:45 +08:00
Dragios
9b3eb69973 Utilize vector function instead 2017-10-26 23:50:20 +08:00
Dragios
84054b7cd8 Get rid of narrowing conversion warning 2017-10-24 00:02:46 +08:00
Dragios
520929dd6d Fix typo for -Wunused-local-typedefs 2017-10-22 15:56:50 +08:00
Huw Pascoe
b3b34a1e76 Extracted the attribute setup and draw commands into their own functions 2017-10-04 01:08:29 +01:00
Huw Pascoe
a13ab958cb Fixed type conversion ambiguity 2017-09-30 09:34:35 +01:00
Subv
a321bce378 Disable unary operator- on Math::Vec2/Vec3/Vec4 for unsigned types.
It is unlikely we will ever use this without first doing a Cast to a signed type.
Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
2017-09-27 09:06:41 -05:00
B3n30
dc6a365337 Merge pull request #2951 from huwpascoe/perf-4
Optimized Morton
2017-09-25 08:28:55 +02:00
Huw Pascoe
903906da3b Optimized Float<M,E> multiplication
Before:

ucomiss xmm1, xmm1
jp      .L9
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm0, xmm2
setp    al
cmovne  eax, edx
test    al, al
jne     .L9
.L3:
movaps  xmm0, xmm2
ret
.L9:
ucomiss xmm0, xmm0
jp      .L10
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm1, xmm2
setp    al
cmovne  eax, edx
test    al, al
je      .L3

After:

movaps  xmm2, xmm1
mulss   xmm2, xmm0
ucomiss xmm2, xmm2
jnp     .L3
ucomiss xmm1, xmm0
jnp     .L11
.L3:
movaps  xmm0, xmm2
ret
.L11:
pxor    xmm2, xmm2
jmp     .L3
2017-09-25 00:54:02 +01:00
Huw Pascoe
876aa82c29 Optimized Morton 2017-09-24 22:27:14 +01:00
James Rowe
93930a966f Merge pull request #2921 from jroweboy/batch-fix-2
GPU: Add draw for immediate and batch modes
2017-09-24 07:57:16 -06:00
James Rowe
19d41dcc6e Remove pipeline.gpu_mode and fix minor issues 2017-09-23 09:28:20 -06:00
Yuri Kunde Schlesner
a7758b0b36 Merge pull request #2928 from huwpascoe/master
Fixed framebuffer warning
2017-09-22 04:06:38 +02:00
Huw Pascoe
a234e4c200 Improved performance of FromAttributeBuffer
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.

I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
Huw Pascoe
6a110ac5f5 Fixed framebuffer warning 2017-09-17 11:57:06 +01:00
Yuri Kunde Schlesner
699c920991 Merge pull request #2900 from wwylele/clip-2
PICA: implement custom clip plane
2017-09-16 10:23:00 +02:00
James Rowe
ad0b57f407 GPU: Add draw for immediate and batch modes
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
2017-09-11 09:21:43 -06:00
bunnei
11baa40d75 Merge pull request #2865 from wwylele/gs++
PICA: implemented geometry shader
2017-09-07 23:02:59 -04:00
bunnei
ff4941fb3a Merge pull request #2914 from wwylele/fresnel-fix
pica/lighting: only apply Fresnel factor for the last light
2017-09-05 10:00:49 -04:00
wwylele
12fbc8c8df pica/lighting: only apply Fresnel factor for the last light 2017-09-03 08:22:03 +03:00
wwylele
e2c41a5891 video_core: report telemetry for gas mode 2017-08-31 12:54:17 +03:00
bunnei
f0e461bf6f Merge pull request #2891 from wwylele/sw-bump
SwRasterizer/Lighting: implement bump mapping
2017-08-30 21:07:30 -04:00
Weiyi Wang
647f017c6d Merge pull request #2892 from Subv/warnings2
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-28 03:21:51 -05:00
Subv
da88f3b8f0 Warnings: Fixed a few missing-return warnings in video_core. 2017-08-26 11:58:22 -05:00
wwylele
417cb45e3f SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGL 2017-08-25 07:26:45 +03:00
wwylele
addbcd5784 gl_rasterizer: implement custom clip plane 2017-08-25 07:26:45 +03:00
wwylele
ea51a3af26 SwRasterizer: implement custom clip plane 2017-08-24 15:34:27 +03:00
wwylele
17c6104d2a gl_rasterizer/lighting: more accurate CP formula 2017-08-22 09:34:44 +03:00
wwylele
b5aa570354 SwRasterizer/Lighting: implement LUT input CP 2017-08-22 09:34:44 +03:00
wwylele
3e478ca131 SwRasterizer/Lighting: implement bump mapping 2017-08-22 09:34:44 +03:00
wwylele
63b6e802cd swrasterizer: remove invalid TODO
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21 08:03:07 +03:00
wwylele
72b26ac32f swrasterizer/clipper: remove tested TODO
hwtested. Current implementation is the correct behavior
2017-08-21 08:03:07 +03:00
wwylele
5a4af616c6 gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shader 2017-08-21 08:03:07 +03:00
wwylele
1eca380886 gl_rasterizer: add clipping plane z<=0 defined in PICA 2017-08-21 08:03:07 +03:00
Yuri Kunde Schlesner
46d1ca768d Merge pull request #2872 from wwylele/sw-geo-factor
SwRasterizer/Lighting: implement geometric factor
2017-08-20 17:49:42 -07:00
James Rowe
8afa81ac1b Merge pull request #2871 from wwylele/sw-spotlight
SwRasterizer/Lighting: implement spot light
2017-08-19 20:10:24 -06:00
wwylele
0f35755572 pica/command_processor: build geometry pipeline and run geometry shader
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
 - no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
 - GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
 - GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
 - GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19 10:13:20 +03:00
wwylele
8285ca4ad8 pica/shader/jit: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele
36981a5aa6 pica/primitive_assembly: Handle winding for GS primitive
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19 10:13:20 +03:00
wwylele
bb63ae3052 correct constness 2017-08-19 10:13:20 +03:00
wwylele
28128348f2 pica/shader/interpreter: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele
46c6973d2b pica/shader: extend UnitState for GS
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele
686fb3e78c gl_shader_gen: don't call SampleTexture when bump map is not used 2017-08-11 18:35:00 +03:00
wwylele
945f9a1b04 SwRasterizer/Lighting: implement spot light 2017-08-11 01:19:10 +03:00
wwylele
14ee32c46a SwRasterizer/Lighting: implement geometric factor 2017-08-11 01:18:43 +03:00
wwylele
5d9d42f0d0 SwRasterizer/Lighting: use make_tuple instead of constructor
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10 12:19:58 +03:00
wwylele
db309b2423 pica/regs: layout geometry shader configuration regs
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-10 01:53:08 +03:00
Weiyi Wang
792dee47a7 Merge pull request #2822 from wwylele/sw_lighting-2
Implement fragment lighting in the sw renderer (take 2)
2017-08-09 18:54:29 +03:00
wwylele
baa24f4ea9 pica: upload shared shader code to both unit 2017-08-07 10:30:05 +03:00
wwylele
2252a63f80 SwRasterizer/Lighting: shorten file name 2017-08-03 13:51:22 +03:00
wwylele
eda28266fb SwRasterizer/Lighting: move to its own file 2017-08-02 22:20:40 +03:00
wwylele
48b4105871 SwRasterizer/Lighting: reduce confusion 2017-08-02 22:07:15 +03:00
wwylele
c59ed47608 SwRasterizer/Lighting: move quaternion normalization to the caller 2017-08-02 22:05:53 +03:00
wwylele
c89f804a01 pica/shader_interpreter: fix off-by-one in LOOP 2017-07-27 13:48:27 +03:00
Sebastian Valle
c6a2e519ef Merge pull request #2816 from wwylele/proctex-lutlutlut
gl_rasterizer: use texture buffer for proctex LUT
2017-07-22 23:03:48 -05:00
Sebastian Valle
e646bd902d Merge pull request #2834 from wwylele/depth-enable-fix
gl_rasterizer_cache: fix using_depth_fb
2017-07-22 23:02:59 -05:00
bunnei
df8b9863f9 telemetry: Log performance, configuration, and system data. 2017-07-17 21:32:28 -04:00
wwylele
4feff63ffa SwRasterizer/Lighting: dist atten lut input need to be clamp 2017-07-11 22:19:00 +03:00
wwylele
56e5425e59 SwRasterizer/Lighting: unify float suffix 2017-07-11 22:15:35 +03:00
wwylele
e415558a4f SwRasterizer/Lighting: get rid of nested return 2017-07-11 22:15:35 +03:00
wwylele
c6d1472513 SwRasterizer/Lighting: refactor GetLutValue into a function.
merging similar pattern. Also makes the code more similar to the gl one
2017-07-11 22:15:35 +03:00
wwylele
f13cf506e0 SwRasterizer: only interpolate quat and view when lighting is enabled 2017-07-11 21:35:57 +03:00
wwylele
efc655aec0 SwRasterizer/Lighting: pass lighting state as parameter 2017-07-11 20:06:26 +03:00
Subv
9906feefbd SwRasterizer/Lighting: Move the clamp highlight calculation to the end of the per-light loop body. 2017-07-11 19:39:15 +03:00
Subv
7526af5e52 SwRasterizer/Lighting: Move the lighting enable check outside the ComputeFragmentsColors function. 2017-07-11 19:39:15 +03:00
Subv
b8229a7684 SwRasterizer/Lighting: Do not use global registers state in ComputeFragmentsColors. 2017-07-11 19:39:15 +03:00
Subv
7bc467e872 SwRasterizer/Lighting: Do not use global state in LookupLightingLut. 2017-07-11 19:39:15 +03:00
Subv
37ac2b6657 SwRasterizer/Lighting: Fixed a bug where the distance attenuation bias was being set to the dist atten scale. 2017-07-11 19:39:15 +03:00
Subv
6250f52e93 SwRasterizer: Fixed a few conversion warnings and moved per-light values into the per-light loop. 2017-07-11 19:39:15 +03:00
Subv
2d69a9b8bf SwRasterizer: Run clang-format 2017-07-11 19:39:15 +03:00
Subv
73566ff7a9 SwRasterizer: Flip the vertex quaternions before clipping (if necessary). 2017-07-11 19:39:15 +03:00
Subv
2a75837bc3 SwRasterizer: Corrected the light LUT lookups. 2017-07-11 19:39:15 +03:00
Subv
f2d4d5c219 SwRasterizer: Corrected the light LUT lookups. 2017-07-11 19:39:15 +03:00
Subv
80b6fc592e SwRasterizer: Fixed the lighting lut lookup function. 2017-07-11 19:39:15 +03:00
Subv
10b0bea060 SwRasterizer: Calculate fresnel for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
46b8c8e1da SwRasterizer: Calculate specular_1 for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
be25e78b07 SwRasterizer: Calculate specular_0 for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
b2f472a2b1 SwRasterizer: Implement primary fragment color. 2017-07-11 19:39:15 +03:00
wwylele
8482933db8 gl_rasterizer: use texture buffer for proctex LUT 2017-07-01 11:02:48 +03:00
wwylele
8978ecb09c gl_rasterizer: use texture buffer for fog LUT 2017-06-22 20:41:00 +03:00
wwylele
f1e377f57e gl_rasterizer: create the texture before applying the state
this is a rebasing error from #2792. It doesn't affect much though, because the later more Apply() call fixes/hides it
2017-06-22 17:47:46 +03:00
wwylele
457659fe01 gl_state: reset 1d textures 2017-06-21 23:13:06 +03:00
wwylele
42f7ca7412 gl_rasterizer: fix glGetUniformLocation type 2017-06-21 23:13:06 +03:00
wwylele
be9e952bdc gl_rasterizer: manage texture ids in one place 2017-06-21 23:13:06 +03:00
wwylele
ab60414122 gl_rasterizer/lighting: fix LUT interpolation 2017-06-21 23:13:06 +03:00
Yuri Kunde Schlesner
d0888f8548 Merge pull request #2776 from wwylele/geo-factor
Fragment lighting: implement geometric factor
2017-06-18 14:18:48 -07:00
wwylele
5a454173a8 gl_rasterizer/lighting: use the formula from the paper for germetic factor 2017-06-18 10:29:02 +03:00
Yuri Kunde Schlesner
f6715f98f5 Stop using reserved operator names (and/or/xor) with Xbyak
Also has the Dynarmic upgrade with the same change
2017-06-17 12:20:22 -07:00
wwylele
7052d43a67 gl_rasterizer/lighting: implement geometric factor 2017-06-15 14:59:01 +03:00
Yuri Kunde Schlesner
da1bec121a Merge pull request #2762 from wwylele/light-cp-tangent
Fragment lighting: implement lut input 5 (CP) and tangent mapping
2017-06-14 20:08:26 -07:00
Yuri Kunde Schlesner
5fe5ccac42 Merge pull request #2743 from wwylele/wrap-fix
pica/rasterizer: implement/stub texture wrap mode 4-7
2017-06-13 21:28:12 -07:00
Yuri Kunde Schlesner
791cd14c8d Merge pull request #2767 from yuriks/quaternion-flip-comment
OpenGL: Update comment on AreQuaternionsOpposite with new information
2017-06-12 16:31:55 -07:00
wwylele
972548e3ee gl_rasterizer/lighting: Implement tangent mapping 2017-06-11 21:30:53 +03:00
wwylele
40b7d0bf3f gl_rasterizer/lighting: implement lut input 5 (CP) 2017-06-11 21:30:53 +03:00
Sebastian Valle
39c7c1f580 Merge pull request #2727 from wwylele/spot-light
Fragment lighting: implement spot light
2017-06-11 18:23:47 +00:00
wwylele
b3b9468573 gl_rasterizer_cache: depth write is disabled if allow_depth_stencil_write is false 2017-06-10 15:10:34 +03:00
Yuri Kunde Schlesner
ba01a8302a OpenGL: Update comment on AreQuaternionsOpposite with new information
While debugging the software renderer implementation, it was noticed
that this is actually exactly what the hardware does, upgrading the
status of this "hack" to being a proper implementation. And there was
much rejoicing.
2017-06-10 01:55:17 -07:00
wwylele
28d1e73d2f pica/rasterizer: implement/stub texture wrap mode 4-7 2017-06-04 09:47:25 +03:00
bunnei
54ea95cca7 Merge pull request #2721 from wwylele/texture-cube
swrasterizer: implemented TextureCube
2017-05-30 10:21:05 -04:00
wwylele
10906dceec gl_rasterizer: implement spot light 2017-05-30 10:54:58 +03:00
wwylele
686cbf3ac6 gl_rasterizer: sync spot light status 2017-05-30 10:54:58 +03:00
wwylele
b5addf8fb8 pica: prepare registers for spotlight 2017-05-30 10:54:58 +03:00
Yuri Kunde Schlesner
a4f88c7d7c Merge pull request #2734 from yuriks/cmake-imported-libs
CMake: Use CMake target properties for all libraries
2017-05-29 15:12:21 -07:00
wwylele
0b9bb082c3 swrasterizer: implement TextureCube 2017-05-29 22:28:48 +03:00
wwylele
077cc683e5 pica: add registers for texture cube 2017-05-29 22:03:08 +03:00
Yuri Kunde Schlesner
3df85a103a Merge pull request #2729 from yuriks/quaternion-fix
OpenGL: Improve accuracy of quaternion interpolation
2017-05-28 01:24:06 -07:00
Yuri Kunde Schlesner
d736cca848 CMake: Create INTERFACE targets for microprofile and nihstro 2017-05-27 22:34:52 -07:00
Yuri Kunde Schlesner
4660bc1c78 CMake: Use IMPORTED target for libpng 2017-05-27 20:44:51 -07:00
Yuri Kunde Schlesner
7b81903756 CMake: Correct inter-module dependencies and library visibility
Modules didn't correctly define their dependencies before, which relied
on the frontends implicitly including every module for linking to
succeed.

Also changed every target_link_libraries call to specify visibility of
dependencies to avoid leaking definitions to dependents when not
necessary.
2017-05-27 18:41:24 -07:00
Yuri Kunde Schlesner
eb10f25025 Move screen size constants from video_core to core
video_core didn't even properly use them, and they were the source of
many otherwise-unnecessary dependencies from core to video_core.
2017-05-27 18:41:24 -07:00
Yuri Kunde Schlesner
6665557ff7 OpenGL: Remove unused RendererOpenGL fields 2017-05-27 18:02:46 -07:00
Yuri Kunde Schlesner
669ef82aee OpenGL: Improve accuracy of quaternion interpolation
Current order of operations (rotate then normalize) seems to produce a
lot more distortion than normalizing and then rotating. This makes Citra
results match pretty closesly with hardware, and indicates that hardware
may also be using lerp instead of slerp to interpolate the quaternions.
2017-05-27 00:13:41 -07:00
wwylele
90c8d09098 gl_shader: refactor texture sampler into its own function 2017-05-27 01:56:22 +03:00
Yuri Kunde Schlesner
bae3799bd5 Merge pull request #2697 from wwylele/proctex
Implemented Procedural Texture (Texture Unit 3)
2017-05-24 21:37:42 -07:00
wwylele
36526c63ef swrasterizer: add missing tc0_w and fragment lighting attribute processing 2017-05-21 09:09:15 +03:00
wwylele
4d62e75fb2 gl_rasterizer: implement procedural texture 2017-05-20 13:50:50 +03:00
wwylele
ade45b5b99 pica/swrasterizer: implement procedural texture 2017-05-20 13:50:50 +03:00
wwylele
393fee10a2 pica: use correct register value for shader bool_uniforms
variable value is not masked. the masked and combined register value should be used instead
2017-05-17 22:14:09 +03:00
Yuri Kunde Schlesner
8d558777a6 Merge pull request #2703 from wwylele/pica-reg-revise
pica: correct bit field length for some registers
2017-05-16 10:00:37 -07:00
wwylele
86ee1f6101 pica: correct bit field length for some registers 2017-05-16 19:24:06 +03:00
Jannik Vogel
ba722be2ac Pica: Write GS registers
This adds the handlers for the geometry shader register writes which will call the functions from the previous commit to update registers for the GS.
2017-05-12 16:22:37 +02:00
Jannik Vogel
3fd3775d35 Pica: Write shader registers in functions
The commit after this one adds GS register writes, so this moves the VS handlers into functions so they can be re-used and extended more easily.
2017-05-12 16:22:37 +02:00
Jannik Vogel
925724c990 Pica: Set program code / swizzle data limit to 4096
One of the later commits will enable writing to GS regs.
It turns out that on startup, most games will write 4096 GS program words.

The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages:
```
HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024
```

New constants have been introduced to represent these limits.
The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
2017-05-11 15:01:27 +02:00
wwylele
039b293092 pica: shader_dirty if texture2 coord changed 2017-05-05 15:35:17 +03:00
wwylele
0f664ef89d pica: use correct coordinates for texture 2 2017-05-03 22:12:46 +03:00
bunnei
ea53d6085a Merge pull request #2671 from wwylele/dot3-rgba
rasterizer: implement combiner operation 7 (Dot3_RGBA)
2017-04-21 17:03:22 -04:00
wwylele
2c2e872b31 gl_shader_gen: remove TODO about Lerp behaviour verification. The implementation is verified against hardware 2017-04-20 22:56:07 +03:00
wwylele
b624a95205 rasterizer: implement combiner operation 7 (Dot3_RGBA) 2017-04-19 23:48:10 +03:00
Yuri Kunde Schlesner
52a4489d65 OpenGL: Pass Pica regs via parameter 2017-04-17 10:34:45 -07:00
Yuri Kunde Schlesner
a6fd4533f6 OpenGL: Move PicaShaderConfig to gl_shader_gen.h
Also move the implementation of CurrentConfig to the cpp file.
2017-04-16 21:49:32 -07:00
Yuri Kunde Schlesner
40e28f6217 OpenGL: Move Attributes enum to a more appropriate file 2017-04-16 20:47:04 -07:00
Jannik Vogel
1b397c77fa Pica/Regs: Correct bit width for blend-equations 2017-04-08 18:33:17 +02:00
wwylele
e02c4b7195 Input: remove unused stuff & clean up
1. removed zl, zr and c-stick from HID::PadState. They are handled by IR, not HID
2. removed button handling in EmuWindow
3. removed key_map
4. cleanup #include
2017-03-01 23:30:57 +02:00
Mat M
0cb52ee74a Doxygen: Amend minor issues (#2593)
Corrects a few issues with regards to Doxygen documentation, for example:

- Incorrect parameter referencing.
- Missing @param tags.
- Typos in @param tags.

and a few minor other issues.
2017-02-26 17:58:51 -08:00
Yuri Kunde Schlesner
fb1979d7e2 Core: Re-write frame limiter
Now based on std::chrono, and also works in terms of emulated time
instead of frames, so we can in the future frame-limit even when the
display is disabled, etc.

The frame limiter can also be enabled along with v-sync now, which
should be useful for those with displays running at more than 60 Hz.
2017-02-26 17:22:04 -08:00
Yuri Kunde Schlesner
b285c2a4ed Core: Make PerfStats internally locked
More ergonomic to use and will be required for upcoming changes.
2017-02-26 17:22:03 -08:00
Yuri Kunde Schlesner
3b4e400333 Remove built-in (non-Microprofile) profiler 2017-02-26 17:22:03 -08:00
Yuri Kunde Schlesner
c75ae6c585 Add performance statistics to status bar 2017-02-26 17:22:03 -08:00
Jannik Vogel
e594e63bb5 OpenGL: Check if uniform block exists before updating it (#2581) 2017-02-18 11:46:26 -08:00
Weiyi Wang
e085e6a768 video_core: remove #pragma once in cpp file (#2570) 2017-02-15 00:16:50 -08:00
Yuri Kunde Schlesner
426fda1d52 SWRasterizer: Move more framebuffer functions to file 2017-02-12 18:13:04 -08:00
Yuri Kunde Schlesner
1683cb0ec9 SWRasterizer: Move texturing functions to their own file 2017-02-12 18:12:37 -08:00
Yuri Kunde Schlesner
f9026e8a7a SWRasterizer: Convert large no-capture lambdas to standalone functions 2017-02-12 18:11:05 -08:00
Yuri Kunde Schlesner
e1ad7d69b9 SWRasterizer: Move framebuffer operation functions to their own file 2017-02-12 18:11:03 -08:00
Yuri Kunde Schlesner
e24717bca0 VideoCore: Move software rasterizer files to sub-directory 2017-02-12 18:08:11 -08:00
Yuri Kunde Schlesner
e10b11a5d0 video_core/shader: Document sanitized MUL operation 2017-02-12 13:29:14 -08:00
Yuri Kunde Schlesner
443bb3d522 Merge pull request #2550 from yuriks/pica-refactor2
Small VideoCore cleanups
2017-02-12 12:33:26 -08:00
Yuri Kunde Schlesner
e2fa1ca5e1 video_core: Fix benign out-of-bounds indexing of array (#2553)
The resulting pointer wasn't written to unless the index was verified as
valid, but that's still UB and triggered debug checks in MSVC.

Reported by garrettboast on IRC
2017-02-10 20:51:09 -08:00
Yuri Kunde Schlesner
553e672777 VideoCore: Split u64 Pica reg unions into 2 separate u32 unions
This eliminates UB when aliasing it with the array of u32 regs, and
is compatible with non-LE architectures.
2017-02-09 00:04:25 -08:00
Yuri Kunde Schlesner
bfb1531352 VideoCore: Force enum sizes to u32 in LightingRegs
All enums that are used with BitField must have their type forced to u32
to ensure correctness.
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
af65e1c0a0 OpenGL: Remove unused duplicate of IsPassThroughTevStage
This copy was left behind when the shader generation code was moved to a
separate file.
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
60fc0b086f VideoCore: Split regs.h inclusions 2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
f241bb72f5 Pica/Regs: Use binary search to look up reg names
This gets rid of the static unordered_map. Also changes the return type
const char*, avoiding unnecessary allocations (the result was only used
by calling .c_str() on it.)
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
602f57da38 VideoCore: Use union to index into Regs struct
Also remove some unused members.
2017-02-08 22:13:25 -08:00
Yuri Kunde Schlesner
2889372e47 Merge pull request #2482 from yuriks/pica-refactor
Split up monolithic Regs struct
2017-02-08 22:07:34 -08:00
Lectem
f146a6d45a Use std::array<u8,2> instead of u8[2] to fix MSVC build 2017-02-05 14:55:51 +01:00
Yuri Kunde Schlesner
5759d94b5c VideoCore: Move Regs to its own file 2017-02-04 13:59:12 -08:00
Yuri Kunde Schlesner
f7c7f422c6 VideoCore: Split shader regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
8fca90b5d5 VideoCore: Split geometry pipeline regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
f443c7e5b0 VideoCore: Split lighting regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
23713d5dee VideoCore: Split framebuffer regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
9017093f58 VideoCore: Split texturing regs from Regs struct 2017-02-04 13:59:09 -08:00
Yuri Kunde Schlesner
000e78144c VideoCore: Split rasterizer regs from Regs struct 2017-02-04 13:08:47 -08:00
Yuri Kunde Schlesner
97e06b0a0d Merge pull request #2476 from yuriks/shader-refactor3
Oh No! More shader changes!
2017-02-04 13:02:48 -08:00
Yuri Kunde Schlesner
c74787a11c Pica/Texture: Move part of ETC1 decoding to new file and cleanups 2017-02-04 12:33:28 -08:00
Yuri Kunde Schlesner
09a750e866 Pica/Texture: Simplify/cleanup texture tile addressing 2017-02-04 12:33:25 -08:00
Yuri Kunde Schlesner
a1c9ac7845 VideoCore: Move LookupTexture out of debug_utils.h 2017-02-04 12:31:40 -08:00
wwylele
6dc1d6e568 ShaderJIT: add 16 dummy bytes at the bottom of the stack 2017-02-03 14:53:38 +02:00
Weiyi Wang
0b9c59ff22 Common/x64: remove legacy emitter and abi (#2504)
These are not used any more since we moved shader JIT to xbyak.
2017-01-31 01:06:42 -08:00
Merry
f7e96dc068 shader_jit_x64_compiler: esi and edi should be persistent (#2500) 2017-01-31 00:38:31 -08:00
Yuri Kunde Schlesner
37a4ea046d VideoCore: Make PrimitiveAssembler const-correct 2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
dcdffabfe6 VideoCore: Extract swrast-specific data from OutputVertex 2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
8ed9f9d49f VideoCore/Shader: Clean up OutputVertex::FromAttributeBuffer
This also fixes a long-standing but neverthless harmless memory
corruption bug, whech the padding of the OutputVertex struct would get
corrupted by unused attributes.
2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
92bf5c88e6 VideoCore: Split shader output writing from semantic loading 2017-01-29 21:31:37 -08:00
Yuri Kunde Schlesner
335df895b9 VideoCore: Consistently use shader configuration to load attributes 2017-01-29 21:31:37 -08:00
Yuri Kunde Schlesner
fccb28d2e9 VideoCore: Use correct register for immediate mode attribute count 2017-01-29 21:31:36 -08:00
Yuri Kunde Schlesner
ab6954e942 VideoCore: Rename some types to more accurate names 2017-01-29 21:31:36 -08:00
Yuri Kunde Schlesner
bbc7844021 VideoCore: Change misleading register names
A few registers had names such as "count" or "number" when they actually
contained the maximum (that is, count - 1). This can easily lead to hard
to notice off by one errors.
2017-01-29 21:31:36 -08:00
Kloen
eee37b857b video_core: gl_rasterizer_cache.cpp removed unused type alias 2017-01-30 05:18:28 +01:00
Kloen
6a3a3964b0 video_core: gl_rasterizer.cpp removed unused type alias 2017-01-30 05:16:48 +01:00
Kloen
4652d70572 video_core: silence unused-local-typedef boost related warning on GCC 2017-01-29 21:24:24 +01:00
Yuri Kunde Schlesner
0e9081b973 VideoCore/Shader: Move entry_point to SetupBatch 2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
0f64274145 VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetup 2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
6fa3687afc Shader: Remove OutputRegisters struct 2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
9ea5eacf91 Shader: Initialize conditional_code in interpreter
This doesn't belong in LoadInputVertex because it also happens for
non-VS invocations. Since it's not used by the JIT it seems adequate to
initialize it in the interpreter which is the only thing that cares
about them.
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
1a2acc3baa Shader: Don't read ShaderSetup from global state 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
fa4ac279a7 shader_jit_x64: Don't read program from global state 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
ade7ed7c5f VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngine 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
114d6b2f97 VideoCore/Shader: Split interpreter and JIT into separate ShaderEngines 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
8eefc62833 VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h} 2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
dd4a1672a7 VideoCore/Shader: Split shader uniform state and shader engine
Currently there's only a single dummy implementation, which will be
split in a following commit.
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
bd82cffd0b VideoCore/Shader: Add constness to methods 2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
1e1f939817 VideoCore/Shader: Use only entry_point as ShaderSetup param
This removes all implicit dependency of ShaderState on global PICA
state.
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
e3caf669b0 VideoCore/Shader: Use self instead of g_state.vs in ShaderSetup 2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
34d581f2dc VideoCore/Shader: Extract input vertex loading code into function 2017-01-25 18:53:20 -08:00
Kloen
5cc94c17f6 video_core: fix shader.cpp signed / unsigned warning 2017-01-23 16:53:31 +01:00
Kloen
753fea5d65 video_core: gl_rasterizer float to int warning 2017-01-23 16:53:30 +01:00
Kloen
b6063d9a93 video_core: fix gl_rasterizer warning on MSVC 2017-01-23 16:53:30 +01:00
bunnei
22ad9094e6 config: Add option for specifying screen resolution scale factor. 2017-01-07 03:23:22 -05:00
Jonathan Hao
c18cb1b192 Fix some warnings (#2399) 2017-01-04 13:48:29 -03:00
bunnei
2f746e9946 Merge pull request #2367 from JayFoxRox/lighting-lut-quickfix
Lighting LUT Quickfix
2016-12-29 13:41:51 -05:00
Jannik Vogel
6ed4206f87 Minor cleanup in GLSL code 2016-12-25 21:38:10 +01:00
Jannik Vogel
88f409aec9 Offset lighting LUT samples correctly 2016-12-25 21:37:26 +01:00
MerryMage
64f98f4d0f core: Move emu_window and key_map into core
* Removes circular dependences (common should not depend on core)
2016-12-23 13:42:39 +00:00
bunnei
29564d73bd Merge pull request #2319 from yuriks/profile-scopes
VideoCore: Make profiling scope more representative
2016-12-21 13:33:49 -05:00
Albin Bernhardsson
ddec9cb369 Use GL_TRUE when setting color_mask 2016-12-19 19:06:35 +01:00
bunnei
3a1eaf2efc Merge pull request #2318 from yuriks/trace-opt
VideoCore: Inline IsPicaTracing
2016-12-18 21:15:24 -05:00
Yuri Kunde Schlesner
c135317de1 VideoCore/Shader: Extract DebugData out from UnitState 2016-12-16 00:16:25 -08:00
Yuri Kunde Schlesner
6e7e767645 Remove unnecessary cast 2016-12-16 00:15:55 -08:00
Yuri Kunde Schlesner
b5e3599704 VideoCore/Shader: Extract evaluate_condition lambda to function scope 2016-12-16 00:15:51 -08:00
Yuri Kunde Schlesner
960578f4e1 VideoCore/Shader: Extract call lambda up a scope and remove unused param 2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
e4e962bc7c VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffset 2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
d27cb1dedc VideoCore/Shader: Move DebugData to a separate file 2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
fb9e856b91 shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexing 2016-12-15 10:02:42 -08:00
Yuri Kunde Schlesner
ac9f937477 VideoCore: Make profiling scope more representative 2016-12-14 22:52:09 -08:00
Yuri Kunde Schlesner
945f554b84 VideoCore: Inline IsPicaTracing
Speeds up ALBW main menu slightly (~3%)
2016-12-14 22:06:40 -08:00
Yuri Kunde Schlesner
f00ada3363 VideoCore: Eliminate an unnecessary copy in the drawcall loop 2016-12-14 21:00:29 -08:00
Yuri Kunde Schlesner
5ff3206207 shader_jit_x64: Use Reg32 for LOOP* registers, eliminating casts 2016-12-14 20:06:09 -08:00
Yuri Kunde Schlesner
f4e98ecf3f VideoCore: Convert x64 shader JIT to use Xbyak for assembly 2016-12-14 20:06:08 -08:00
Lioncash
963aedd8cc Add all services to the Service namespace
Previously there was a split where some of the services were in the
Service namespace and others were not.
2016-12-11 00:07:27 +00:00
Markus Wick
d0d49bb951 OpenGL: Drop framebuffer completeness check.
This OpenGL call synchronize the worker thread of the nvidia blob.
It can be verified on linux with the __GL_THREADED_OPTIMIZATIONS=1 environment variable.
Those errors should not happen on tested drivers.
It was used as a workaround for https://bugs.freedesktop.org/show_bug.cgi?id=94148
2016-12-07 22:09:13 +01:00
emmauss
c4e4fa53d9 Implement Frame rate limiter (#2223)
* implement frame limiter

* fixes
2016-12-06 14:33:19 -05:00