1731 Commits

Author SHA1 Message Date
degasus
304adc6e0d IndexGenerator: inline all variables
As we do lots of writes to *Iptr, the compiler isn't allowed to cache any shared variable (neither index nor Iptr itself).
This commit inlines Iptr + index into the index generator functions, so the compiler know that they are const.
2014-01-17 16:34:53 +01:00
degasus
1d6425bd5e IndexGenerator: drop unused variable 2014-01-17 16:34:53 +01:00
degasus
6b01839525 VideoCommon: merge triangle+list+point index buffers
We are used to render them out of order as long as everything else matches, but rendering order does matter, so we have to flush on primitive switch. This commit implements this flush.
Also as we flush on primitive switch, we don't have to create three different index buffers. All indices are now stored in one buffer.

This will slow down games which switch often primitive types (eg ztp), but it should be more accurate.
2014-01-17 16:34:53 +01:00
degasus
770485ad04 VertexLoader: don't check for possible range
I(index) < std::numeric_limits<I>::max() is always true, so we don't have to check it
2014-01-16 22:07:48 +01:00
degasus
5eae39766b enable buffer_storage on nvidia 331.38 on linux
it works fine here, and as the VSH is removed, this is the newest driver.
2014-01-16 17:51:38 +01:00
degasus
331af32038 fixup "Remove the ZTP speedup hack"
This fixes revision b49c09c36b67
2014-01-16 00:26:49 +01:00
Tony Wasserka
f1adc56a56 Remove vertex streaming hack.
NV has buffer_storage, AMD has pinned memory.
Both are better than that hack which shouldn't ever have been introduced in the first place.
2014-01-16 00:11:12 +01:00
Tony Wasserka
b49c09c36b Remove the ZTP speedup hack. Also remove useless debugging code, and a presumably outdated workaround (which was commented out).
Fixes issue 6875.
2014-01-16 00:11:12 +01:00
degasus
5e5db9fbc6 VideoCommon: cleanup of "components" usage
This "u32 components" is a list of flags which attributes of the vertex loader are present.
We are used to append this variable to lots of vertex generation functions, but some of them don't need it at all.
2014-01-15 16:58:36 +01:00
Pierre Bourdon
a561c436fc Change the default GFX backend from D3D11 to OGL.
Rationale and discussion:
    https://ml.dolphin-emu.org/archives/dolphin-dev/2014-January/000003.html
2014-01-14 21:57:32 +01:00
degasus
e00c3ce363 TextureConverter: remove implicit int->float convertion
They was used to check if we're writing to the first or second part of one pixel.
So this is now done with a boolean and a ternary operator.
2014-01-13 12:10:17 +01:00
Scott Mansell
9aff16e7c1 Fix stupid bug in Z16L depth texture efb2ram encoding shader. 2014-01-12 13:32:06 +13:00
Ryan Houdek
e3d103f60c Update some of the comments in DriverDetails.h for drivers that have fixed their bugs. 2014-01-11 07:31:47 -06:00
Ryan Houdek
67f099af33 Enable buffer_storage for Nvidia drivers 332.21 and above. 2014-01-09 12:06:12 -06:00
degasus
eb310cbd1d VideoCommon: disable efb access + perf querys on cph thread
The usual way to handle this kind of request is to rise a flag which the gpu thread polls.
The gpu thread itself either generates the result or just write zeros if disabled.
After this, it rise another flag which says that this work is done.

So if disabled, we still have the cpu-gpu round trip time. This commit just returns 0 on the cpu thread
instead of playing ping pong...
2014-01-09 18:37:59 +01:00
Ryan Houdek
b55a4bb087 Slight optimization in the pixel shader. We are using pow(2.0, X) in place of exp2(X). This can be faster in places that don't optimize a pow to a exp2 in this case.
Notice this from here: http://cgit.freedesktop.org/mesa/mesa/commit/?id=847bc36a38d42967ad6bf0492fe90a4892d9d799
Intel Haswell GPU is 24 cycles for POW and 14 cycles for EXP2.
Maybe other GPUs don't optimize this either. Just be safe.
2014-01-08 16:40:31 -06:00
Ryan Houdek
7acc64eb0a [Android] Reenable the bug for dynamic UBO array member accesses.
Some information on this bug since this isn't quite true.
Seemingly with the v53 driver, Qualcomm has actually fixed this bug. So we can dynamically access UBO array members.
The issue that is cropping up is actually converting our attribute 'fposmtx' to an integer.
int posmtx = int(fpostmtx);
This line causes some seemingly garbage values to enter in to the posmtx variable.
Not sure exactly why it is failing, probably them just not actually converting the float to an integer and just handling the float directly as a integer.
So the bug is going to stay active with Qualcomm devices until we convert this vertex attribute from a float to a integer.
2014-01-07 07:56:30 -06:00
degasus
e6676b4565 OpenGL: fix scaled efb2ram copys
This fix a regression in revision 687097d4bc1ee2f8ee60011280823e01907a986a because of the wrong order of moving the sampled rect and scaling.
2014-01-05 18:19:17 +01:00
Pierre Bourdon
ed67d1ae2f Fix the Zelda: The Wind Waker heat effect glitch.
Let's talk a bit about this bug. 12nd oldest bug not fixed in Dolphin, it was a
lot of fun to debug and it kept me busy for a while :)

Shoutout to Nintendo for framework.map, without which this could have taken a
lot longer.

Basic debugging using apitrace shows that the heat effect is rendered in an
interesting way:
* An EFB copy texture is created, using the hardware scaler to divide the
  texture resolution by two and that way create the blur effect.
* This texture is then warped using indirect texturing: a deformation map is
  used to "move" the texture coordinates used to sample the framebuffer copy.

Pixel shader: http://pastie.org/private/25oe1pqn6s0h5yieks1jfw

Interestingly, when looking at apitrace, the deformation texture was only 4x4
pixels... weird. It also does not have any feature that you would expect from a
deformation map. Seeing how the heat effect glitches, this deformation texture
being wrong looks like a good candidate for the problem. Let's see how it's
loaded!

By NOPing random calls to GXSetTevIndirect, we find a call that when removed
breaks the effect completely. The parameters used for this call come from the
results of methods of JPAExTexShapeArc objects. 3 different objects go through
this code path, by breaking each one we can notice that the one "controlling"
the heat effect is the one at 0x81575b98.

Following the path of this object a bit more, we can see that it has a method
called "getIndTexId". When this is called, the returned texture ID is used to
index a map and get a JPATextureArc object stored at 0x81577bec.

Nice feature of JPATextureArc: they have a getName method. For this object, it
returns "AK_kagerouInd01". We can probably use that to see how this texture
should look like, by loading it "manually" from the Wind Waker DVD.
Unfortunately I don't know how to do that. Fortunately @Abahbob got me the
texture I wanted in less than 10min after I asked him on Twitter.
AK_kagerouInd01 is a 32x32 texture that really looks like a deformation map:
http://i.imgur.com/0TfZEVj.png . Fun fact: "kagerou" means "heat haze" in JP.

So apparently we're not using the right texture object when rendering! The
GXTexObj that maps to the JPATextureArc is at offset 0x81577bf0 and points to
data at 0x80ed0460, but we're loading texture data from 0x0039d860 instead.

I started to suspect the BP write that loads the texture parameters "did not
work" somehow. Logged that and yes: nothing gets loaded to texture stage 1! ...
but it turns out this is normal, the deformation map is loaded to texture stage
5 (hardcoded in the DOL). Wait, why is the TextureCache trying to load from
texture stage 1 then?!

Because someone sucked at hex.

Fixes issue 2338.
2014-01-05 11:33:15 +01:00
degasus
c42f274e22 OpenGL: use shader 420pack if available to staticly bind ubo location
Bindung locations after compiling a shader stalls the driver. So if we manage not to bind anything after compilation, the lag would be reduced much.
2014-01-05 10:38:45 +01:00
degasus
4fff5ac90d OpenGL: drop UBO-workaround usage for efb2ram shaders
It's just brainfuck to use this workaroung there. Just fetch the uniform location like all other util shaders.
2014-01-05 09:52:26 +01:00
degasus
01351795f0 TextureCache: Warn for invalid custom textures
At the moment, custom textures with:
- invalid mipmap size
- invalid aspect ratio
- non-fractional scaling factors
are allowed. But they can't be loaded fine by the backend, so generate a warning if someone trys to load them.
2014-01-03 14:30:12 +01:00
degasus
0f0a3cc509 ogl: clamp to edge for out of bound efb access
fixes issue 6898

OpenGL defaults are GL_REPEAT, which is even more unlikely than GL_CLAMP_TO_EDGE.
As I can't test the behavoir of the real hardware, I changed it to how it works before,
but I guess just clip the texture makes more sense.
2014-01-03 08:15:19 +01:00
Ryan Houdek
1118226f27 Merge branch 'master' into buffer_storage 2013-12-31 19:18:30 -06:00
Ryan Houdek
8d8b0fc884 Merge branch 'master' into buffer_storage
Conflicts:
	Source/Core/VideoBackends/OGL/Src/Render.cpp
	Source/Core/VideoCommon/Src/DriverDetails.cpp
	Source/Core/VideoCommon/Src/DriverDetails.h
2013-12-31 15:41:50 -06:00
Jasper St. Pierre
34692ab826 Remove unnecessary Src/ folders 2013-12-31 14:03:19 -05:00
Jasper St. Pierre
43e618682e Convert all vcxproj files to UNIX line endings 2013-12-31 14:03:18 -05:00
Ryan Houdek
6d63db96e9 Disable primitive restart on buggy OS X Intel HD 3000 drivers. 2013-12-30 18:26:55 -06:00
Tony Wasserka
3aa0a63fe6 VertexShaderGen: Remove Sonic Unleashed hack. Doesn't seem to be required anymore.
Either way, even if it's still needed for anything, this is not the correct way to fix the issue.
2013-12-30 20:28:07 +01:00
NeoBrainX
3cfa04b5cf VertexShaderManager: Remove a hardcoded projection hack. 2013-12-30 19:26:10 +00:00
Ryan Houdek
ce99921c20 [buffer_storage] Implement ARB_buffer_storage. Disable it for GL_ARRAY_BUFFER due to a bug in Nvidia's drivers that causes black screen with it. 2013-12-27 10:56:03 -06:00
Ryan Houdek
e697d7a2dd [Android] Work around Qualcomm's broken garbage in their v53 drivers. This doesn't fix the issue, just a work around. This is the stupidest issue coming from Qualcomm. Now Dolphin Mobile won't crash immediately, but there are new SPS issues. 2013-12-19 17:30:39 -06:00
Ryan Houdek
945b903499 Work around AMD's broken Linux drivers when it comes to pinned memory and base_vertex usage. It seems that using pinned memory with base_vertex disabled is quicker than the other way around. 2013-12-19 09:40:13 -06:00
Ryan Houdek
a35b62358a [Android] Things fixed in Qualcomm driver v53. GLSL Centroid usage. SHADER_INFO_LOG reporting 0 at all times. Some crazy nonsense that broke the FPS counter. Those are all fixed. glBufferSubData still makes the device do a OOM error, and is still stupidly slow to use. Many more bugs remain in this latest Qualcomm driver. 2013-12-18 22:23:26 -06:00
Ryan Houdek
eb3b933dd0 Remove all instances of OpenCL in the Dolphin Project. A brief history of OpenCL in Dolphin. OpenCL was originally added to the Dolphin codebase 1 month after it was released with OS X Snow Leopard in 2009. OpenCL was one of the largest group projects that Dolphin ever has had. The OpenCL texture decoder was originally aded with version 1.0 of the OpenCL spec; This version didn't have the capability of a OpenCL-OpenGL interop which would allow for uploading textures once and have it decoded directly to a OpenGL texure. This was to be worked out when the OpenCL 1.1 spec was released and allowed the interop. This work has never been done, and no one in the team is willing to work on it for various reasons. OpenCL has had the unreasonable expectation that it increases the performance of video games that require a large amount of EFB copies like NSMBW. In reality, enabling OpenCL just put the graphics card in a higher power mode which increased the game speed. This is due to the unfortunate effect of Dolphin tending to not push GPUs out of their lower frequency power savings modes. Thanks to everyone that had contributed to the OpenCL texture decoder. 2013-12-11 15:15:55 -06:00
Tony Wasserka
c9d9081bf9 Use less brain damaged names for DLCache and TextureDecoder. 2013-12-11 20:35:12 +01:00
degasus
2d8515c0cf VideoCommon: remove outdated copy of OGL::VertexManager::vFlush 2013-12-09 23:49:09 +01:00
degasus
42619c1d2d Merge branch 'ogl-tex2d'
Conflicts:
	Source/Core/VideoBackends/OGL/Src/TextureConverter.cpp
2013-12-09 13:04:14 +01:00
degasus
687097d4bc OGL: use integer uniforms for efb2ram texture converter 2013-12-09 12:33:50 +01:00
Ryan Houdek
14d9802ea4 Oops. Fix a typo in the DriverDetails change. 2013-12-06 12:18:20 -06:00
Ryan Houdek
faf8792351 Support OS specific bugs in our DriverDetails. 2013-12-05 09:32:27 -06:00
degasus
2cbefa2905 PixelShaderManager: clear s_bViewPortChanged flag
This flag wasn't cleared at all, so we set our constants dirty every time...

This could fix some performance regressions because of revision 6798a4763e9c
2013-12-03 09:37:45 +01:00
degasus
69137cff4c Merge X11+D3D FreeLook feature into DolphinWX
This removes the redundant code and also implements this feature for OSX and Wayland.
But so it's dropped for non-wx builds...

imo DolphinWX still isn't the best place for this, but now it's in the same file as all other hotkeys. Maybe they'll be moved to InputCommon sometimes at once ...
2013-11-29 06:09:54 +01:00
degasus
11973d31c1 TextureConverter: remove WriteIncrementSampleX 2013-11-25 17:11:41 +01:00
Ryan Houdek
421fd0e16e Fix OpenGL ES 3. 2013-11-25 15:36:24 +00:00
degasus
64a1969e36 TextureConverter: fix scoping 2013-11-25 16:34:08 +01:00
degasus
2a2f2fd4eb TextureConvertion: merge Write*Swizzler 2013-11-25 16:19:08 +01:00
degasus
6750a81972 TextureConverter: Use integer math for swizzling
also move int(efb_coord) -> float(ogl_fb_coord) into WriteSampleColor
2013-11-25 15:49:13 +01:00
degasus
bcb31b09d3 TextureConverter: Use gl_FragCoord instead of uv0 2013-11-25 15:01:18 +01:00
degasus
a289e0604f TextureConverter: remove D3D9 foo
This file is in VideoCommon, but as D3D11 doesn't use it and D3D9 is dropped, it's time to clean up.
2013-11-25 14:53:44 +01:00