161 Commits

Author SHA1 Message Date
Jordan Woyak
f2647a1216 Implement OGL sampler cache. Allows binding a texture multiple times with different parameters. Also possibly gives a very small speed improvement. 2013-02-19 21:18:53 -06:00
degasus
a629dea4dc Merge branch 'master' into GLSL-master
Conflicts:
	CMakeLists.txt
	Source/Core/DolphinWX/Dolphin.vcxproj
	Source/Core/DolphinWX/Src/GLInterface/WX.cpp
	Source/Core/DolphinWX/Src/GLInterface/WX.h
	Source/Core/VideoCommon/Src/TextureCacheBase.cpp
	Source/Core/VideoCommon/Src/TextureCacheBase.h
	Source/Plugins/Plugin_VideoDX11/Src/TextureCache.cpp
	Source/Plugins/Plugin_VideoDX11/Src/TextureCache.h
	Source/Plugins/Plugin_VideoDX9/Src/TextureCache.cpp
	Source/Plugins/Plugin_VideoDX9/Src/TextureCache.h
	Source/Plugins/Plugin_VideoOGL/Src/Render.cpp
	Source/Plugins/Plugin_VideoOGL/Src/TextureCache.cpp
	Source/Plugins/Plugin_VideoOGL/Src/TextureCache.h
	Source/Plugins/Plugin_VideoSoftware/Src/SWmain.cpp

damn mipmap_fixes ...
2013-02-18 18:49:20 +01:00
Jordan Woyak
d994e56b60 Changes/cleanup to TextureCache::Load and other mipmap related code.
The significant change is what is now line 520 of TextureCacheBase.cpp:
((std::max(mipWidth, bsw) * std::max(mipHeight, bsh) * bsdepth) >> 1)
to
TexDecoder_GetTextureSizeInBytes(expanded_mip_width, expanded_mip_height, texformat);

Fixes issue 5328.
Fixes issue 5461.
2013-02-15 22:56:29 -06:00
degasus
bbc292c210 merge Vertex and PixelShaderCache into ProgramShaderCache
this is the first step, uniform handling is still missing
2013-02-13 13:12:19 +01:00
NeoBrainX
ed0abc9dc5 Merge branch 'mipmap_fixes'. 2013-02-07 20:40:33 +01:00
degasus
76adc77fa6 bigger buffers 2013-02-05 18:01:27 +01:00
degasus
3af9840a4c stream by map and sync
but not working perfectly, so disabled
2013-02-01 15:15:25 +01:00
degasus
878bd7f26c implement streaming by bufferSubData, split upload and allocation in ringbuffer 2013-02-01 12:30:08 +01:00
degasus
30170575c8 create StreamBuffer class for ogl upload 2013-01-31 23:11:53 +01:00
degasus
fd06342a97 set hint GL_STREAM_READ
it's wrong, but so we are guaranteed to get pinned memory.
it's slower for rendering, but faster for mapping.
2013-01-28 13:03:31 +01:00
degasus
41b1128fdd orphan vbo also with glBufferData 2013-01-25 13:28:05 +01:00
degasus
e0ffdda26e Merge branch 'immediate-removal' into GLSL-master
Conflicts:
	Source/Core/VideoCommon/Src/PixelShaderGen.cpp
	Source/Plugins/Plugin_VideoSoftware/Src/SWRenderer.cpp

immediate-removal is a new created branch seperated from master but reverted the revert of immediate-removal
so we get less conflicts by merging
2013-01-24 16:58:28 +01:00
degasus
d60cc373d1 Revert "Revert 30dd9c2 e9d00bf db5f4c8 and bff0fae"
This reverts commit d0301ca89d318d8d6e3e7fde9c4b60e470e184e4.

Conflicts:
	.gitignore
2013-01-24 16:11:07 +01:00
degasus
e7d5b274c0 add stage parameter for texture load, so ogl can bind to the correct sampler 2013-01-19 00:47:48 +01:00
degasus
714ff50fdf set blending if dual source might be triggered 2013-01-18 00:44:35 +01:00
degasus
2f78986e2c Merge branch 'Graphic_Update' into GLSL-master
Conflicts:
	Source/Core/VideoCommon/Src/VertexManagerBase.cpp
	Source/Plugins/Plugin_VideoOGL/Src/NativeVertexFormat.cpp
	Source/Plugins/Plugin_VideoOGL/Src/Render.cpp
	Source/Plugins/Plugin_VideoOGL/Src/VertexManager.cpp
2013-01-14 21:36:31 +01:00
degasus
5fe3def64c videoConfig cleanup 2013-01-14 20:00:33 +01:00
degasus
c3aafc77b3 upload complete uniform buffer at once
this is the way of dx11. it would upload more per draw, but uses less calls.
will be faster if many uniforms are changed, but slower else
2013-01-14 13:58:11 +01:00
NeoBrainX
d26bcb0847 Move alpha pretest to BPMemory.h and rename a bunch of alpha testing related stuff 2013-01-08 18:56:01 +01:00
Ryan Houdek
d0301ca89d Revert 30dd9c2 e9d00bf db5f4c8 and bff0fae 2013-01-07 13:47:34 -06:00
degasus
b38b62afc6 remove glsl binding support. convert every shader to version 130 2013-01-02 16:56:08 +01:00
degasus
70c63ce6cf fix dual pass alpha 2012-12-28 14:24:12 +01:00
degasus
193056493a also use shaderCaches in rasterFont 2012-12-28 00:52:44 +01:00
degasus
316a33d1e6 Merge branch 'master' into GLSL-master
Conflicts:
	Source/Core/DolphinWX/Src/VideoConfigDiag.h
	Source/Plugins/Plugin_VideoOGL/Src/GLUtil.h
	Source/Plugins/Plugin_VideoOGL/Src/Render.cpp
	Source/Plugins/Plugin_VideoOGL/Src/TextureCache.cpp
	Source/Plugins/Plugin_VideoOGL/Src/TextureConverter.cpp
2012-12-27 10:36:54 +01:00
Ryan Houdek
9209253e0d Initial removal of Nvidia CG. Still some more cleanup to go 2012-12-24 11:09:52 -06:00
degasus
1919a458e8 only use one buffer, orphaning should do the rest 2012-12-15 17:28:58 +01:00
degasus
ba8264c2ac use VAO in VertexManager
to use VAO, we must use VBO, so some legency code was removed:

- ARB_map_buffer_range must be available (OGL 3.0), don't call glBufferSubData if not
- ARB_draw_elements_base_vertex also (OGL 3.2), else we have to set the pointers every time
- USE_JIT was removed, it was broken and it isn't needed any more

And the index and vertex buffers are now synchronized, so that there will be one VAO per
NativeVertexFormat and Buffer.
2012-12-15 14:43:01 +01:00
degasus
6864b40e26 reset glEnableClientState befor every draw
should be done with VAO, but atm, this is not possible :-(
this also partial revert the fix in fb92c338af83 (activating texture0 globally).

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
2012-12-13 15:27:46 -06:00
rodolfoosvaldobogado
3936c06ee8 more fixes for my last commit, player problem in twin snakes is fixed 2012-11-11 22:39:27 -03:00
rodolfoosvaldobogado
53b62ab169 This should fix the bugs introduced by my last commit please retest the games that where showing graphical glitches. 2012-11-10 11:37:08 -03:00
rodolfoosvaldobogado
ee72852491 implement some code to reduce the amounts of calls to setup vertex format, in d3d9 it gives no noticeable speedup, in opengl it still does not work right.
thanks to neobrain for the idea
2012-10-26 23:18:09 -03:00
rodolfoosvaldobogado
eaa1ea71c1 Implement the new buffer approach in opengl. sadly in my machine it gives my only 2 more fps and if your hardware does not support ARB_map_buffer_range is even slower than plain vertex arrays.
change naming in all the backends vertex managers to make more easy to continue with the merge an some future improvements.
please test this as i'm interested in knowing the performance in linux and windows with the different hardware platforms.
2012-10-26 11:34:02 -03:00
NeoBrainX
52963584f0 [bugfix] Make sure to specify the correct max lod level. 2012-10-20 21:07:03 +02:00
rodolfoosvaldobogado@gmail.com
5230146c73 Hey, long time no commits :).
So to compensate lets bring back some speed to the emulation.
change a little the way the vertex are send to the gpu,
This first implementation changes dx9 a lot and dx11 a little to increase the parallelism between the cpu and gpu.
ogl: is my next step in ogl is a little more trickier so i have to take a little more time.
the original concept is Marcos idea, with my little touch to make it even more faster.
what to look for: SPEEEEEDDD :).
please test it a lot and let me know if you see any problem.
in dx9 the code is prepared to fall back to the previous implementation if your card does not support the amount of buffers needed.
So if you did not experience any speed gains you know where is the problem :).
for the ones with more experience and compression of the code please test changing the amount and size of the buffers to tune this for your specific machine.
The current values are the sweet spot for my machine.
All must Thanks Marcos, I hate him for giving good ideas when I'm full of work.
2012-10-20 10:22:15 -03:00
Ryan Houdek
2e15440896 Add support for Dual source blending to older ATI cards that don't support 420pack but do support GL_ARB_blend_func_extended. This is more proper as well anyways. 2012-10-09 23:56:00 -05:00
Ryan Houdek
e3854ded73 Woops, better not forget the ing 2012-10-09 23:54:18 -05:00
Ryan Houdek
4cd748bbec Remove some warnings in ProgramShadercache, Was using wrong variable for checking dual source blending. 2012-10-09 23:54:18 -05:00
Shawn Hoffman
31a8424bcc fix formatting uglies introduced in glsl-master branch 2012-10-09 23:54:17 -05:00
Ryan Houdek
9996f27120 Give OSX users more of a chance of supporting Single pass DSB in the future. 2012-10-09 23:42:41 -05:00
Ryan Houdek
f8d0c28e53 Set Sampler values at program make time instead of every frame. Fix an issue when The user had UBO support but not Binding support. 2012-10-09 23:42:40 -05:00
Jordan Woyak
d70726b035 glMapBuffer was slow, go back to glBufferSubData, single combined ps/vs ubo now 2012-10-09 23:41:48 -05:00
Jordan Woyak
e641ede232 make use of glMapBuffer to set ubo data 2012-10-09 23:41:48 -05:00
Ryan Houdek
4a84c6f742 Add in UBOs, doesn't work yet. Still debugging here. 2012-10-09 23:41:05 -05:00
Ryan Houdek
d83ead5914 Support Dual Source Blending in OGL plugin with GLSL. 2012-10-09 23:39:16 -05:00
Ryan Houdek
7cec31dbf3 Almost there. 2012-10-09 23:33:02 -05:00
Ryan Houdek
a357c77257 Add in GLSL setting again.
PS and VS making. Untested and won't work for now.

Add in program shader cache files.

Readd NativeVertexFormat stuffs.

Add in PS and VS cache things.

SetShaders in places.

Fixed EFB cache index computations in OpenGL renderer.

The previous computation was very likely to go out of array bounds,
which could result in crashes on EFB access.

Also, the cache size was rounded down instead of up. This is a problem
since EFB_HEIGHT (528) is not a multiple of EFB_CACHE_RECT_SIZE (64).
2012-10-09 23:23:37 -05:00
NeoBrainX
08a9c66037 Revert the recent zcomploc changes including the Graphic_Fixes merge.
Reason:
- It's wrong, zcomploc can't be emulated perfectly in HW backends without severely impacting performance.
- It provides virtually no advantages over the previous hack while introducing lots of code.
- There is a better alternative: If people insist on having some sort of valid zcomploc emulation, I suggest rendering each primitive separately while using a _clean_ dual-pass approach to emulate zcomploc.

This reverts commit 0efd4e5c29766ba5f5d22204339637ade9ccec83.
This reverts commit b4ec836aca4392a86b864dc58b1030ca616fe0d5.
This reverts commit bb4c9e2205d4117f48fd4ca50774ee56c28c92e4.
This reverts commit 146b02615c07dd52dddaa18b7e23d09bc23b549e.
2012-08-10 20:12:02 +02:00
skidau
0efd4e5c29 Skipped the ZCompLoc pass if the result can be determined at compile time. Brings back the speed lost by r146b02615c07. 2012-08-06 09:29:01 +10:00
NeoBrainX
b5ad382b07 Fast mipmaps deserves to die!! 2012-06-08 00:22:57 +02:00
skidau
146b02615c Merge rodolfoosvaldobogado's zcomploc code (Graphic_Fixes branch) 2012-05-26 13:47:07 +10:00