rpcsx/rpcs3/Emu/RSX/Common
linkmauve cfd5cf6bdb Optimise primitive_restart::upload_untouched() (#6881)
* rsx: Optimise primitive_restart::upload_untouched() with SSE4.1

This optimisation is only applied when skip_restart is false.

I’ve only tested the u16 codepath, as it is the one used in NieR.

In some very unscientific profiling, this function used to take 2.76% of
the total frame time at the save point of the port town, it now takes
about 0.40%.

* rsx: Mark all SSE4.1 functions with attributes on gcc and clang

This assures the compiler we will take care of only calling these
functions after having checked that the CPU does support these
instructions.

* rsx: Add an AVX2 implementation of primitive restart ibo upload

* rsx: Remove redefinition of SSE4.1 instructions

Now that clang is aware that our functions are compiled with SSE4.1, it
lets us generate this code using its intrinsics.

* rsx: Optimise vector to scalar conversion

This is done using minpos and srli intrinsics and generate less code
than before.

Thanks Nekotekina for the suggestion!
2019-10-30 16:42:44 +03:00
..
BufferUtils.cpp Optimise primitive_restart::upload_untouched() (#6881) 2019-10-30 16:42:44 +03:00
BufferUtils.h Remove unnecessary header includes 2019-06-25 17:11:10 +03:00
FragmentProgramDecompiler.cpp rsx: Copypasta fix 2019-10-23 00:50:24 +03:00
FragmentProgramDecompiler.h rsx/fp: Warnings cleanup 2019-09-01 18:59:50 +03:00
GLSLCommon.h rsx: Add support for delayed shader discard. 2019-10-22 13:44:49 +03:00
GLSLTypes.h rsx: Add support for delayed shader discard. 2019-10-22 13:44:49 +03:00
ProgramStateCache.cpp rsx/decompiler: Restructure program register behavior 2019-08-26 20:03:31 +03:00
ProgramStateCache.h rsx/prog: Warnings cleanup 2019-09-01 18:59:50 +03:00
ring_buffer_helper.h rsx/ring_buffer: Warnings cleanup 2019-09-01 18:59:50 +03:00
ShaderParam.cpp RSX: Add a class factorizing decompiler code 2015-05-23 20:45:07 +02:00
ShaderParam.h rsx/decompiler: Restructure program register behavior 2019-08-26 20:03:31 +03:00
surface_store.cpp EXCEPTION macro removed 2016-08-08 19:19:32 +03:00
surface_store.h rsx: Explicity describe transfer regions for both source and destination blocks 2019-10-04 18:10:46 +03:00
surface_utils.h rsx: Explicity describe transfer regions for both source and destination blocks 2019-10-04 18:10:46 +03:00
TextGlyphs.h rsx: TextGlyphs optimizations 2019-06-09 23:09:11 +01:00
texture_cache.h rsx: Separate subresource_layout:dim_in_block and 2019-10-29 20:03:54 +03:00
texture_cache_checker.h Texture cache cleanup, refactoring and fixes 2018-09-24 15:26:40 +03:00
texture_cache_helpers.h rsx: Fixup for slice gathering for structures with multiple mipmap levels 2019-10-17 18:18:00 +03:00
texture_cache_predictor.h Add missing #includes to header files 2019-06-25 17:11:10 +03:00
texture_cache_utils.h rsx: Experiments with nul sink 2019-09-12 23:32:21 +03:00
TextureUtils.cpp rsx: Separate subresource_layout:dim_in_block and 2019-10-29 20:03:54 +03:00
TextureUtils.h rsx: Separate subresource_layout:dim_in_block and 2019-10-29 20:03:54 +03:00
VertexProgramDecompiler.cpp rsx/vp: Warnings cleanup 2019-09-01 18:59:50 +03:00
VertexProgramDecompiler.h rsx/vp: Warnings cleanup 2019-09-01 18:59:50 +03:00