Commit graph

924 commits

Author SHA1 Message Date
kd-11 b7470cfc1a rsx: Tighten format checks in cache hit tests 2019-04-09 13:40:54 +03:00
kd-11 443fde760f rsx: Blit engine clipping fixes
- Do not round up sub-pixel offsets, round down instead
- Do not allow incomplete sources for hw blit transfer
- Reimplement src clipping (slice_h)
- Check 'area' of incoming texels and correct for them before RTT lookup/transfer
- Filter out incomplete targets when performing RTT lookup (1 texel or less contribution)
2019-04-09 13:40:54 +03:00
kd-11 41b87cf577 rsx: Blit engine fixes
- If a transfer writes to a RTT and depth mismatch happens, create a local target and the upload function will likely resolve between the two
- If a surface is rejected, reset the target region!
2019-03-22 21:27:15 +03:00
kd-11 86ad204636 rsx: Rebase output region when using upload-fallback path 2019-03-22 21:27:15 +03:00
kd-11 dbc8e70ddd rsx: Silence some compiler noise 2019-03-22 21:27:15 +03:00
kd-11 adc59f9810 rsx: Fix blit transfers when texel sizes mismatch
- Also refactors some bpp handling code
- Simplify texture intersection test to use a normalized/uniform coordinate space
- Fix broken bounds checking as well
2019-03-22 21:27:15 +03:00
kd-11 03fca73cf4 rsx: Fix blit intersection falling outside the available texture
- Just becaue we have a hit inside the tile of interest does not guarantee that it sits inside the texture!
2019-03-20 10:05:54 +03:00
kd-11 3ef16bee47 rsx: Fix texture lookups and avoid out-of-bounds copies/transfers 2019-03-17 21:50:11 +03:00
kd-11 bb65e45614 rsx: Implement GPU acceleration for rotated images 2019-03-17 21:50:11 +03:00
kd-11 5260f4b47d rsx: Improvements to memory flush mechanism
- Batch dma transfers whenever possible and do them in one go
- vk: Always ensure that queued dma transfers are visible to the GPU before they are needed by the host
  Requires a little refactoring to allow proper communication of the commandbuffer state
- vk: Code cleanup, the simplified mechanism makes it so that its not necessary to pass tons of args to methods
- vk: Fixup - do not forcefully do dma transfers on sections in an invalidation zone! They may have been speculated correctly already
2019-03-17 21:50:11 +03:00
kd-11 385485204b vk/gl: Omit unlocked data when grabbing flip sources from texture cache 2019-03-17 21:50:11 +03:00
kd-11 74eeacd091 vk/gl: Improve memory tag sync and test
- Properly pass parameters such as rsx-pitch to the surface store
- Do not crash if a surface fails verification in flip, use fall-back instead
2019-03-17 21:50:11 +03:00
kd-11 1a44446250 rsx: Fix dst upload block region
- The section needed starts at image origin, not transfer origin!
2019-03-17 21:50:11 +03:00
kd-11 a49a0f2a86 vk/gl: Synchronization improvements
- Properly wait for the buffer transfer operation to finish before map/readback!
- Change vkFence to vkEvent which works more like a GL fence which is what is needed.
- Implement supporting methods and functions
- Do not destroy fence by immediately waiting after copying to dma buffer
2019-03-17 21:50:11 +03:00
kd-11 85cb703633 rsx/cache: Debugging bugs introduced by the atlas coverage check
- Figured out why it breaks things, ofc can't actually check for coverage when there is no proper fbo data persistence
2019-03-17 21:50:11 +03:00
kd-11 3a4083263e rsx: Fix texture transfer when pitch does not match exactly 2019-03-17 21:50:11 +03:00
kd-11 612160a8ff rsx: Fix zero-pitch textures
- Assumption here is that only texel (0, 0) is accessible. Inline with other pitch 0 operations.
- TODO: Verify pitch 0 does not advance in Y either
2019-03-17 21:50:11 +03:00
kd-11 17c49d21a5 rsx/blit: Remove workarounds/hacks added for master. Start implementation/stubs for blit engine rotations in GPU 2019-03-17 21:50:11 +03:00
kd-11 745f8f9627 rsx: Remove pointless assert 2019-03-17 21:50:11 +03:00
kd-11 358558aaa7 cleanup and fixups 2019-03-10 16:09:05 +03:00
kd-11 04dda44225 rsx: Properly generate render target data with all parameters provided
- Build-up to variable-sized framebuffers and AA implementation
- Also allows accurate range calculation for our hit testing
2019-03-10 16:09:05 +03:00
kd-11 21bc6c7a87 rsx: Properly resolve data for upload when needed.
- Avoids blindly reusing blit dst sections as they may contain garbage.
  If a section was unlocked for a flush, just discard it as its reuse introduces potential data corruption.
  Since the data needs to be reuploaded anyway (for now), its better to start afresh
- In case of format mismatch, reset the calculated dst block
- Add a bounds check to determine if data contained in an atlas is good enough for sampling the cache.
  If not enough data is provided, fall back to full upload
2019-03-10 16:09:05 +03:00
kd-11 9d4d3d9443 rsx: Reimplement render target intersection tests when using hw accelerated blit engine
- Properly collapse memory tree when scanning in case of overlaps!
2019-03-10 16:09:05 +03:00
kd-11 7c379432dd rsx: Implement proper pitch compatibility lookup
- When a single row is required or is all that is available, pitch has
no meaning as the coordinate space changed to 1D
2019-03-10 16:09:05 +03:00
kd-11 dccb4a4888 rsx/texture_cache: fixes to commit_framebuffer_memory 2019-03-10 16:09:05 +03:00
kd-11 b9e7b085fe rsx/texture_cache: Fixups for local resource hit and fast-path added 2019-03-10 16:09:05 +03:00
kd-11 10dc3dadee rsx/texture_cache: Improve framebuffer memory locking when WCB/WDB is not enabled
- Adds a new mode that removes non-framebuffer stuff inside framebuffer range
2019-03-10 16:09:05 +03:00
kd-11 563e205a72 rsx/texture_cache: Fix 'AA' scaling hack and restore collection template selection 2019-03-10 16:09:05 +03:00
kd-11 fa628f0ac4 rsx/surface_store: More aggressive tag sampling
- Use a 5-point tap with an X pattern across the target's memory space to reduce chances of false positives
- TODO: Potential false positives identified, requires some minor
restructuring of surface_store
2019-03-10 16:09:05 +03:00
kd-11 3a071a9c07 rsx: Texture search rewrite
- Perform a full search across all resource types as needed without
taking too many shortcuts/hacks
2019-03-10 16:09:05 +03:00
kd-11 6ef9dcd62e rsx: Handle mismatched/invalidated framebuffer sections when WCB is enabled 2019-03-10 16:09:05 +03:00
kd-11 ef071ebb6b rsx: Synchronize surface cache and texture cache data
- TODO: The whole upload_texture thing is a big hack, fix it properly
2019-03-10 16:09:05 +03:00
kd-11 2163a59649 rsx: Typo fix 2019-01-25 14:34:22 +03:00
kd-11 fb778e4821 rsx: Reimplement attrib divisor 2019-01-25 14:34:22 +03:00
kd-11 736415fcd9 rsx/fp: Detect broken/NOP shaders automatically
- Do not compile body if the shader is of no consequence, leave as a passthrough shader
2019-01-25 14:34:22 +03:00
kd-11 6fdc0fd7f0 rsx: Reimplement MSAA transparency
- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
2019-01-25 14:34:22 +03:00
kd-11 417a2e6731 rsx: Refactor index buffers
- Index offset is ignored anyway and only used to calculate vertex attribute divisor index
- Specialized optimization for untouched xfer without primitive restart
2019-01-25 14:34:22 +03:00
Nekotekina bd9131ae1c Implement fs::get_cache_dir
Win32: equal to config dir for now
Linux: respect XDG_CACHE_HOME if specified
OSX: possibly incomplete
2019-01-13 14:45:36 +03:00
kd-11 52ac0a901a rsx: improve memory coherency
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
2019-01-06 10:44:40 +03:00
kd-11 89c9c54743 rsx: Minor hot-fix
- Pitch 0 makes sense if width == 1 and height == 1
2019-01-06 10:44:40 +03:00
kd-11 2a62fa892b rsx: Texture cache refactor
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
  usage. Allows for operations like optional memory initialization before
  reading
2019-01-06 10:44:40 +03:00
kd-11 3be4b474d9 rsx: Handle rsx-self-tripping in draw call and triggering invalid invalidation
- If draw call resources consume memory that intersects with NA parts of the texture cache, we get a framebuffer test mismatch.
  This mismatch is false and happens because the thread has not yet reached the point of relocking the pages
2019-01-06 10:44:40 +03:00
kd-11 a95a44cf66 rsx: Strictness cleanups
- Also account for variable pitch textures (swizzled scan)
2019-01-06 10:44:40 +03:00
kd-11 362eea09a1 whitespace fix only 2019-01-06 10:44:40 +03:00
kd-11 15d5507154 rsx: Rewrite memory inheritance transfers
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
2019-01-06 10:44:40 +03:00
kd-11 97704d1396 rsx: Fix texture size calculations 2019-01-06 10:44:40 +03:00
kd-11 50c07833e4 rsx: Do not force upload for missing data
- TODO: Finish implementing GPU RCB for mem-sync
- TODO: Refactor mem-sync
2019-01-06 10:44:40 +03:00
kd-11 15488eb247 rsx: Avoid unnecessarily touching framebuffer memory
- Do not bind companion framebuffer when clearing single aspect; let the
  contest mechanism sort it out instead
- Do not prematurely tag framebuffers, instead only do so at
  write-confirmation time. Should avoid false tagging if setup does not
  allow a render to occur.
2019-01-06 10:44:40 +03:00
Megamouse bb464b0b64 fix some warnings 2019-01-05 04:03:18 +01:00
kd-11 7555be232f rsx/vp: Fix double dst commands
- Test the vec_result mask before assigning to actual output
  Sometimes, VEC op is used to write to Rx, and SCA op is used to write to o[x]!
2018-12-24 09:05:19 +03:00
kd-11 4b79ef1ad9 rsx: Implement stencil mirror views
- Implements a mirror view of D24S8 data that accesses the stencil components.
  Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data
- Add a few missing destructors
  Image classes are inherited a lot and I forgot to make the dtors virtual
2018-12-24 09:05:19 +03:00
kd-11 696b91cb9b rsx: Reimplement conditional execution in shaders
- Per-channel conditional execution introduces RAW hazards all over the place
- Its cheaper to process both branches and select between the two
- Also improves ShaderVariable functionality to allow functionality such as match_size and taking complex variables as inputs
2018-12-24 09:05:19 +03:00
Rui Pinheiro 54bfe2e102 Add log warning on slow flush path 2018-12-11 22:37:10 +03:00
Rui Pinheiro 18b9ee4541 Reimplement overlapping fbo "hack"
To avoid the need (and performance hit) of Read Color/Depth Buffers, we
may not invalidate overlapping fbos inside lock_memory_region unless
they are guaranteed to be superseded by the new one.

This avoids e.g. issues with overblooming, among others.
2018-12-11 22:37:10 +03:00
Rui Pinheiro 5ab7296665 Fix xcode build 2018-12-11 22:37:10 +03:00
Rui Pinheiro bcdf91edbb Misc. Texture Cache fixes 2018-12-11 22:37:10 +03:00
Rui Pinheiro 9d1cdccb1a Implement dedicated texture cache predictor 2018-12-11 22:37:10 +03:00
Rui Pinheiro af360b78f2 Texture cache section management fixups
Fixes VRAM leaks and incorrect destruction of resources, which could
lead to drivers crashes.

Additionally, lock_memory_region is now able to flush superseded
sections. However, due to the potential performance impact of this
for little gain, a new debug setting ("Strict Flushing") has been
added to config.yaml
2018-12-11 22:37:10 +03:00
eladash 4ddafc481e remove unreachable code 2018-12-04 13:01:29 +03:00
kd-11 504ab5a6d4 rsx: Minor cleanup to silence stupid compiler warnings 2018-12-03 20:01:23 +03:00
kd-11 7b065d7781 rsx: Fixup; input attributes blob decoding
- Use an unstructured blob and index into the vec4 structures to extract the real data
2018-11-30 23:51:25 +03:00
kd-11 846daadd5d rsx: Fixups
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
2018-11-30 23:51:25 +03:00
kd-11 1ad76ad331 rsx: Restructure programs
- Also re-enable pipeline optimizations
2018-11-30 23:51:25 +03:00
kd-11 677b16f5c6 rsx: Fixups
- Also fix visual corruption when using disjoint indexed draws

- Refactor draw call emit again (vk)

- Improve execution barrier resolve
  - Allow vertex/index rebase inside begin/end pair
  - Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]

- gl bringup

- Simplify
  - using the simple_array gets back a few more fps :)
2018-11-30 23:51:25 +03:00
kd-11 e01d2f08c9 rsx: Refactor FIFO
- Removes fifo structures from common RSXThread
- Sets up a dedicated FIFO controller
- Allows for configurable queue optimizations
2018-11-30 23:51:25 +03:00
eladash 83b6c98563 rsx: Fix u16 index arrays overflow
Force u32 index array destinations to avoid overflows when adding vertex base index.
2018-10-08 16:39:47 +03:00
eladash e361e0daa6 rsx: Fix restart index check for u16 index arrays
Dont ignore upper bits of the restart index with u16 types
2018-10-08 16:39:47 +03:00
eladash 348db050ae rsx: Fix texture height read 2018-10-03 20:57:46 +03:00
eladash fa723f6dc4 rsx: Fix texture depth read 2018-10-03 20:57:46 +03:00
eladash 6586090307 rsx: Remove texture size hack 2018-10-03 20:57:46 +03:00
eladash eacd1b8f13 rsx: Remove texture address hack 2018-10-03 20:57:46 +03:00
Nekotekina da6ce80f4f Make vm::get_super_ptr return contiguous memory
Cleanup RSX code complexity
2018-09-27 23:37:13 +03:00
kd-11 dab30c0051 rsx: Disable predictions if 50% of predictions are wrong
- This happens often in loading screens where the memory usage pattern is often randomized by loading in of assets
2018-09-24 21:19:38 +03:00
Rui Pinheiro 35139ebf5d Texture cache cleanup, refactoring and fixes 2018-09-24 15:26:40 +03:00
kd-11 dafc914bcc rsx: temporary hack
- Removes all use of valid_count as a metric until the new refactor is merged
2018-09-21 16:32:23 +03:00
kd-11 fc486a1bac rsx: Preserve memory order when doing flush
- Orders flushing to preserve memory at all cost
- Avoids false positive where flushing overlapping sections can falsely invalidate another with head/tail test
2018-09-21 16:32:23 +03:00
kd-11 a21bdb9f45 rsx; blit engine fixes
- Forcefully downloads and reuploads data from the CPU in case of unexpected overlaps
- Properly detect correct size of newly created blit targets
- Remember to clear any existing views when changing the default component map!
2018-09-21 16:32:23 +03:00
kd-11 16dcbe8c74 rsx/vp: Fix ARL opcode properly
- NOTE: The address swizzle index is only for use as src. The address registers are only used one channel at a time.
- When the destination of ARL, the encoding is the same as the other temp registers
2018-09-15 11:57:06 +03:00
kd-11 f413996362 rsx: Minor texture cache fixes
- Retag resources reprotected under flush_always rules
- Properly check for blit resource fitting taking into account format
mismatch, pitch mismatch and typeless transfers
2018-09-10 15:43:28 +03:00
Dzmitry Malyshau 27474316fd Add missing virtual desctructors (#5094) 2018-09-07 14:35:40 +03:00
kd-11 66610a28af rsx/common: Clean up shared glsl header to minimize string concat operations 2018-09-06 21:11:11 +03:00
kd-11 346b97f871 rsx: Preserve fog coordinate across shader stages
- The x value contains the VP output value interpolated across primitive surface
- The y coordinate contains the fog fraction according to the selected fog formula
2018-09-06 21:11:11 +03:00
scribam d7bb59cd99 c++17: use std::size 2018-09-06 13:15:59 +03:00
Nekotekina ca5158a03e Cleanup semaphore<> (sema.h) and mutex.h (shared_mutex)
Remove semaphore_lock and writer_lock classes, replace with std::lock_guard
Change semaphore<> interface to Lockable (+ exotic try_unlock method)
2018-09-03 23:00:36 +03:00
Nekotekina ce4c4696dd Try to get rid of SIZE_32 macro 2018-09-03 21:40:36 +03:00
kd-11 dea5193fd7 rsx: Fix FP temp register count 2018-09-03 21:39:18 +03:00
kd-11 6399833182 rsx: Fix endianness order when immediate mode register is updated, but used as register lookup
- Simplify the code by unifying all the register-backed memory
2018-09-03 18:24:20 +03:00
Nekotekina a93a40e8d9 Disable texture_cache::emit_once (MSVC crash) 2018-08-25 01:58:28 +03:00
Nekotekina 1c6c24f8ac Update GSL and yaml-cpp submodules 2018-08-25 01:15:47 +03:00
Nekotekina 923314aef5 Rewrite texture_cache::emit_once
Also trying to workaround MSVC bug
2018-08-25 01:15:47 +03:00
kd-11 c6e35706a3 vk: Support sw component swizzle decode because metal sucks 2018-08-23 22:54:56 +03:00
kd-11 f3d3a1a4a5 rsx: Rework section reuse logic 2018-08-22 17:22:54 +03:00
kd-11 937f1e8cd0 fix gcc build 2018-08-18 16:14:30 +03:00
kd-11 4b2b662c3a rsx: Followup to the memory inheritance hierachy patch
- Tags framebuffer resources on first use (when on_write is called to verify memory)
- Texture cache now selects the best match and even sorts atlas writes with memory write order to avoid older data showing over newer one
2018-08-18 16:14:30 +03:00
kd-11 cca488d0cf rsx: Enable swizzled decode for all formats unless proven otherwise
- Some formats are proven to ignore swizzle flag
  - DXT compressed textures
  - COMPRESSED_BG_GB class textures
- Some applications are using swizzled wide integer formats so those are confirmed to swizzle
2018-08-18 16:14:30 +03:00
kd-11 f8a9b1fa30 [WIP] rsx: Improve memory inheritance hierachy
- Cascade memory writes by invalidating 'downstream' subsurfaces
- Fixup; always resolve for overlapping surfaces before sampling (force
  atlas gather test)
2018-08-18 16:14:30 +03:00
kd-11 cc7848b3ef vulkan: Fix blit engine transfer to ARGB8 render target memory 2018-08-18 16:14:30 +03:00
kd-11 0267221586 Minor optimizations and fixes
- FIFO: avoid multiline spam
- VK: Fix program setup counter
- FS: Precalculate fragment constants buffer size during analysis step
2018-08-18 16:14:30 +03:00
Rui Pinheiro 23b52e1b1c Mark unsync textures dirty when deferred flushing
invalidate_range_impl_base does not mark all textures that will only be
unprotected as dirty when doing a deferred flush, since that is done by
flush_all.

However, if there are no sections to flush, the deferred flush will
use the same code path as non-deferred flushes for unprotecting textures
and forget to mark them as dirty.

This commit fixes this bug.
2018-08-16 15:38:36 +03:00
Rui Pinheiro fa6a5761b3 Refactor get_intersecting_set
The existing implementation restarts the loop immediately after
finding a range_data instance that updates the trampled_range.

This commit refactors this method to continue the loop with the updated
trampled_range, and then repeat only those range_data instances that
were iterated through before the trampled_range was last updated.
As a result, the number of total iterations required is reduced.
2018-08-16 15:38:36 +03:00
Rui Pinheiro b534d49e48 Fix off-by-one error in get_intersecting_set
When the trampled range changes, get_intersecting_set restarts the
outer loop. However, due to an off-by-one error, it skips the first
cache entry when doing so. This can cause a texture not to be
correctly unlocked, which could lead to issues or even deadlocks.

This commit fixes this off-by-one error.
2018-08-16 15:38:36 +03:00
eladash f349695a75 Rsx: rewrite address translation 2018-08-13 16:16:34 +03:00
kd-11 19d808d378 rsx/gl: Minor cleanup and optimization
- Track register change status
- Remove unused gl classes
2018-07-22 17:19:59 +03:00
kd-11 8695f95267 rsx: Reimplement cached textures and their views 2018-07-22 17:19:59 +03:00
scribam 65d270e5d8 clang-tidy: performance-faster-string-find
https://clang.llvm.org/extra/clang-tidy/checks/performance-faster-string-find.html
2018-07-15 12:51:09 +04:00
kd-11 e7f30640ef rsx: Async shader compilation
- Defer compilation process to worker threads
- vulkan: Fixup for graphics_pipeline_state.
  Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)
2018-07-14 15:19:56 +03:00
kd-11 fa55a8072c rsx: Improve vertex textures support
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
2018-07-12 18:02:28 +03:00
kd-11 d78957d1cf rsx/vp: CodeGen improvements
- Fix double destination writes on conditional write masking
- Fix codegen to simplify simple scalar comparisons vs vector functions
2018-07-07 16:20:33 +03:00
kd-11 2c34195954 rsx/vp: Discard broken vertex programs with no writes to POS register 2018-07-07 16:20:33 +03:00
kd-11 2ca935a26b vp: Improve vertex program analyser
- Adds dead code elimination
- Fix absolute branch target addresses to take base address into account
- Patch branch targets relative to base address to improve hash matching
- Bumps shader cache version
- Enables shader logging option to write out vertex program binary,
  helpful when debugging problems.
2018-07-07 16:20:33 +03:00
kd-11 bd915bfebd rsx: vp decompiler fixes
- Fix program abort logic to never abort before resolving later label addresses
  Fixes jumping over broken code and jumping over END markers
- TEXTURE_CONTROL2 has indexing range of [0..15] without stride skipping!
  This register does not have interleaving with other texture registers
- Track shader address poke as it seems to invalidate programs as well
2018-07-07 16:20:33 +03:00
kd-11 24f4c92759 rsx: Improve texture cache read speculation 2018-06-26 20:07:20 +03:00
kd-11 1730708f47 rsx: Rework memory protection management for framebuffer access
- Avoid re-locking memory if there is no reason to do so (no draws issued)
- Actively bound regions should always get written to the backing cache
- Forcefully read memory during download if writes to the target have occured since last sync event
2018-06-26 20:07:20 +03:00
kd-11 d77e62c94e rsx: Improve GPU resource read prediction 2018-06-18 17:32:22 +03:00
kd-11 2afcf369ec vk: Add synchronous compute pipelines
- Compute is now used to assist in some parts of blit operations, since there are no format conversions with vulkan like OGL does
- TODO: Integrate this into all types of GPU memory conversion operations instead of downloading to CPU then converting
2018-06-18 17:32:22 +03:00
kd-11 dd4c13b625 rsx: Avoid race conditions in unsynchronized unprotect 2018-06-18 17:32:22 +03:00
kd-11 3150619320 rsx: Preserve read AA state separate from write AA state
- Some applications (e.g Backbreaker) use an evil hack to resolve MSAA.
  The application respecifies a formerly AA region as a region with no AA then performs a framebuffer feedback lookup.
  The old memory keeps AA during read, but writes back to itself with AA resolved.
  This is evil on several levels but it just happens to work on PS3
2018-06-08 22:17:50 +03:00
kd-11 0f24379c0e rsx: Obey MSAA resolve during memory persistence transfer
- Ugh. This is a bandaid on a festering wound, AA badly needs a rewrite

 Also silence some warnings
2018-06-08 22:17:50 +03:00
Dravonic 400079a006 Parallel shader cache loading (#4677)
* Parallel shader cache loading
2018-06-01 19:49:29 +03:00
kd-11 b030d1900c rsx: Fixup - fix broken memory protection fail caused by region respec
- Some applications will alternate memory between framebuffer and texture data
2018-05-24 10:36:04 +03:00
kd-11 f8d999b384 fixup - range check 2018-05-23 19:07:08 +03:00
kd-11 3f14bc6961 rsx: Silence some meaningless error 2018-05-23 19:07:08 +03:00
kd-11 8fcd5c1e5a rsx: Texture cache fixes
1. rsx: Rework section synchronization using the new memory mirrors
2. rsx: Tweaks
    - Simplify peeking into the current rsx::thread instance.
      Use a simple rsx::get_current_renderer instead of asking fxm for the same
    - Fix global rsx super memory shm block management
3. rsx: Improve memory validation. test_framebuffer() and
tag_framebuffer() are simplified due to mirror support
4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine
5. rsx: Explicitly mark clobbered flushable sections as dirty to have them
removed
6. rsx: Cumulative fixes
    - Reimplement rsx::buffered_section management routines
    - blit engine subsections are not hit-tested against confirmed/committed memory range
      Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops
2018-05-23 19:07:08 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 4836a03a7d rsx: Fix build 2018-05-13 14:44:14 +03:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 a52ea7f870 rsx: Improve fragment and vertex program usage
- Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search
  - Avoids detecting shader as being different because of unused textures having state changes
  - Adds better program size detection for vertex programs
- Improved vertex program decompiler
  - Properly support CAL type instructions
  - Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes
  - Fix SRC checks and abort
  - Fix CC register initialization
  - NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)
2018-05-13 14:44:14 +03:00
kd-11 f3210a9a33 rsx: Workaround for lost memory sections
- TODO: surface_cache and texture_cache need a better method of persisting partial framebuffer resources
2018-04-25 19:14:36 +03:00
kd-11 291a828217 fixups 2018-04-25 19:14:36 +03:00
kd-11 da99f3cb9a rsx: Critical fixes
- texture cache: Avoid leaking memory sections
  - Avoid double ref increment on flush-always reprotection
  - Detect invalidated_resources entries in surface cache when protecting fbo memory
- vk: Copypasta bugfix, properly initialize aspect mask
2018-04-25 19:14:36 +03:00
kd-11 a42b00488d rsx: Texture fixes
- gl/vk: Fix subresource copy/blit
- gl/vk: Fix default_component_map reading
- vk: Reimplement cell readback path and improve software channel decoder
- Properly name the subresource layout field - its in blocks not bytes!
- Implement d24s8 upload from memory correctly
- Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules
- NOTE: Redirection of 16-bit textures is not implemented yet
2018-04-25 19:14:36 +03:00
kd-11 9abbbb79ae rsx: Blit engine fixes
- Ignore unlocked blit sections [TODO]
- Do not attempt blit on hw if bytesize is unsupported
- gl: Implement typeless memory transfers
  Uses pbo to handle type-agnostic memory transfer
2018-04-25 19:14:36 +03:00
kd-11 cf1b700ebc rsx: Improve format mismatch detection hack 2018-04-25 19:14:36 +03:00
kd-11 cfd0b8a975 rsx: Fix alphakill 2018-04-05 01:06:50 +03:00
kd-11 53f2533a08 rsx: Implement proper Z-order curve in 3 dimensions
- Should fix garbage palette textures getting uploaded (LSD graphics)
2018-04-05 01:06:50 +03:00
kd-11 e291494282 rsx: Texture cache updates
- Properly implement section gather for 3d and cubemaps
  Implements render-to-3d and fixes some corner cases for render-to-cubemap
2018-04-05 01:06:50 +03:00
Jake 6d6d6fa827 dx12/vk/gl: implement use of vertex_data_base_index when calculating index 2018-03-30 13:30:04 +03:00
kd-11 ee0fe28ddc rsx: Fix copypasta 2018-03-29 13:52:11 +03:00
kd-11 5aac8aa424 rsx: Clamp negative fog distance 2018-03-25 16:02:47 +03:00
kd-11 887ea43e39 rsx: Fix some texture cache problems
- gl/vk: Properly handle remapping temporary resources
2018-03-25 13:31:06 +03:00
kd-11 321c360dcb rsx: Overhaul rendertarget sampling/shuffles
- Reimplements render target views used for sampling
- Optimizes access using an encoded control token
- Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR)
- Adds proper encoding for ABGR textures (ABGR8 -> ARGB8)
- Silence some compiler warnings as well
- TODO: Real texture views for OGL current method is a hack
2018-03-25 13:31:06 +03:00
kd-11 9fc1740608 rsx/fp: Fragment program overhaul
- Separate TXB from TXL: They are completely different!
- Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment
- Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work
- Document some more bits in SHADER_CONTROL register
2018-03-25 13:31:06 +03:00
kd-11 9f416e5ce1 rsx/gl/vk: Obey channel remapping on framebuffer resources if requested 2018-03-25 13:31:06 +03:00
kd-11 27552891ad rsx/fp: Improvements
- Export some debug information in the free texture register space components zw
  Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
  Some engines (UE3) read all the components in the shader and use mul/mad with the result
2018-03-25 13:31:06 +03:00
kd-11 5817f9fe3f rsx: Texture format fixes
- Implement SRGB (gamma corrected) textures (DXT1, DXT3, DXT5, RGBA8 only)
- Fix channel map decode for XY data texture formats
- Fix remap layout for X16 textures (verified with Mass Effect 3)
2018-03-25 13:31:06 +03:00
pauls-gh 44cddda4b4 Fix VTC source index increment 2018-03-23 12:01:30 +03:00
pauls-gh d79a544320 VTC tiling - fix source offset increment. 2018-03-23 12:01:30 +03:00
pauls-gh e5b4710471 Add end condition for VTC copy. This handles the case when depth is not a multiple of 4. 2018-03-23 12:01:30 +03:00
pauls-gh e6010ba2ca Fix code formatting 2018-03-23 12:01:30 +03:00
pauls-gh fd8d2ecbf4 Remove Volume Texture Compression (VTC) tiling for Vulkan, DX12 and ATI (OpenGL). 2018-03-23 12:01:30 +03:00
kd-11 ffe6c9ba5a fix linux builds 2018-03-13 18:55:03 +03:00
kd-11 e230867492 rsx: Properly implement raster window offsets 2018-03-13 18:55:03 +03:00
kd-11 d41b49d8b4 rsx/fp: Color output registers are always present and zero initialized
- According to NV_fragment_program spec, registers are zero initialized always
- A program even without writing to these registers will have black (0, 0, 0, 0) output
  Confirmed behaviour with MotorStorm games. Their engine uses this quirk to clear color buffers when doing depth replace
  Might be an unfixed game bug
2018-03-13 18:55:03 +03:00
kd-11 20d4c09a1c rsx/vk/gl: Enforce format matching for render target resources. Fall back to raw data copy if match fails
- Forces Bitcast of texture data if input format cannot possibly be the
  same as the existing texture format

- rsx: Other minor improvements to texture cache :-
  - remove obsolete blit engine incompatibility warning. The texture will be re-uploaded if it is indeed incompatible
  - Implement warn_once and err_once to avoid spamming the log with systemic errors
  - Track mispredicted flushes
  - Reswizzle bitcasted texture data to native layout
    TODO: Also needs reshuffle according to input remap vector
2018-03-13 18:55:03 +03:00
kd-11 68b3229756 rsx/fp: Improve rgister component gather detection
- Also avoids clobbering register data by keeping gathered bits in a temp var
2018-03-13 18:55:03 +03:00
kd-11 87741141f1 rsx/vulkan: Add post-compilation key validation and dynamically determine attachment write maks based on decompiled shader
- A new step is added between decompilation and pipeline object creation allowing for properties to be updated based on shader contents
- Allos masking off attachment writes that are unmodified in the shader
2018-03-13 18:55:03 +03:00
kd-11 705820c430 rsx: Nvidia driver compatibility workarounds
- Sanitize NaN values before they reach the driver. On nvidia (X * NaN = X)
2018-03-13 18:55:03 +03:00
kd-11 07cbf3da48 rsx/gl: Minor fixes
- Identify depth textures reaching the gpu via shader_read upload path
- Use correct timestamp counter for opengl
- inline draw_state::test_property because msvc doesnt do it for us
2018-03-13 18:55:03 +03:00
kd-11 8ccaabb502 vulkan: Optimize vertex data upload
- Reuse buffer views as much as possible, vkCreateBufferView is slow on NV
  Implemented as a large sliding window, reuseable until it is filled
2018-03-13 18:55:03 +03:00
kd-11 01349b8cee rsx: Texture cache fixes - Optionally attempt to merge framebuffers into an atlas if partial resources are missing - Support for data update requests to the temporary subresource handler This is useful for framebuffer feedback loops where a new copy is needed after every draw call (resource is always dirty) 2018-03-13 18:55:03 +03:00
kd-11 4487cc8e7a Remove an ugly hack pertaining to partial framebuffer-resident texture data - Its better to fill in the missing information with a wrap or clamp than to fake the texture reads in valid regions - Texture coordinate scaling is used to fill in for the cropped dimension available 2018-03-13 18:55:03 +03:00
kd-11 661b8b006f rsx: Add texture readback statistics to the texture cache and debug overlay 2018-02-16 16:14:54 +03:00
kd-11 1bd77c2f51 rsx: Add cache pattern checking to blit engine resources
- Feature was implemented long ago but was not functional due to bugs
2018-02-16 16:14:54 +03:00
kd-11 c191a98ec3 vulkan API fixes
- Fix for texture barriers
- vulkan: Rework texture cache handling of depth surfaces
- Support for scaled depth blit using overlay pass
- Support proper readback of D24S8 in both D32F_S8 and D24U_S8 variants
- Optimize the depth conversion routines with SSE
- vulkan: Replace slow single element copy with std::memcpy
- Check heap status before attempting blit operations
- Bump guard size on upload buffer as well
2018-02-16 16:14:54 +03:00
kd-11 3bbecd998a infinitesimal fixes 2018-02-16 16:14:54 +03:00
kd-11 a64bea1286 rsx/fp: Discard shaders with undefined (non-existent) writes. On nvidia+vulkan, undefined writes autofill with blue color 2018-02-16 16:14:54 +03:00
kd-11 89c548b5d3 rsx: fbo fixes 2.5
- Implement flush-always behaviour to partially fix readback from a currently bound fbo
  - Without this, only the first read is correct, as more draws are added the results become 'wrong'
  - Fixes WCB and cpublit behviour
- Synchronize blit_dst surfaces to avoid data loss when gpu texture scaling is used
  - Its still faster in such cases to disable gpu texture scaling but some types cannot be disabled without force cpu blit (e.g framebuffer transfers)
- Memory management tuning
  - rsx: on-demand texture cache rescanning for unprotected sections
  - rsx: Only framebuffer resources are upscaled
  - Do not resize regular blit engine resources
  - Lazy initialize readback buffer when using opengl
  -- These measures should help minimize vram usage
2018-02-16 16:14:54 +03:00
Nekotekina cce0ad0c35 Clean vm::ps3 namespace use 2018-02-09 17:49:37 +03:00
kd-11 eeb6e29e39 vulkan: implement proper texture read barriers 2018-02-02 10:07:55 +03:00
kd-11 b9cca71c47 gl: API compliance fixes
- Do not assume texture2D when creating new textures
- Flag invalid texture cache if readonly texture is trampled by fbo memory.
  Avoids binding a stale handle to the pipeline and is rare enough that it should not hurt performance
2018-02-02 10:07:55 +03:00
kd-11 33bcdd476c glsl/fp/vp: Avoid shader clutter
- Do not add unused subroutines in shaders unless necessary
-- makes shaders easier to read and disassembled spir-v has less clutter
- glsl: Replace switch block with lookup table
2018-01-30 21:16:43 +03:00
kd-11 2e04dceaf0 rsx: misc fixes
- Supply explicit options for spv emit allowing optimizations (not yet compiled into the backend)
- Add epsilon fix to glslcommon
- Fix shader dialog crash when using qt (race condition)
2018-01-30 21:16:43 +03:00
kd-11 648fc92184 rsx/fp/vp: Epsilon value is too large!
- Original epsilon value was 1.E-10 which nvidia linux driver could not read properly
-- Restores the original value represented in decimal notation
2018-01-30 21:16:43 +03:00
kd-11 743928b379 vk/gl: Preserve clamped z precision to some extent
- Use edges of depth range to map clamped stuff

Disable range compression on regular draws vs extended range draws
- Some applications require full 0-1 usage without compromises.
-- TODO: This leaves the extended range z values to fight with regular draws in the .99 - 1.0 range
2018-01-22 11:43:35 +03:00
kd-11 6828fbf658 rsx/texture_cache: Remove hacks; it has been proven that in offsets are in x16 fixed point 2018-01-19 12:03:57 +03:00
kd-11 0a2992839b rsx/gl/vk: Simulate z clipping with selective depth clamp
- The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances
-- If depth falls within near/far, clamp depth value to [0,1]
2018-01-19 12:03:57 +03:00
lewmpk d64e79bd9f fix clang warning: logical-op-parentheses 2017-12-31 22:08:17 +03:00
kd-11 1ea5e7404a rsx: Workaround for nvidia linux
- For some reason, using 1.E-x notation does not work on nvidia linux. Could be a bug in spir-v generator or the driver itself
2017-12-31 12:43:40 +03:00
kd-11 d6bc6ec2c1 rsx: fix initial swizzle ordering for render target data 2017-12-22 20:08:14 +03:00
kd-11 4a0c4259f0 c++ is hard
- Remove unnecessary const definitions
2017-12-22 20:08:14 +03:00
Nekotekina 61de20a633 RSX: remove SSSE3 dependency 2017-12-20 00:04:08 +03:00
kd-11 47060cdc5f rsx/fp: Fix typo 2017-12-18 10:45:37 +03:00
kd-11 7dd349ae8e Update FragmentProgramDecompiler.cpp 2017-12-18 10:45:37 +03:00
kd-11 4e80858bed rsx/fp: Hotfix for TEXBEM/TXPBEM 2017-12-18 10:45:37 +03:00
kd-11 e89a035e8b rsx/fp: Implement TXPBEM 2017-12-18 10:45:37 +03:00
kd-11 f7c52d3bb7 rsx/fp: Implement TEXBEM (untested) 2017-12-18 10:45:37 +03:00
kd-11 6f8dd20f03 rsx/fp: Stuff
- Implement BEM
- Add LG2 to special instructions
2017-12-18 10:45:37 +03:00
kd-11 6891323c18 rsx: framebuffer textures do not have mipmaps!
- Force mipmap count to 1 if sampling from an RTV/DSV
- TODO: Better wcb flush detection, it should be better to re-upload the texture after it has been dwnloaded if expected mipmaps are > 1
2017-12-18 10:45:37 +03:00
kd-11 7c7cd4153e rsx: Framebuffer setup fixes
- Sometimes square renders are done to surfaces with pitch=64 and re-uploaded with swizzle scanning
-- This setup avoids discarding targets if they are square and pitch == 64
2017-12-18 10:45:37 +03:00
kd-11 ff0f1510e5 rsx: Minor fixes
- Abort nv406e semaphore acquire if the rsx thread stalls/crashes
- Fix texture size approximation to take mipmaps into account. Fixes some games hanging with WCB
2017-12-18 10:45:37 +03:00
kd-11 3338fdb936 rsx: Fix RGB565 blits. Data is byteswapped on input
- Fixes messed up BG on retroarch glyphs
2017-12-18 10:45:37 +03:00
kd-11 6dfe32c6d2 fix linux builds 2017-12-18 10:45:37 +03:00
kd-11 95966a467e rsx: Texture cache fixes
- Handle blit resources in a more consistent way
- TODO: Handle some corner cases (piyotama)
2017-12-18 10:45:37 +03:00
kd-11 ac0022483a rsx: Implement delayed swizzle remap for blit engine resources
- Fixes remap vectors for memory copied via blit engine as it has no context
2017-12-18 10:45:37 +03:00
kd-11 0b3fbf1d4c rsx: Narrow the race condition window further
- Needs aliased paging to be implemented to fix properly or a re-entrant global IO lock
2017-12-06 12:55:49 +03:00
kd-11 a2b4cf22b5 rsx: Reimplement invalidate_range_base_impl
- Avoid unprotecting memory until just before we have to write the data
- Avoids race conditions where the caller thread takes too long to enter the second phase and another thread accesses the "bad" memory
2017-12-06 12:55:49 +03:00
kd-11 9853027f72 rsx/vp: Decide default return values in case of undefined attributes based on location ID
- Different default values should be returned for different attributes
2017-12-04 18:22:18 +03:00
kd-11 90c2324e47 rsx: Program cache fixes
- Reorganize storage hash vs ucode hash
- Scan for actual fragment program start in case leading NOPed code precedes the actual instructions
-- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes
2017-12-04 18:22:18 +03:00
kd-11 cdd4fd9867 rsx/fp: Explicitly insert global functions.
- Functions such as pack/unpack ops must exist before the shared gather functions are declared
2017-12-04 18:22:18 +03:00
kd-11 896c8991de rsx/fp: Properly implement PK/UP instructions based on NV_fragment_program documentation 2017-12-01 21:00:50 +03:00
kd-11 fe9090bd39 rsx/fp: Implement register gather (only for UP(X) instructions)
- Workaround for temp register aliasing between H and R variants
- TODO: Implement temp regs as 128 bit-blocks with r/w as pack/unpack
2017-12-01 21:00:50 +03:00
kd-11 a18ae0f6ac rsx/fp: Reimplement PK(X) and UP(X) opcodes. The read back values are obviously in normalized range
- Confirmed with a GOW shader which writes result of UP8 to BGRA8 output
2017-12-01 21:00:50 +03:00
kd-11 6c9c300fe0 rsx: Fix texture cache memory usage statistics 2017-12-01 21:00:50 +03:00
kd-11 ddebc334bf rsx: Fixes
- Discard intentionally invalidated framebuffer resources. These are created after a flush has happened, forcing reupload since contents cannot be guaranteed (strict mode only)
- Fix for blits using vulkan; dont use the copy method if formats do not match, use generic blit instead
2017-12-01 21:00:50 +03:00
kd-11 145ecb00fc rsx: Texture cache hotfixes 2017-12-01 21:00:50 +03:00
kd-11 4d75e98647 rsx/fp: Do not apply input mods to all types of inputs
- Temp registers are confirmed to be affected
- Const registers are confirmed to be unaffected
- Varying inputs are not confirmed yet
2017-12-01 21:00:50 +03:00
kd-11 de5a4fe083 rsx: Reimplement depth <-> RGBA reinterpretation code
- Implements proper channel order for fp24-ARGB8 conversion
- Takes swizzle remap into account when reconstructing source bytes
2017-12-01 21:00:50 +03:00
kd-11 33f3a3e014 rsx: Major fixes
- Handle aliased depth + color target by disabling depth writes. This looks to be the correct way
- Add support for generic passes that cannot be done using general imaging operations. Lays the framework for tons of features and effects
- Implement RGBA->D24D8 casting. Sometimes games will split depth texture into RGBA8 then use the new RGBA8 as a depth texture directly
-- This happens alot in ps3 games and I'm not sure why. Its likely the ps3 did not sample fp values with linear filtering so this is a workaround
-- Only implemented for openGL at the moment
-- Requires a workaround for an AMD driver bug
2017-12-01 21:00:50 +03:00
kd-11 0aaae000b3 rsx: Minor improvements 2017-12-01 21:00:50 +03:00
kd-11 db58cd7513 rsx: Invalidate both depth and color surfaces when binding a new surface 2017-12-01 21:00:50 +03:00
Zion Nimchuk 3a9ae2df9e silence warnings in RSX stuff 2017-11-30 18:07:19 +03:00
kd-11 df7d52b177 rsx/fp: Give abs higher prio as it invalidates any precision checks 2017-11-20 15:18:57 +03:00
kd-11 f5addbf751 rsx/fp: improve SRC modifier order
- Neg modifier is applied after clamping. Abs has not been tested/proven so precision clamp goes first now, not last
2017-11-20 15:18:57 +03:00
kd-11 a8c0dd649e rsx/fp: RE work on precision modifier bits
- Testing DS2 has revealed clamping bits in SRC1 that were not respected and left negative values reaching the framebuffer
2017-11-20 15:18:57 +03:00
kd-11 6d2dcbd164 rsx: Enable hw blit engine for local->main memory blit operations as well 2017-11-20 15:18:57 +03:00
kd-11 be6b5922dd rsx: research native texel byte order on cpu readback (WCB) [WIP] 2017-11-20 15:18:57 +03:00
kd-11 3c9126d91f rsx: Ignore FENCE instruction as it seems like its ignored on realhw
- This is likely a compiler hint for performance reasons and not a mandate
2017-11-09 14:39:50 +03:00
kd-11 ed21bb309f rsx: Minor fixups
- Fix texture cache blit behaviour when src has AA enabled and dst is a blit dst texture with or without AA
-- This requires handling AA resolve by removing a half downscale on multisampled axes

- Return all ones when a vertex attribute is disabled.
-- Some games forget to enable vertex attributes actually needed by the fs
2017-11-08 13:15:34 +03:00
kd-11 4e9160104a rsx/vk/gl: Cleanup and refector glsl::getFunctionImpl
- Both backends now generate very similar code
2017-11-08 13:15:34 +03:00
kd-11 8733505d0a rsx: Minor fixes
- texture_cache: Fix internal size calculation for subresources
- vk: Delay dynamic state updates until just about to draw to ensure no flush has discarded the cb state
2017-11-08 13:15:34 +03:00
kd-11 baa5a261a5 rsx: Rewrite invalidate_range_impl_base in a way that makes sense. Fixes wcb hanging 2017-11-08 13:15:34 +03:00
kd-11 3730b9d1da rsx: More fixes
- Support for raster offsets in surface descriptors (looks to be unused)
- Do not tag disabled render targets when using MRT (pitch = 64)
- Add missing notify_surface_changed() call for openGL
2017-11-08 13:15:34 +03:00
kd-11 4ca98e53a6 rsx: Fix for unnormalized texture access 2017-11-08 13:15:34 +03:00
kd-11 300a36d3d6 rsx: Fixes for cubemap reconstruction
- Do not abort generation if sides are missing, replace with blank surfaces instead
- Make cubemaps scale with res scaling
2017-11-08 13:15:34 +03:00
kd-11 60c7a508a7 rsx: Refactor create_subresource_view(deferred_subresource&) and implement a subresource cache
- This limits the number of times an image is copied and improves performance
2017-11-08 13:15:34 +03:00
kd-11 1fa18757fc rsx: Implement render-to-cubemap; Also simplify unnormalized samplers [WIP, DELETE SHADER CACHE, VERY SLOW]
- Enables real-time cubemap reflections
- TODO: Vulkan is broke; rsx is very slow with this feature
2017-11-08 13:15:34 +03:00
kd-11 fbb7186e66 rsx/gl: Addendum - Fix fragment shader to consume texture scale parameters 2017-11-08 13:15:34 +03:00
kd-11 0961a43997 rsx: Implement 1D<->2D image type casts 2017-11-08 13:15:34 +03:00
kd-11 7037504dcf rsx: Workaround for missing AA flags on some surfaces
- This just doesnt work right yet. It looks like AA is being used dynamically? (RDR)
- TODO: Try to locate flags to set AA if AA mode is not changed
2017-11-08 13:15:34 +03:00
kd-11 eed55a446c rsx: Minor optimization
- Defer resolving image copy operations to the binding step
2017-11-08 13:15:34 +03:00
kd-11 bbcb6b6851 rsx: Fbo fixes 2
- Use AA mode to predict surface compression. Compression mode is useless without AA activated
- Rewrites most image subresource fetch routines to use the new heuristic
- Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code
-- Fixes alot of failures when looking for tiled regions

rsx: Fix antialiased unnormalized coords
- scaling factors are inverse to allow proper coordinates to be computed in fs
2017-11-08 13:15:34 +03:00
kd-11 b95630d84a rsx: Minor fixups
- Optimize framebuffer memory invalidate conditions
- Fix texture sampling of AA textures (wider by 2x surfaces)
2017-11-08 13:15:34 +03:00
kd-11 af1d3c2aa6 rsx: Improve surface store resource management
- vk: Use frame testing to determine invalidated resources that can be safely deleted
2017-11-08 13:15:34 +03:00
kd-11 ec3e5c547f rsx: More fixes
- Tag surface store to help determine when contents have been invalidated
- Crop framebuffer textures if they are not the requested dimensions!
2017-11-08 13:15:34 +03:00
kd-11 173d05b54f rsx: Optimizations
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well

addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine

vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb
2017-11-08 13:15:34 +03:00
scribam 4600094829 [RSX] Fix uninitialized value before usage 2017-11-04 01:28:53 +03:00
kd-11 daaa83b9ca rsx: Fix for framebuffer validation 2017-11-04 00:08:30 +03:00
kd-11 30bba09fed disable fb testing for partial framebuffer resources 2017-11-02 14:35:19 +03:00
kd-11 31b07f2c5c rsx: Tweaks
- Optimize get_surface_subresource
- Add check_program_status time to draw call setup statistics. It can slow down games significantly
2017-11-02 14:35:19 +03:00
kd-11 c2ac05f734 rsx: Fix for rsx thread lockup due to nested access violations when WCB is enabled 2017-10-29 15:25:17 +03:00
kd-11 361e80f7dc rsx: Tag cache blocks returned on access violation to validate data passed
to flush_all is up to date. Should prevent recursive exceptions

Partially revert Jarves' fix to invalidate cache on tile unbind. This will
need alot more work. Fixes hangs
2017-10-29 15:25:17 +03:00
kd-11 7abf755a57 rsx: Avoid false positives by early rejection. Should keep cache thashing to a minimum 2017-10-28 13:26:16 +03:00
kd-11 055f0e2e4a rsx: Export more information about affected cache sections when handling violations
- This allows for better handling of deferred flushes.
-- There's still no guarantee that cache contents will have changed between the set acquisition and following flush operation
-- Hopefully this is rare enough that it doesnt cause serious issues for now
2017-10-28 13:26:16 +03:00
kd-11 49f4da3016 rsx: Fixes
- vk: Always reopen primary command buffers. They should only be closed in flush_command_queue
- If uploading a texture and there are collisions with protected buffers, do not rebuild the cache
- Perform writes via flush before reprotecting pages that were not trampled
- Only flush no pages once
2017-10-28 13:26:16 +03:00
kd-11 bf234dc668 rsx: Implement memory tags for strict mode to validate render target memory 2017-10-28 13:26:16 +03:00
Jake e0d1ac676e rsx: invalidate surface store address when tile is unbound 2017-10-28 12:46:20 +03:00
kd-11 e6849a59a2 rsx: Better detection of framebuffer memory copy operations
- Still requires texture stitching to work correctly, but matching dimensions works well for now
2017-10-24 22:59:09 +03:00
kd-11 6918e265ec rsx/vk: Be a little more frugal with texture memory to avoid running out of VRAM on 1GB cards 2017-10-24 22:59:09 +03:00
kd-11 e9f293f522 rsx: Improve separate treatment of write exceptions vs read exceptions
- Optimizes search functionality and avoids thrashing valid sections
2017-10-24 22:59:09 +03:00
kd-11 5fc36d64b6 fix build 2017-10-24 22:59:09 +03:00
kd-11 95e6d78689 rsx: Workaround for 0 pitch textures.
- Should these be ignored? Needs investigation
2017-10-24 22:59:09 +03:00
kd-11 f4a666345a rsx: Even more texture cache fixes
- Fix subresource sampling
- Invalidate memory range before uploading textures to prevent hangs
2017-10-24 22:59:09 +03:00
kd-11 0de0dded53 rsx: Texture fixes continued
- Fix buffer invalidate behaviour (wcb)
- Disable auto rebuild with only framebuffer storage getting rebuilt
- Fix vulkan subresource sampling
2017-10-24 22:59:09 +03:00
kd-11 5e58cf6079 rsx: Restructuring [WIP]
- Refactor invalidate memory functions into one function
- Add cached object rebuilding functionality to avoid throwing away useful memory on an invalidate
- Added debug monitoring of texture unit VRAM usage
2017-10-24 22:59:09 +03:00
kd-11 ddcacb8258 general fixes; Force u32 return type for index_count and add RX Vega to primitive restart blacklist 2017-10-19 12:22:52 +03:00
kd-11 89dcafbe41 rsx: Reimplement index buffer generation
- Emulate primitive restart in software whenever we get the chance
- Ensure PRIMITIVE_RESTART is never active when LIST topologies are active
- Reimplement TRIANGLE_FAN, POLYGON and QUAD expansion
2017-10-19 12:22:52 +03:00
kd-11 86bf61ad35 rsx: Fix memory protection
- Fixes hanging when wcb is enabled
2017-10-14 14:19:14 +03:00
kd-11 eab9d06981 rsx: Texture cache fixes
- Fix src/dst framebuffer detection
- Silence some warnings
2017-10-13 15:23:48 +03:00
kd-11 12ab03b0b5 rsx/gl: Implement resolution scaling
rsx: Revise wpos calculation to take resolution scale into account
2017-10-09 20:25:41 +03:00
kd-11 47202d5839 rsx: Set up patch functionality for program coeffecients 2017-10-09 20:25:41 +03:00
kd-11 393e3b702f rsx: Clean up debug overlays. Add unreleased textures metric to track texture memory 2017-09-23 16:46:41 +03:00
kd-11 9ee21af524 vulkan: Optimize memory allocation 2017-09-23 16:46:41 +03:00
kd-11 b74cdcde00 rsx: Make the 3rd texture dimension matter
- Affects cube maps and texture3D surfaces
2017-09-23 16:46:41 +03:00
kd-11 4d83d749a0 rsx: Texture cache fixes
- Update section flags when requested
- Fix nullptr dereference: cached_dest will be null if dst_is_render_target is true
2017-09-23 16:46:41 +03:00
kd-11 d0148728c6 rsx: Fixes
- Fix section scanning range for early reject
- Specify IMAGE_ASPECT_STENCIL when uploading image_from_cpu
2017-09-21 20:05:07 +03:00
kd-11 3499d089e7 rsx: Texture cache fixes and improvements
rsx: Conditional lock hack removed
vulkan - Fixes
- Remove unused texture class
- Fix native pitch calculation (WCB)
rsx: Catch hanging begin/end pairs when flushing deferred draw calls
vulkan: Register DXT compressed formats
vulkan: Register depth formats
gl: Workaround for 'texture stitching' when gathering flip surface
- TODO: Add a proper flip hack option
rsx: Fix texture memory size calculation
- DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size
rsx: Fix cpu copy detection
rsx: Validate blit dst surface and dont make assumptions about region blit order
- Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled
rsx: Fix depth texture detection
rsx: Do not manually offset into dst. The overlapped range check does so automatically
rsx: Minor optimizations
rsx: Minor fixes
- Fix to detect incompatible formats when using GPU texture scaling and show message
- Better 'is_depth_texture' algorithm to eliminate false positives
2017-09-21 16:17:06 +03:00
kd-11 6b96a2022a rsx: Add support for non-projective shadow sampling
- Fixes missing shadows in persona 5

vk: Enable polygon depth bias a.k.a polygonOffset
- Fixes shadow acne in persona 5
2017-09-21 16:17:06 +03:00
kd-11 3836b40bf7 rsx: Fixups 2017-09-21 16:17:06 +03:00
kd-11 571dbfb7b1 rsx: Texture cache improvements
- Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO)
rsx: Fixups to allow large textures for blit operation
- Also includes checks for both leaking sections and blit regions for vulkan
hotfix for hanging when using WCB
addendum - unlock both ro and no blocks before attempting to copy memory blocks
gl: Fixups for ARB_explicit_uniform_location
- Forces glsl v 430 to make use of the extension
rsx/vk: Rework texture cache to minimize recursive access violations
- Also modifies the vulkan commandbuffer begin/end/submit mechanism
gl: Fix cached_texture_section::is_flushable to take memory protection into account
rsx: Fix blit dst offset calculation
2017-09-21 16:17:06 +03:00
kd-11 45d0e821dc gl: Minor optimizations
rsx: Texture cache - improvements to locking
rsx: Minor optimizations to get_current_vertex_program and begin-end batch flushes
rsx: Optimize texture cache storage
- Manages storage in blocks of 16MB
rsx/vk/gl: Fix swizzled texture input
gl: Hotfix for compressed texture formats
2017-09-21 16:17:06 +03:00
kd-11 e37a2a8f7d rsx: Texture cache fixes and improvments
gl/vk/rsx: Refactoring; unify texture cache code
gl: Fixups
- Removes rsx::gl::texture class and leave gl::texture intact
- Simplify texture create and upload mechanisms
- Re-enable texture uploads with the new texture cache mechanism
rsx: texture cache - check if bit region fits into dst texture before attempting to copy
gl/vk: Cleanup
- Set initial texture layout to DST_OPTIMAL since it has no data in it anyway at the start
- Move structs outside of classes to avoid clutter
2017-09-21 16:17:06 +03:00
kd-11 07c83f6e44 gl: cleanup; fix program linkage on mesa using GL_ARB_explicit_uniform_location, also make use of ARB_multidraw 2017-09-21 16:17:06 +03:00
kd-11 9359b8c170 rsx/fp: Shader decompiler fixes
- Requires proper 2-pass impl
rsx/fp: Catch hanging code blocks
rsx/fp: Don't pause on scaling error
2017-09-21 16:17:06 +03:00
kd-11 2d0f1f27a8 esx: Fixes to the texture cache
rsx: Blit engine improvements
- Always handle blits to and from framebuffers through the GPU
- Handle depth surfaces properly when using GL
- Check for format mismatches when blitting to the surface store [WIP]
2017-09-21 16:17:06 +03:00
kd-11 2033f3f7dc rsx/vk/gl: Refactoring and reimplementation of blit engine
Fix rsx offscreen-render-to-display-buffer-blit surface reads
- Also, properly scale display output height if reading from compressed tile

gl: Fix broken dst height computation
- The extra padding is only there to force power-of-2 sizes and isnt used

gl: Ignore compression scaling if output is rendered to in a renderpass

rsx/gl/vk: Cleanup for GPU texture scaling. Initial impl [WIP]
- TODO: Refactor more shared code into RSX/common
2017-09-21 16:17:06 +03:00
kd-11 2e9405db4c rsx: Remove index expansion for quad strips 2017-08-26 21:53:54 +03:00
kd-11 fe5828cb47 rsx: Implement QUAD_STRIP
- QUAD_STRIP evaluates to TRIANGLE_STRIP in memory. The memory layout is identical.
- The only difference between the two modes would be the primitive_ID but that doesnt matter on RSX
- Its worth noting that results will be different between the two modes if input vertices are non-coplanar for every set of N verts
2017-08-26 21:53:54 +03:00
kd-11 9a7ce2fd29 rsx/vp: ARL fix 2017-08-26 21:53:54 +03:00
kd-11 f71f67c4ff rsx: Make fragment state dynamic to reduce shader permutations 2017-08-26 21:53:54 +03:00
Jake 5d7c454e52 rsx: Vertex Decompiler, fix sca register assignment 2017-08-19 12:27:53 +03:00
kd-11 334327df67 rsx: Add a success message on program compile completion
- Should help users wondering if rpcs3 'froze' during shader compile
2017-08-16 23:58:30 +03:00
kd-11 650c1c64f1 gl: Workarounds for intel GPUs which dont seem to be truly GL4 compliant 2017-08-16 23:58:30 +03:00
kd-11 c04aa05398 rsx: Shader pipeline fixes and improvements
- Do not set zfunc if alphakill is not enabled. This is because at the moment alphakill requires a different shader to be built

- use glsl loop-unroll friendly comparison; skip vertex input compare if either key requests it

- Minor tweaks to fp key generation
2017-08-16 23:58:30 +03:00
kd-11 00c6a589a5 rsx/util: Add simple consistent hash function
rsx/vk/shaders_cache: Move vp control mask to dynamic state

rsx/vk/gl: adds a shader cache for GL. Also Separates pipeline storage for each backend

rsx: Add more texture state variables to the cache
2017-08-16 23:58:30 +03:00
kd-11 c7dca1dbef rsx/vk: Implement shaders cache and fix broken pipeline_storage comparison and hash
- Fix pipeline state storage to uniquely store each pipeline variant
- Adds a progress bar to indicate loading is taking place
2017-08-16 23:58:30 +03:00
kd-11 00b0311c86 rsx/gl/vulkan: Refactoring and partial vulkan rewrite
- Updates vulkan to use GPU vertex processing
- Rewrites vulkan to buffer entire frames and present when first available to avoid stalls
- Move more state into dynamic descriptors to reduce progam cache misses; Fix render pass conflicts before texture access
- Discards incomplete cb at destruction to avoid refs to destroyed objects
- Move set_viewport to the uninterruptible block before drawing in case cb is switched before we're ready
- Manage frame contexts separately for easier async frame management
- Avoid wasteful create-destroy cycles when sampling rtts
2017-08-16 23:58:30 +03:00
kd-11 6a707f515e vk/gl: Factorize shared GLSL code
- prep vulkan for shared glsl backend
2017-08-16 23:58:30 +03:00
kd-11 d54c2dd39a gl: Move vertex processing to the GPU
- Significant gains from greatly reduced CPU work
- Also reorders command submission in end() to improve throughput

- Refactors most of the vertex buffer handling
- All vertex processing is moved GPU side
2017-08-16 23:58:30 +03:00
mp-t 607d2486ea Code review (#3114)
* Fix always-true conditions in sceNp module

* gl_render_targets: useless check on unsigned variable, possible bug

* fixed UB in crypto utility functions

* copy-paste error in vk::init_default_resources

* pass strings by const ref

* Dont copy vectors. Make sure copies are not needed because functions are used in a multi-threaded context.
2017-08-01 20:22:33 +03:00
kd-11 a7c28f5827 rsx: Fix remainder/iteration computations in BufferUtils 2017-07-24 16:52:42 +03:00
Jake fde6099769 rsx: Fix vertex decompiler to support 2 arg destination 2017-07-22 09:41:00 +03:00
scribam 9747ab61f9 Missing function names (HLE) and small fixes (#3038)
* Add sceNpScoreGetFriendsRanking and sceNpScoreGetFriendsRankingAsync functions

* Add sceNpSnsFbGetLongAccessToken function

* Add new functions for the sceNpTus module

* Add new functions for the cellSailRec module

* Stub cellCrossControllerInitialize

* Add sceNpAuth* functions for the sceNp2 module

* Remove unnecessary call to c_str()

* Add missing module id "CELL_SYSMODULE_ADEC_AT3MULTI"

* Add Turkish keyboard mapping constant

* Add cellOskDialogExtRegisterKeyboardEventHookCallbackEx function

* Update cellSubDisplay

* Update cotire version to 1.7.10

* Replace cellSubdisplay by cellSubDisplay

* Update cellSysutil.cpp with new functions stubbed
2017-07-21 18:41:11 +03:00
kd-11 2526626646 rsx: Surface cache bug fixes
- Properly handle data 'transfer' when recycling frame buffer images
- Clear 'recycled' surfaces before use
2017-07-19 23:28:33 +03:00
kd-11 94c1b74a17 fix build; restore asmjit reader_lock for now 2017-07-19 23:28:33 +03:00
kd-11 f69121116a rsx/vk: Optimize framebuffer lifetime management
- Significant gains due to avoiding aggressive create-delete cycles every frame
2017-07-19 23:28:33 +03:00
kd-11 9e7a42d057 rsx: Minor bug fixes
- vk: Do not select first available format when choosing a swapchain format
- gl/vk: Ignore rendering zero sized framebuffers/scissors
- fp: Re-enable range clamp on fp16 registers; fix fx12 clamping [-2, 2]
2017-07-08 14:52:16 +03:00
kd-11 d43e06c0ea rsx: Fix some fp bugs
rsx/fp: Properly fix RCP
- Input is always scalar, output is a vector

rsx/fp: Ignore forced unit for SIP and TEX instructions
2017-07-08 14:52:16 +03:00
kd-11 3d935b64f2 rsx/gl/vk: Enable contents transfer when a new framebuffer is created and not cleared 2017-07-08 14:52:16 +03:00
kd-11 d7662e54cc rsx/fp: Do not swizzle shadow lookups 2017-06-29 13:13:19 +03:00
kd-11 459a7ba5a2 rsx: Avoid using push_back/emplace_back on empty STL containers
- Reckless management of STL containers causes significant slowdown
- Also reorders vertex compare steps to fail quickly on simpler checks
2017-06-29 13:13:19 +03:00
kd-11 47e5074dc5 rsx: Emulated index buffers are based on vertex 0 with no disjoint ranges
- Drop the 'first' argument as it is unused for now
2017-06-29 13:13:19 +03:00
kd-11 17318112eb rsx: Do not sample as pcf shader if writing a vector result 2017-06-22 23:36:15 +03:00
kd-11 590bb7cbe4 rsx: Bug fixes
rsx: Give more info when ring buffer allocations fail
2017-06-22 23:36:15 +03:00
kd-11 b2e906f4cc rsx: Code cleanup. Fixes several dozen warnings
- Wrap unused parameters as comments to prevent C1400
- Fix sized variable conversions with explicit casts
2017-06-22 23:36:15 +03:00
kd-11 11317acdbe rsx: Handle non-zero base vertex better
- Vertex buffer contents treat the base vertex as vertex 0 so we do the same for indices

rsx: Fix vertex base indexing

rsx: Properly fix non-zero offset indexed rendering
2017-06-22 23:36:15 +03:00
kd-11 75964c686f rsx/gl/vk: Fix some warnings and whitespace issues (LF vs CRLF) 2017-06-22 23:36:15 +03:00
kd-11 db1a90d828 rsx: Discard surface store contents once per frame (temp workaround)
Need to find the proper command issued to discard all surfaces
2017-06-22 23:36:15 +03:00
kd-11 110974af0b vk/gl: Fix sampling of shadow2D textures 2017-06-22 23:36:15 +03:00
kd-11 6a9eef0382 rsx/gl/vk: Enable use of native PCF shadows 2017-06-22 23:36:15 +03:00
kd-11 5f66d0b996 rsx/wip: Fix depth surface reuse and clearing (fixes shadows) 2017-06-22 23:36:15 +03:00
kd-11 9aa632bcc1 rsx/vk: Fixes for ring buffer allocation and image clipping (#2850) 2017-06-10 23:32:17 +03:00
kd-11 d5df4a4616 rsx/fp/gl: Minor fixes (#2823)
* rsx/fp: expand glsl unpack instructions to vec4

* rsx/fp: Ignore BRK outside LOOP/REP

* fix string compare typo
2017-06-01 15:53:25 +03:00
kd-11 18df292f90 rsx/fp: Better handling of flow control ops 2017-05-22 14:28:33 +03:00
kd-11 786bcb0d1b rsx: bugfix - avoid a divide by zero 2017-05-22 14:28:33 +03:00
kd-11 d4ddc40988 rsx: Add support for repeated data streams (broken attrib divisor?) 2017-05-22 14:28:33 +03:00
kd-11 e8b4d332eb rsx: Use faster upload path when conditions allow
Fix aligned memory access (SSE)

rsx: BufferUtils; always use optimized paths
2017-05-22 14:28:33 +03:00
kd-11 c5975d5f66 rsx: Vertex program output fixes 2017-05-12 20:10:03 +03:00
kd-11 2d99f3556e rsx: Fix line_loop -> line_strip indexing 2017-04-03 13:50:58 +03:00
Jochen Schleu ce7d62968e Only pass positive values to sqrt and log2 in the fragment program. (#2624) 2017-04-03 13:17:20 +03:00
kd-11 ef822d785e rsx/fp: src3 workaround 2017-03-24 09:30:23 +03:00
kd-11 8fa3f0721e fix false alphakill flags when texture fetch is optimized away 2017-03-24 09:30:23 +03:00
kd-11 1de2ceca9b rsx/vp: Fixes (#2533)
* rsx/vp: Fix rsq opcode broken in previous commit

* fix ms compiler error

* fix another possible conflict with ms d3d compiler
2017-03-14 16:05:59 +03:00
kd-11 be4bb48476 rsx/fp: Fix some decompiler bugs 2017-03-13 23:40:34 +03:00
kd-11 7c73c3b75c rsx/gl: Minor refactoring; prepare vulkan backend 2017-03-01 00:16:55 +03:00
kd-11 9263999ad1 [rsx/vp] Improve BRB opcode implementation
fix merge issues
2017-02-26 10:17:34 +03:00
kd-11 d6159a35aa gl/vk/dx12: Fix texture scaling on unnormalized rtt access 2017-02-11 15:45:59 +03:00
Zangetsu38 73906f9f29 d3d12: add x1r5g5b5_z1r5g5b5 and cleanup in D3D12Formats.
Add info in BufferUtils for log.
2017-02-10 22:04:45 +03:00
O1L 07a366b608 VP decompiler: fixed condition update flags using 2017-01-23 23:49:17 +03:00
kd-11 fb5df32990 rsx/vs: decode sca ops after vec ops 2016-12-15 14:36:28 +03:00
kd-11 8454949eea gl/vk/rsx: Add a cross-platform overlay text; Minor perf improvements and rsx bugfixes (#2196)
* gl/rsx: Implement platform-agnostic text overlays

gl: Restore performance metrics using new text out helper

gl/rsx: Refactor text generation class

* vk: Enable text overlay

gl/vk: Polish overlay counters implementation

gl: Better resource shutdown for text writer

* gl: Optimization, do not rebind TIUs every frame. Speedup

* gl: Optimizations and improvements to vertex upload code

* gl/vk: Texture format swizzles

vk: Texture format fix

vk: Fix YX format swizzles

* rsx: Decode vertex texture index
2016-10-11 08:55:42 +08:00
kd-11 5430e1d310 rsx/gl/vk/dx12: Add emulated texture fetch for depth read (#2173)
* rsx/gl/vk/dx12: Add emulated texture fetch for depth read

gl/vk/dx12: Simplify reinterpretation equation

* gl: Remove unnecessary re-swizzle

* glsl: explicitly cast uint to float
2016-09-29 14:54:32 +08:00
kd-11 38562155d4 gl/vk: Flip wpos if origin != top 2016-09-28 07:22:45 +08:00
kd-11 867e9210d7 gl/vk: Enable vertex texture fetch (#2127)
* gl: Enable vertex textures

* rsx: use textureLod instead of generic texture sample

* rsx: handle uploading of W32_X32_Y32_Z32

* gl: Re-enable proper shader logging

remove old logging method that overwrites single file

* gl: Declare texture_coord_scale for vertex samplers

* gl: texture remap fixes; enable remap for vertex textures

* gl: offset texture indices to base layer 16

* rsx: Fix W32_Z32_Y32_X32_FLOAT subresource layout

* vk: Enable vertex textures

* rsx: define special calls for vertex texture fetch

* gl: improved vertex texture fetch setup

* vk: Fix texture formats and component mapping

* vk: Implement vertex texture fetch functions properly

* vk/gl: proper fix for primitive restart index

revert inadvertent decompiler update

* gl: Disable filtering for vertex textures
2016-09-20 22:23:56 +08:00
raven02 77f8ce503d RSX texture refactor (#2144) 2016-09-19 09:25:49 +08:00
raven02 691d87978b RSX: fix wrong format 0x9b (#2121) 2016-09-04 15:23:43 +08:00
raven02 a270ac7f02 GL: re-use common fp/vp decompiler (#2100) 2016-08-26 22:23:23 +08:00
Vincent Lejeune 42b518cf7e rsx: use range for vertex buffer attribute. 2016-08-24 21:58:59 +02:00
kd-11 9beb2d8ae0 vk/rsx: Bug fixes (#2092)
* vk: fix separate front and back lighting

* vk: Inlined arrays can have emulated primitives too!

* vk: Use float input attribs for better compatibility

* vk: Free resources during shutdown
2016-08-24 08:50:07 +08:00
raven02 af1ff4439d RSX: fix wrong format 0x9c (#2087) 2016-08-23 16:18:58 +08:00
Nekotekina 84d0d396ed EXPECTS usage removed 2016-08-15 16:29:38 +03:00
Nekotekina 1f3433464c ENSURES usage removed 2016-08-14 22:41:01 +03:00
Nekotekina a7e808b35b EXCEPTION macro removed
fmt::throw_exception<> implemented
::narrow improved
Minor fixes
2016-08-08 19:19:32 +03:00
Vincent Lejeune fb47945930 rsx: Returns u32 instead of size_t for get_index_count/type_size 2016-08-06 00:25:23 +02:00
Vincent Lejeune 7a6f5b6ee5 rsx: Move index pointer generation in rsx::thread. 2016-08-05 17:54:44 +02:00
raven02 4dd67cdd54 texture: ignore when texture width > pitch 2016-08-04 17:54:34 +08:00
raven02 e1ff3f4674 rsx: use fragment_textures_count (#1948)
* rsx:  use fragment_textures_count

* Typo: unknow -> unknown
2016-07-19 22:50:40 +08:00
Nekotekina ceb4cb59ac Typo fix: comparaison->comparison 2016-07-19 14:17:25 +03:00
Vincent Lejeune 772706ca4c Factorize rsx state 2016-07-07 21:38:57 +02:00
DH 989f954432 Added WIP vertex textures support 2016-06-28 12:58:44 +03:00
DHrpcs3 3b5cd4845e OpenGL renderer: use correct MVP matrix. Cleanup
Simplified gl::ring_buffer helper
2016-06-21 19:56:05 +03:00
kd-11 35ab3b0cd8 gl/vk/dx12: re-implement pack/unpack operations (#1764)
dx12: implement pack/unpack operations

dx12: Fix shader compilation when pack/unpack is used

dx12: pk16/up16 - relax half-float range to more realistic values
2016-06-10 14:42:48 +03:00
raven02 db27ea923d VP: add few opcodes comment for vec/sca (#1750) 2016-06-10 01:03:43 +03:00
raven02 8f67c910ab FP: Implement REFL and LRP (#1712) 2016-06-04 10:23:45 +03:00
raven02 fc1408e643 FP: Implement texture lookup with explicit gradients (#1706) 2016-05-29 18:33:41 +03:00
Nekotekina 266db1336d The rest 2016-05-23 16:22:25 +03:00
raven02 42423588c8 Use native function for OP_CODE_PK2/UK2 and UP2/UK2 2016-05-21 22:08:34 +08:00
Ivan aafcf44581 Header optimizations (#1684)
Shouldn't break anything. I hope.
2016-04-27 01:27:24 +03:00
Ivan 75fe95eeb1 GSL moved from stdafx.h (#1676)
Added GSL.h helper for correct including
2016-04-20 02:32:27 +03:00
Nekotekina b85a68e8a1 Partial commit: RSX 2016-04-15 19:22:36 +03:00
Vincent Lejeune 3a3d264cb5 rsx/common/d3d12/gl/vulkan: Set dst stride in write_vertex_array_data_to_buffer. 2016-04-07 22:17:28 +02:00
Vincent Lejeune 2ae5a7ff39 rsx/common/d3d12/gl/vulkan: Use single overload for write_index_array_data_to_buffer. 2016-04-07 22:17:28 +02:00
Vincent Lejeune 2e17ea1490 rsx/common/d3d12/vulkan: Factorise data_heap between vulkan and d3d12. 2016-04-07 22:17:28 +02:00
Vincent Lejeune cbe119b457 rsx/common: Remove MIN2/MAX2 macro. 2016-04-07 22:17:28 +02:00
Raul Tambre 5ad060f150 Vulkan/DX12: Texture format fixes
DX12 also had a couple fixes
2016-04-07 21:34:32 +03:00
Raul Tambre a8e15ce18a Fix forced_unit for unimplemented instructions
For SCT and SCB, the forced unit is always set to FORCE_NONE before
handling of the instruction. This makes the error for unimplemented
instructions' forced unit be incorrect. This fixes that.
2016-04-07 21:34:32 +03:00
Raul Tambre 3ee56627eb DX12 texture format fixes and improvements 2016-04-07 21:34:32 +03:00
Vincent Lejeune 91d0229bc5 rsx/common: Use an help texture_dimension_extended to handle cubemap more cleanly. 2016-03-30 22:19:29 +02:00
Vincent Lejeune b7c539ad7a rsx/common: Make get_exact_mipmap_count take compressed format into account 2016-03-30 22:19:29 +02:00
Vincent Lejeune f2c82d3cf4 rsx/common: Use a typed class for texture dimension. 2016-03-30 20:03:50 +02:00
kd-11 0327e76320 Fix quad strip triangle winding 2016-03-24 10:52:35 +03:00
Vincent Lejeune b00acff9dd rsx/common: Turn alignment constraints in textureUtils to multiple_of constraints. 2016-03-22 19:06:09 +01:00
Vincent Lejeune 284d2c43f9 rsx/common: Use protected instead of private for surface_store content. 2016-03-22 19:06:09 +01:00
Vincent Lejeune 5de70628d7 rsx/common/d3d12/gl/vulkan: Unify texture upload code. 2016-03-14 19:10:51 +01:00
kd-11 82bc41f4ad rsx: support for more formats
rsx: support R5G5B5A1 textures
2016-03-11 18:02:29 +03:00
kd-11 bd52bcf8d4 Fix nvidia crash (API version). Fix linux builds
Properly set up vulkan API version when creating instance

Fix gcc error about passing function result by reference

Fix alot of warnings in VKGSRender project

More fixes for gcc

Fix texture create function
2016-03-10 23:55:25 +03:00
kd-11 3b6e3fb3b4 Rework vertex upload code and fix indexed renders
Rebase on current master; Refactor vertex upload code

Fix build; Minor fixes

Start preparations for merge

Fix generic indexed drawing bugs

Define WIN32_KHR only for windows

Remove linking against vulkan-1.lib
2016-03-10 23:55:25 +03:00
kd-11 2ae687cf00 Properly compute texture size 2016-03-05 18:54:06 +03:00
Vincent Lejeune 9cdb74efc7 rsx/common: Add supports for quads strip
Used in Hitman 2
2016-02-27 19:38:16 +01:00
Vincent Lejeune a6ba47265f rsx/common/gl: s32k is actually signed short unormalized.
gl fix
2016-02-27 00:21:12 +01:00
Vincent Lejeune a6d8d1144c rsx/common: Supports D24X8 texture format when copying
Some app uses this type before setting proper depth surface
2016-02-27 00:21:08 +01:00
Vincent Lejeune 5ef7f8bf3e rsx/common: Fix handling of UB256 2016-02-27 00:21:06 +01:00
kd-11 8a3d15d4fe Handle swizzled CELL_GCM_B8 textures
Properly handle swizzled single-channel textures
2016-02-24 17:44:24 +03:00
Vincent Lejeune 5a14644cd4 rsx/common/d3d12/gl: Use span in vertex upload function. 2016-02-22 20:22:47 +01:00
Vincent Lejeune 1675a82efd rsx/common/d3d12/gl: Use gsl::span in TextureUtils.cpp
* get_placed_texture_storage_size returns more accurate result (fix crash in Outrun)
* Factors lot of code and use integer type more carrefully
* Treat warning as error in TextureUtils.cpp
2016-02-16 18:08:22 +01:00
Vincent Lejeune 837e06e85b rsx/common/d3d12: Support non default alpha function
Fix After burner climax cloud effects.
2016-02-13 17:07:12 +01:00
kd-11 843d0ed298 Fragment position is given as gl_FragCoord not gl_Position
Fix references to gl_Position in Dx12
2016-02-12 18:34:41 +03:00
Vincent Lejeune f0dc38cadd rsx/common/d3d12: Support back spec/diffuse color.
Fix green car in Outrun.
2016-02-08 17:35:52 +01:00
Vincent Lejeune 1f7a1e4078 rsx/common/d3d12/gl: Fix lit and rsq behavior near 0 in vertex shaders. 2016-02-08 17:35:49 +01:00
kd-11 7b889a10cc Add vertex texture buffers for VS input
Support vertex instancing in vertex shader using VertexID

Relax OpenGL requirements by removing 4.5 features

Use EXT version of TexBufferRange; Implement buffer copy using TexBuffer

Apply travis workaround by danilaml

Fix vertex upload in in case of inlined array
2016-02-03 13:38:23 +03:00
Vincent Lejeune 5f35f2ac7d rsx/common/d3d12: Support for texture 1d too.
They are used in after burner climax
2016-01-30 01:13:15 +01:00
Vincent Lejeune 149fa9d750 rsx/common: Make RSXFragmentProgram key and not just pointer. 2016-01-27 23:16:06 +01:00
Vincent Lejeune acd384ae2d rsx/common: Base offset is actually correctly supported.
Outrun uses it and cars are correctly displayed.
2016-01-27 22:05:43 +01:00
Nekotekina 7417033d7f GLGSRender fix 2016-01-27 18:14:39 +03:00
Vincent Lejeune 3c3f92f29b rsx/common/d3d12: Support 3d textures 2016-01-26 17:56:02 +01:00
Vincent Lejeune 24255f7883 rsx/common/d3d12/gl: Add some texture info to RSXFragmentProgram 2016-01-26 17:56:01 +01:00
Vincent Lejeune 9b8522e734 rsx/common: Div is vector over scalar division
According to investigation on Resogun.
2016-01-24 00:13:17 +01:00
Vincent Lejeune 4ce4cf5242 rsx: Add vertex input and output in RSXVertexProgram. 2016-01-22 01:24:54 +01:00
DHrpcs3 19ce0cdc09 rsx methods constants moved to rsx namespace
minor fix
2016-01-20 20:12:48 +03:00
DHrpcs3 7972cb5bdc Code style fixes #1 2016-01-20 16:23:25 +03:00
Vincent Lejeune 440c637b1f rsx/common/d3d12: Move surface_store in common 2016-01-19 22:49:10 +01:00
Vincent Lejeune 1ce49b60d9 rsx-debug/d3d12: Support all rtt formats. 2016-01-17 20:02:30 +01:00
Nekotekina 960668ecf1 For #1355
offsetof() eliminated
OFFSET_32, SIZE_32, ALIGN_32 used
2016-01-14 19:07:27 +03:00
Nekotekina 38531459df Logging system rewritten
GUI doesn't freeze anymore
Some things simplified
2016-01-13 18:54:57 +03:00
Vincent Lejeune 689dee9944 rsx/common/d3d12: Consider separate index range as a whole.
Fix Wolf of the Battlefield 3
2016-01-13 00:28:48 +01:00
Vincent Lejeune bab52c132d rsx/common/d3d12/gl: Clean ProgramStateCache
Use a_b_c format.
Use using =
Use tuple as output
Use RAII to delete program safely
Ensure const correctness.
2016-01-11 19:21:57 +01:00
Vincent Lejeune 4ef76866a5 rsx/common/d3d12/gl: Support texture lod sampling. 2016-01-10 00:16:26 +01:00
Vincent Lejeune 675ccd4510 rsx/common/d3d12/gl: Mimic divsq and rsq fragment instruction behaviour with 0.
Fix Super Puzzle Turbo HD 2 and SH3 HD
2016-01-09 23:18:05 +01:00
Vincent Lejeune d153575e59 rsx/common/d3d12/gl: Support for CMP/non pow of 2 size vertex formats.
Also use class enum for base_vertex_type everywhere.
Fix Bomberman Ultra color and Cubixx HD geometry.
2016-01-09 23:18:03 +01:00
DHrpcs3 48919330d7 rsx methods moved from rsx thread 2016-01-06 13:30:24 +02:00
DHrpcs3 ba12c489ec gl: using tiled region for read/write color buffers and flip
gl: fixed flip buffer row length
compilation fixes
2016-01-06 13:30:23 +02:00
Vincent Lejeune 3586c7613a rsx/common: Fix program state cache Shader program comparaison.
Comparaison was not taking the last instruction of shader into account.
Also remove "constant masking" since it wasn't actually usefull.

Fix DBZ: Burst Limits, SH3 and likely much more games.
2016-01-02 00:47:51 +01:00
Vincent Lejeune 5f12a4f7b5 rsx/common/d3d12/gl: Use separate vertex array/vertex register states. 2015-12-30 17:04:34 +01:00
Vincent Lejeune 969e2d8c57 rsx/common: Support RSX_FP_OPCODE_DIV for scb
Fix glitches in dbz
2015-12-29 17:08:01 +01:00
Vincent Lejeune 9c6539ea2d rsx/common/d3d12: Force depth to be at least 1. 2015-12-23 22:26:18 +01:00
Vincent Lejeune a97dc349b7 rsx/common: If swizzle bit is not set then there is no padding, even for dxtc textures.
Fixes some textures in dbz and after burner climax.
2015-12-23 22:26:16 +01:00
Nekotekina 3ed603074c Changes done by [DH] rewritten
Added rsx_program_decompiler submodule
Added fs::dir iterator
Added fmt::match
2015-12-22 23:11:20 +03:00
DHrpcs3 d8bef46c2a Do not use global static variables in headers 2015-12-21 05:35:56 +02:00
Nekotekina aa811b6eef Cleanup (noexcept, unreachable)
%x formatting fixes
2015-12-20 15:41:07 +03:00
Vincent Lejeune 69b3828086 rsx/common: Vertex program condition swizzle should apply to cc0, not float4(0.) 2015-12-16 20:36:50 +01:00
Vincent Lejeune 1cda2977bb common/d3d12: emulate polygon mode 2015-12-16 20:36:36 +01:00
Vincent Lejeune 6221fecf3b common/d3d12/gl: Start implementing cubemap sampling 2015-12-16 20:36:34 +01:00
Vincent Lejeune 80dc122742 common/d3d12: Clean texture upload code.
Some typos are fixed in the process
2015-12-16 20:36:32 +01:00
Vincent Lejeune 929f518ef3 rsx/d3d12/gl: Make output write backend dependent. 2015-12-16 20:36:31 +01:00
Vincent Lejeune 6fae5863cf common/d3d12/gl: Add support for textureProj 2015-12-16 20:36:29 +01:00
Jake 52be47ca89 rsx: Style changes 2015-12-02 07:06:40 -06:00
Jake 19cf749944 rsx: fix convert_linear_swizzle converting backwards 2015-12-02 04:22:19 -06:00
Jake 178bcfc8df rsx: Improve NV3089_IMAGE_IN_SIZE and use faster loop for swizzle conversions 2015-12-02 04:22:18 -06:00
Vincent Lejeune a21c9f9861 rsx: Avoid mixing float4 and int4 in declaration of AddrReg. 2015-11-30 17:35:51 +01:00
Vincent Lejeune c86cfef58e rsx/common: Remove getFragmentConstantOffsetsCache 2015-11-28 20:58:00 +01:00
Vincent Lejeune b9d8d9383a rsx/d3d12: dump program content when capturing frame 2015-11-24 23:34:03 +01:00
Vincent Lejeune 3e5f0e5c37 rsx: Add missing SCB DIVSQ opcode support
Fix a lot of gfx glitches in SH3 HD
2015-11-19 19:24:58 +01:00
Vincent Lejeune 9fdc458d69 rsx: Make SCT/SCB/TEX SRB function complete member of FragmentProgram 2015-11-19 19:24:57 +01:00
Vincent Lejeune a79ffdb485 rsx/common: Fix ARL register type and write function in vtx shader 2015-11-15 17:20:41 +01:00
Vincent Lejeune ae5d95d462 rsx/common: Take primitive restart index in account and turns it into -1. 2015-11-12 18:29:01 +01:00
Raul Tambre fac9d74344 Lots of defect fixes 2015-11-09 07:39:50 +02:00
Vincent Lejeune 2a9895b7f0 rsx/d3d12: Move fragment constants filling code to ProgramStateCache 2015-11-06 20:08:45 +01:00
Vincent Lejeune 2ad7051746 rsx/d3d12: Move vertex constants filling code to RSXThread 2015-11-06 20:08:41 +01:00
Vincent Lejeune 02ce78482c rsx/d3d12: Move scale offset buffer setting to RSXThread 2015-11-06 20:08:17 +01:00
Vincent Lejeune 1ec18bdf64 RSX/common: Clean BufferUtils code
* Add noexcept
* Use a_b_c code style
* Use anonymous namespace
2015-10-29 18:48:50 +01:00
Vincent Lejeune 42467ba40f RSX/common: Clean TextureUtils code.
* Use a_b_c code style
* Add noexcept
* Use anonymous namespace
2015-10-29 18:48:50 +01:00
Vincent Lejeune 9f49232cac d3d12: Avoid copying index data and use correct index range.
This fixes Braid.
2015-10-27 01:24:04 +01:00
Vincent Lejeune a2997a1109 d3d12: Avoid an extra vertex copy 2015-10-15 17:13:43 +02:00
Vincent Lejeune b0f8611f49 Common/GL/D3D12: Fix int vector ctor in vertex shader and a compare opcode. 2015-10-15 17:13:42 +02:00
Nekotekina a974ee009e vm::var improved, cleanup
Mostly vm::var initialization introduced.
Added vm::make_var function.
2015-10-14 18:17:37 +03:00
DH 4a55ba3067 OpenGL renderer improvements
Flush program cache at thread exit
Use cached locations
2015-10-14 03:16:39 +03:00
Vincent Lejeune a63fdf6c45 Use files from master
- Drop smart vertex storage and use OpenGL's one instead.
2015-10-13 14:27:17 +02:00
Vincent Lejeune 60bccf0f10 Remove RSXVertexArray 2015-10-13 00:04:12 +02:00
Vincent Lejeune d27f6c8fa7 Use rsx::limits values 2015-10-13 00:04:05 +02:00
Vincent Lejeune 6f71d04aa4 move linear to swizzle and get_size_type
symbol undef though
2015-10-13 00:04:04 +02:00
Vincent Lejeune 3de47c201c RSX: Create a rsx namespace.
Put get_address inside.
2015-10-13 00:04:04 +02:00
Vincent Lejeune f483c3b9ca Revert "Merge pull request #1245 from DHrpcs3/master"
This reverts commit 5feba39ff7, reversing
changes made to ebf28f8da0.
2015-10-09 20:04:20 +02:00
DH 6cb036d35f Fix for gcc/clang build 2015-10-08 00:05:04 +03:00
DH cc0c3fc98d Implemented fragment constants loading (OpenGL renderer)
Fixed nv308a::color
Minor improvements
2015-10-07 17:36:26 +03:00
DH 4fdeeace66 D3D12Renderer: fixed some compilation errors
Removed GSFrameBase2 and D3DGSFrame.
Added frame for NullRender.
Minor improvements and fixes
2015-10-05 13:03:23 +03:00
Vincent Lejeune d511153836 Common: Fix element count computation if addr is null (RSXVertexData) 2015-10-05 01:57:57 +02:00
Vincent Lejeune c7b7d1f71f Common: Move generic vertex buffer code from d3d12 backend 2015-10-03 18:25:19 +02:00
Vincent Lejeune 62d7bf2159 Common: Move generic upload texture code from d3d12 2015-10-03 18:25:18 +02:00
Raul Tambre ea376e7751 Implement console_write and GetHomeDataExportPath 2015-09-12 14:11:26 +03:00
Raul Tambre 4666f190db Fix BRI instruction, fixes #1165 2015-09-07 20:14:00 +03:00
Vincent Lejeune 095c8fa19b RSX/D3D12: Improve shader lookup performance 2015-08-26 18:45:57 +02:00
Nekotekina ce494f8847 fmt::by_value, fmt::Format removed 2015-08-24 21:22:42 +03:00
vlj 6a408301d7 d3d12: Another fix 2015-08-12 00:28:35 +02:00
vlj 312ff7e8f5 RSX: Fix for default value of temp reg 2015-08-12 00:25:33 +02:00
Nekotekina 8c00dcd02d Bugfix 2015-07-10 04:31:21 +03:00
Nekotekina ef6f9f6ded be_t constructor implemented, make() eliminated
be_t enums are forbidden, le_t improved, some operators cleaned.
2015-07-10 04:31:07 +03:00
Nekotekina edb9595721 Using vm::ps3 namespace moved in proper places
Various fixes
2015-07-10 04:30:41 +03:00
raven02 2d6dd873cd FP: RSQ instruction alternative 2015-05-23 20:45:12 +02:00
raven02 f98b03b61f VP: use getFloatTypeName() with compare instructions 2015-05-23 20:45:11 +02:00
raven02 eac5147a45 FP: fix SFL instruction 2015-05-23 20:45:11 +02:00
raven02 bebd437a7e RSX: use getFloatTypeName 2015-05-23 20:45:10 +02:00
raven02 79cb025d25 RSX : factorize DPH 2015-05-23 20:45:09 +02:00
raven02 f961a2e3b4 GL: fix IFE instruction 2015-05-23 20:45:08 +02:00
vlj 2416d49dba RSX: Add a class factorizing decompiler code 2015-05-23 20:45:07 +02:00
vlj 145f411324 RSX: Add a template class that helps caching programs. 2015-05-19 17:26:05 +02:00