Commit graph

178 commits

Author SHA1 Message Date
kd-11 507ec8252b vk: Refactor renderpass management
- Ensures the current renderpass matches the image properties even when a cyclic reference is detected
- Solves SDK debug output error spam due to mismatching layouts and renderpasses
2019-05-25 14:07:29 +03:00
kd-11 4037225e98 vk: Workaround for cyclic feedback loops
- Transition attachments to LAYOUT_GENERAL in case of a feedback loop
  - Fixes appearance of garbage along polygon edges in some
post-processing passes.
  - Also reverse this transition when rendering goes back to normal
2019-05-17 16:41:17 +03:00
kd-11 05eb1e9193 rsx: Fix zombie image references from inside the texture cache
- Do not add locked orphans to the flush_always cache! They will not remove their cache entries as they are not bound
2019-05-16 19:25:26 +03:00
kd-11 214bb3ec87 rsx: Always initialize memory unless it is guaranteed to be wiped 2019-05-16 19:25:26 +03:00
kd-11 88290d9fab rsx: Hack around using data regions as transfer targets 2019-05-16 19:25:26 +03:00
kd-11 3c7d8a1099 rsx: Minor texture/surface scanning optimization
- Also re-enable optimization in blit engine accidentally disabled during debugging
2019-05-16 19:25:26 +03:00
kd-11 9f0090772a rsx: Fix write tagging when comments are transferred in by blit engine 2019-05-16 19:25:26 +03:00
kd-11 88c20afd3a rsx: Implement unaligned surface inheritance with hierachial contribution
- Allows render targets to behave like stacked 3D views same as shader inputs are resolved
- Basically implements most of 'Read Color/Depth Buffers" option for 'free'.
- Allows splitting RTV/DSV resources if they are superceded by a partial surface
- Also allows intersecting new resources through the surface cache for proper inheritance from other scattered data
- TODO: Refactor bind_surface_as_rtt and bind_surface_as_ds to reduce asinine code duplication
2019-05-16 19:25:26 +03:00
kd-11 6b7cd458e3 rsx: Silence some diagnostics unless compiled with debugging options 2019-05-01 15:36:21 +03:00
kd-11 48cb265c2c rsx: Bounds check on local resource for atlas merge.
- Local resources can also have padded pitch dimensions and false-positives on range overlap tests
2019-05-01 15:36:21 +03:00
kd-11 ec9aa74008 rsx: Fix section base offset calculation for blit_dst targets which affects confirmed memory range
- Fixes flushes only writing partially to target memory
2019-05-01 15:36:21 +03:00
kd-11 df3b46a611 rsx: Improve texture sourcing and clipping when reverse scanning is enabled
- When reverse scanning, offsets are inverted and offset value of 0 is logically equivalent to an offset of -1
- Add an explicit message if clipping happens to avoid silent errors/bugs
2019-04-12 15:36:21 +03:00
kd-11 e4e86455f2 rsx: Fix temporary subresource caching behaviour
- Do not cache if a gathered subresource contains a bound RTT
- Change op to dynamic copy if parent is still bound
2019-04-09 13:40:54 +03:00
kd-11 3249000511 rsx: Improvements to texture scanning
- Removes CPU-only transforms that broke GPU-side code.
 -- Channels in GPU compute are laid out in cell-order, but CPU was uploading in favorable order and compensating with swizzles.
 -- This leads to 2 different layouts depending on the location of the data (CPU vs GPU)
- Implement R8G8_R8B8 interleaved format decode
- General improvements
2019-04-09 13:40:54 +03:00
kd-11 366e4c2422 rsx: Preliminary support for format conversions using typeless resolve 2019-04-09 13:40:54 +03:00
kd-11 b7470cfc1a rsx: Tighten format checks in cache hit tests 2019-04-09 13:40:54 +03:00
kd-11 443fde760f rsx: Blit engine clipping fixes
- Do not round up sub-pixel offsets, round down instead
- Do not allow incomplete sources for hw blit transfer
- Reimplement src clipping (slice_h)
- Check 'area' of incoming texels and correct for them before RTT lookup/transfer
- Filter out incomplete targets when performing RTT lookup (1 texel or less contribution)
2019-04-09 13:40:54 +03:00
kd-11 41b87cf577 rsx: Blit engine fixes
- If a transfer writes to a RTT and depth mismatch happens, create a local target and the upload function will likely resolve between the two
- If a surface is rejected, reset the target region!
2019-03-22 21:27:15 +03:00
kd-11 86ad204636 rsx: Rebase output region when using upload-fallback path 2019-03-22 21:27:15 +03:00
kd-11 dbc8e70ddd rsx: Silence some compiler noise 2019-03-22 21:27:15 +03:00
kd-11 adc59f9810 rsx: Fix blit transfers when texel sizes mismatch
- Also refactors some bpp handling code
- Simplify texture intersection test to use a normalized/uniform coordinate space
- Fix broken bounds checking as well
2019-03-22 21:27:15 +03:00
kd-11 3ef16bee47 rsx: Fix texture lookups and avoid out-of-bounds copies/transfers 2019-03-17 21:50:11 +03:00
kd-11 bb65e45614 rsx: Implement GPU acceleration for rotated images 2019-03-17 21:50:11 +03:00
kd-11 5260f4b47d rsx: Improvements to memory flush mechanism
- Batch dma transfers whenever possible and do them in one go
- vk: Always ensure that queued dma transfers are visible to the GPU before they are needed by the host
  Requires a little refactoring to allow proper communication of the commandbuffer state
- vk: Code cleanup, the simplified mechanism makes it so that its not necessary to pass tons of args to methods
- vk: Fixup - do not forcefully do dma transfers on sections in an invalidation zone! They may have been speculated correctly already
2019-03-17 21:50:11 +03:00
kd-11 385485204b vk/gl: Omit unlocked data when grabbing flip sources from texture cache 2019-03-17 21:50:11 +03:00
kd-11 74eeacd091 vk/gl: Improve memory tag sync and test
- Properly pass parameters such as rsx-pitch to the surface store
- Do not crash if a surface fails verification in flip, use fall-back instead
2019-03-17 21:50:11 +03:00
kd-11 1a44446250 rsx: Fix dst upload block region
- The section needed starts at image origin, not transfer origin!
2019-03-17 21:50:11 +03:00
kd-11 a49a0f2a86 vk/gl: Synchronization improvements
- Properly wait for the buffer transfer operation to finish before map/readback!
- Change vkFence to vkEvent which works more like a GL fence which is what is needed.
- Implement supporting methods and functions
- Do not destroy fence by immediately waiting after copying to dma buffer
2019-03-17 21:50:11 +03:00
kd-11 85cb703633 rsx/cache: Debugging bugs introduced by the atlas coverage check
- Figured out why it breaks things, ofc can't actually check for coverage when there is no proper fbo data persistence
2019-03-17 21:50:11 +03:00
kd-11 612160a8ff rsx: Fix zero-pitch textures
- Assumption here is that only texel (0, 0) is accessible. Inline with other pitch 0 operations.
- TODO: Verify pitch 0 does not advance in Y either
2019-03-17 21:50:11 +03:00
kd-11 17c49d21a5 rsx/blit: Remove workarounds/hacks added for master. Start implementation/stubs for blit engine rotations in GPU 2019-03-17 21:50:11 +03:00
kd-11 745f8f9627 rsx: Remove pointless assert 2019-03-17 21:50:11 +03:00
kd-11 358558aaa7 cleanup and fixups 2019-03-10 16:09:05 +03:00
kd-11 21bc6c7a87 rsx: Properly resolve data for upload when needed.
- Avoids blindly reusing blit dst sections as they may contain garbage.
  If a section was unlocked for a flush, just discard it as its reuse introduces potential data corruption.
  Since the data needs to be reuploaded anyway (for now), its better to start afresh
- In case of format mismatch, reset the calculated dst block
- Add a bounds check to determine if data contained in an atlas is good enough for sampling the cache.
  If not enough data is provided, fall back to full upload
2019-03-10 16:09:05 +03:00
kd-11 9d4d3d9443 rsx: Reimplement render target intersection tests when using hw accelerated blit engine
- Properly collapse memory tree when scanning in case of overlaps!
2019-03-10 16:09:05 +03:00
kd-11 7c379432dd rsx: Implement proper pitch compatibility lookup
- When a single row is required or is all that is available, pitch has
no meaning as the coordinate space changed to 1D
2019-03-10 16:09:05 +03:00
kd-11 dccb4a4888 rsx/texture_cache: fixes to commit_framebuffer_memory 2019-03-10 16:09:05 +03:00
kd-11 b9e7b085fe rsx/texture_cache: Fixups for local resource hit and fast-path added 2019-03-10 16:09:05 +03:00
kd-11 10dc3dadee rsx/texture_cache: Improve framebuffer memory locking when WCB/WDB is not enabled
- Adds a new mode that removes non-framebuffer stuff inside framebuffer range
2019-03-10 16:09:05 +03:00
kd-11 563e205a72 rsx/texture_cache: Fix 'AA' scaling hack and restore collection template selection 2019-03-10 16:09:05 +03:00
kd-11 3a071a9c07 rsx: Texture search rewrite
- Perform a full search across all resource types as needed without
taking too many shortcuts/hacks
2019-03-10 16:09:05 +03:00
kd-11 6ef9dcd62e rsx: Handle mismatched/invalidated framebuffer sections when WCB is enabled 2019-03-10 16:09:05 +03:00
kd-11 ef071ebb6b rsx: Synchronize surface cache and texture cache data
- TODO: The whole upload_texture thing is a big hack, fix it properly
2019-03-10 16:09:05 +03:00
kd-11 2163a59649 rsx: Typo fix 2019-01-25 14:34:22 +03:00
kd-11 52ac0a901a rsx: improve memory coherency
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
2019-01-06 10:44:40 +03:00
kd-11 2a62fa892b rsx: Texture cache refactor
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
  usage. Allows for operations like optional memory initialization before
  reading
2019-01-06 10:44:40 +03:00
kd-11 3be4b474d9 rsx: Handle rsx-self-tripping in draw call and triggering invalid invalidation
- If draw call resources consume memory that intersects with NA parts of the texture cache, we get a framebuffer test mismatch.
  This mismatch is false and happens because the thread has not yet reached the point of relocking the pages
2019-01-06 10:44:40 +03:00
kd-11 a95a44cf66 rsx: Strictness cleanups
- Also account for variable pitch textures (swizzled scan)
2019-01-06 10:44:40 +03:00
kd-11 15d5507154 rsx: Rewrite memory inheritance transfers
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
2019-01-06 10:44:40 +03:00
kd-11 97704d1396 rsx: Fix texture size calculations 2019-01-06 10:44:40 +03:00