Commit graph

485 commits

Author SHA1 Message Date
kd-11 88c20afd3a rsx: Implement unaligned surface inheritance with hierachial contribution
- Allows render targets to behave like stacked 3D views same as shader inputs are resolved
- Basically implements most of 'Read Color/Depth Buffers" option for 'free'.
- Allows splitting RTV/DSV resources if they are superceded by a partial surface
- Also allows intersecting new resources through the surface cache for proper inheritance from other scattered data
- TODO: Refactor bind_surface_as_rtt and bind_surface_as_ds to reduce asinine code duplication
2019-05-16 19:25:26 +03:00
scribam 6c5ea068c9 Remove redundant semicolons
Fix "-Wextra-semi" warnings
2019-05-12 18:32:11 +03:00
scribam 3623f4343f gl/vk: clear scissor_setup_invalid bit along with scissor_config_state_dirty bit 2019-05-11 13:13:49 +03:00
kd-11 1d5c52f476 rsx: Ignore stencil clear flag if the stencil write mask is disabled 2019-05-01 15:36:21 +03:00
kd-11 63f9b8e0c6 gl/vk: Minor cleanup 2019-05-01 15:36:21 +03:00
kd-11 4e3ec162e2 rsx: Fix broken texture cache search when flipping 2019-05-01 15:36:21 +03:00
eladash 6f76e34104 rsx: Fix race on clearing native_ui vs emu_requested flag 2019-04-20 01:04:41 +03:00
kd-11 adc59f9810 rsx: Fix blit transfers when texel sizes mismatch
- Also refactors some bpp handling code
- Simplify texture intersection test to use a normalized/uniform coordinate space
- Fix broken bounds checking as well
2019-03-22 21:27:15 +03:00
kd-11 385485204b vk/gl: Omit unlocked data when grabbing flip sources from texture cache 2019-03-17 21:50:11 +03:00
kd-11 74eeacd091 vk/gl: Improve memory tag sync and test
- Properly pass parameters such as rsx-pitch to the surface store
- Do not crash if a surface fails verification in flip, use fall-back instead
2019-03-17 21:50:11 +03:00
kd-11 9d4d3d9443 rsx: Reimplement render target intersection tests when using hw accelerated blit engine
- Properly collapse memory tree when scanning in case of overlaps!
2019-03-10 16:09:05 +03:00
kd-11 7c379432dd rsx: Implement proper pitch compatibility lookup
- When a single row is required or is all that is available, pitch has
no meaning as the coordinate space changed to 1D
2019-03-10 16:09:05 +03:00
kd-11 a80f1a6ed4 gl: Fix memory tag sampling
- Also fixes a bad arg passed to glClearBuffer
2019-03-10 16:09:05 +03:00
kd-11 3a071a9c07 rsx: Texture search rewrite
- Perform a full search across all resource types as needed without
taking too many shortcuts/hacks
2019-03-10 16:09:05 +03:00
kd-11 9ed9d7e947 overlays/osk: Implement native osk interface 2019-02-02 11:54:01 +03:00
kd-11 5a4bea8c4f gl: Blit fixup
- Typo fix. I meant to disable scissor test, not stencil test
- Also clean up and simplify/optimize the core logic
2019-01-25 14:34:22 +03:00
kd-11 fb778e4821 rsx: Reimplement attrib divisor 2019-01-25 14:34:22 +03:00
kd-11 6fdc0fd7f0 rsx: Reimplement MSAA transparency
- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
2019-01-25 14:34:22 +03:00
kd-11 8093c9b573 rsx: Disable rtt side-effects when async compilation is ongoing. Only real renders should promote buffer state from underined to drawn, otherwise keep previous contents intact. 2019-01-25 14:34:22 +03:00
kd-11 417a2e6731 rsx: Refactor index buffers
- Index offset is ignored anyway and only used to calculate vertex attribute divisor index
- Specialized optimization for untouched xfer without primitive restart
2019-01-25 14:34:22 +03:00
kd-11 52ac0a901a rsx: improve memory coherency
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
2019-01-06 10:44:40 +03:00
kd-11 95245bdd83 rsx: Improve ARGB8->D24S8 casting
- Set up partial transfers
- Force clear of target before starting the transfer
2019-01-06 10:44:40 +03:00
kd-11 475cc99117 rsx: Fix dirty flag reset after a partial attachment initialization
- D24S8 targets have 2 aspects that are dealt with separately; Forcefully initialize the remaining data if a partial init is done. Its 'free' anyway
- It seems that the stencil mask matters when clearing unlike the depth mask and color mask
2019-01-06 10:44:40 +03:00
kd-11 c80c7f06bb rsx: Typo fix
- This silly typo broke the flip improvements in the GT fixes PR
2019-01-06 10:44:40 +03:00
kd-11 2a62fa892b rsx: Texture cache refactor
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
  usage. Allows for operations like optional memory initialization before
  reading
2019-01-06 10:44:40 +03:00
kd-11 1ffadbe086 rsx: Reorganize write barrier implementation (either clear or memory barrier) 2019-01-06 10:44:40 +03:00
kd-11 474d0f61a2 minor typo fix 2019-01-06 10:44:40 +03:00
kd-11 15d5507154 rsx: Rewrite memory inheritance transfers
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
2019-01-06 10:44:40 +03:00
kd-11 9c46386dd4 rsx: Check av configuration when selecting display buffers!
- Some applications have mismatch between video output configuration and display buffer sizes
2018-12-24 09:05:19 +03:00
kd-11 4b79ef1ad9 rsx: Implement stencil mirror views
- Implements a mirror view of D24S8 data that accesses the stencil components.
  Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data
- Add a few missing destructors
  Image classes are inherited a lot and I forgot to make the dtors virtual
2018-12-24 09:05:19 +03:00
kd-11 c75749f8ce rsx: fix flip logic when grabbing output from the surface cache 2018-12-24 09:05:19 +03:00
Rui Pinheiro 9d1cdccb1a Implement dedicated texture cache predictor 2018-12-11 22:37:10 +03:00
kd-11 504ab5a6d4 rsx: Minor cleanup to silence stupid compiler warnings 2018-12-03 20:01:23 +03:00
kd-11 2168159d03 gl: Fix flip regression
- Restore graphics state after flip (including active fbo) because flip can be made through a syscall
2018-11-30 23:51:25 +03:00
kd-11 5b6e1420f3 rsx: Pipeline barriers fixed up
- Ensure barriers are invoked even if no draw occurs!
-- Ensures that deferred commands are executed eventually
2018-11-30 23:51:25 +03:00
kd-11 8a186bb97e rsx: Fix insertion of execution barriers
- Ignore barriers inserted after BEGIN but before any draw commands are emitted
- Properly process tail barriers inserted before END but after draw commands are submitted
- Ignore execution barriers with no effect (same register value written)
2018-11-30 23:51:25 +03:00
kd-11 5193c99973 rsx: Enable dynamic FIFO preprocessing
- Tries to detect when FIFO preprocessing is beneficial and only enables optimizations if the benefit outweighs the cost
- Current threshold is at least 500 draw calls saved at over 2000 draw calls to justify the overhead
- TODO: More tuning for other CPUs
2018-11-30 23:51:25 +03:00
kd-11 846daadd5d rsx: Fixups
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
2018-11-30 23:51:25 +03:00
kd-11 d6b4440ef9 gl: Separate vertex env from program env 2018-11-30 23:51:25 +03:00
kd-11 54ec363e88 rsx: Critical pipeline fixes
- Fix scissor and viewport binding behavior
- Fixes recovery if empty scissor is specified and then 'fixed' later
- Optimizes state binding a bit
2018-11-30 23:51:25 +03:00
kd-11 1ad76ad331 rsx: Restructure programs
- Also re-enable pipeline optimizations
2018-11-30 23:51:25 +03:00
kd-11 b0a6b72ce8 rsx: Optimizations
- Replace a few more vectors with simple_array<T>
- Avoid unnecessary string comparisons in backends. We already know referenced textures from the program analysers!
2018-11-30 23:51:25 +03:00
kd-11 677b16f5c6 rsx: Fixups
- Also fix visual corruption when using disjoint indexed draws

- Refactor draw call emit again (vk)

- Improve execution barrier resolve
  - Allow vertex/index rebase inside begin/end pair
  - Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]

- gl bringup

- Simplify
  - using the simple_array gets back a few more fps :)
2018-11-30 23:51:25 +03:00
kd-11 e01d2f08c9 rsx: Refactor FIFO
- Removes fifo structures from common RSXThread
- Sets up a dedicated FIFO controller
- Allows for configurable queue optimizations
2018-11-30 23:51:25 +03:00
Nekotekina 1b37e775be Migration to named_thread<>
Add atomic_t<>::try_dec instead of fetch_dec_sat
Add atomic_t<>::try_inc
GDBDebugServer is broken (needs rewrite)
Removed old_thread class (former named_thread)
Removed storing/rethrowing exceptions from thread
Emu.Stop doesn't inject an exception anymore
task_stack helper class removed
thread_base simplified (no shared_from_this)
thread_ctrl::spawn simplified (creates detached thread)
Implemented overrideable thread detaching logic
Disabled cellAdec, cellDmux, cellFsAio
SPUThread renamed to spu_thread
RawSPUThread removed, spu_thread used instead
Disabled deriving from ppu_thread
Partial support for thread renaming
lv2_timer... simplified, screw it
idm/fxm: butchered support for on_stop/on_init
vm: improved allocation structure (added size)
2018-10-19 22:22:35 +03:00
kd-11 6a9f234dc7 rsx: Fixup flip behaviour
- handle_emu_flip is very heavy, only fire
2018-09-26 19:41:50 +03:00
kd-11 f72157bcec rsx: Fix vertex attrib parsing 2018-09-25 22:03:35 +03:00
kd-11 a3d44b5e1f rsx: Cleanup changes for the flip patch 2018-09-24 16:44:02 +03:00
Jake 699eadc84f rsx: Move render flip from rsx queue command to flip command 2018-09-24 16:44:02 +03:00
Rui Pinheiro 35139ebf5d Texture cache cleanup, refactoring and fixes 2018-09-24 15:26:40 +03:00
kd-11 2b6e6a9ae9 gl: Fix problems with framebuffer reuse
- Matching attachments with resource id fails because drivers are reusing
  handles!
- Properly sets up stale fbo ref counting and removal
- Properly sets up resource reference test with subsequent removal to
  avoid using a broken fbo entry
2018-09-21 16:32:23 +03:00
kd-11 fc486a1bac rsx: Preserve memory order when doing flush
- Orders flushing to preserve memory at all cost
- Avoids false positive where flushing overlapping sections can falsely invalidate another with head/tail test
2018-09-21 16:32:23 +03:00
kd-11 23dc9d54e3 rsx: Fix flip source selector 2018-09-21 16:32:23 +03:00
kd-11 d6dc1493cb rsx/overlays: Implement blur, darkening and ability to disable custom background 2018-09-18 16:24:13 +03:00
kd-11 9f61fb5a78 overlays: Allow custom background for message dialog 2018-09-18 16:24:13 +03:00
scribam 4cb98014a2 rsx: tiny zcull optimizations 2018-09-13 12:43:40 +03:00
eladash efbd77deb4 rsx: dont silently ignore null shader address 2018-09-12 00:40:20 +03:00
Nekotekina ca5158a03e Cleanup semaphore<> (sema.h) and mutex.h (shared_mutex)
Remove semaphore_lock and writer_lock classes, replace with std::lock_guard
Change semaphore<> interface to Lockable (+ exotic try_unlock method)
2018-09-03 23:00:36 +03:00
kd-11 5a08b690d5 gl: always clean up the heap when using legacy buffers 2018-09-03 18:24:20 +03:00
scribam 7d0e94ab0a Compilation fixes for optional on osx 2018-08-31 20:13:54 +04:00
kd-11 ec31157bc7 gl: Avoid unnecessary scissor state change every draw call. 2018-08-18 16:14:30 +03:00
kd-11 8c93db342f gl: Reuse framebuffer resources
- WIP optimizations for GL backend
2018-08-18 16:14:30 +03:00
kd-11 f8a9b1fa30 [WIP] rsx: Improve memory inheritance hierachy
- Cascade memory writes by invalidating 'downstream' subsurfaces
- Fixup; always resolve for overlapping surfaces before sampling (force
  atlas gather test)
2018-08-18 16:14:30 +03:00
kd-11 ba5b59dc59 gl: Do not create secondary context if async is disabled
- Some third party programs fall apart when multiple contexts are created
2018-08-18 16:14:30 +03:00
kd-11 0267221586 Minor optimizations and fixes
- FIFO: avoid multiline spam
- VK: Fix program setup counter
- FS: Precalculate fragment constants buffer size during analysis step
2018-08-18 16:14:30 +03:00
kd-11 3b47e43380 rsx: Synchronization rewritten
- Do not do a full sync on a texture read barrier
- Avoid calling zcull sync in FIFO spin wait
- Do not flush memory to cache from the renderer side; this method is now obsolete
2018-08-18 16:14:30 +03:00
eladash f349695a75 Rsx: rewrite address translation 2018-08-13 16:16:34 +03:00
kd-11 19d808d378 rsx/gl: Minor cleanup and optimization
- Track register change status
- Remove unused gl classes
2018-07-22 17:19:59 +03:00
kd-11 e7f30640ef rsx: Async shader compilation
- Defer compilation process to worker threads
- vulkan: Fixup for graphics_pipeline_state.
  Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)
2018-07-14 15:19:56 +03:00
kd-11 fa55a8072c rsx: Improve vertex textures support
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
2018-07-12 18:02:28 +03:00
kd-11 2ca935a26b vp: Improve vertex program analyser
- Adds dead code elimination
- Fix absolute branch target addresses to take base address into account
- Patch branch targets relative to base address to improve hash matching
- Bumps shader cache version
- Enables shader logging option to write out vertex program binary,
  helpful when debugging problems.
2018-07-07 16:20:33 +03:00
kd-11 24f4c92759 rsx: Improve texture cache read speculation 2018-06-26 20:07:20 +03:00
kd-11 1730708f47 rsx: Rework memory protection management for framebuffer access
- Avoid re-locking memory if there is no reason to do so (no draws issued)
- Actively bound regions should always get written to the backing cache
- Forcefully read memory during download if writes to the target have occured since last sync event
2018-06-26 20:07:20 +03:00
Megamouse a8f19fbfae RSX: fix shader cache progress bar exit state shenanigans 2018-06-11 22:41:38 +03:00
Megamouse 4003aacc6a RSX: add taskbar progress to native ui progress dialogs 2018-06-08 23:41:56 +03:00
kd-11 c9e367befd rsx/debug: Fix rendering when FIFO reordering is disabled 2018-06-08 22:17:50 +03:00
kd-11 1b9c9267f0 rsx: Update memory flags after memory transfer 2018-06-08 22:17:50 +03:00
kd-11 fc18e17ba6 vk: Implement depth scaling using hardware blit/copy engines
- Removes the old depth scaling using an overlay.
  It was never going to work properly due to per-pixel stencil writes being unavailable
- TODO: Preserve stencil buffer during ARGB8->D32S8 shader conversion pass
2018-06-08 22:17:50 +03:00
kd-11 3150619320 rsx: Preserve read AA state separate from write AA state
- Some applications (e.g Backbreaker) use an evil hack to resolve MSAA.
  The application respecifies a formerly AA region as a region with no AA then performs a framebuffer feedback lookup.
  The old memory keeps AA during read, but writes back to itself with AA resolved.
  This is evil on several levels but it just happens to work on PS3
2018-06-08 22:17:50 +03:00
kd-11 0f24379c0e rsx: Obey MSAA resolve during memory persistence transfer
- Ugh. This is a bandaid on a festering wound, AA badly needs a rewrite

 Also silence some warnings
2018-06-08 22:17:50 +03:00
Dravonic 400079a006 Parallel shader cache loading (#4677)
* Parallel shader cache loading
2018-06-01 19:49:29 +03:00
kd-11 f543fb0243 vk/gl: Fix flush synchronization to be kinder to weaker CPUs but not harm higher end CPUs 2018-05-30 13:30:23 +03:00
kd-11 6362942928 rsx: Avoid semaphore acquire deadlock 2018-05-30 13:30:23 +03:00
kd-11 83f9be2524 rsx: Promote FIFO optimizations outside of strict mode
- The benefits of FIFO optimizations are huge in some cases.
  The optimizations also do not break any tested applications so no need to disable with strict mode
- A debug option is provided to disable this behaviour for testing
2018-05-29 13:54:30 +03:00
kd-11 2adb2ebb00 overlays: Avoid race condition on remove-on-update views
- Improves cleanup code to consist of 2 parts, remove then dispose. Remove
  does not deallocate the item until dispose is called on it, allowing the
  backends to first deallocate external references.
- Caller is responsible for managing list locking and tracking disposable list
  of items when external references have been cleaned up before using
  dispose method.
2018-05-29 13:54:30 +03:00
kd-11 b957eac6e8 rsx: Avoid calling any blocking callbacks from threads that are not rsx::thread
- Defers on_notity_memory_unmapped to only run from within rsx context
- Avoids passive_lock + writer_lock deadlock
2018-05-23 19:07:08 +03:00
kd-11 c9669818eb Facepalm
- overlays: Do not free self handle!!!!
2018-05-21 15:55:25 +03:00
kd-11 f6f45b8699
Native UI refactored (#4623)
Refactor and improve native overlays
2018-05-20 23:05:00 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 1aa44ede31 gl: Improve AMD multidraw workaround
- Reimplements the AMD workaround using an identity buffer to avoid the performance hit of doing multiple glDrawArrays for every single compiled set
- Reimplements first/count allocation using a scratch buffer to reduce allocation overhead when large number of draw calls is used
2018-05-13 14:44:14 +03:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 440a31ef18 rsx: Optimizations for program management 2018-05-13 14:44:14 +03:00
kd-11 a52ea7f870 rsx: Improve fragment and vertex program usage
- Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search
  - Avoids detecting shader as being different because of unused textures having state changes
  - Adds better program size detection for vertex programs
- Improved vertex program decompiler
  - Properly support CAL type instructions
  - Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes
  - Fix SRC checks and abort
  - Fix CC register initialization
  - NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)
2018-05-13 14:44:14 +03:00
kd-11 ffa62918aa gl: Improve pixel transfer code and notify on AMD driver bug
- Readback does not work at all with float textures on AMD openGL
  Driver throws a bogus OUT_OF_MEMORY error regardless of amount of VRAM and system RAM available
2018-04-25 19:14:36 +03:00
kd-11 91a6091d26 rsx: Minor fixes
- vk: Clear dirty textures before copying 'old contents' in case the old data does not fill the new region
- rsx: Properly decode border color - seems to be in BGRA format
- vk: better approximation of border color to better choose between the presets
- vk: Individually clear color images outside render pass and without scissor
- vk: Fix renderpass selection for clear overlay pass
- vk: Include scissor region when emulating clear mask

NOTES:
- vk: Completely avoid using vkClearXXXXimage - its 'broken' on nvidia drivers
  Spec is vague about the function so its not an actual bug
  ClearAttachment is clearly defined as bypassing bound state which works correctly
- TODO: Implement memory sampling to simulate loading precleared memory if cell used memset to preinitialize the framebuffer
  Autoclear depth to 1|255 and color to 0 is hacky!
2018-04-25 19:14:36 +03:00
kd-11 63d9cb37ec rsx: Framebuffer fixes
Primary:
- Fix SET_SURFACE_CLEAR channel mask - it has been wrong for all these
  years! Layout is RGBA not ARGB/BGRA like other registers

Other Fixes:
- vk: Implement subchannel clears using overla pass
- vk: Simplify and clean up state management
- gl: Fix nullptr deref in case of failed subresource copy
- vk/gl: Ignore float buffer clears as hardware seems to do
2018-04-25 19:14:36 +03:00
kd-11 6d46ac1ad6 gl: Reimplement textures
- Separate texture data from texture views
2018-04-25 19:14:36 +03:00
kd-11 c5cd758700 rsx: Workaround for G8B8 render targets
- Mainly affected are colormasks and read swizzles

NOTES:
- Writes to G write to the second and fourth component (YW)
- Writes to B write to first and third component (XZ)
- This means the actual format layout is BGBG (RGBA) making RG mapping actually GR
- Clear does not seem to have any intended effect on this format (TLOU)
2018-04-25 19:14:36 +03:00
Talkashie 64992f758d Fix typos (#4410)
* MASSIVE TYPO FIX part 1

* ANOTHER HUUUUGE TYPO FIX part 2

* thank you :hcorion: for all of your help. I could not have done this without you
2018-04-08 01:01:39 +01:00
pauls-gh a17025c465 Strict Rendering Mode (SRM) fix. Move old surface copy before texture upload.
Fixes the following issues on Tales of Vesperia which requires SRM.
- Blacked out scene after the sleeping dog now renders correctly
- Ghosting effect. The ghosting was most noticeable as a delay between the character rendering and the cell shading around the character. This appears to be gone with this change.
2018-03-29 11:01:58 +03:00