Commit graph

393 commits

Author SHA1 Message Date
kd-11 27552891ad rsx/fp: Improvements
- Export some debug information in the free texture register space components zw
  Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
  Some engines (UE3) read all the components in the shader and use mul/mad with the result
2018-03-25 13:31:06 +03:00
kd-11 5f047034ae rsx: Disable async count verification to avoid lockup due to zombie reports in ZCULL 2018-03-13 18:55:03 +03:00
kd-11 f00d9a7c7f rssx" Halfplement alpha-to-coverage AA transparency 2018-03-13 18:55:03 +03:00
kd-11 2dce55d036 rsx: ZCULL synchronization fixes
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
  Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
2018-03-13 18:55:03 +03:00
kd-11 315798b1f4 rsx: ZCULL rewrite and other improvements
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
2018-03-13 18:55:03 +03:00
kd-11 dece1e01f4 rsx: Improve transform constants management
- Removes the duplicate local_transform_constants
- Resets the transform constants on every context reset
- Simplifies the code abit which should make it faster
- NOTE: Transform constants are persistent across context re-init events (VF5)
2018-03-13 18:55:03 +03:00
kd-11 0c8e4c0887 rsx: Improve FIFO commandlist flattening
- TODO: Alot of work is still needed to execute draw commands out of order
  Thats the only solution to games sending many draw calls with high frequency of state changes
2018-03-13 18:55:03 +03:00
kd-11 84b8a08d26 rsx: Basic performance counters 2018-03-13 18:55:03 +03:00
kd-11 af1b13550b rsx/vk: More optimizations
- Do not bother rechecking the dirty sampler pool for hits. Its faster to create new sampler than to search the pool
- Reserve some memory on vertex layout struct to reduce reallocation penalty
2018-03-13 18:55:03 +03:00
Jake 7233640cf0 rsx: add vertex data base to offset and mask before translating address 2018-03-07 16:57:20 +03:00
kd-11 bd297d079d rsx: Minor optimizations 2018-02-16 16:14:54 +03:00
kd-11 a5500ebfa4 rsx: Fix disjoint draw range splitting
- Fixes flickering and missing draws in R&C and other games such as Motorstorm Apocalypse and Okami HD when strict mode is disabled
2018-02-16 16:14:54 +03:00
Nekotekina cce0ad0c35 Clean vm::ps3 namespace use 2018-02-09 17:49:37 +03:00
Jake 2f414f96bf rsx: fix potential hang during thread close 2018-01-24 16:28:09 +00:00
kd-11 3d9e3a16f1 rsx/gl/vk: Fixes and optimizations
- opengl driver optimization for nvidia. On nvidia glTextureBufferRange performance is horrendous
-- Initialize texture buffer to whole buffer at startup and use absolute offsets to read data instead
-- Over 2x performance in some cases (Resogun, TNT racers)
- gl/vk: Do not flip non-existent display buffers. Fixes spec violation at boot in TNT racers demo
- whitespace fixes for sys_rsx
2018-01-22 11:43:35 +03:00
kd-11 0a2992839b rsx/gl/vk: Simulate z clipping with selective depth clamp
- The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances
-- If depth falls within near/far, clamp depth value to [0,1]
2018-01-19 12:03:57 +03:00
kd-11 cbc8bf01a1 cell/scheduler: Manage thread placement depending on cpu hardware 2018-01-19 12:03:57 +03:00
kd-11 71f69d1d48
rsx/overlays: Introduce 'native' HUD UI and implement some common dialogs (#4011) 2018-01-17 19:14:00 +03:00
Jake 0477f8ed3c rsx: add log for potential source of error 2018-01-14 20:50:55 +03:00
Jake 7ca2c444cc rsx: Fix depth clipping 2018-01-14 20:50:55 +03:00
kd-11 ee009ec99c rsx: Robustness fixes
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
2018-01-02 21:17:56 +03:00
kd-11 0d0821e914 rsx: Pause FIFO queue when changing ctrl registers 2017-12-18 10:45:37 +03:00
kd-11 90c2324e47 rsx: Program cache fixes
- Reorganize storage hash vs ucode hash
- Scan for actual fragment program start in case leading NOPed code precedes the actual instructions
-- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes
2017-12-04 18:22:18 +03:00
kd-11 90a3f3af30 rsx: Discard queue if RET is found without CALL 2017-12-01 21:00:50 +03:00
kd-11 de5a4fe083 rsx: Reimplement depth <-> RGBA reinterpretation code
- Implements proper channel order for fp24-ARGB8 conversion
- Takes swizzle remap into account when reconstructing source bytes
2017-12-01 21:00:50 +03:00
kd-11 ccc0383f75 vulkan: Implement overlay shader passes
- Implements vk::overlay_pass and vk::depth_convert_pass
- Also added a sanity check in RSX core for depth replace shaders
2017-12-01 21:00:50 +03:00
kd-11 680ca1d12a rsx: Zcull refactoring and vulkan implementation 2017-12-01 21:00:50 +03:00
kd-11 0aaae000b3 rsx: Minor improvements 2017-12-01 21:00:50 +03:00
Zion Nimchuk 3a9ae2df9e silence warnings in RSX stuff 2017-11-30 18:07:19 +03:00
Nekotekina dbc9bdfe02 Implement set_ideal_processor_core (linux) 2017-11-15 21:00:02 +03:00
kd-11 1fa18757fc rsx: Implement render-to-cubemap; Also simplify unnormalized samplers [WIP, DELETE SHADER CACHE, VERY SLOW]
- Enables real-time cubemap reflections
- TODO: Vulkan is broke; rsx is very slow with this feature
2017-11-08 13:15:34 +03:00
kd-11 0961a43997 rsx: Implement 1D<->2D image type casts 2017-11-08 13:15:34 +03:00
kd-11 bbcb6b6851 rsx: Fbo fixes 2
- Use AA mode to predict surface compression. Compression mode is useless without AA activated
- Rewrites most image subresource fetch routines to use the new heuristic
- Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code
-- Fixes alot of failures when looking for tiled regions

rsx: Fix antialiased unnormalized coords
- scaling factors are inverse to allow proper coordinates to be computed in fs
2017-11-08 13:15:34 +03:00
kd-11 173d05b54f rsx: Optimizations
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well

addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine

vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb
2017-11-08 13:15:34 +03:00
Jake e0d1ac676e rsx: invalidate surface store address when tile is unbound 2017-10-28 12:46:20 +03:00
Jake 626b9f93c4 rsx: make dmactrl get 'readonly' 2017-10-28 12:46:20 +03:00
kd-11 9c9495621c rsx: Fix critical bug concerning transient data layout in memory 2017-10-26 00:35:45 +03:00
kd-11 5db45c3699 rsx: More fixes
- Workaround for AMD glMultiDrawArrays bug
- Disable disjoint command submission when multidraw support is disabled
2017-10-19 12:22:52 +03:00
kd-11 86bf61ad35 rsx: Fix memory protection
- Fixes hanging when wcb is enabled
2017-10-14 14:19:14 +03:00
kd-11 c570410e06 rsx: Stop executing broken command queues if the application fails to recover in 3 retries 2017-10-12 13:51:29 +03:00
kd-11 9af71699a4 rsx: Minor fixes
- Dont skip cb if a problem occurs, just spin on it instead to allow possibility of recovery
- Vulkan cleanup for the die_with_error helper
2017-10-12 13:51:29 +03:00
kd-11 fc0f98b5db rsx: Res scaling improvements
- gl: Reintroduce the wcb hw downscaling
- rsx: Clamp scalable render target size with a config var
2017-10-09 20:25:41 +03:00
kd-11 12ab03b0b5 rsx/gl: Implement resolution scaling
rsx: Revise wpos calculation to take resolution scale into account
2017-10-09 20:25:41 +03:00
kd-11 3499d089e7 rsx: Texture cache fixes and improvements
rsx: Conditional lock hack removed
vulkan - Fixes
- Remove unused texture class
- Fix native pitch calculation (WCB)
rsx: Catch hanging begin/end pairs when flushing deferred draw calls
vulkan: Register DXT compressed formats
vulkan: Register depth formats
gl: Workaround for 'texture stitching' when gathering flip surface
- TODO: Add a proper flip hack option
rsx: Fix texture memory size calculation
- DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size
rsx: Fix cpu copy detection
rsx: Validate blit dst surface and dont make assumptions about region blit order
- Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled
rsx: Fix depth texture detection
rsx: Do not manually offset into dst. The overlapped range check does so automatically
rsx: Minor optimizations
rsx: Minor fixes
- Fix to detect incompatible formats when using GPU texture scaling and show message
- Better 'is_depth_texture' algorithm to eliminate false positives
2017-09-21 16:17:06 +03:00
kd-11 3836b40bf7 rsx: Fixups 2017-09-21 16:17:06 +03:00
kd-11 571dbfb7b1 rsx: Texture cache improvements
- Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO)
rsx: Fixups to allow large textures for blit operation
- Also includes checks for both leaking sections and blit regions for vulkan
hotfix for hanging when using WCB
addendum - unlock both ro and no blocks before attempting to copy memory blocks
gl: Fixups for ARB_explicit_uniform_location
- Forces glsl v 430 to make use of the extension
rsx/vk: Rework texture cache to minimize recursive access violations
- Also modifies the vulkan commandbuffer begin/end/submit mechanism
gl: Fix cached_texture_section::is_flushable to take memory protection into account
rsx: Fix blit dst offset calculation
2017-09-21 16:17:06 +03:00
kd-11 10e25eb362 rsx: Fix multidraw range splits again
rsx: Hotfix for disjoint range detection
2017-09-21 16:17:06 +03:00
kd-11 45d0e821dc gl: Minor optimizations
rsx: Texture cache - improvements to locking
rsx: Minor optimizations to get_current_vertex_program and begin-end batch flushes
rsx: Optimize texture cache storage
- Manages storage in blocks of 16MB
rsx/vk/gl: Fix swizzled texture input
gl: Hotfix for compressed texture formats
2017-09-21 16:17:06 +03:00
kd-11 061824a7ec rsx: Add support for batched multidraw
gl: Fix multidraw [WIP]
rsx: Ignore vertex base when data source is generated using arithmetic
vk: Check pending flag before doing fence poke
vk/gl: Fix for inlined array and immediate draws
rsx: Collapse joined draws when batching
2017-09-21 16:17:06 +03:00
kd-11 abb56a354d rsx: Bug fixes and improvements
rsx: Try to skip unknown commands without discarding entire cb
2017-09-21 16:17:06 +03:00
kd-11 f71f67c4ff rsx: Make fragment state dynamic to reduce shader permutations 2017-08-26 21:53:54 +03:00
kd-11 142bfb5250 rsx: Fix immediate indexed drawing 2017-08-18 16:51:38 +03:00
kd-11 2fd4726919 rsx: Fix single vertex array input declarations 2017-08-16 23:58:30 +03:00
kd-11 c04aa05398 rsx: Shader pipeline fixes and improvements
- Do not set zfunc if alphakill is not enabled. This is because at the moment alphakill requires a different shader to be built

- use glsl loop-unroll friendly comparison; skip vertex input compare if either key requests it

- Minor tweaks to fp key generation
2017-08-16 23:58:30 +03:00
kd-11 bbf2a97d2e rsx: Zero-initialize the vertex register block
- Some games reference constant regs that they never initialize
2017-08-16 23:58:30 +03:00
kd-11 1da732bbf5 rsx/gl/vk: Invalidate texture regions when memory is unmapped
- Free GPU resources immediately if mappings change to avoid leaking VRAM
2017-08-16 23:58:30 +03:00
kd-11 b2b5f564a1 rsx: Add a few more depth format types to known behaviour paths 2017-08-16 23:58:30 +03:00
kd-11 d54c2dd39a gl: Move vertex processing to the GPU
- Significant gains from greatly reduced CPU work
- Also reorders command submission in end() to improve throughput

- Refactors most of the vertex buffer handling
- All vertex processing is moved GPU side
2017-08-16 23:58:30 +03:00
RipleyTom 2d7e91ba8a Yield instead of sleeping rsx thread. (#3158)
Another Yield
2017-08-06 01:46:01 +01:00
Jake 21dd715b42 sys_rsx: implement support for lle-gcm 2017-08-02 01:33:12 +03:00
Jake d9a693019b rsx/gcm: Implement rsx dma. Refactor gcm/rsx to not be as codependent 2017-08-02 01:33:12 +03:00
kd-11 4cd5624fa7 rsx/vk/gl: Refactoring - Also adds a vertex cache to openGL as well 2017-07-27 14:33:30 +03:00
kd-11 6557bf1b20 rsx: More aggressive thread scheduling for vertex processing
- Significantly helps vertex performance
- Not recommended as more threads will harm performance if the PC does not have the cores for it
2017-07-24 16:52:42 +03:00
kd-11 df8fa74e2a vulkan hotfix (#3046)
* Rework vertex attribute binding for vulkan. Allows always providing a buffer view to the pipeline even if the game has the attribute disabled as long as it is consumed by the vertex shader.
2017-07-22 01:54:28 +03:00
kd-11 05ffb50037 vk/rsx: Bug fixes and improvements
- Improvements to framebuffer usage; Avoid creating new resources every frame
- Handle null fragment program properly
- Collect vertex upload statistics

- vk: Pre-initialize 'unused' varying registers in the vertex shader in case it gets matched with a fs that consumes it
 -- Fixes a crash about fog_c not being declared

gl/dx12/vk: Handle null fragment program

- cleanup - use yield semantic instead of sleep(0) as yield is more cross-platform
 -- sleep(0) is a windows specific scheduler hint
2017-07-19 23:28:33 +03:00
kd-11 e9b8f94fb1 rsx/gl/vk: Enable frame skipping 2017-07-08 14:52:16 +03:00
kd-11 b95ffaf4dd rsx: Implement skip draw. Also, start working on MT vertex upload 2017-07-08 14:52:16 +03:00
kd-11 459a7ba5a2 rsx: Avoid using push_back/emplace_back on empty STL containers
- Reckless management of STL containers causes significant slowdown
- Also reorders vertex compare steps to fail quickly on simpler checks
2017-06-29 13:13:19 +03:00
kd-11 b2e906f4cc rsx: Code cleanup. Fixes several dozen warnings
- Wrap unused parameters as comments to prevent C1400
- Fix sized variable conversions with explicit casts
2017-06-22 23:36:15 +03:00
kd-11 11317acdbe rsx: Handle non-zero base vertex better
- Vertex buffer contents treat the base vertex as vertex 0 so we do the same for indices

rsx: Fix vertex base indexing

rsx: Properly fix non-zero offset indexed rendering
2017-06-22 23:36:15 +03:00
kd-11 98cf72e0fb rsx: Fix clip space computations 2017-06-22 23:36:15 +03:00
kd-11 69d3d47901 gl: Fix clip-space -> depth conversion. Fixes remaining depth read issues
- Also set some default values for samplers in a cleaner way using their 'natural' float values
2017-06-22 23:36:15 +03:00
kd-11 6a9eef0382 rsx/gl/vk: Enable use of native PCF shadows 2017-06-22 23:36:15 +03:00
kd-11 860b76452f vulkan bringup on linux
cleanup: drop unused stuff
2017-06-08 19:08:44 +03:00
Nekotekina f010b5b235 Configuration simplified 2017-05-20 16:01:48 +03:00
scribam 299f627321 Stub cell (#2785)
* Update cellGcmSys

* Update cellStorage

* Update cellSubdisplay

* Update sceNpTrophy
- Use error_code as return type
- Add few checks

* Update cellKey2char

* Update cellKb:
- Use error_code as return type
- Replace UNIMPLEMENTED_FUNC by .todo

* Update cellNetCtl

* Update cellSpudll

* Update cellSysutilAp

* Update cellUserInfo

* Stub sys_mempool_allocate_block (bad idea)
2017-05-15 14:30:14 +03:00
kd-11 c5975d5f66 rsx: Vertex program output fixes 2017-05-12 20:10:03 +03:00
kd-11 450d45354c rsx: Enable GPU texture scaling by default 2017-05-10 21:50:14 +03:00
Jake 60ce85f840 [Render] Userclip for d12/vk/ogl (#2719) 2017-04-25 18:32:39 +08:00
Ofek a5fd7abcf7 Trophy update (#2655)
* Added checksum check to TROPHY.TRP loader

* Implemented sceNpTrophyGetGameProgress, sceNpTrophyGetGameIcon & sceNpTrophyGetTrophyIcon

* Updates to up to date APIs and tiny changes

* Code style fixes for checksum verifier, and another fix for trophy functions

* Format fix
2017-04-13 20:29:47 +03:00
kd-11 adefd1fd63 rsx/ui: Add config toggle for GPU texture scaling/blit 2017-04-08 23:12:09 +03:00
kd-11 909f3e9b3e rsx: Support indexed immediate draw via ArrayElement method 2017-03-29 23:06:17 +03:00
kd-11 70d3a6d840 rsx: Support more base types for immediate rendering
fix alignment
2017-03-26 16:22:53 +03:00
kd-11 79d114cc06 rsx: Support immediate mode rendering 2017-03-26 16:22:53 +03:00
kd-11 34c2b8a55e rsx: recover from FIFO parse errors
- Validate FIFO registers before access

-- Validate the args ptr separate from the get ptr
2017-03-24 09:30:23 +03:00
kd-11 7c73c3b75c rsx/gl: Minor refactoring; prepare vulkan backend 2017-03-01 00:16:55 +03:00
kd-11 1e826f5ccf rsx: Minor optimization (tangible boost) 2017-03-01 00:16:55 +03:00
Nekotekina baf22527b0 Ditch fs::get_executable_dir 2017-02-22 17:17:26 +03:00
Ani 65104b5909 Rough implementation of GCM_CONTEXT_DMA methods
Rough implementation of GCM_CONTEXT_DMA methods.
Fixes #1487
2017-02-17 22:35:28 +03:00
Nekotekina 598c90f376 PPU thread scheduler 2017-02-13 22:26:11 +03:00
kd-11 d6159a35aa gl/vk/dx12: Fix texture scaling on unnormalized rtt access 2017-02-11 15:45:59 +03:00
Nekotekina 246b9f3182 CHECK_EMU_STATUS removal 2017-02-05 17:35:27 +03:00
Nekotekina a5a2d43d7c Thread.cpp refinement
Hide thread mutex
Safe notify() method
Other refactoring
2017-01-29 19:52:19 +03:00
kd-11 2c803dbe66 gl/vk: Bug fixes and improvements (#2206)
* gl: Only bind attrib textures on thread startup

* gl: Persistent mapped buffers

* gl: Fix emulated primitives in an inlined array

* gl: Do not re-update program information every draw call

* gl/vk: s1 type is signed normalized not unsigned normalized

* gl/rsx: Allow disabling of persistent buffers for debugging

gl: Large heap size is more practical

gl: Fix a bug with legacy opengl buffers

* gl/rsx: Allow emulation of unsupported attribute formats

* gl: Fix typos and remove dprints

gl: cleanup debug prints

* ui: Move the GL legacy buffer toggle to the left pane

* vk/gl: Fix cmp type, its range is [-1,1] not [0,1] SNORM_INT
2016-10-18 15:57:28 +08:00
Melissa Goad 22b1400018 Revamp PFIFO command submission emulation (#2179) 2016-10-01 22:13:15 +03:00
kd-11 5430e1d310 rsx/gl/vk/dx12: Add emulated texture fetch for depth read (#2173)
* rsx/gl/vk/dx12: Add emulated texture fetch for depth read

gl/vk/dx12: Simplify reinterpretation equation

* gl: Remove unnecessary re-swizzle

* glsl: explicitly cast uint to float
2016-09-29 14:54:32 +08:00
vlj 8d54bcbc0d rsx: Use variant based draw commands. 2016-09-17 23:37:52 +02:00
vlj 03c86ae43b rsx: Move inline array to draw_clause structure. 2016-09-17 23:37:52 +02:00
Oil 153a2d2b40 Fixed fog and alphakill implementation in glsl (based on DH's old commits) (#2137)
* Fixed NV4097_SET_COLOR_CLEAR_VALUE

* Fixed fog and alphakill implementation in glsl (based on DH's old commits)
2016-09-14 22:47:53 +08:00
vlj 11858dce1a rsx: Vertex array attributes don't need to be stored outside of regs. 2016-08-27 15:40:41 +02:00