Commit graph

776 commits

Author SHA1 Message Date
kd-11 60f3059d22 rsx: Compensate for nvidia's low precision attribute interpolation
- The hw generates inaccurate values when doing perspective-correct
  interpolation of vertex output attributes and makes the comparison (a ==
  b) fail even when they are a fixed constant value.
- Increase equality tolerance when doing comparisons in fragment
  shaders for NV cards only to work around this issue.
- Teepo fix
2019-04-25 16:23:05 +03:00
kd-11 463b1b220d rsx: Improve accuracy of shadow compare Ops when non-integer depth formats are used
- The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour
- D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders
2019-04-25 16:23:05 +03:00
kd-11 06a85f00d1 rsx: Shader decompiler cleanup and improvements
- Improve support for float16_t by minimizing mixed inputs to functions
(ambiguous overloads)
- Minimize amount of downcasts in code by using opcode flags
- Re-enable float16_t support for vulkan
2019-04-25 16:23:05 +03:00
kd-11 a668560c68 rsx: Use native half float types if available
- Emulating f16 with f32 is not ideal and requires a lot of value clamping
- Using native data type can significantly improve performance and accuracy
- With openGL, check for the compatible extensions NV_gpu_shader5 and
AMD_gpu_shader_half_float
- With Vulkan, enable this functionality in the deviceFeatures if
applicable. (VK_KHR_shader_float16_int8 extension)
- Temporarily disable hw fp16 for vulkan
2019-04-25 16:23:05 +03:00
eladash 6f76e34104 rsx: Fix race on clearing native_ui vs emu_requested flag 2019-04-20 01:04:41 +03:00
kd-11 a5ed30a8c0 rsx: Fixups for data cast operations via typeless transfer 2019-04-09 13:40:54 +03:00
kd-11 f04a0a2bb6 rsx: Remove some old restrictions affecting memory persistence 2019-04-09 13:40:54 +03:00
kd-11 cc3809fbfe gl: Register a few more missing formats for conversion 2019-04-09 13:40:54 +03:00
kd-11 e4e86455f2 rsx: Fix temporary subresource caching behaviour
- Do not cache if a gathered subresource contains a bound RTT
- Change op to dynamic copy if parent is still bound
2019-04-09 13:40:54 +03:00
kd-11 3249000511 rsx: Improvements to texture scanning
- Removes CPU-only transforms that broke GPU-side code.
 -- Channels in GPU compute are laid out in cell-order, but CPU was uploading in favorable order and compensating with swizzles.
 -- This leads to 2 different layouts depending on the location of the data (CPU vs GPU)
- Implement R8G8_R8B8 interleaved format decode
- General improvements
2019-04-09 13:40:54 +03:00
kd-11 366e4c2422 rsx: Preliminary support for format conversions using typeless resolve 2019-04-09 13:40:54 +03:00
kd-11 dbc8e70ddd rsx: Silence some compiler noise 2019-03-22 21:27:15 +03:00
kd-11 adc59f9810 rsx: Fix blit transfers when texel sizes mismatch
- Also refactors some bpp handling code
- Simplify texture intersection test to use a normalized/uniform coordinate space
- Fix broken bounds checking as well
2019-03-22 21:27:15 +03:00
kd-11 b879b32271 rsx: Fix bpp calculation taking resolution scaling into account
- Do not rely on image->width(), use surface_width() instead for unscaled values
- Refactor/clean GL rendertarget class a bit
2019-03-20 10:05:54 +03:00
kd-11 bb65e45614 rsx: Implement GPU acceleration for rotated images 2019-03-17 21:50:11 +03:00
kd-11 5260f4b47d rsx: Improvements to memory flush mechanism
- Batch dma transfers whenever possible and do them in one go
- vk: Always ensure that queued dma transfers are visible to the GPU before they are needed by the host
  Requires a little refactoring to allow proper communication of the commandbuffer state
- vk: Code cleanup, the simplified mechanism makes it so that its not necessary to pass tons of args to methods
- vk: Fixup - do not forcefully do dma transfers on sections in an invalidation zone! They may have been speculated correctly already
2019-03-17 21:50:11 +03:00
kd-11 385485204b vk/gl: Omit unlocked data when grabbing flip sources from texture cache 2019-03-17 21:50:11 +03:00
kd-11 74eeacd091 vk/gl: Improve memory tag sync and test
- Properly pass parameters such as rsx-pitch to the surface store
- Do not crash if a surface fails verification in flip, use fall-back instead
2019-03-17 21:50:11 +03:00
kd-11 a49a0f2a86 vk/gl: Synchronization improvements
- Properly wait for the buffer transfer operation to finish before map/readback!
- Change vkFence to vkEvent which works more like a GL fence which is what is needed.
- Implement supporting methods and functions
- Do not destroy fence by immediately waiting after copying to dma buffer
2019-03-17 21:50:11 +03:00
kd-11 3a4083263e rsx: Fix texture transfer when pitch does not match exactly 2019-03-17 21:50:11 +03:00
kd-11 1875dc3f18 gl: Fix buffer size calculations 2019-03-10 16:09:05 +03:00
kd-11 04dda44225 rsx: Properly generate render target data with all parameters provided
- Build-up to variable-sized framebuffers and AA implementation
- Also allows accurate range calculation for our hit testing
2019-03-10 16:09:05 +03:00
kd-11 9d4d3d9443 rsx: Reimplement render target intersection tests when using hw accelerated blit engine
- Properly collapse memory tree when scanning in case of overlaps!
2019-03-10 16:09:05 +03:00
kd-11 7c379432dd rsx: Implement proper pitch compatibility lookup
- When a single row is required or is all that is available, pitch has
no meaning as the coordinate space changed to 1D
2019-03-10 16:09:05 +03:00
kd-11 a80f1a6ed4 gl: Fix memory tag sampling
- Also fixes a bad arg passed to glClearBuffer
2019-03-10 16:09:05 +03:00
kd-11 0395fb9955 rsx/tecture_cache: Addendum - fix data cast with scaling conversion (AA emulation)
- Blit operations do format conversion automatically which is NOT what we want!
- Scale onto temp buffer with similar format before performing data cast.
2019-03-10 16:09:05 +03:00
kd-11 10dc3dadee rsx/texture_cache: Improve framebuffer memory locking when WCB/WDB is not enabled
- Adds a new mode that removes non-framebuffer stuff inside framebuffer range
2019-03-10 16:09:05 +03:00
kd-11 563e205a72 rsx/texture_cache: Fix 'AA' scaling hack and restore collection template selection 2019-03-10 16:09:05 +03:00
kd-11 3a071a9c07 rsx: Texture search rewrite
- Perform a full search across all resource types as needed without
taking too many shortcuts/hacks
2019-03-10 16:09:05 +03:00
kd-11 ef071ebb6b rsx: Synchronize surface cache and texture cache data
- TODO: The whole upload_texture thing is a big hack, fix it properly
2019-03-10 16:09:05 +03:00
kd-11 38887bc03e gl/vk: Improvements to overlay rendering
- gl: Properly initialize and manage sampler states
- gl/vk: Snap overlay elements to pixel grid by aligning to pixel centers
- overlays: Disable grid snapping in stb since its now handled in the backend
2019-02-05 12:15:12 +03:00
kd-11 9e39e2d2c4 gl/vk: Fix clip region scaling for overlay elements 2019-02-02 11:54:01 +03:00
kd-11 9ed9d7e947 overlays/osk: Implement native osk interface 2019-02-02 11:54:01 +03:00
kd-11 660bfeabae gl: Fixup - inline arrays 2019-01-25 14:34:22 +03:00
kd-11 fa9b448686 vk: Spec fixups
- Disable DEPTH<->RGBA typeless transfers for now as they require a lot more work to work for all vendors
- Do not allow switching layouts to UNDEFINED/PREINITIALIZED formats
2019-01-25 14:34:22 +03:00
kd-11 521969bcc3 gl: Remove GL_R 'format'. There is no GL_R format, it part of the S-T-Q-R enums for texture coordinate space 2019-01-25 14:34:22 +03:00
kd-11 5a4bea8c4f gl: Blit fixup
- Typo fix. I meant to disable scissor test, not stencil test
- Also clean up and simplify/optimize the core logic
2019-01-25 14:34:22 +03:00
kd-11 fb778e4821 rsx: Reimplement attrib divisor 2019-01-25 14:34:22 +03:00
kd-11 6fdc0fd7f0 rsx: Reimplement MSAA transparency
- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
2019-01-25 14:34:22 +03:00
kd-11 7eec702c6d gl: Fix silly regression with blit dst resource readback 2019-01-25 14:34:22 +03:00
kd-11 8093c9b573 rsx: Disable rtt side-effects when async compilation is ongoing. Only real renders should promote buffer state from underined to drawn, otherwise keep previous contents intact. 2019-01-25 14:34:22 +03:00
kd-11 417a2e6731 rsx: Refactor index buffers
- Index offset is ignored anyway and only used to calculate vertex attribute divisor index
- Specialized optimization for untouched xfer without primitive restart
2019-01-25 14:34:22 +03:00
Nekotekina bd9131ae1c Implement fs::get_cache_dir
Win32: equal to config dir for now
Linux: respect XDG_CACHE_HOME if specified
OSX: possibly incomplete
2019-01-13 14:45:36 +03:00
kd-11 52ac0a901a rsx: improve memory coherency
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
2019-01-06 10:44:40 +03:00
kd-11 95245bdd83 rsx: Improve ARGB8->D24S8 casting
- Set up partial transfers
- Force clear of target before starting the transfer
2019-01-06 10:44:40 +03:00
kd-11 475cc99117 rsx: Fix dirty flag reset after a partial attachment initialization
- D24S8 targets have 2 aspects that are dealt with separately; Forcefully initialize the remaining data if a partial init is done. Its 'free' anyway
- It seems that the stencil mask matters when clearing unlike the depth mask and color mask
2019-01-06 10:44:40 +03:00
kd-11 c80c7f06bb rsx: Typo fix
- This silly typo broke the flip improvements in the GT fixes PR
2019-01-06 10:44:40 +03:00
kd-11 2a62fa892b rsx: Texture cache refactor
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
  usage. Allows for operations like optional memory initialization before
  reading
2019-01-06 10:44:40 +03:00
kd-11 1ffadbe086 rsx: Reorganize write barrier implementation (either clear or memory barrier) 2019-01-06 10:44:40 +03:00
kd-11 a95a44cf66 rsx: Strictness cleanups
- Also account for variable pitch textures (swizzled scan)
2019-01-06 10:44:40 +03:00