Commit graph

142 commits

Author SHA1 Message Date
kd-11 af9e217fa4 vk: Improve D16F handling
- Adds upload and download routines. Mostly untested, which is why the error message exists
2020-08-30 09:26:37 +03:00
kd-11 e8274d5a59 vk: Fix depth format mismatch detection in copy_image 2020-08-29 02:03:09 +01:00
kd-11 d257ba5156 vk: Add some more diagnostic messages for unoptimized image transfer setups 2020-08-27 12:52:28 +03:00
kd-11 65ead08880 rsx: Refactor and improve image memory manipulation routines 2020-08-27 12:52:28 +03:00
kd-11 f6c6c04648 vk: Implement transport for D24S8_FLOAT data 2020-08-27 12:52:28 +03:00
kd-11 faaf28b41d rsx: Basic support for creating depth float formats 2020-08-27 12:52:28 +03:00
kd-11 b41349546c rsx: Proper support for typeless transform of ABGR framebuffers using the RGBA8 format 2020-08-12 20:19:19 +03:00
kd-11 b437794e92 vk: Improve nvidia speedhack for non-turing cards
- Inverts the chip family check to skip any unidentified GPUs altogether
2020-06-28 22:54:58 +03:00
kd-11 d25ba03e82 vk: Lazy evaluate renderpass scope
- Spamming the driver with renderpass open/close cycles is bad for performance.
2020-03-15 18:39:40 +03:00
Nekotekina Aux1 f2f3321952 Fix warnings in VKGSRender 2020-03-04 21:23:34 +03:00
gamerforEA 93552a5958 Apply some Clang-Tidy fixes 2020-02-27 00:38:55 +03:00
Nekotekina c0f80cfe7a Use attributes for LIKELY/UNLIKELY
Remove LIKELY/UNLIKELY macro.
2020-02-05 10:42:34 +03:00
Nekotekina 15391f45d0 Modernize RSX logging (rsx_log variable) 2020-02-01 11:52:22 +03:00
kd-11 0a2b6a290d vk: Fixup
- Scaling is not needed for a direct typeless transfer!
2020-01-17 14:31:14 +03:00
kd-11 9b34f00241 vk: Optimize image transfers
- Adds the same optimization/simplification steps to complex image
transfer routines. Whenever possible, multi-step transfers are collapsed
into a single operation.
2020-01-16 22:29:26 +03:00
kd-11 621fab2ad9 vk: Fix D32S8 interpolation by using integer interpolation instead of floating point
- Interpolating floats is not the same as interpolating their bits!
  Use integer format to interpolate linearly for D32F formats instead of using R32F as intermediary
2020-01-16 11:12:08 +03:00
kd-11 086ecf4ba6 vk: Add some missing image memory barriers causing artifacting on AMD cards
- There needs to be a memory barrier after each step.
- TODO: Optimize scale_typeless_safe function
2020-01-16 11:12:08 +03:00
kd-11 3d96fe79cc vk: Implement dynamic sized compute heap
- Implements a dynamically sized compute heap to allow growing up the
size if it is too small.
2020-01-15 15:42:36 +03:00
Nekotekina 377e7d2a73 C-style cast cleanup VI 2019-12-04 17:56:22 +03:00
kd-11 fd751e3e7b rsx: Improve blit format mismatch detection 2019-11-19 13:18:15 +03:00
kd-11 4a0e1c79ed rsx: Improve format validation for blit engine
- Check all possible cases where format mismatch is possible.
- Warn if a slow path is going to be taken. Should help with future
optimizations.
2019-11-18 13:17:00 +03:00
kd-11 c415578e79 vk: Clamp buffer row length to never be less than declared width
- Fixes some games with broken textures
2019-11-18 13:17:00 +03:00
Emmanuel Gil Peyrot f76720ceb0 Remove extraneous ::narrow<int>() calls
GSL’s gsl::span didn’t use the correct type for its index_type, which is
why they were needed.
2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot ef368c5171 rsx: Replace gsl::byte with C++17’s std::byte 2019-11-09 19:30:05 +01:00
kd-11 99d71fdc2a vk: Implement layer batching for the GPU swizzle decoder
- Handles all LODs per layer meaning cubemaps are now fully handled in 6 passes instead of 6 * (log2(width)) passes.
- Handles all LODs of a 3D texture in one pass as well.
- The improvements do warrant dropping down the number of allowed compute invocations a bit
2019-11-05 22:07:22 +03:00
kd-11 1266b63135 vk: Enable gpu deswizzling 2019-11-05 22:07:22 +03:00
kd-11 9cd3530c98 rsx: Set up framework for hw deswizzle 2019-11-05 22:07:22 +03:00
kd-11 aa3eeaa417 rsx: Separate subresource_layout:dim_in_block and
subresource_layout::dim_in_texel

- These two are not always linked when working with compressed textures.
The actual texels extend past the actual size of the image if the size
is not aligned. e.g if height is 1, the real height is 4, but its not
possible to determine this from the aligned size. It could be 1, 2, 3 or
4 for example.
- Fixes image out-of-bounds writes when uploading from CPU
2019-10-29 20:03:54 +03:00
kd-11 ee0633f43a vk: Add turing workaround
- Turing crashes if using the depth->color transfer hack
2019-09-26 20:12:25 +03:00
kd-11 cc313b052f rsx: Improve hit testing when scanning for overlapping surfaces
- Calculate exact sizes when doing hit tests to avoid false negatives
- Defer page checking until actually require to do memory setup
- Introduce align2 helper to do non-pow2 alignments
2019-09-12 23:32:21 +03:00
kd-11 858014b718 rsx: Experiments with nul sink 2019-09-12 23:32:21 +03:00
kd-11 d1603fbb0b vk: Crop malformed image descriptors
- Some image descriptors (lle vdec?) are malformed with pitch being smaller than width
- Crop these for now pending hardware tests
2019-09-08 18:22:27 +03:00
kd-11 cbce309199 vk: Fix depth_stencil scaling 2019-09-08 13:56:41 +03:00
kd-11 440d58f2ff vk: Batch compute jobs when doing texture upload
- Reduces overall number of invocations
2019-09-07 16:23:20 +03:00
kd-11 6aa0b49dbc vk: Prefer using native alignment when uploading.
- Allows using fast copy paths and reduces memory and compute footprint
2019-09-07 16:23:20 +03:00
kd-11 99fb6d6a5d rsx: Allow GPU-accelerated stream manipulation when doing texture uploads 2019-08-30 21:46:19 +03:00
kd-11 141072023b rsx: Fix handling of ARGB8 memory
- Load into memory as straightforward BGRA
- Fixes a bug in vulkan caused by byte shuffling in blit engine vs shader access
- Removes the need for memory shuffling when transferring into a rendertarget
2019-08-21 21:17:15 +03:00
kd-11 dfe709d464 rsx: Surface cache restructuring
- Further improve aliased data preservation by unconditionally scanning.
  Its is possible for cache aliasing to occur when doing memory split.
- Also sets up for RCB/RDB implementation
2019-08-18 20:45:48 +03:00
JohnHolmesII 23094b48bb Fix warnings related to -Wswitch
Add default cases.
Move default breaks to newline
Add proper handling in some instances.
Add missing enums to switches
2019-06-28 01:40:52 +03:00
kd-11 a245d9fb24 vk: DOuble general-purpose heap allocation to 128M and add a better diagnostic message for OOM 2019-05-19 17:33:21 +03:00
kd-11 e3cf3ab6b8 rsx: Minor fixes
- Fix transfer scaling (inverted)
- Fix under-estimated typeless acquisition when doing depth format scaling
2019-05-16 19:25:26 +03:00
kd-11 1c439f6198 vk: Fix some spec violations 2019-05-16 19:25:26 +03:00
kd-11 12dc3c1872 vk: Dynamic heap management to potentially fix ring buffer overflows
- Allows checking one heap type at a time, on demand
- Should avoid OOM situations unless inside an uninterruptible block
2019-04-09 13:40:54 +03:00
kd-11 a5ed30a8c0 rsx: Fixups for data cast operations via typeless transfer 2019-04-09 13:40:54 +03:00
kd-11 3249000511 rsx: Improvements to texture scanning
- Removes CPU-only transforms that broke GPU-side code.
 -- Channels in GPU compute are laid out in cell-order, but CPU was uploading in favorable order and compensating with swizzles.
 -- This leads to 2 different layouts depending on the location of the data (CPU vs GPU)
- Implement R8G8_R8B8 interleaved format decode
- General improvements
2019-04-09 13:40:54 +03:00
kd-11 0f7af391d7 vk: Implement copy-to-buffer and copy-from-buffer for depth_stencil
formats
- Allows D24S8 and D32S8 transport via typeless channels
- Allows uploading and downloading D24S8 data easily
- TODO: Implement optional byteswapping to fix flushed readbacks with
the same method
2019-04-09 13:40:54 +03:00
kd-11 bb65e45614 rsx: Implement GPU acceleration for rotated images 2019-03-17 21:50:11 +03:00
kd-11 0395fb9955 rsx/tecture_cache: Addendum - fix data cast with scaling conversion (AA emulation)
- Blit operations do format conversion automatically which is NOT what we want!
- Scale onto temp buffer with similar format before performing data cast.
2019-03-10 16:09:05 +03:00
kd-11 3a071a9c07 rsx: Texture search rewrite
- Perform a full search across all resource types as needed without
taking too many shortcuts/hacks
2019-03-10 16:09:05 +03:00
kd-11 fa9b448686 vk: Spec fixups
- Disable DEPTH<->RGBA typeless transfers for now as they require a lot more work to work for all vendors
- Do not allow switching layouts to UNDEFINED/PREINITIALIZED formats
2019-01-25 14:34:22 +03:00
kd-11 2a62fa892b rsx: Texture cache refactor
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
  usage. Allows for operations like optional memory initialization before
  reading
2019-01-06 10:44:40 +03:00
kd-11 9c45ce6d37 vk: Reimplement typeless memory allocation to handle resolution upscaling 2019-01-06 10:44:40 +03:00
kd-11 15d5507154 rsx: Rewrite memory inheritance transfers
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
2019-01-06 10:44:40 +03:00
kd-11 1ad76ad331 rsx: Restructure programs
- Also re-enable pipeline optimizations
2018-11-30 23:51:25 +03:00
kd-11 c6e35706a3 vk: Support sw component swizzle decode because metal sucks 2018-08-23 22:54:56 +03:00
kd-11 bda65f93a6 vk: Tuning [WIP]
- Unroll main compute queue loop
- Do NOT run GPU cores on mappable memory! This has a dreadful impact on performance for obvious reasons
- Enable dynamic SSBO indexing (affects AMD)
- Make loop unrolling and loop length variable depending on hardware and find optimum
2018-06-26 20:07:20 +03:00
kd-11 5fb4009a07 vk; Add more compute routines to handle texture format conversions
- Implement le D24x8 to le D32 upload routine
- Implement endianness swapping and depth format conversions routines (readback)
2018-06-26 20:07:20 +03:00
kd-11 278cb52f19 facepalm 2018-06-26 20:07:20 +03:00
kd-11 c60f7b89ba vk: Implement safe typeless transfer
- Used to transfer D32S8 data where it makes sense to use this variant
 - On nvidia cards, it is very slow to move aspects from D24S8 probably due to the format being faked.
   For this reason, the unsafe variant is used for both D16 and D24S8 to avoid the heavy performance loss
2018-06-18 17:32:22 +03:00
kd-11 2afcf369ec vk: Add synchronous compute pipelines
- Compute is now used to assist in some parts of blit operations, since there are no format conversions with vulkan like OGL does
- TODO: Integrate this into all types of GPU memory conversion operations instead of downloading to CPU then converting
2018-06-18 17:32:22 +03:00
kd-11 0d5c071eee vk: Implement typeless image transport 2018-06-18 17:32:22 +03:00
kd-11 00eaf39c01 vk: RADV support for depth scaling and transfer 2018-06-08 22:17:50 +03:00
kd-11 fc18e17ba6 vk: Implement depth scaling using hardware blit/copy engines
- Removes the old depth scaling using an overlay.
  It was never going to work properly due to per-pixel stencil writes being unavailable
- TODO: Preserve stencil buffer during ARGB8->D32S8 shader conversion pass
2018-06-08 22:17:50 +03:00
kd-11 0f24379c0e rsx: Obey MSAA resolve during memory persistence transfer
- Ugh. This is a bandaid on a festering wound, AA badly needs a rewrite

 Also silence some warnings
2018-06-08 22:17:50 +03:00
kd-11 91a6091d26 rsx: Minor fixes
- vk: Clear dirty textures before copying 'old contents' in case the old data does not fill the new region
- rsx: Properly decode border color - seems to be in BGRA format
- vk: better approximation of border color to better choose between the presets
- vk: Individually clear color images outside render pass and without scissor
- vk: Fix renderpass selection for clear overlay pass
- vk: Include scissor region when emulating clear mask

NOTES:
- vk: Completely avoid using vkClearXXXXimage - its 'broken' on nvidia drivers
  Spec is vague about the function so its not an actual bug
  ClearAttachment is clearly defined as bypassing bound state which works correctly
- TODO: Implement memory sampling to simulate loading precleared memory if cell used memset to preinitialize the framebuffer
  Autoclear depth to 1|255 and color to 0 is hacky!
2018-04-25 19:14:36 +03:00
kd-11 a42b00488d rsx: Texture fixes
- gl/vk: Fix subresource copy/blit
- gl/vk: Fix default_component_map reading
- vk: Reimplement cell readback path and improve software channel decoder
- Properly name the subresource layout field - its in blocks not bytes!
- Implement d24s8 upload from memory correctly
- Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules
- NOTE: Redirection of 16-bit textures is not implemented yet
2018-04-25 19:14:36 +03:00
kd-11 9f416e5ce1 rsx/gl/vk: Obey channel remapping on framebuffer resources if requested 2018-03-25 13:31:06 +03:00
pauls-gh fd8d2ecbf4 Remove Volume Texture Compression (VTC) tiling for Vulkan, DX12 and ATI (OpenGL). 2018-03-23 12:01:30 +03:00
kd-11 71f69d1d48
rsx/overlays: Introduce 'native' HUD UI and implement some common dialogs (#4011) 2018-01-17 19:14:00 +03:00
kd-11 ddebc334bf rsx: Fixes
- Discard intentionally invalidated framebuffer resources. These are created after a flush has happened, forcing reupload since contents cannot be guaranteed (strict mode only)
- Fix for blits using vulkan; dont use the copy method if formats do not match, use generic blit instead
2017-12-01 21:00:50 +03:00
kd-11 3499d089e7 rsx: Texture cache fixes and improvements
rsx: Conditional lock hack removed
vulkan - Fixes
- Remove unused texture class
- Fix native pitch calculation (WCB)
rsx: Catch hanging begin/end pairs when flushing deferred draw calls
vulkan: Register DXT compressed formats
vulkan: Register depth formats
gl: Workaround for 'texture stitching' when gathering flip surface
- TODO: Add a proper flip hack option
rsx: Fix texture memory size calculation
- DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size
rsx: Fix cpu copy detection
rsx: Validate blit dst surface and dont make assumptions about region blit order
- Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled
rsx: Fix depth texture detection
rsx: Do not manually offset into dst. The overlapped range check does so automatically
rsx: Minor optimizations
rsx: Minor fixes
- Fix to detect incompatible formats when using GPU texture scaling and show message
- Better 'is_depth_texture' algorithm to eliminate false positives
2017-09-21 16:17:06 +03:00
kd-11 e37a2a8f7d rsx: Texture cache fixes and improvments
gl/vk/rsx: Refactoring; unify texture cache code
gl: Fixups
- Removes rsx::gl::texture class and leave gl::texture intact
- Simplify texture create and upload mechanisms
- Re-enable texture uploads with the new texture cache mechanism
rsx: texture cache - check if bit region fits into dst texture before attempting to copy
gl/vk: Cleanup
- Set initial texture layout to DST_OPTIMAL since it has no data in it anyway at the start
- Move structs outside of classes to avoid clutter
2017-09-21 16:17:06 +03:00
kd-11 deb590cb05 rsx/vk: Bug fixes
- Make each frame context own its own memory
- Fix GPU blit
- Fix image layout transitions in flip

vk: Improve frame-local memory usage tracking to prevent overwrites
- Also slightly bumps VRAM requirements for stream buffers to help with running out of storage
- Fixes flickering and missing graphics in some cases. Flickering is still there and needs more work
vk: Up vertex attribute heap size and increase the guard size on it
vulkan: Reorganize memory management
vulkan: blit cleanup
vulkan: blit engine improvements
- Override existing image mapping when conflicts detected
- Allow blitting of depth/stencil surfaces
2017-09-21 16:17:06 +03:00
mp-t 607d2486ea Code review (#3114)
* Fix always-true conditions in sceNp module

* gl_render_targets: useless check on unsigned variable, possible bug

* fixed UB in crypto utility functions

* copy-paste error in vk::init_default_resources

* pass strings by const ref

* Dont copy vectors. Make sure copies are not needed because functions are used in a multi-threaded context.
2017-08-01 20:22:33 +03:00
raven02 77f8ce503d RSX texture refactor (#2144) 2016-09-19 09:25:49 +08:00
Nekotekina a7e808b35b EXCEPTION macro removed
fmt::throw_exception<> implemented
::narrow improved
Minor fixes
2016-08-08 19:19:32 +03:00
kd-11 33c59fa51b vk: optionally center/offset images when scaling (#1998) 2016-07-30 10:07:39 +08:00
Vincent Lejeune 2e17ea1490 rsx/common/d3d12/vulkan: Factorise data_heap between vulkan and d3d12. 2016-04-07 22:17:28 +02:00
Vincent Lejeune 69d08b6691 vulkan: Support cube and 1D/3D textures. 2016-03-31 23:50:14 +02:00
Vincent Lejeune 77674be1c1 vulkan: Fix all warnings in VKGSRender project. 2016-03-30 21:16:53 +02:00
Vincent Lejeune 9485fe2693 rsx/common/gl/d3d12/vulkan: Use exact mimap counts.
Fix invalid textures in gl backend.
2016-03-25 21:37:53 +01:00
Vincent Lejeune 36aace57ca vulkan: Use simpler texture object 2016-03-23 21:09:30 +01:00
Vincent Lejeune a14dd8ea51 vulkan: Move sampler object outside of texture. 2016-03-21 22:10:36 +01:00
Vincent Lejeune 4484e8c3f0 vulkan: Move vk_wrap_mode and max_aniso to vkFormat 2016-03-19 18:12:43 +01:00
Vincent Lejeune 3b3fffa962 vulkan: Remove redundant texture::create/init overloads 2016-03-15 22:03:24 +01:00
Vincent Lejeune 5de70628d7 rsx/common/d3d12/gl/vulkan: Unify texture upload code. 2016-03-14 19:10:51 +01:00
Vincent Lejeune 70a80b84d7 vulkan: Zero initialize as much structure info as possible.
This fixes a crash with nvidia driver in present call (likely because of
some uninitialized member)
2016-03-12 22:22:28 +01:00
kd-11 26964efa7e Support stencil formats
Fix appveyor build
2016-03-10 23:55:25 +03:00
kd-11 b018c91135 Make render-targets GPU resident
Fix minor regressions that occured during merge
2016-03-10 23:55:25 +03:00
kd-11 d910d2c572 Fix vulkan swap modes for nvidia
CMakeLists edits

Check for linear tiling support for all usage attributes
2016-03-10 23:55:25 +03:00
kd-11 bd52bcf8d4 Fix nvidia crash (API version). Fix linux builds
Properly set up vulkan API version when creating instance

Fix gcc error about passing function result by reference

Fix alot of warnings in VKGSRender project

More fixes for gcc

Fix texture create function
2016-03-10 23:55:25 +03:00
kd-11 3b6e3fb3b4 Rework vertex upload code and fix indexed renders
Rebase on current master; Refactor vertex upload code

Fix build; Minor fixes

Start preparations for merge

Fix generic indexed drawing bugs

Define WIN32_KHR only for windows

Remove linking against vulkan-1.lib
2016-03-10 23:55:25 +03:00