Commit graph

599 commits

Author SHA1 Message Date
kd-11 d167582f6b gl: Implement on-chip buffer-to-d24x8 conversion 2022-05-31 23:34:14 +03:00
kd-11 1b305bf789 gl: Workaround for poor AMD OpenGL performance
- Turns out the AMD driver really hates it if you render with a mapped index buffer.
  The driver internally seems to make a copy of the consumed indices and uses that. Very slow.
  I was able to isolate this after observing that glDrawArrays is not entirely shit, but glDrawElements duration scaled linearly with the number of vertices.
2022-05-31 23:34:14 +03:00
kd-11 60a2a39e88 gl: Deswizzle textures on the GPU 2022-05-31 23:34:14 +03:00
kd-11 3ee27bd434 gl: Optimize consumption of buffer objects when uploading textures 2022-05-31 23:34:14 +03:00
kd-11 e964060a6a gl: Handle texture binding using the global state tracker 2022-05-31 23:34:14 +03:00
kd-11 74696d2e44 gl: Commit to a consistent global state 2022-05-31 23:34:14 +03:00
kd-11 ed2068fb03 gl: Rewrite buffer mapping 2022-05-31 23:34:14 +03:00
kd-11 ec2d529832 rsx: Separate loop interrupts from graphics state
- The interrupts are for multithreaded signals andmake the main loop run more aggressively for the next cycle
2022-05-20 16:29:27 +03:00
kd-11 93d93b4805 rsx: Fix typo 2022-05-20 16:29:27 +03:00
kd-11 e368453751 rsx: Rework loop interrupts a bit
- Reset backend interrupt in core handler
- Separate memory config interrupt from regular backend interrupt
2022-05-20 16:29:27 +03:00
kd-11 9a1e6cc3e8 rsx: Implement RSX reports area access detection and optimize around it
- If nobody is reading RSX reports, do not be in a hurry to write them
- Requires HLE of some methods (cellGcmGetTimestamp) to function correctly
2022-05-20 16:29:27 +03:00
kd-11 96742852eb Fix OGL 2022-03-26 16:10:18 +03:00
kd-11 de0e660d28 rsx: Handle vertex shaders with no constant references
- If no vc[] refs exist, do not upload anything!
2022-03-26 16:10:18 +03:00
kd-11 d057ffe80f rsx: Fix program generation and compact referenced data blocks 2022-03-26 16:10:18 +03:00
kd-11 9a2d4fe46b rsx: Relocatable transform constants 2022-03-26 16:10:18 +03:00
kd-11 bc7ed8eaab rsx/vk: Rework MSAA implementation 2022-03-17 22:02:20 +03:00
kd-11 1943d9819f rsx: Clean up surface cache routines around RTT invalidate 2022-03-10 20:43:58 +03:00
kd-11 6812fa4764 rsx: Fix surface write coherency when MSAA is active 2022-03-08 22:06:26 +03:00
kd-11 f923eaf09a rsx: Surface format remapping enhancements 2022-01-17 10:28:23 +03:00
kd-11 99fc90648b gl: Disable shader interpreter if hardware does not support bindless textures 2021-10-27 12:30:43 +03:00
kd-11 2e22a0d9bb rsx: Optimize thread self-tests 2021-09-28 17:43:15 +03:00
kd-11 3e09b97f58 rsx: Minor optimization; avoid preparing unused vertex streams
- Also discards unused program state variables
2021-09-28 17:43:15 +03:00
kd-11 dabfce5c82 rsx: Rework how depth/stencil initialization+clear works 2021-09-21 19:53:46 +03:00
kd-11 472efc08eb rsx: Implement precise ZCULL stats 2021-09-06 20:04:03 +03:00
Nick Renieris 47e784d5d0 gl/vk: Scale line width & point size by resolution scaling 2021-08-17 19:29:46 +03:00
kd-11 99b6963fab rsx: Improve unnormalized coordinate sampling
- Improve rounding when sampling nearest neighbour. This is mostly a problem with NVIDIA
- Implement unnormalized 3D sampling
2021-08-03 00:36:04 +03:00
kd-11 2c7c1c501d rsx: Implement support for extended vertex programs
- Some games are kinda pushing it with RSX register space and spilling VP data into adjacent unused space.
2021-06-28 10:52:05 +03:00
kd-11 cd8cb9cced rsx: Don't leak data during partial clears
- Partial clears either in active clear channels or scissor region must get barrier inserts to load previous data.
- Fixes some incorrectly discarded data during clear where data in untouched/uninitialized channels is lost.
2021-06-25 14:45:36 +03:00
kd-11 6ac9e6f9c4 gl: Add some debug visualization to internally verify consistency 2021-06-05 21:02:14 +03:00
kd-11 c5a06dab0a rsx: Refactor program texture state handling to be persistent across shader swaps 2021-05-15 23:51:12 +03:00
Megamouse a50be7a912 GL: resharper findings (too lazy for const functions) 2021-04-30 08:23:16 +02:00
kd-11 8b0e1d6c03 rsx: Make renderdoc compatibility mode a general option 2021-04-28 16:53:02 +03:00
Nekotekina c8fefc4434 Fix -Wpessimizing-move (Clang) 2021-02-18 14:38:56 +03:00
kd-11 195fb1cf66 rsx: Improve texture cache invalidate
- Bunch of improvements
- Properly signal renderer to rebind textures!
- TODO: Range checks, should be pretty easy
2021-02-10 11:37:14 +03:00
Nekotekina bd269bccaf types.hpp: remove intrinsic includes
Replace v128 with u128 in some places.
Removed some unused files.
2020-12-21 21:11:25 +03:00
Nekotekina eec11bfba9 Move align helpers to util/asm.hpp
Also add some files:
GLTextureCache.cpp
VKTextureCache.cpp
2020-12-18 18:07:42 +03:00
kd-11 fb1c790350 rsx: Make debug overlay dynamic 2020-12-16 10:10:06 +03:00
kd-11 f83c2f0b6b rsx: Restructure and simplify some header include chains 2020-12-13 15:38:35 +03:00
Nekotekina 36c8654fb8 Remove HERE macro
Some cleanup.
Add location to some functions.
2020-12-10 12:30:22 +03:00
Nekotekina e055d16b2c Replace verify() with ensure() with auto src location.
Expression ensure(x) returns x.
Using comma operator removed.
2020-12-09 15:43:38 +03:00
kd-11 3a0b3a85a5 rsx: Separate program environment state from program ucode state
- Allows for conservative texture uploads
- Allows to update a program object without running full ucode analysis for no reason
2020-12-07 00:45:27 +03:00
RipleyTom af8c661a64 Remove BOM markers 2020-12-06 15:30:12 +03:00
kd-11 cab4c78b7b rsx: Some shader compiler threads tuning
- Allow more threads for wide CPUs
- Simplify 'auto' selection a bit
2020-11-21 20:43:15 +03:00
kd-11 7553429130 gl: Thread shader source compilation dispatch
- glCompileShader is in itself much slower than anticipated
2020-11-21 20:43:15 +03:00
kd-11 3ddfa288cf rsx: Use multithreaded shader compiler backend 2020-11-21 20:43:15 +03:00
Nekotekina 71f1021648 Fix thread pool entry point and get_cycles()
Fix possible race between thread handle availability.
Don't treat zero thread as invalid one.
Now entry point is full is assembly.
Attempt to fix #9282
Also fix some TLS.
2020-11-21 17:18:42 +03:00
kd-11 0e7a705254 rsx: Resolution scaling overhaul
- Enforce square pixels instead of per-axis scaling
2020-11-18 09:29:34 +03:00
Nekotekina ba5ed5f380 Fix vm::lock_range wrong check
Minor header refactoring.
2020-11-04 14:59:26 +03:00
kd-11 a5ac5a9861 rsx: Separate uint depth formats from float depth formats 2020-08-27 12:52:28 +03:00
kd-11 05dc6ad610 gl: Silence warnings 2020-07-05 16:58:44 +03:00
kd-11 5ea6535fd5 rsx: Force flushing of NaN/INF to zero
- This option was always enabled for NVIDIA cards, but it seems some games would benefit from the option on other GPUs as well.
- TODO: Hwtest to verify correct behavior and plan how to safely implement in hw
2020-06-26 09:24:15 +03:00
kd-11 c6a9a5d5d7 rsx/fixup: Fix color clear logic
- Enable fast clears on ABGR formats in vulkan
- Fix disabling color clears for unsupported formats in GL
2020-06-23 12:15:02 +03:00
kd-11 7f917c8ba5 rsx: Fix ABGR decoding for colormask and clear color
- The bytes in these values are based on the format according to hw tests
- G8B8 is unaffected as the first two bytes are already G8B8 for A8R8G8B8 standard layout (BGRA)
- A8B8G8R8 and its derivatives have words 0 and 2 exchanged.
2020-06-22 20:12:41 +03:00
kd-11 8d8fb4a2e4 rsx: Remove ARGB->D24S8 conversion shader which has been deprecated for years since compute capabilities were added to the emulator 2020-06-15 14:18:12 +03:00
kd-11 1677618c75 rsx: Implement stippled rendering 2020-05-30 14:47:10 +03:00
Eladash 3d20ce70f5 rsx: Fix possible case NULL zcull_ctrl in on_exit() 2020-05-28 11:56:02 +02:00
Nekotekina cda8b3a59e RSX: fix new warnings 2020-05-01 22:00:57 +03:00
Megamouse 8f0af6a6fe rsx/interpreter: merge shader settings
- merge disable_asynchronous_shader_compiler and interpreter_mode
- removes disable_asynchronous_shader_compiler setting
- Adds the resulting settings as radio buttons to the gui tab
2020-04-30 15:02:59 +03:00
kd-11 2281c4f662 Fix build 2020-04-30 15:02:59 +03:00
kd-11 bc5c4c9205 rsx/gl: Implement variable path interpreter for optimal performance 2020-04-30 15:02:59 +03:00
kd-11 930bc9179d rsx/interpreter: Improve instructions support
- Must statically write the gl_ClipDistance registers else you get uninitialized trash.
  This problem is more readily apparent on NVIDIA technology but even AMD is not completely immune.
2020-04-30 15:02:59 +03:00
kd-11 0072df7f20 rsx/gl: Add basic interpreter support to OGL
- Adds basic interpreter functionality.
- Flow control and other instructions not yet implemented.
2020-04-30 15:02:59 +03:00
scribam 2e397e38a4 Typos 2020-04-14 17:06:58 +03:00
scribam f37adc4188 Add fallthrough attribute 2020-04-14 17:06:58 +03:00
kd-11 b301fecfd8 gl: Fix async shader compiler
- Removes glFinish hack.
- Adds proper server-side synchronization.
- Adds primary context detection to allow worker threads to be identified.
2020-04-05 16:35:20 +03:00
kd-11 4965bf7d7a gl/vk: Refactor draw call handling and stub shader interpreter
- Refactors backend draw call management to make it easier to extend
  functionality.
- Stubs shader interpreter functionality.
2020-03-23 14:47:28 +03:00
kd-11 2985a39d2e rsx: Rewrite async decompiler 2020-03-09 14:59:25 +03:00
Megamouse ee46ad1ca9 move overlays code to headers 2020-02-26 23:43:18 +01:00
Megamouse fe75311be2 move config structs to own files and clean up some headers 2020-02-17 15:08:17 +03:00
Eladash b7043ce000 Make rsx::get_address report caller location 2020-02-08 22:18:56 +03:00
Nekotekina c0f80cfe7a Use attributes for LIKELY/UNLIKELY
Remove LIKELY/UNLIKELY macro.
2020-02-05 10:42:34 +03:00
Nekotekina 15391f45d0 Modernize RSX logging (rsx_log variable) 2020-02-01 11:52:22 +03:00
kd-11 7453e46a7c rsx: Refactor out complex present code into separate files
- Also restructures present code to have image lookup in a separate
re-usable function.
2020-01-18 19:52:52 +03:00
Megamouse 5e7d25ad35 overlays: refactor shader loading dialogs 2020-01-03 14:22:40 +01:00
Megamouse c4b4ce46b8 cellSaveData: don't pause apps during dialogs 2019-12-29 14:22:58 +01:00
kd-11 e1b734fd12 rsx: Fix linux build 2019-12-29 13:49:46 +03:00
kd-11 5be7f08965 rsx: Restructure ZCULL report retirement
- Prefer lazy retire model. Sync commands are sent out and the reports will be
  retired when they are available without forcing.

- To make this work with conditional rendering, hardware support is
  required where the backend will automatically determine visibility by
  itself during rendering.
2019-12-29 13:49:46 +03:00
kd-11 8dfea032f2 rsx: Remove deprecated do_method path that has been superceded by c++ inheritance for many years 2019-12-29 13:49:46 +03:00
Nekotekina 377e7d2a73 C-style cast cleanup VI 2019-12-04 17:56:22 +03:00
Emmanuel Gil Peyrot f76720ceb0 Remove extraneous ::narrow<int>() calls
GSL’s gsl::span didn’t use the correct type for its index_type, which is
why they were needed.
2019-11-09 19:30:06 +01:00
kd-11 63bbf11a76 vk: Add video out calibration pass
- Adds gamma correction and RGB range filters to output to match PS3
2019-10-31 14:43:24 +03:00
kd-11 35794dc3f2 vk: Add checks for alphaToOne support
- This feature is very rarely used, as alphaToCoverage is commonly used as a replacement for blending, not in addition to it.
2019-10-30 01:06:28 +03:00
kd-11 f7842b765f rsx: Implement packed format renormalization
- Renormalizes arbitrary N-bit values as 8-bit normalized.
- NV hardware performs integer normalization at 8 bits if the size is less than 8.
- This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.
2019-10-22 13:44:49 +03:00
kd-11 eee2237e19 rsx: Track uncached cache resources
- Uncacheable resources can be reused as soon as they're made visible to the draw call.
- Since they're likely to be reused every draw call until the shader changes, it is important to reuse as much as possible
2019-10-18 14:46:37 +03:00
kd-11 27f48fbc06 gl: Rewrite image transfer operations to support image subregions
- Working exclusively with full sized images is very expensive
2019-10-13 19:00:05 +03:00
kd-11 105d4b51e6 gl: Use compute shaders for typeless texture decode 2019-10-13 19:00:05 +03:00
kd-11 7a6e2e716f gl: Add a framework for compute shaders 2019-10-13 19:00:05 +03:00
kd-11 2275259bf5 rsx: Properly scale overlay passes to match drawable area 2019-09-28 13:24:14 +03:00
kd-11 e0005ec347 rsx: Refactoring and improvement
- Separate displayed statistics from actual backend statistics.
  Allows asynchronous flipping to work correctly as it just uses display stats.
  The real stats are used by the frame scope marker to determine behavior like engaging the FIFO optimizer or skipping draw calls correctly.
2019-09-19 23:10:09 +03:00
kd-11 2962e05f26 rsx: Implement per-RTT color masks
- Also refactors and simplifies some common code in surface store and rsx core
2019-08-27 21:59:02 +03:00
kd-11 27aeaf66bc gl: Restructure buffer objects to give more control over usage
- This allows creating buffers with no MAP bits set which should ensure they are created for VRAM usage only
- TODO: Implement compute kernels to avoid software fallback mode for pack/unpack operations
2019-08-27 21:59:02 +03:00
Nekotekina d2eba2387b Use g_fxo for display_manager 2019-08-27 03:50:15 +03:00
Nekotekina 928719b658 Use g_fxo for rsx::avconf 2019-08-27 03:50:15 +03:00
kd-11 3317e13b64 rsx: Hotfix for semaphore timeout bug
- Add pending flip requests as a reason to invoke the RSX local task handler and release the vblank semaphore
2019-08-26 22:33:29 +03:00
kd-11 3e28e4b1e0 rsx/decompiler: Restructure program register behavior
- Fix reading of varying registers in FP
  Different registers have different behavior
- Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1)
- Reimplements two-sided lighting correctly without hacks
- Also bumps shader cache version
2019-08-26 20:03:31 +03:00
kd-11 9d981de96d rsx: Fix offloader deadlock
- Do not allow offloader to handle its own faults. Serialize them on RSX instead.
  This approach introduces a GPU race condition that should be avoided with improved synchronization.
- TODO: Use proper GPU-side synchronization to avoid this situation
2019-08-25 22:09:20 +03:00
kd-11 ca8b0da141 gl: Invalidate range before reading to prevent deadlock 2019-08-21 21:17:15 +03:00
kd-11 5e299111cc rsx/vk: Restructure surface access barriers and implement RCB/RDB
- Implements render target data load (aka Read Color Buffer/Read Depth Buffer)
- Refactors vulkan surface barrier to be much cleaner.
- Removes redundant surface barrier invocations after doing a merged load
  from surface cache.
- Adds explicit access modes when gathering surfaces from cache.
2019-08-18 20:45:48 +03:00
kd-11 dfe709d464 rsx: Surface cache restructuring
- Further improve aliased data preservation by unconditionally scanning.
  Its is possible for cache aliasing to occur when doing memory split.
- Also sets up for RCB/RDB implementation
2019-08-18 20:45:48 +03:00
RipleyTom 87bf0386c4 Screenshot function 2019-08-14 19:24:42 +02:00