Commit graph

595 commits

Author SHA1 Message Date
Nekotekina 9c9c2eb2c9 Fix wrong g_fxo->init_crtp name, use just init<> 2020-02-25 14:07:50 +03:00
Nekotekina fa02a04baa Add g_fxo->init_crtp to simplify thread construction 2020-02-25 11:51:41 +03:00
Megamouse f7666f44da Untangle GUI and input includes 2020-02-24 16:31:01 +01:00
Nekotekina 8b4b859091 Remove "thread_ctrl::spawn" 2020-02-23 15:03:38 +03:00
Nekotekina 7069e7265f RSX: move g_dma_manager to g_fxo 2020-02-23 13:12:50 +03:00
Nekotekina 92e3eaf3ff Fix signed-unsigned comparisons and mark warning as error (part 2). 2020-02-19 22:54:58 +03:00
Nekotekina 771eff273b First part of fixing sign-compare warning (inside be_t). 2020-02-19 22:54:58 +03:00
Eladash df8d0cde4a RSX/SPU: Accurate reservation access 2020-02-19 18:11:30 +00:00
Megamouse fe75311be2 move config structs to own files and clean up some headers 2020-02-17 15:08:17 +03:00
Eladash 9344b21484 rsx: Unify FIFO recovery methods
TODO: Maybe consider fifo stack content when recovering.
2020-02-14 17:11:26 +03:00
Eladash 07f300a14e rsx: ZCULL typo fix 2020-02-14 17:11:26 +03:00
Eladash dcb30df7c8 rsx capture: Fix capture recovery after a crash 2020-02-10 21:39:39 +00:00
Eladash bdab26ec09 rsx: rewrite io mappings
Along with some with fixes to cellGcmSys HLE.
2020-02-10 21:39:39 +00:00
Eladash 9d1bb60ad7 cellGcm HLE: fix cellGcmMapMainMemory
Fix arguments order, softcode RsxReports::report offset.
2020-02-08 22:18:56 +03:00
Eladash b7043ce000 Make rsx::get_address report caller location 2020-02-08 22:18:56 +03:00
Nekotekina c0f80cfe7a Use attributes for LIKELY/UNLIKELY
Remove LIKELY/UNLIKELY macro.
2020-02-05 10:42:34 +03:00
Nekotekina 15391f45d0 Modernize RSX logging (rsx_log variable) 2020-02-01 11:52:22 +03:00
Eladash 92466165f6
Increase Maximum Vblank Rate and Clocks Scale
Allow x30 times the speed of vblank rate + clocks scale of original PS3.
In theory a 60 fps limit game which scales frame limit perfectly with vblank rate can be played at up to 1800 fps with this change.

And:
* Fixed lv2 sleep with Clocks Scaling
* Make these settings dynamicaly adjustable.
* Avoid code duplication
2020-01-29 21:42:41 +01:00
Eladash 85695c8bac rsx: FIFO wake-up pause control 2020-01-15 19:54:23 +03:00
Eladash 1ccb3c4492 rsx: Verify local memory offset 2020-01-15 13:23:56 +03:00
Nick Renieris 192912131e rsx: Update vblank count in LLE mode 2020-01-06 22:42:07 +03:00
Eladash 9690854e58 Some cleanup
* Prefer default initializer over std::memset 0 when possible and more readable.
* Use std::format in trophy files name obtaining.
* Use vm::ptr<>::operator bool() instead of comparing vm::ptr to vm::null or using addr().
* Add a few std::memset calls in hle where it matters (or in some places just to document an actual firmware memcpy call).
2019-12-31 22:27:27 +03:00
kd-11 ed2bdb8e0c rsx: Zcull synchronization tuning
- Also fixes a bug where sync_hint would erroneously update the sync tag
  even for old lookups (e.g conditional render using older query)
2019-12-29 13:49:46 +03:00
kd-11 fdb638436f rsx: Add toggle for zcull sync behaviour - Adds a relaxed sync mode where ZCULL reports are lazily nudged into flushing and the main core does not actually wait for the event to finish before proceeding - Can drastically improve performance in cases where the game actually does not utilize the report data 2019-12-29 13:49:46 +03:00
kd-11 cdd9c12132 vk: Emulate conditional rendering for AMD 2019-12-29 13:49:46 +03:00
kd-11 93895838c7 vk: Implement hw conditional rendering 2019-12-29 13:49:46 +03:00
kd-11 a51395370e vk: Implement multithreaded command submission
- A few nagging issues remain, specifically that partial command stream
  largely caused by poor synchronization structures for partial CS flush
  and also the fact that occlusion map entries wait on a command buffer
  and not an EID!
2019-12-29 13:49:46 +03:00
kd-11 5be7f08965 rsx: Restructure ZCULL report retirement
- Prefer lazy retire model. Sync commands are sent out and the reports will be
  retired when they are available without forcing.

- To make this work with conditional rendering, hardware support is
  required where the backend will automatically determine visibility by
  itself during rendering.
2019-12-29 13:49:46 +03:00
Eladash 6a926daee7 rsx: Delay FIFO recovery point creation if is in in_begin_end scope (#7080) 2019-12-12 15:38:56 +03:00
Eladash 7260af032e rsx: Ignore or recover from unknown primitives
This also fixes a bug when recovering FIFO or creating such recovery point inside in_begin_end == true scope.
2019-12-11 00:11:12 +03:00
Nekotekina 377e7d2a73 C-style cast cleanup VI 2019-12-04 17:56:22 +03:00
Megamouse f2b530823b overlays: add dynamic switch for perf overlay 2019-11-27 10:34:03 +01:00
kd-11 c10aa360b1 rsx: Remove more deprecated methods 2019-11-18 13:17:00 +03:00
Megamouse fb96047d2f overlays: add settings for overlay graphs 2019-11-15 14:53:18 +01:00
Megamouse d6b0361a02 overlays: perf_metrics_overlay to seperate header
this is done to prevent severe conflicts with upcoming changes
2019-11-15 14:53:18 +01:00
kd-11 5f39a594ac rsx: Clean up some unused legacy methods unnecessary after d3d removal 2019-11-10 17:53:12 +03:00
Emmanuel Gil Peyrot 56f82d2701 rsx: Wrap gsl::span definition into Utilities/span.h 2019-11-09 20:00:50 +01:00
Emmanuel Gil Peyrot 72cdf0b04c Replace gsl::span’s implementation with tcbrindle’s
This implementation optimises correctly on all relevant compilers,
unlike GSL’s which gave extremely slow code on any compiler other than
MSVC.

Supersedes #6948.
2019-11-09 19:30:06 +01:00
Emmanuel Gil Peyrot ef368c5171 rsx: Replace gsl::byte with C++17’s std::byte 2019-11-09 19:30:05 +01:00
kd-11 7072489a6e rsx: Implement point sprite coordinate generation
- When the point sprite flag is set, overrides the input similar to the
2D mask. The returned X and Y values are always the gl_PointCoord values
for the fragment.
- Stacks with the 2D mask to override the z and w coordinates.
2019-11-09 12:50:53 +03:00
kd-11 8d1505752f rsx: Validate depth test setup to avoid address contention 2019-11-07 11:32:44 +03:00
kd-11 57d3c9e171 rsx: Take empty queries into account for engines that spam report reads.
- Some games will spam the report queue with requests but have zpass
statistics enabled.
2019-11-04 18:48:41 +03:00
kd-11 2a8f2c64d2 rsx: Implement report transfer deferring
- Allow delaying report flushes triggered by image_in or buffer_notify
- When the report is ready, all the delayed transfers will automatically
be done.
- TODO: Make this configurable?
2019-11-04 18:48:41 +03:00
kd-11 51e0eaaddc rsx: Implement backend notification for upcoming zcull reads 2019-11-04 18:48:41 +03:00
Eladash 945abcc6cd rsx: Align down index array offset
* Also use improved to_be_t<> template (recetly ignoring one byte long types) for vm gsl::byte referencing, remove redundent narrow<> cast (same type)
2019-10-22 13:45:09 +03:00
kd-11 901942f24a rsx: Replace pointless f32[4] restriction on texture parameters.
- Use a struct instead to improve readability and remove pointless OpBitCast
2019-10-22 13:44:49 +03:00
kd-11 f7842b765f rsx: Implement packed format renormalization
- Renormalizes arbitrary N-bit values as 8-bit normalized.
- NV hardware performs integer normalization at 8 bits if the size is less than 8.
- This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.
2019-10-22 13:44:49 +03:00
Eladash 29cddc30f0 rsx: Fix vblank signals flood after Emu.Resume() 2019-10-21 15:31:45 +03:00
Eladash 5de0005f5a rsx: Report full method range on invalid methods
Also report full command on fifo desync event for the first time
2019-10-21 15:31:45 +03:00
eladash 730e9cde84 sys_rsx: Improve allocations and error checks
* allow sys_rsx_device_map to be called twice: in this case the DEVICE address retrived from the previous call returned
* Add ENOMEM checks for sys_rsx_memory_allocate and sys_rsx_context_allocate
* add EINVAL check for sys_rsx_context_allocate if memory handle is not found
* Separate sys_rsx_device_map allocation from sys_rsx_context_allocate's
* Implement sys_rsx_memory_free; used by cellGcmInit upon failure
* Added context_id checks
* Throw if sys_rsx_context_allocate was called twice.
2019-10-21 15:31:45 +03:00
Eladash 06017cb14e rsx: Recover from invalid writes to CELL_GCM_NV4097_SET_INDEX_ARRAY_DMA
Also: Trigger a FIFO recovery when encountering an invalid method.
2019-10-10 19:34:23 +03:00
Eladash 0b2fa6ffdc rsx: Flush FIFO GET before smeaphore_acquire 2019-09-30 17:30:15 +03:00
Eladash 822287b418 rsx: Avoid unsigned/signed mismatch with fifo ret addr 2019-09-29 13:05:24 +03:00
kd-11 1464069476 rsx: Restructure deferred flip queue handling
- Allows frameskipping to occur naturally if RSX thread is bombarded with flip requests but just jumping to the last one if possible
- See request_emu_flip() for async frame submission and implicit skipping
- Also allows display queue to fill faster than the flip thread can drain the queue
2019-09-28 21:13:56 +03:00
Nekotekina bd1a24b894 Tidy endianness support (se_t) implementation
Move se_t and se_storage to util/endian.hpp
Use single template instead of two specializations.
Add minor optimization for MSVC.
Remove v128 dependency.
Try to enable intrinsics for unaligned data.
Fix minor bug in u16/u32/u64 specializations.
2019-09-28 15:39:50 +03:00
Nekotekina 5f9c5e8765 Use g_fxo for rsx::thread 2019-09-26 23:26:36 +03:00
kd-11 1a892c6b1b rsx: Avoid recursion in flip handler 2019-09-20 15:08:41 +03:00
kd-11 e0005ec347 rsx: Refactoring and improvement
- Separate displayed statistics from actual backend statistics.
  Allows asynchronous flipping to work correctly as it just uses display stats.
  The real stats are used by the frame scope marker to determine behavior like engaging the FIFO optimizer or skipping draw calls correctly.
2019-09-19 23:10:09 +03:00
kd-11 2c76f47eec rsx: Restructure flip code and frame scoping
- Add an explicit frame scope marker tied in with the queue_prepare command
  Since queue_prepare is emitted at the end of a frame, it can be used as end-of-frame in games that emit this
- If this command is not emitted, fifo flatenner and frameskip will not work
2019-09-19 23:10:09 +03:00
kd-11 52e8747b83 rsx: Workaround for exit deadlock
- Avoids games locking up when the stop button is pressed
2019-09-12 23:32:21 +03:00
kd-11 7f99de36c1 rsx: Fixup for surface_target_a flag being broken
- While the mask for surface_a is at index 0, the surface cache expects the order to be maintained correctly!
  Set the correct mask since surface store now checks each RTT individually
2019-08-30 21:46:19 +03:00
kd-11 e0a7912d7c rsx: Check for stencil writes when determining zeta_write flag 2019-08-30 21:45:41 +03:00
kd-11 04c808b8ab rsx: Fixup for MRT color write lookup and surface_target_a 2019-08-28 16:12:10 +03:00
kd-11 2962e05f26 rsx: Implement per-RTT color masks
- Also refactors and simplifies some common code in surface store and rsx core
2019-08-27 21:59:02 +03:00
Nekotekina d2eba2387b Use g_fxo for display_manager 2019-08-27 03:50:15 +03:00
Nekotekina 38a06c4b14 Use g_fxo for SysRsxConfig
Rename to lv2_rsx_config
2019-08-27 03:50:15 +03:00
kd-11 3e28e4b1e0 rsx/decompiler: Restructure program register behavior
- Fix reading of varying registers in FP
  Different registers have different behavior
- Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1)
- Reimplements two-sided lighting correctly without hacks
- Also bumps shader cache version
2019-08-26 20:03:31 +03:00
kd-11 f9aea076ae rsx: Implement depth_buffer_float support.
- Since this is transparent to the application at all time, it only becomes a problem when doing memory transfer or DEPTH->RGBA conversion in shaders.
2019-08-26 20:03:31 +03:00
kd-11 9d981de96d rsx: Fix offloader deadlock
- Do not allow offloader to handle its own faults. Serialize them on RSX instead.
  This approach introduces a GPU race condition that should be avoided with improved synchronization.
- TODO: Use proper GPU-side synchronization to avoid this situation
2019-08-25 22:09:20 +03:00
kd-11 f0bd0b5a7c rsx: Conditional render sync optimization
- ZCULL queue was updated to one-per-cb but the conditional render sync hint was not updated.
- Do not unconditionally flush the queue unless the upcoming ref is contained in the active CB.
- This avoids spamming queue flush, which frees up resources and improves performance
2019-07-30 21:13:42 +03:00
Eladash 230c3d55b6 Fixup 2019-07-27 04:03:29 +01:00
Eladash fcc75c8b0f rsx: Write atomically semaphore updates and fix zcull timestamp 2019-07-26 21:27:55 +03:00
Eladash c53f0dd7b5 rsx: Fix gcm unmap events 2019-07-26 21:27:55 +03:00
Eladash 85b1152e29 Timers scaling and fixes 2019-07-23 00:09:01 +01:00
kd-11 9a7c2784f0 rsx: Do not clip scissor to viewport when doing buffer clear 2019-07-20 16:39:32 +03:00
kd-11 e2574ff100 rsx: Support CSAA transparency without multiple rasterization samples enabled 2019-07-19 15:49:08 +03:00
kd-11 b5a2f0df68 rsx: Implement separate viewport raster clipping
- Merge viewport raster window and scissor into one clipping region
- Viewport raster clip is different from viewport geometry clipping in
hardware as the latter is configurable separately
2019-07-19 14:21:19 +03:00
Eladash c4d8ef4340 rsx: Allow to configure vblank rate
Removed "HLE protection" hack from sys_rsx_context_attribute
2019-07-12 00:19:56 +03:00
kd-11 d8f753f1e8 rsx: Do not allow framebuffer surfaces that exceed their allocated pitch dimensions
- Truncate surfaces to forcefully fit inside the declared region
2019-07-11 13:22:13 +03:00
kd-11 219a5382f7 rsx: If no array streams are enabled, mark inline array as disabled (null render) 2019-07-09 16:27:59 +03:00
Eladash 43f919c04b Fixup after #6143 (#6146)
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145.
    Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
    Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
    Used sequantially consistent ordering in semaphore_release for TSX path as well.
    Improved memory ordering for sys_rsx_context_iounmap/map.
    Fixed sync bugs in HLE gcm because of not using atomic instructions.
    Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
    Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
    Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
2019-06-29 18:48:42 +03:00
Eladash 1ee7b91646 Refactoring (#6143)
Prefer vm::ptr<>::ptr over vm::get_addr.
    Prefer vm::_ptr/base over vm::g_base_addr with offset.
    Added methods atomic_t<>::bts and atomic_t<>::btr .
    Removed obsolute rsx:🧵:Read/WriteIO32 methods.
    Removed wrong check in semaphore_release.
    Added handling for PUTRx commands for RawSPU MFC proxy.
    Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise.
    Fixed more potential overflows that may result in wrong behaviour.
    Added io/size alignment check for sys_rsx_context_iounmap.
    Added rsx::constants::local_mem_base which represents RSX local memory base address.
    Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.
2019-06-29 01:27:49 +03:00
JohnHolmesII 23094b48bb Fix warnings related to -Wswitch
Add default cases.
Move default breaks to newline
Add proper handling in some instances.
Add missing enums to switches
2019-06-28 01:40:52 +03:00
kd-11 4ff77a8555 rsx: Improve balancing of the offloader thread
- Use two counters to avoid atomic operations
- Yield instead of sleeping because some games are very sensitive to timing
2019-06-25 20:50:54 +03:00
kd-11 d26b25816d rsx: Improve profiling setup
- Avoid spamming QPC when not needed
- Free performance when debug overlay is not enabled
2019-06-25 20:50:54 +03:00
kd-11 b893a75002 rsx: Rework RSX offloading
- Use a lockless queue
- Do not enqueue small transfers
2019-06-25 20:50:54 +03:00
kd-11 0fa3bcc336 rsx: Asynchronous data transfer 2019-06-25 20:50:54 +03:00
Lassi Hämäläinen c963c51a60 Remove unnecessary header includes
- Manually removed lot of unneeded #includes to clean code and reduce
  compilation time
- Reordered some of the #includes to be in more logical order
2019-06-25 17:11:10 +03:00
Eladash cd0ef99df5 Fix BE endianess arch support in semaphore_406e (#6116)
Add raw() methods for endianness support types and make use of it.
2019-06-21 19:29:49 +03:00
kd-11 bca5f94b3f rsx: Add option to toggle MSAA 2019-06-14 16:19:52 +03:00
kd-11 4a5bbba277 rsx: Enable MSAA
- vk: Enable depth buffer resolve+unresolve
- vk: Add AMD stenciling extension support
- rsx: Temporarily disables MSAA-compatible hacks such as transparency AA
- TODO: Add paths to optionally disable MSAA
2019-06-14 16:19:52 +03:00
kd-11 0d906d6974 rsx: Remove surface aa_mode hacks 2019-06-14 16:19:52 +03:00
scribam 13671d9684 rsx: Apply Clang-Tidy fix "modernize-loop-convert" + const when relevant 2019-06-12 15:11:52 +03:00
scribam 635695ac78 rsx: Apply Clang-Tidy fix "modernize-use-emplace" 2019-06-12 15:11:52 +03:00
scribam a555504142 rsx: Apply Clang-Tidy fix "modernize-deprecated-headers" 2019-06-12 15:11:52 +03:00
scribam db926ee671 rsx: Apply Clang-Tidy fix "performance-unnecessary-value-param" 2019-06-12 15:11:52 +03:00
scribam 81a3b49c2f rsx: Apply Clang-Tidy fix "readability-container-size-empty" 2019-06-12 15:11:52 +03:00
Nekotekina dfd50d0185 Implement std::bit_cast<>
Partial implementation of std::bit_cast from C++20.
Also fix most strict-aliasing rule break warnings (gcc).
2019-06-02 23:22:16 +03:00
scribam 78c7ef3039 rsx: Use clear() instead of resize(0)
The result is the same but clear [1] has slightly less code than resize [2] and signals better the intent IMHO.

[1] fb7fb646fa/libstdc%2B%2B-v3/include/bits/stl_vector.h (L1495)
[2] fb7fb646fa/libstdc%2B%2B-v3/include/bits/stl_vector.h (L934)
2019-06-01 22:59:23 +03:00
kd-11 463b1b220d rsx: Improve accuracy of shadow compare Ops when non-integer depth formats are used
- The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour
- D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders
2019-04-25 16:23:05 +03:00