Commit graph

262 commits

Author SHA1 Message Date
Eladash 6a926daee7 rsx: Delay FIFO recovery point creation if is in in_begin_end scope (#7080) 2019-12-12 15:38:56 +03:00
Eladash 7260af032e rsx: Ignore or recover from unknown primitives
This also fixes a bug when recovering FIFO or creating such recovery point inside in_begin_end == true scope.
2019-12-11 00:11:12 +03:00
Nekotekina 377e7d2a73 C-style cast cleanup VI 2019-12-04 17:56:22 +03:00
kd-11 2a8f2c64d2 rsx: Implement report transfer deferring
- Allow delaying report flushes triggered by image_in or buffer_notify
- When the report is ready, all the delayed transfers will automatically
be done.
- TODO: Make this configurable?
2019-11-04 18:48:41 +03:00
Emmanuel Gil Peyrot 69e9ee26f6 rsx: Make input_is_swizzled a template parameter
This lowers the relative cost of this function from ~2.25% to ~1.80% on
gcc 9 which I found quite surprising, some of it probably gets inlined
better in the callers, but I haven’t been able to isolate which parts.
2019-10-28 13:28:51 +03:00
Eladash 5de0005f5a rsx: Report full method range on invalid methods
Also report full command on fifo desync event for the first time
2019-10-21 15:31:45 +03:00
eladash 730e9cde84 sys_rsx: Improve allocations and error checks
* allow sys_rsx_device_map to be called twice: in this case the DEVICE address retrived from the previous call returned
* Add ENOMEM checks for sys_rsx_memory_allocate and sys_rsx_context_allocate
* add EINVAL check for sys_rsx_context_allocate if memory handle is not found
* Separate sys_rsx_device_map allocation from sys_rsx_context_allocate's
* Implement sys_rsx_memory_free; used by cellGcmInit upon failure
* Added context_id checks
* Throw if sys_rsx_context_allocate was called twice.
2019-10-21 15:31:45 +03:00
Eladash 397007cf8b rsx: Fix FIFO_DRAW_BARRIER substituation 2019-10-11 12:34:53 +03:00
Eladash 9242f16560 rsx: Improve FIFO recovery from flip 2019-10-10 19:34:23 +03:00
Eladash 06017cb14e rsx: Recover from invalid writes to CELL_GCM_NV4097_SET_INDEX_ARRAY_DMA
Also: Trigger a FIFO recovery when encountering an invalid method.
2019-10-10 19:34:23 +03:00
Eladash 2eaf5df60b rsx: Register some more methods 2019-10-10 19:34:23 +03:00
Eladash 0b2fa6ffdc rsx: Flush FIFO GET before smeaphore_acquire 2019-09-30 17:30:15 +03:00
Nekotekina bd1a24b894 Tidy endianness support (se_t) implementation
Move se_t and se_storage to util/endian.hpp
Use single template instead of two specializations.
Add minor optimization for MSVC.
Remove v128 dependency.
Try to enable intrinsics for unaligned data.
Fix minor bug in u16/u32/u64 specializations.
2019-09-28 15:39:50 +03:00
kd-11 7fdb4976d8 rsx: Remove log spam for cond render 2019-09-12 14:08:21 +03:00
kd-11 f8617500b5 rsx/methods: Warnings cleanup 2019-09-01 18:59:50 +03:00
kd-11 2962e05f26 rsx: Implement per-RTT color masks
- Also refactors and simplifies some common code in surface store and rsx core
2019-08-27 21:59:02 +03:00
Eladash 7fda07eb5b rsx: UB fix (signed vs unsigned mismatch) 2019-08-13 20:48:50 +01:00
Eladash 519fe9309e rsx: Fix nv0039::buffer_notify 2019-08-13 20:48:50 +01:00
Eladash 527b1bb071 rsx: Fix overlapping transfer of nv3089::image_in when out_pitch != in_pitch
or out_pitch != out_bpp * out_w
2019-08-13 20:48:50 +01:00
kd-11 8866a3d6a9 rsx: Cleanup for blit engine fixes 2019-08-10 16:45:02 +01:00
kd-11 033836d88c rsx: Minor fixup for nv3089::image_in
- Typo scale_x->scale_y
- Remove convoluted temp buffer creation and just use vector instead
2019-08-08 15:48:22 +03:00
kd-11 f0bd0b5a7c rsx: Conditional render sync optimization
- ZCULL queue was updated to one-per-cb but the conditional render sync hint was not updated.
- Do not unconditionally flush the queue unless the upcoming ref is contained in the active CB.
- This avoids spamming queue flush, which frees up resources and improves performance
2019-07-30 21:13:42 +03:00
Eladash fcc75c8b0f rsx: Write atomically semaphore updates and fix zcull timestamp 2019-07-26 21:27:55 +03:00
kd-11 b5a2f0df68 rsx: Implement separate viewport raster clipping
- Merge viewport raster window and scissor into one clipping region
- Viewport raster clip is different from viewport geometry clipping in
hardware as the latter is configurable separately
2019-07-19 14:21:19 +03:00
kd-11 7840cd914e rsx: Fixup nv3089::image_in
- Correct pitch when sourcing from temp block
- Remove obsolete? double transfer that also introduced a stale pointer reference to freed memory
2019-07-09 16:27:59 +03:00
msuih 146e43b6ec Do not use negative unsigned literals 2019-07-01 04:33:23 +03:00
Eladash 2bce367488 Fixup for fixup (#6153)
* Fixup for fixup

* Fix memory ordering for MTRSX

volatile doesnt block reordering.

* ugh
2019-06-30 12:47:42 +03:00
Eladash 43f919c04b Fixup after #6143 (#6146)
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145.
    Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
    Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
    Used sequantially consistent ordering in semaphore_release for TSX path as well.
    Improved memory ordering for sys_rsx_context_iounmap/map.
    Fixed sync bugs in HLE gcm because of not using atomic instructions.
    Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
    Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
    Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
2019-06-29 18:48:42 +03:00
Eladash 1ee7b91646 Refactoring (#6143)
Prefer vm::ptr<>::ptr over vm::get_addr.
    Prefer vm::_ptr/base over vm::g_base_addr with offset.
    Added methods atomic_t<>::bts and atomic_t<>::btr .
    Removed obsolute rsx:🧵:Read/WriteIO32 methods.
    Removed wrong check in semaphore_release.
    Added handling for PUTRx commands for RawSPU MFC proxy.
    Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise.
    Fixed more potential overflows that may result in wrong behaviour.
    Added io/size alignment check for sys_rsx_context_iounmap.
    Added rsx::constants::local_mem_base which represents RSX local memory base address.
    Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.
2019-06-29 01:27:49 +03:00
JohnHolmesII 23094b48bb Fix warnings related to -Wswitch
Add default cases.
Move default breaks to newline
Add proper handling in some instances.
Add missing enums to switches
2019-06-28 01:40:52 +03:00
JohnHolmesII be521ff0ab Fix warnings related to parentheses 2019-06-25 20:36:32 -07:00
Lassi Hämäläinen 499035512b Split Emu/Memory into more logical headers
- Add vm_locking.h and vm_reservation.h and move relevant functions
  and types to these headers.
- Change include order and make vm_ptr.h, vm_var.h and vm_ref.h headers
  usable invidually and them including vm.h instead of other way around
- Because usage of vm::ptr now requires including vm_ptr.h instead of
  vm.h updated multiple #includes
- Added additional #includes to vm_reservation.h and vm_locking to
  where vm::reservation_* and locking related functions are used
2019-06-25 17:11:10 +03:00
Eladash cd0ef99df5 Fix BE endianess arch support in semaphore_406e (#6116)
Add raw() methods for endianness support types and make use of it.
2019-06-21 19:29:49 +03:00
Nekotekina 9abb303569 vm: expand reservation lock bit area to 7 bit
This is minor change.
2019-05-19 17:46:55 +03:00
kd-11 3c7d8a1099 rsx: Minor texture/surface scanning optimization
- Also re-enable optimization in blit engine accidentally disabled during debugging
2019-05-16 19:25:26 +03:00
kd-11 6feffe6ff6 rsx: Ignore transfer offsets when wrapping behaviour is expected 2019-05-01 15:36:21 +03:00
eladash f25587d24c rsx: Write vblank semahpre, minor semaphore acquire optimization 2019-04-20 01:04:41 +03:00
kd-11 df3b46a611 rsx: Improve texture sourcing and clipping when reverse scanning is enabled
- When reverse scanning, offsets are inverted and offset value of 0 is logically equivalent to an offset of -1
- Add an explicit message if clipping happens to avoid silent errors/bugs
2019-04-12 15:36:21 +03:00
kd-11 443fde760f rsx: Blit engine clipping fixes
- Do not round up sub-pixel offsets, round down instead
- Do not allow incomplete sources for hw blit transfer
- Reimplement src clipping (slice_h)
- Check 'area' of incoming texels and correct for them before RTT lookup/transfer
- Filter out incomplete targets when performing RTT lookup (1 texel or less contribution)
2019-04-09 13:40:54 +03:00
kd-11 17c49d21a5 rsx/blit: Remove workarounds/hacks added for master. Start implementation/stubs for blit engine rotations in GPU 2019-03-17 21:50:11 +03:00
elad ce8c92262d Treat X8R8G8B8 format as A8R8G8B8 in image_in, Fixes #5510 2019-03-08 23:44:46 +03:00
eladash bc27f5f75c Implement invalid NV4097_NOTIFY context handling 2019-01-13 12:59:00 +03:00
kd-11 f48abde14b rsx: Fixups for immediate rendering mode
- Immediate mode is isolated from the rest of the vertex configuration
- TODO: Verify register behaviour when immediate mode is used
  Check if per-primitive const register values are supported (likely are)
2018-12-24 09:05:19 +03:00
eladash 45ed58cdaf Fix rsx capture replay
Allow to capture non-increment cmd flag that was missing in command.reg
2018-12-15 19:40:18 +03:00
eladash 835a552d8d rsx: Implement cellGcmSetNotify 2018-12-15 19:40:18 +03:00
eladash 45942c4962 Fix segfault when scaled image dimension is less than clip's 2018-12-04 13:01:29 +03:00
eladash fa5652fceb rsx image_in: Implement negative scaling 2018-12-04 13:01:29 +03:00
eladash ce500c75c4 throw exceptions in case of invalid/unknown operations in image_in 2018-12-04 13:01:29 +03:00
eladash 6ecf2fb3d0 rsx: default lv2 semaphore context + dma_4097
extracted from vsh
2018-12-04 13:01:29 +03:00
eladash 28e4a9e0d0 rsx image_in: Fix in_pitch 0
The hw doesnt fix pitch, when specifying src pitch 0 it copies the same pixels line to dst. keep in mind out_pitch = 0 is not allowed in image_in.
Same goes for buffer_notify, though it allows out_pitch to be 0.
2018-12-04 13:01:29 +03:00
eladash d1d3ac984e rsx image_in: Fix src size calculation when in_pitch != line_lengh 2018-12-04 13:01:29 +03:00
eladash 0a1da14a15 rsx image_in: remove clip h and w hack
If clip region is empty, dont execute
2018-12-04 13:01:29 +03:00
kd-11 8a186bb97e rsx: Fix insertion of execution barriers
- Ignore barriers inserted after BEGIN but before any draw commands are emitted
- Properly process tail barriers inserted before END but after draw commands are submitted
- Ignore execution barriers with no effect (same register value written)
2018-11-30 23:51:25 +03:00
kd-11 2e32777375 rsx: Scrap the prebuffered queue approach
- Basically starting over
- The cost of making command copies into the queue has a measurable impact
2018-11-30 23:51:25 +03:00
kd-11 435afcb865 rsx: Fix fifo draw barriers 2018-11-30 23:51:25 +03:00
kd-11 54ec363e88 rsx: Critical pipeline fixes
- Fix scissor and viewport binding behavior
- Fixes recovery if empty scissor is specified and then 'fixed' later
- Optimizes state binding a bit
2018-11-30 23:51:25 +03:00
kd-11 1ad76ad331 rsx: Restructure programs
- Also re-enable pipeline optimizations
2018-11-30 23:51:25 +03:00
kd-11 677b16f5c6 rsx: Fixups
- Also fix visual corruption when using disjoint indexed draws

- Refactor draw call emit again (vk)

- Improve execution barrier resolve
  - Allow vertex/index rebase inside begin/end pair
  - Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]

- gl bringup

- Simplify
  - using the simple_array gets back a few more fps :)
2018-11-30 23:51:25 +03:00
kd-11 e01d2f08c9 rsx: Refactor FIFO
- Removes fifo structures from common RSXThread
- Sets up a dedicated FIFO controller
- Allows for configurable queue optimizations
2018-11-30 23:51:25 +03:00
Nekotekina 1b37e775be Migration to named_thread<>
Add atomic_t<>::try_dec instead of fetch_dec_sat
Add atomic_t<>::try_inc
GDBDebugServer is broken (needs rewrite)
Removed old_thread class (former named_thread)
Removed storing/rethrowing exceptions from thread
Emu.Stop doesn't inject an exception anymore
task_stack helper class removed
thread_base simplified (no shared_from_this)
thread_ctrl::spawn simplified (creates detached thread)
Implemented overrideable thread detaching logic
Disabled cellAdec, cellDmux, cellFsAio
SPUThread renamed to spu_thread
RawSPUThread removed, spu_thread used instead
Disabled deriving from ppu_thread
Partial support for thread renaming
lv2_timer... simplified, screw it
idm/fxm: butchered support for on_stop/on_init
vm: improved allocation structure (added size)
2018-10-19 22:22:35 +03:00
eladash 62f97f2e5f rsx: Fix default texture dimensions
haha
2018-10-03 20:57:46 +03:00
kd-11 a3d44b5e1f rsx: Cleanup changes for the flip patch 2018-09-24 16:44:02 +03:00
Jake 699eadc84f rsx: Move render flip from rsx queue command to flip command 2018-09-24 16:44:02 +03:00
eladash e8474145a5 rsx: Remove shader address verification
this came from a misunderstanding of the register's use
2018-09-24 13:25:05 +03:00
eladash 1a6c819176 cellgcm: Fix SET_REFERENCE initial value 2018-09-20 01:05:40 +03:00
eladash a8ea576b22 rsx/cellgcm: Implemet initialization registers reset 2018-09-20 01:05:40 +03:00
eladash efbd77deb4 rsx: dont silently ignore null shader address 2018-09-12 00:40:20 +03:00
kd-11 2e0ecb556c rsx: Possible fix for UB data type consistency 2018-09-03 18:24:20 +03:00
kd-11 6399833182 rsx: Fix endianness order when immediate mode register is updated, but used as register lookup
- Simplify the code by unifying all the register-backed memory
2018-09-03 18:24:20 +03:00
elad 685eaedbf9 rsx: Fix typos (#5054) 2018-08-30 00:47:48 +03:00
eladash 37ee0a2f55 Rsx/cellgcm: complete rsx_state::reset() 2018-08-29 13:37:50 +03:00
eladash fc50e6abcb Rsx: remove method registers reset
cellGcm manually resets registers each flip, tested with cellGcmSetFlip
2018-08-29 13:37:50 +03:00
eladash acf1286b49 Rsx: fix unknown cull faces 2018-08-28 10:47:24 +03:00
eladash 38a72cc6ee Rsx: fix flip method registers reset
driver flip does not reset registers
2018-08-28 10:47:24 +03:00
eladash e279bdb304 Rsx: add missing default vertex shader attributes registers states 2018-08-28 10:47:24 +03:00
kd-11 7915dcb23c rsx: Do not overflow the program buffer!
- Some games overflow the program buffer e.g Resistance games
  The observed overflow is one instruction longer, likely an engine bug
with counting instructions
2018-08-18 16:14:30 +03:00
kd-11 dd21e43ed5 rsx: Force disable draw reordering when capturing a frame 2018-08-18 16:14:30 +03:00
kd-11 0267221586 Minor optimizations and fixes
- FIFO: avoid multiline spam
- VK: Fix program setup counter
- FS: Precalculate fragment constants buffer size during analysis step
2018-08-18 16:14:30 +03:00
kd-11 8800c10476 zcull synchronization tweaks
- Implement forced reading when calling update method to sync partial lists
- Defer conditional render evaluation and use a read barrier to avoid extra work
- Fix HLE gcm library when binding tiles & zcull RAM
2018-08-18 16:14:30 +03:00
kd-11 3b47e43380 rsx: Synchronization rewritten
- Do not do a full sync on a texture read barrier
- Avoid calling zcull sync in FIFO spin wait
- Do not flush memory to cache from the renderer side; this method is now obsolete
2018-08-18 16:14:30 +03:00
eladash f349695a75 Rsx: rewrite address translation 2018-08-13 16:16:34 +03:00
Megamouse d057c79733 RSX: use localtime_s instead of localtime 2018-07-28 23:10:45 +02:00
Megamouse 67aff85e8e RSX/Qt: Move rrc captures to captures dir 2018-07-28 23:10:45 +02:00
Megamouse f8d396ac9e change rsx_capture filename 2018-07-28 23:10:45 +02:00
kd-11 19d808d378 rsx/gl: Minor cleanup and optimization
- Track register change status
- Remove unused gl classes
2018-07-22 17:19:59 +03:00
kd-11 fa55a8072c rsx: Improve vertex textures support
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
2018-07-12 18:02:28 +03:00
kd-11 bd915bfebd rsx: vp decompiler fixes
- Fix program abort logic to never abort before resolving later label addresses
  Fixes jumping over broken code and jumping over END markers
- TEXTURE_CONTROL2 has indexing range of [0..15] without stride skipping!
  This register does not have interleaving with other texture registers
- Track shader address poke as it seems to invalidate programs as well
2018-07-07 16:20:33 +03:00
kd-11 66854b78fa rsx: Fix nv308a::color 2018-07-07 16:20:33 +03:00
Jake 00c9b323c2 rsx: fix image_in to use in_pitch when swizzling 2018-06-24 14:29:41 +04:00
eladash b456955688 rsx: fix hardcoded rsx allocation address 2018-06-24 10:57:30 +03:00
kd-11 6362942928 rsx: Avoid semaphore acquire deadlock 2018-05-30 13:30:23 +03:00
Nekotekina 72574b11ff SPU: use reservation spinlocks on writes (non-TSX)
This should decrease contention by avoiding global lock
2018-05-21 21:56:14 +03:00
Nekotekina 367f039523 Build transactions at runtime
Drop _xbegin family intrinsics due to bad codegen
Implemented `notifier` class, replacing vm::notify
Minor optimization: detach transactions from global mutex on TSX path
Minor optimization: don't acquire vm::passive_lock on PPU on TSX path
2018-05-16 17:31:58 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 440a31ef18 rsx: Optimizations for program management 2018-05-13 14:44:14 +03:00
Jake 75b40931fc rsx: initial capture/replay functionality (#4510)
* rsx: initial capture/replay functionality
2018-05-13 12:18:05 +03:00
kd-11 93b2776604 rsx: Fix vertex input detection
- Properly detect inline array registers vs constant value registers
- Silence needless spam, 306E is 2D surface engiine, the assumption that y is multiplied by 306E pitch is not crazy
2018-04-05 01:06:50 +03:00
kd-11 2dce55d036 rsx: ZCULL synchronization fixes
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
  Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
2018-03-13 18:55:03 +03:00
kd-11 315798b1f4 rsx: ZCULL rewrite and other improvements
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
2018-03-13 18:55:03 +03:00