Commit graph

761 commits

Author SHA1 Message Date
Eladash bf63a18c5f SPU Add ability to specify percentage of busy waiting 2022-08-21 15:02:01 +03:00
Eladash b0e2c959eb SPU: Disable notification if no changes were made in PUTLLC 2022-08-21 15:02:01 +03:00
Eladash 28bec8e1bf SPU: Implement custom reservation condition in atomic wait 2022-08-21 15:02:01 +03:00
Eladash 82b1a2bd7a SPU: add the concept of inaccurate reservations
Implement cellSpursRequestIdleSpu
2022-08-21 15:02:01 +03:00
Eladash c0e3b86064 SPU: Optimize spu_thread::get_events() 2022-08-21 15:02:01 +03:00
Eladash 6210a8491f SPU: Optimize and enable SPU GETLLAR Polling detection by default
* Make this setting guard all reservation waitings. (renamed)
* Revert atomic list usage: it's more expensive and not needed because it has a timeout and is not optimized for the rest of the waitables.
2022-08-21 15:02:01 +03:00
Eladash 33a4f05ffa SPU: Interleave loads/stores in reservation access utilities 2022-08-21 15:02:01 +03:00
Eladash a3007e11ca SPU: Fix minor race in sys_spu_thread_receive_event
Check final cpu_state::state value for suspend state because that's the variable the thread waits on.
2022-08-12 15:20:48 +03:00
Eladash 34bae90820 LV2: Move nearly all notifications out of all mutex scopes including IDM 2022-08-07 20:23:54 +03:00
Eladash 73aaff1b29 LV2: allocation-free synchronization syscalls
* Show waiters' ID in kernel explorer.
* Remove deque dependency from sys_sync.h
2022-08-07 20:23:54 +03:00
Eladash c7fbc16357 SPU: Postpone notifications to afterward group mutex ownership 2022-08-07 20:23:54 +03:00
Eladash 2eebbd307d LV2: Minor optimization regarding signal flag 2022-08-07 20:23:54 +03:00
sguo35 27acebc5f5 spu: use portable llvm recompiler on arm64
Since there is not yet an arm64 version of the assembly (fast) version.
2022-07-15 12:37:33 +03:00
Eladash 3e51426379 Savestates/SPU: Kill emulation when its safe to save SPU state 2022-07-15 09:30:53 +03:00
Eladash cdd6840826 Savestates/SPU: Complete fix for saving sys_spu_thread_receive_event 2022-07-12 15:15:42 +03:00
Eladash befd7ceb89 Savestates/sys_spu: Minor fix in saving sys_spu_thread_receive_event 2022-07-10 14:19:59 +03:00
Eladash 2cead6f328 Savestates/SPU: Fix saving sys_spu_thread_send_event 2022-07-10 14:19:59 +03:00
Eladash 87cd65ff03 Savestates: support game collections 2022-07-10 14:19:59 +03:00
Eladash 4ac88fa8d3 Savestates/RSX: Save drawing context 2022-07-08 12:57:43 +03:00
Eladash 5f8f9e33f1 RSX/Savestates: Replace GCM hack with a proper fix 2022-07-08 12:57:43 +03:00
Eladash 155bd09fd0 Savestates: Cleanup v128 usage
It's compatible with bitwise serialization so it is faster to load/save it this way.
2022-07-06 19:43:25 +03:00
Eladash 2815aecd0c Savestates: Save SPU decrementer state 2022-07-06 19:43:25 +03:00
Elad Ashkenazi fcd297ffb2
Savestates Support For PS3 Emulation (#10478) 2022-07-04 16:02:17 +03:00
Eladash cf0fcf5a2a SPU: Implement execution wake-up delay 2022-06-28 19:54:25 +03:00
Eladash 5e01ffdfd8 Debugger: Optimize cpu_thread::dump_regs()
Reuse string buffer. Copies and reallocations are expensive with such large strings.
2022-06-23 22:41:32 +02:00
Nekotekina 653a9e6e7f Debugger: always print cpu_thread::dump_misc()
Was removed for some reason.
2022-06-22 18:53:29 +03:00
Eladash ccb2724fc4 Debugger: Implement SPU breakpoints 2022-06-21 16:59:45 +03:00
Eladash d0e9108800 SPU: Implement "double" SNR storage 2022-06-20 20:50:11 +03:00
Jeff Guo cefc37a553
PPU LLVM arm64+macOS port (#12115)
* BufferUtils: use naive function pointer on Apple arm64

Use naive function pointer on Apple arm64 because ASLR breaks asmjit.
See BufferUtils.cpp comment for explanation on why this happens and how
to fix if you want to use asmjit.

* build-macos: fix source maps for Mac

Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo
modes.

* LLVM PPU: fix aarch64 on macOS

Force MachO on macOS to fix LLVM being unable to patch relocations
during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by
PPUTranslator/Recompiler.

* virtual memory: use 16k pages on aarch64 macOS

Temporary hack to get things working by using 16k pages instead of 4k
pages in VM emulation.

* PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS

Fixes some intrinsics usage and patches usages of asmjit to properly
emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also
patches the SPU recompiler to properly work on arm64 by telling LLVM to
target arm64.

* virtual memory: fix W^X toggles on macOS aarch64

Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default
to RW mode. For both SPU and PPU execution threads, when initialization
finishes we toggle to RX mode. This exploits Apple's per-thread setting
for RW/RX to let us be technically compliant with the OS's W^X
    enforcement while not needing to actually separate the memory
    allocated for code/data.

* PPU: implement aarch64 specific functions

Implements ppu_gateway for arm64 and patches LLVM initialization to use
the correct triple. Adds some fixes for macOS W^X JIT restrictions when
entering/exiting JITed code.

* PPU: Mark rpcs3 calls as non-tail

Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you
call a function inside e.g. an L2 syscall, it will clobber LR on arm64
and subtly break returns in emulated code. Only JIT -> JIT "calls"
should be tail.

* macOS/arm64: compatibility fixes

* vm: patch virtual memory for arm64 macOS

Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to
existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix
memory unmapping on 16K page machines with a hack to mark "unmapped"
pages as RW.

* PPU: remove wrong comment

* PPU: fix a merge regression

* vm: remove 16k page hacks

* PPU: formatting fixes

* PPU: fix arm64 null function assembly

* ppu: clean up arch-specific instructions
2022-06-14 15:28:38 +03:00
Elad Ashkenazi 9bb7e8d614
rsx: Implement atomic FIFO fetching (stability improvement) (non-default setting) (#12107) 2022-06-04 15:35:06 +03:00
Eladash e7ced1aeab Debugger: Implement SPU mailbox content display 2022-05-25 17:36:28 +03:00
Eladash 961d41d0bd RawSPU: Reinvoke pending interrupts if missed 2022-05-25 11:46:51 +03:00
Eladash 2ba437b6dc SPU: Implement timer freezing ability 2022-05-14 22:03:47 +03:00
Eladash d77c9139ad Debugger: Show constant-formed attribute of register value 2022-05-10 22:34:29 +03:00
Eladash be5f8413ca
Avoid using PUTLLC in PUTLLUC if we know SPU LR has already been raised (#11940) 2022-05-05 22:15:08 +03:00
Nekotekina 10b33d0f79 SPU: optimize conflicting PUTLLUC (No-TSX)
Enable previously TSX-only optimization.
2022-05-05 19:16:16 +03:00
Eladash fcbeb2fa22 Remove slow vm::writer_lock usage from SPUThread.cpp 2022-05-04 23:36:57 +03:00
Eladash 3dda72e47f SPU: Cache reservation memory direct access handle (optimization) 2022-05-04 20:28:55 +03:00
RipleyTom a4d715e25d Warning Fixes 2022-03-23 19:35:10 +01:00
Nekotekina 14951d8713 Fix abuse of fs::pending_file
Debug dumps don't fall into category which needs atomic rewrite.
2022-01-24 22:39:01 +03:00
Nekotekina 12c83b340d Remove built_function
With today's branch prediction techniques, it's hardly useful.
2022-01-24 22:21:41 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
Malcolm Jestadt 31a5a77ae5 SPU: Use REP MOVSB in do_dma_transfer
- Try to use REP MOVSB when the size of the transfer is above a certain threshold
- This threshold is determined by the ERMS and FSRM cpuid flags
- The threshold values are (roughly) taken from GLIBC
- A threshold of 0xFFFFFFFF indicates that the cpu has neither flag
2022-01-02 21:35:46 +03:00
Nekotekina cb2748ae08 Update ASMJIT (new upstream API) 2021-12-29 02:45:00 +03:00
Nekotekina d836033212 LLVM: enable some JIT events (Intel, Perf)
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina dcd011048d Implement "built_function" utility (runtime-generated assembly)
Similar to build_function_asm, but links without indirection.
Achieved by emitting code directly into a byte array.
2021-12-22 19:27:20 +03:00
Nekotekina c0bafbc804 TSX: enable same data optimization for PUTLLC 2021-12-19 20:23:01 +03:00
Nekotekina 61c64d1060 TSX: refactoring M
Remove first stage 'optimistic' transactions.
2021-12-19 20:23:01 +03:00
Nekotekina 3e1e1a683c TSX/PPU: fix conditional store regression 2021-12-17 21:48:01 +03:00
Eladash c0c52c33b9 SPU: Implement interrupts handling for remaining events 2021-10-20 15:46:50 +03:00
Eladash ab50e5483e
GUI Utilities: Implement instruction search, PPU/SPU disasm improvements (#10968)
* GUI Utilities: Implement instruction search in PS3 memory
* String Searcher: Case insensitive search
* PPU DisAsm: Comment constants with ORI
* PPU DisAsm: Add 64-bit constant support
* SPU/PPU DisAsm: Print CELL errors in disasm
* PPU DisAsm: Constant comparison support
2021-10-12 23:12:30 +03:00
Eladash e10c6cbaf7 SPU: cpu_work() fixup, fix recursion in AV handler 2021-09-18 19:43:55 +03:00
Eladash 5870da0b55 SPU MFC: Add shuffling in steps setting 2021-09-18 19:43:55 +03:00
Eladash ddec5d6908 CPUThread: Prevent recursive check_state calls 2021-09-17 14:02:22 +03:00
Eladash 975aae1d13 SPU MFC: Implement MFC commands execution shuffling 2021-09-17 11:38:10 +03:00
Eladash fafefb2cf5 Fixup No.3 after #10779 2021-09-10 11:46:39 +03:00
Megamouse 0debcfed0a Silence some warnings 2021-09-02 19:39:42 +02:00
Eladash 063df64108 SPU/event queue: Implement protocol for SPU queue 2021-08-13 08:58:09 +03:00
Eladash f1f93b8f81 SPU: Remove outdated assertation 2021-08-13 08:58:09 +03:00
Eladash 91737b11fe Fix sys_spu_thread_group_resume
Do not remove suspend flag when SPU group state is not SPU_THREAD_GROUP_STATUS_RUNNING after operation!
2021-08-12 22:24:54 +03:00
Eladash bf61c826d5 SPU/event queue: Atomically resume SPU group 2021-08-12 22:24:54 +03:00
Eladash 8e2c34a003 PPU debugger: Implement PPU calling history 2021-07-17 17:28:23 +02:00
Nekotekina 160b131de3 types.hpp: implement smin, smax, amin, amax
Rewritten the following global utility constants:
`umax` returns max number, restricted to unsigned.
`smax` returns max signed number, restricted to integrals.
`smin` returns min signed number, restricted to signed.
`amin` returns smin or zero, less restricted.
`amax` returns smax or umax, less restricted.

Fix operators == and <=> for synthesized rel-ops.
2021-05-22 12:10:57 +03:00
Eladash 638f20c80f Improve get_current_cpu_thread() 2021-05-20 09:25:51 +03:00
Eladash 8bd58b1ad4 Remove lv2_event_queue::check(weak_ptr) 2021-05-15 00:31:14 +03:00
Eladash c681395fb2 sys_interrupt: weak_ptr -> shared_ptr 2021-05-15 00:31:14 +03:00
Eladash 56471f4ad4 SPU: Optimize SPU ports/queues 2021-05-15 00:31:14 +03:00
Eladash daa53b77cf Simplify named_thread construction 2021-05-01 18:08:03 +03:00
Megamouse a16d8ba3ea More random changes 2021-04-11 14:01:51 +03:00
Nekotekina 95725bf7fc Add -Werror=missing-noreturn (GCC, clang)
May be useful to diagnose functions which fail assertions unconditionally.
2021-04-08 10:29:47 +03:00
Nekotekina b3fb6d7d18 Add and fix -Wredundant-decls (GCC) 2021-03-23 22:48:57 +03:00
Eladash 1864419561 Fix SPU mapped memory page size 2021-03-19 22:25:08 +03:00
Nekotekina 03332c340d Implement utils::bless (pointer cast)
Tries to workaround strict aliasing troubles.
Don't confuse with std::bless which works differently.
2021-03-10 16:02:00 +03:00
Nekotekina a4fdbf0a88 Enable -Wstrict-aliasing=1 (GCC)
Fixed partially.
2021-03-09 03:10:15 +03:00
Nekotekina 87af905018 Enable -Wunused-parameter 2021-03-06 18:07:08 +03:00
Timothy Redaelli fa5a2b6a85 SPUThread.cpp: remove "__attribute__((always_inline))"
cmp_rdata and mov_rdata are using __attribute__((always_inline)),
without inline, that is not supported on current g++ (see RPCS3#1546).

Moreover __attribute__((always_inline)) is a noop if used without inline so
just remove it.

A proper fix is to move the 2 functions in an header file as static
(with FORCE_INLINE) so it can be correctly inlined by the compiler.
2021-03-04 12:17:27 +03:00
Nekotekina 52fe86b56c fixed_typemap.hpp: make it a bit fool-proof
Require objects to be non-copyable (move is still allowed).
2021-03-02 21:58:49 +03:00
Eladash 004ebfdaee SPU debugger: Implement MFC journal
* Allow to dump up to 1820 commands with up 128 bytes of data each, using key D with the debugger.
2021-03-02 21:57:51 +03:00
Eladash 9ccf39b27f Atomic SPU LS capture writes 2021-02-23 11:29:23 +03:00
Eladash f43260bd58
Atomic waiting refactoring (#9208)
* Use atomic waitables instead instead of global thread wait as often as possible.
* Add ::is_stopped() and and ::is_paued() which can be used in atomic loops and with atomic wait. (constexpr cpu flags test functions)
* Fix notification bug of sys_spu_thread_group_exit/terminate. (old bug, enhanced by #9117)
* Function time statistics at Emu.Stop() restored. (instead of current "X syscall failed with 0x00000000 : 0")
2021-02-13 17:50:07 +03:00
Eladash 4f85f151fd SPU: Always signal the debugger about termination 2021-02-03 15:05:38 +03:00
Eladash 0652870204 New RSX Debugger 2021-01-28 17:40:26 +03:00
Nekotekina a69248299d SPU: Don't use shm::map_critical in SPU LS allocations
Use shm::try_map instead until proper area is found.
2021-01-25 17:45:47 +03:00
Eladash a58c12db0b SPU: fixup after #9630
Co-Authored-By: Ivan <nekotekina@gmail.com>
2021-01-21 21:32:13 +03:00
Eladash f81674232e Remove SPU and PPU destructors 2021-01-21 18:31:51 +03:00
Eladash e4c3b1c2bd vm: Remove vm::dealloc_verbose_nothrow 2021-01-15 17:37:52 +03:00
Eladash c94a98e15a Fix minor typo 2020-12-23 20:50:33 +03:00
Eladash d17d22139e SPU Debugger: Print reservation data 2020-12-23 08:25:56 +03:00
Nekotekina bd269bccaf types.hpp: remove intrinsic includes
Replace v128 with u128 in some places.
Removed some unused files.
2020-12-21 21:11:25 +03:00
Eladash ef884642e4 Cleanup disasm classes a bit 2020-12-21 13:46:26 +03:00
Nekotekina eec11bfba9 Move align helpers to util/asm.hpp
Also add some files:
GLTextureCache.cpp
VKTextureCache.cpp
2020-12-18 18:07:42 +03:00
Nekotekina db9b7db531 Cleanup and move sysinfo.h -> util/sysinfo.hpp 2020-12-18 12:55:54 +03:00
Nekotekina e321765c54 Split BEType.h to util/v128.hpp and util/to_endian.hpp 2020-12-13 16:34:45 +03:00
Nekotekina 65c04e4ddd Remove constexpr from ppu/spu decoders.
We don't need them at compile time (yet).
But can reduce compile time and complexity.
2020-12-10 15:06:01 +03:00
Nekotekina b382d3b3e9 Remove ASSUME macro
It's dangerous and sometimes bluntly misused feature.
Its optimization potential is near-zero.
2020-12-10 14:08:02 +03:00
Nekotekina 36c8654fb8 Remove HERE macro
Some cleanup.
Add location to some functions.
2020-12-10 12:30:22 +03:00
Nekotekina e055d16b2c Replace verify() with ensure() with auto src location.
Expression ensure(x) returns x.
Using comma operator removed.
2020-12-09 15:43:38 +03:00
Nekotekina eb66302907 atomic.hpp: replace std::atomic with atomic_t
Dual dependency is nothing good.
2020-12-07 17:13:12 +03:00
Nekotekina b16cc618b5 atomic.hpp: add some features and optimizations
Add atomic_t<>::observe() (relaxed load)
Add atomic_fence_XXX() (barrier functions)
Get rid of MFENCE instruction, replace with no-op LOCK OR on stack.
Remove <atomic> dependence from stdafx.h and relevant headers.
2020-12-07 17:13:12 +03:00
Nekotekina 77aa9e58f2 shared_ptr.hpp: add trivial conversion for shared/single types
These conversions don't exist in std::shared_ptr-alike types.
But I don't want to bother with == operators until we have proper C++20.
Removed trivial conversion for atomic_ptr because it's heavyweight.
2020-12-07 15:33:28 +03:00
Eladash 15a12afe25 Debugger: Implement code flow tracking 2020-12-06 15:32:13 +03:00
RipleyTom af8c661a64 Remove BOM markers 2020-12-06 15:30:12 +03:00
Nekotekina b5d498ffda Homebrew atomic_ptr rewritten (util/shared_ptr.hpp)
It's analogous to C++20 atomic std::shared_ptr

The following things brought into global namespace:
single_ptr
shared_ptr
atomic_ptr
make_single
2020-11-26 20:11:26 +03:00
Nekotekina 43952e18e2 Implement prefetch_write() and prefetch_exec() wrappers
Do some refactoring to prefetch_read() in util/asm.hpp as well.
Make all these function constexpr because they are no-ops.
2020-11-24 12:31:11 +03:00
Eladash 3f028fbb83 Fix SPU LS MMIO 2020-11-24 03:44:30 +03:00
Eladash 85880ffded
SPU: Log STOP full opcode (#9292)
Co-authored-by: Ani <ani-leo@outlook.com>
2020-11-20 13:53:16 +03:00
Nekotekina 0fec99e75b SPU: absolutely unacceptable hack for SPU LS
Make normal threads inaccessible in PS3 memory.
2020-11-17 15:22:04 +03:00
Nekotekina eaf0bbc108 SPU: don't allocate SPU LS in vm::main
Create its own shared memory object.
Use vm::spu to allocate all SPU types.
Use vm::writer_lock for shm::map_critical.
2020-11-16 12:46:15 +03:00
Eladash b1710bb712
SPU Debugger: Implement float registers view + General debugger fixes (#9265)
* SPU Debugger: Fix try_get_insert_mask_info
* Debugger: Always update thread state on context's data change
No longer needing to press on thread's instructions for actions to work!
2020-11-15 08:45:28 +03:00
Nekotekina dfae7bd073 SPU: Fix some stat printing 2020-11-15 04:41:16 +03:00
Nekotekina badb3dc2dd atomic.cpp/threads: remove old wait callback
Add new wait callback which simply collects statistics.
Shift workarounds towards actual problem detection.
2020-11-14 18:16:27 +03:00
Eladash fefab50e06 Fix vm::range_lock, imporve vm::check_addr 2020-11-11 10:30:09 +03:00
Eladash 74274f6d77 Debugger: Improve SPU/PPU vector registers 2020-11-11 10:27:45 +03:00
Eladash 52fa69d93d SPU Debugger: Improve registers panel 2020-11-10 22:51:52 +03:00
Nekotekina cdaa8cb5c4 CPU: Improve suspend_all g_suspend_counter handling
Increase in two stages, giving more chances to use it.
Second stage is when all wait flags have been seen.
2020-11-09 23:54:36 +03:00
Nekotekina 083397a555 vm: lock memory under "sudo" addr
Remove memory touching from transactions.
2020-11-09 23:54:36 +03:00
Nekotekina bc61835d97 CPU: use unsigned (u8) priority in suspend_all 2020-11-09 22:57:36 +03:00
Nekotekina 8bc9868c1f SPU/vm: Improve vm::range_lock a bit
Use some prefetching
Use optimistic locking
2020-11-08 17:23:17 +03:00
Nekotekina 3507cd0a37 SPU: improve spu_thread::reservation_check
Use optimistic locking and optimistic loop (expecting 1 iteration).
2020-11-08 16:43:15 +03:00
Nekotekina 1c99a2e7fb vm: add map_self() method to utils::shm
Add complementary unmap_self() method.
Move VirtualMemory to util/vm.hpp
Minor associated include cleanup.
Move asm.h to util/asm.hpp
2020-11-08 16:43:15 +03:00
Nekotekina b68bdafadc vm: refactor vm::range_lock again
Move bits to the highest, set RWX order.
Use only one reserved value (W = locked).
Assume lock size 128 for range_locked.
Add new "Size" template argument that replaces normal argument.
2020-11-08 16:43:15 +03:00
Eladash bacfa9be19
Debugger fixups (#9226)
Fix logic error in callstacks handling code, always set first to false after first iteration.
 Add explicit check for zero return addresses. Current code validity checks may not check for it properly when it sits on interrupt handler entry point (which may contain valid code).
 Do not allow 0x3FFF0 to be a back chain address because it needs space for LR save area, only 0x3FFE0 and below satisfy this criteria.
2020-11-08 16:42:20 +03:00
Eladash 516da4ecdd Debugger: Improve SPU/PPU callstack handling 2020-11-08 09:17:13 +03:00
Eladash 3cb5fd8ebc Debugger: Implement SPU callstack, fix PPU callstack 2020-11-07 20:45:57 +03:00
Eladash 6dcd482dd0 SPU reservations: Do not illegally dereference reservation data 2020-11-07 14:03:09 +03:00
Nekotekina 34fa010601 Improve cond_var notifiers
But nobody uses it anyway, so clean up includes.
2020-11-06 00:10:16 +03:00
Nekotekina 5248240e10 atomic.cpp: improvements.
Reduced static memory amount for waitable atomics.
Allow notifier to skip notifications if wait/notify masks don't overlap.
Improve raw_notify to wake up the thread by its id, add thread_id arg.
Add optional mask argument to notify_one() and notify_all().
2020-11-05 05:51:43 +03:00
Nekotekina 06ecc2ae68 vm::range_lock cleanup and minor optimization
Removed unused arg.
Linearized some branches.
2020-11-01 23:29:33 +03:00
Nekotekina 46d3066c62 Optimize vm::range_lock
Only test address on `range_locked`
Don't check current transaction
Remove vm::clear_range_locks completely
2020-11-01 16:46:06 +03:00
Nekotekina 8d12816001 TSX: fix transaction limit settings 2020-11-01 14:58:04 +03:00
Nekotekina fe03b55046 TSX: tiny optimization of transaction functions
Because new memory manager puts them in first 2G.
2020-11-01 14:44:59 +03:00
Nekotekina 78c986b5dd Improve vm::range_lock
Not sure how it ever worked
Clear redundant vm::clear_range_lock usage
2020-10-31 23:53:14 +03:00
Nekotekina 86fc842c89 TSX: new fallback method (time-based)
Basically, using timestamp counter.
Rewritten vm::reservation_op with the same principle.
Rewritten another transaction helper.
Add two new settings for configuring fallbacks.
Two limits are specified in nanoseconds (first and second).
Fix PUTLLC reload logic (prevent reusing garbage).
2020-10-31 15:34:14 +03:00
Nekotekina 150e18539c Allow cpu_thread& arg passed to the syscalls
Minor cleanup. cpu_mem(), cpu_unmem() removed.
2020-10-30 17:03:32 +03:00
Nekotekina 3419d15878 vm: add extern clear_range_locks function
Allows to wait for range locks to clear for specified range.
vm::range_lock now monitors specified reservation lock as well.
2020-10-30 07:58:16 +03:00
Nekotekina 0da24f21d6 CPU: improve cpu_thread::suspend_all for cache efficiency (TSX)
Add prefetch hint list parameter.
Workloads may be executed by another thread on another CPU core.
It means they may benefit from directly prefetching the data as hinted.
Also implement mov_rdata_nt, for "streaming" data from such workloads.
2020-10-30 05:22:09 +03:00
Nekotekina e794109a67 perf_meter.hpp: fix double rdtsc bug, add restart() method 2020-10-30 03:19:13 +03:00
Nekotekina 006c783aba SPU: make do_dma_transfer() static with _this arg
Instead of this, nullptr will be passed from another thread.
MMIO via MMIO is disabled if it even possible.
2020-10-30 02:58:39 +03:00
Nekotekina 425fce5070 SPU: load previous data on PUTLLC failure
Since it will most likely execute GETLLAR to load it again.
Only implemented for TSX at moment.
2020-10-30 02:58:39 +03:00
Nekotekina 8ce0819b42 SPU: add stx/ftx counters
Just count pure transaction successes and failures.
2020-10-29 18:57:57 +03:00
Nekotekina 688a456642 TSX tweaks
Allow to do more in first-chance transactions.
Give PUTLLC +1 priority (minor change).
2020-10-29 18:57:57 +03:00
Nekotekina 280958ee74 Revert "TSX: adjust transaction logic"
This reverts commit ff550b5c3c.
2020-10-28 21:59:12 +03:00
Nekotekina ff550b5c3c TSX: adjust transaction logic
Allow more in first-chance transactions.
Allow abandonment of PUTLLC as in original path.
Make PUTLLUC unconditionally shared-locked.
Give PUTLLC +1 priority (minor change).
2020-10-28 14:00:09 +03:00
Nekotekina d6daa0d05b Fix cpu_flag::temp, make sure it removes cpu_flag::wait 2020-10-28 14:00:09 +03:00
Nekotekina c491b73f3a SPU: improve accurate DMA
Remove vm::reservation_lock from it.
Use lock bits to prevent memory clobbering in GETLLAR.
Improve u128 for MSVC since it's used for bitlocking.
Improve 128 bit atomics for the same reason.
Improve vm::reservation_op and friends.
2020-10-28 03:47:41 +03:00
Nekotekina c50233cc92 atomics.cpp: add support for waiting on 128-bit atomics
Complementarily.
Also refactored to make waiting mask non-template arg.
2020-10-28 03:47:41 +03:00
Nekotekina 4966f6de73 vm: improve range_lock and shareable cache (Non-TSX)
Allocate "personal" range lock variable for each spu_thread.
Switch from reservation_lock to range lock for all stores.
Detect actual memory mirrors in shareable cache setup logic.
2020-10-27 17:56:19 +03:00
Nekotekina f1e66085cd Fixup for cpu_flag::temp
Wrong check_state() result was triggering assertion.
2020-10-26 01:18:26 +03:00
Nekotekina 130a0ef20e Implement cpu_flag::temp flag
Accompanies wait flag, indicating that it was set in limited conditions.
Such condition don't allow thread to terminate after its removal.
2020-10-25 21:48:20 +03:00
kd-11 18ca3ed449 rsx: Block-level reservation access 2020-10-25 20:21:04 +03:00
Eladash 4ea7628204 SPU: Fix LS capture entry point 2020-10-25 16:39:40 +03:00
Nekotekina 2b52b4a749 SPU: use normal notify() thread function
Using raw_notify() everywhere was overkill.
2020-10-24 14:16:32 +03:00
Eladash 49610f52f5 SPU: Save LS capture executable in one segment 2020-10-24 14:13:19 +03:00
Eladash b56bc7e087 SPU: cleanup channels logging 2020-10-23 13:13:04 +03:00
Nekotekina dc8252bb9f Remove XABORT in PPU/SPU transactions.
It's expensive for unknown reason. Simply XEND is usually much cheaper.
Add some minor improvements. Use g_sudo_addr.
2020-10-20 09:10:21 +03:00
Nekotekina 72d1ac22aa SPU: report too many PUTLLC attempts (TSX)
Mirrored to PPU STCX code and PUTLLUC (STORE128).
2020-10-19 19:41:28 +03:00
Nekotekina 8ce5392390 TSX: add prefetchw instruction in transaction code 2020-10-19 19:41:28 +03:00
Nekotekina 311682b341 SPU: fix GETLLAR regression
Misplaced mov_rdata
2020-10-19 19:41:28 +03:00
Nekotekina 120849c734 Implement perf stat counter for PPU/SPU reservation ops
Adds Emu/perf_meter.hpp header file.
Uses RDTSC for speed.
Prints stats at exit.
2020-10-19 19:41:28 +03:00
Nekotekina adf50b7c4b Implement cpu_thread::if_suspended
Use it for opportunistic guaranteed GETLLAR execution (TSX-FA).
2020-10-18 20:10:48 +03:00
Nekotekina f5c575961f Implement priorities for cpu_thread::suspend_all tasks
Give PUTLLUC increased priority.
2020-10-18 20:10:48 +03:00
Eladash 402e8b12a6 SPU: Touch unmapoed memory in reservation mismatch 2020-10-18 11:42:54 +03:00
Nekotekina d0057c92e4 Fix spu_putlluc_tx (insignificant) 2020-10-17 21:27:19 +03:00
Nekotekina 583ed61712 SPU: return some give-up behaviour for PUTLLC (TSX)
Despite using concept of "shared" lock, allow only first to proceed.
This is similar how conditional stores for PPU are implemented.
2020-10-16 12:14:42 +03:00
Nekotekina facde63460 PPU: fix ppu_stcx_accurate_tx
Don't destroy xmm6/xmm7 state on exit.
Improve addr arg handling (simplify).
2020-10-15 19:24:00 +03:00
Nekotekina 494953997e PPU/SPU: give up on conditional stores if locking fails
Restores Non-TSX behaviour partially.
2020-10-15 17:18:49 +03:00
Nekotekina 1b89ad00e7 SPU: restore some LR event setting logic after #9048 2020-10-15 17:18:49 +03:00
Nekotekina 3bddba0c7a SPU: fix spu_getllar_tx
Was not executing.
2020-10-14 02:53:29 +03:00
Nekotekina 97cd641da9 TSX: reimplement spu_getllar_tx
Only used as a backup method of reading reservation data.
Increase long GETLLAR reporting threshold.
2020-10-13 21:10:04 +03:00
Nekotekina 91db4b724c SPU: fix PUTLLC (TSX-FA)
Some forgotten checks may affect performance.
2020-10-13 17:46:03 +03:00
Nekotekina dcff8c2637 Fix remaining vm::reservation_lock usages (for now)
Optimization can be restored later.
2020-10-13 12:04:59 +03:00
Nekotekina dc39a9b84f SPU: Report 'GETLLAR took too long'
Also move similar code in PPU.
2020-10-13 00:12:11 +03:00
Nekotekina a806be8bc4 SPU: Implement S1/S2 (SNR) events (closes #8789)
Add TSX path in push_snr()
Add locks bits in ch_events
2020-10-12 21:41:57 +03:00
Eladash 95c1443e30
SPU: Validate reservation in GET commands (Accurate DMA) (#9062) 2020-10-12 15:20:06 +03:00
Nekotekina 5bd5a382c0 PPU: fix LDARX/LWARX in accurate mode (closes #9058)
Fixup after #9048
Use SSE intrinsics in mov_rdata.
2020-10-11 19:52:10 +03:00
Nekotekina 050c3e1d6b Rewrite cpu_thread::suspend_all
Now it's a function of higher order.
Make only one thread do the hard work of thread pausing.
2020-10-10 13:58:48 +03:00
Nekotekina 346a1d4433 vm: rewrite reservation bits
Implement classic unique/shared locking concept.
Implement vm::reservation_light_op.
2020-10-10 13:58:48 +03:00
Eladash 865464f607 SPU Local Storage capture 2020-10-08 19:05:14 +03:00
Eladash 493e57837b
SPU: Fix extremely rare bug of GETLLAR (#9011) 2020-10-03 09:31:28 +01:00
Eladash 56cebd99c2
SPU: Simplify logging of MFC commands (#9004) 2020-10-01 19:52:39 +03:00
Eladash f4ca6f02a1 PPU: Implement support for 128-byte reservations coherency 2020-09-28 22:34:42 +03:00
Eladash 09cddc84be SPU/PPU: Implement Atomic Cache Line Stores 2020-09-27 20:09:21 +03:00
Eladash bfa78870cb SPU: Fix unregistered channels in RCHCNT
Shouldn't throw exception on realhw.
2020-09-22 19:47:47 +03:00
Eladash ad37259ccc SPU: Implement many missing channel counts 2020-09-22 19:47:47 +03:00
Eladash c436ef0c6f
SPU: Implement channels 70, 71, add naming for channel 69 (#8932) 2020-09-19 13:08:35 +01:00
eladash 36ac68b436 SPU: Implement events channel count, minor interrupts fixes 2020-09-18 21:57:24 +03:00
Eladash 3206378ae6 sys_spu: Fix overexecution of cpu_return() 2020-09-12 22:11:40 +03:00
Eladash 7ce790f369 SPU: Use ASM for AVX2 coompilation instead of intrinsics 2020-09-12 18:49:49 +03:00
Eladash 9ff0b460a2 SPU: Make PUTLLUC LR event accurate 2020-09-11 09:02:18 +02:00
Eladash 4f0125a0e9 SPU: Remove "Accurate PUTLLUC" setting (always accurate) 2020-09-11 09:02:18 +02:00
Eladash 1e4655aef6
SPU: Remove STOP 0x0 hack (#8873) 2020-09-09 11:36:04 +01:00
Eladash 5060c779da
SPU: Use unaligned instructions in mov_rdata_avx (MSVC) (#8851) 2020-09-07 21:32:44 +01:00
Eladash abc715bc5c
SPU: Make PUT transfers use SEQ-CST ordering on Accurate DMA (#8844) 2020-09-06 12:09:14 +01:00
Eladash 2688081656 SPU: Use unaligned AVX instructions for cmp_data_avx 2020-09-06 07:51:12 +03:00
Eladash e4abd3dc5a SPU: Do not ignore pending PUT tranfers just becase GET may not be cache-line atomic
This is not the proper way to emulate non-atomic GET tranfers, as it makes it seems as if PUT atomic tranfers arent atomic. (TODO)
Incomplete GET cache line accesses still do not verify data though.
2020-09-05 22:23:55 +03:00
Eladash c7a185d4e7 SPU: Fix not acuiring reservation locks on DMA with more than one cache line (Accurate DMA) 2020-09-05 22:23:55 +03:00
Eladash 4ffc58a8ce SPU: Cleanup for Accurate PUTLLUC
Should no longer affect GET commands because Accurate DMA is available for this functionality.
2020-09-04 10:20:44 +02:00
Eladash c5c9ea1b21 SPU: Make GET's full and aligned cache line accesses atomic with Accurate DMA 2020-09-04 10:20:44 +02:00
Eladash 73d23eb6e6
SPU: Implement Accurate DMA (#8822) 2020-09-02 23:58:29 +02:00
Nekotekina ebc4a0188a Restore some code 2020-08-28 01:54:39 +03:00