Commit graph

240 commits

Author SHA1 Message Date
Nekotekina 0da24f21d6 CPU: improve cpu_thread::suspend_all for cache efficiency (TSX)
Add prefetch hint list parameter.
Workloads may be executed by another thread on another CPU core.
It means they may benefit from directly prefetching the data as hinted.
Also implement mov_rdata_nt, for "streaming" data from such workloads.
2020-10-30 05:22:09 +03:00
Nekotekina d6daa0d05b Fix cpu_flag::temp, make sure it removes cpu_flag::wait 2020-10-28 14:00:09 +03:00
Nekotekina 86785dffa4 SPU: make vm::check_addr checks safe under vm::range_lock
Reuse some internal locking mechanisms.
Also fix vm::range_lock missing check.
2020-10-28 14:00:09 +03:00
Nekotekina c491b73f3a SPU: improve accurate DMA
Remove vm::reservation_lock from it.
Use lock bits to prevent memory clobbering in GETLLAR.
Improve u128 for MSVC since it's used for bitlocking.
Improve 128 bit atomics for the same reason.
Improve vm::reservation_op and friends.
2020-10-28 03:47:41 +03:00
Nekotekina 4966f6de73 vm: improve range_lock and shareable cache (Non-TSX)
Allocate "personal" range lock variable for each spu_thread.
Switch from reservation_lock to range lock for all stores.
Detect actual memory mirrors in shareable cache setup logic.
2020-10-27 17:56:19 +03:00
Nekotekina 4384ae15b4 Improve vm::reservation_op
Remove XABORT, sync status handling with SPU/PPU transaction.
Limit max number of transaction attempts in loop.
Add Ack template parameter, as in vm::reservation_light_op.
Remove utils::tx_abort, improve utils::tx_start as well.
2020-10-20 09:10:21 +03:00
Nekotekina 120849c734 Implement perf stat counter for PPU/SPU reservation ops
Adds Emu/perf_meter.hpp header file.
Uses RDTSC for speed.
Prints stats at exit.
2020-10-19 19:41:28 +03:00
Nekotekina 1885e4345c Improve vm::reservation_update
Only respect unique lock.
2020-10-11 17:22:28 +03:00
Nekotekina 050c3e1d6b Rewrite cpu_thread::suspend_all
Now it's a function of higher order.
Make only one thread do the hard work of thread pausing.
2020-10-10 13:58:48 +03:00
Nekotekina 346a1d4433 vm: rewrite reservation bits
Implement classic unique/shared locking concept.
Implement vm::reservation_light_op.
2020-10-10 13:58:48 +03:00
Nekotekina 89f1248140 Implement vm::reservation_op
Implement vm::reservation_peek (memory load)
Implement vm::unsafe_ptr_cast helper
Example use in cellSpurs.cpp
Fix dma_lockb value and description
2020-10-07 20:11:59 +03:00
Eladash f4ca6f02a1 PPU: Implement support for 128-byte reservations coherency 2020-09-28 22:34:42 +03:00
Eladash 53c8ed6a63 vm: Fix stack memory release, always reset memory flags 2020-09-26 21:48:12 +03:00
Eladash 73d23eb6e6
SPU: Implement Accurate DMA (#8822) 2020-09-02 23:58:29 +02:00
Eladash ee953f7953 Fix vm::reservation_update 2020-08-16 22:58:49 +03:00
Eladash d9750e8f9f SPU/PPU reservations: Optimizations for reservation locks and check_state() (non-TSX) 2020-07-09 03:17:35 +01:00
Eladash 3df83e03a9 SPU: Fix possible deadlock after access violation via DMA transfers 2020-05-28 23:23:11 +03:00
Eladash a199c71a55 SPU: Fix page faults notifications (non-TSX) 2020-05-28 23:23:11 +03:00
Eladash 93122196d9 vm: Fix vm::passive_lock regression (#8175)
possible broken signaling in rare occusions.
2020-05-13 16:53:59 +03:00
Eladash b38580bf32 SPU: Use optimized PPU signaling to SPU on reservation pause 2020-05-13 11:10:13 +01:00
Eladash a6025df7de vm: reset stack memory after deallocation 2020-05-06 18:03:37 +03:00
Nekotekina 58ba6d68bb Don't use std::popcount (workaround)
It seems MSVC uses POPCNT instruction when compiling for SSE2.
2020-04-25 18:01:39 +03:00
Nekotekina 0f6a0d2740 Expand vm::g_addr_lock to 64 bit to support ranges
Optimization.
2020-04-16 02:25:43 +03:00
Nekotekina c7fe8567b8 Experimental squashing of reservation memory area.
Enables trivial synchronization between shared mem.
Reduces memory usage, but potentially degrades performance.
Rename an overload of vm::passive_lock to vm::range_lock.
2020-04-16 02:25:43 +03:00
Nekotekina f72af2973d Replace utils::popcnt32 with std::popcount
Cleanup includes.
2020-04-14 16:05:58 +03:00
Nekotekina 032e7c0491 Replace utils::cntlz{32,64} with std::countl_zero 2020-04-14 16:05:58 +03:00
Eladash bbbd06dcee Make vm::_test_map aware of overflow
+ remove vm::find_map 1024mb limit.
2020-04-05 17:40:23 +03:00
Eladash cb4192bce9 vm: Log all guest memory bases at startup 2020-03-14 18:30:14 +02:00
Nekotekina c0f80cfe7a Use attributes for LIKELY/UNLIKELY
Remove LIKELY/UNLIKELY macro.
2020-02-05 10:42:34 +03:00
Nekotekina 1a78e0e80c Make RPCS3 compile in C++2a mode 2020-02-04 23:43:55 +03:00
Nekotekina 3c0bd821c8 Give log channels fancier names
Improve LOG_CHANNEL macro to accept custom name.
2020-02-01 10:43:43 +03:00
Nekotekina 26cccead6e logs: remove legacy MEMORY channel
Add channels vm_log, sig_log.
2020-01-31 16:44:48 +03:00
Nekotekina 185c067d5b C-style cast cleanup V 2019-12-03 17:23:00 +03:00
Nekotekina 2290c389d6 vm: implement vm::try_access, vm::ptr::try_read/write 2019-11-26 00:12:45 +03:00
eladash 730e9cde84 sys_rsx: Improve allocations and error checks
* allow sys_rsx_device_map to be called twice: in this case the DEVICE address retrived from the previous call returned
* Add ENOMEM checks for sys_rsx_memory_allocate and sys_rsx_context_allocate
* add EINVAL check for sys_rsx_context_allocate if memory handle is not found
* Separate sys_rsx_device_map allocation from sys_rsx_context_allocate's
* Implement sys_rsx_memory_free; used by cellGcmInit upon failure
* Added context_id checks
* Throw if sys_rsx_context_allocate was called twice.
2019-10-21 15:31:45 +03:00
Nekotekina 5f9c5e8765 Use g_fxo for rsx::thread 2019-09-26 23:26:36 +03:00
Nekotekina feee3838eb Revert "Revert "Remove shared_cond and simplify reservation waiting""
This reverts commit b70c08a2e8.
2019-09-24 05:01:00 +03:00
Nekotekina 20cb19618d Fix vm::reserve_map NRVO 2019-09-18 21:24:04 +03:00
Nekotekina b70c08a2e8 Revert "Remove shared_cond and simplify reservation waiting"
This reverts commit 0a96497e13.
2019-09-14 00:02:48 +03:00
Nekotekina 0a96497e13 Remove shared_cond and simplify reservation waiting
Use atomic wait for reservations
Cleanup some obsolete code
2019-09-10 19:25:39 +03:00
Nekotekina a45f86a4a2 Remove notifier class
Poorly implemented condition variable.
2019-09-10 19:25:39 +03:00
Eladash ec9b896fbf Fix vm::reserve_map logic 2019-08-22 03:53:40 +03:00
Eladash 0d88f037ff Add new accuracy control for PUTLLUC accuracy setting (non-TSX)
With the option enabled GET commands are blocked until the current PUTLLC/PUTLLUC executer on that address finishes

Additional improvements:
- Minor race fix of sys_ppu_thread_exit (wait until the writer finishes)
- Max number of ppu threads bumped to 8
2019-08-17 00:42:46 +03:00
Nekotekina f8f3067deb Always check page_allocated in vm::check_addr 2019-08-14 20:28:34 +03:00
Eladash 6d3fc3a386 core config: Expose min/max ranges of integral settings and use it 2019-08-13 04:56:00 +03:00
Eladash a6c94a0eaf Fix possible infinite loop on vm area searching (sys_mmapper_allocate_address)
Specifically when allocation with 0x8000'0000 alignment fails.
2019-08-13 04:56:00 +03:00
Eladash cbcd06d1dc ppu: Stack size allocation improvements 2019-08-11 21:43:13 +03:00
Eladash 3ce18fd960 Implement vm::page_executable (#6330)
Fixes segfaults when attenpting to set segfaults on non-executable memory.
2019-08-11 21:04:17 +03:00
Eladash 997e3046e3 vm/sys_overlay Improvements
- Implement sys_overlay_load_module_by_fd.
- Implement special segment allocation when ppc_seg flag is specified.
2019-07-28 14:23:58 +03:00
Nekotekina c01f1a8968 Avoid transitive include of vm_ref.h
Add forward declarations of vm::_ref_base
Remove default AT = u32 in _ptr_base and _ref_base (doesn't play well).
2019-07-15 15:46:46 +03:00
Eladash 4c2fb54b99 Fix possible inconsistencies for sys_memory mem stats report 2019-07-04 22:35:22 +03:00
msuih 690cdff0d3 Minor fixes
- Fix a typo in OpenAL
- Fix typo in cellHttp.h
- Unused variables in catch
- Use 64-bit shifts
- Use use_count with shared pointers, unique is depracated and getting removed
- Explicitly cast boolean to int
- Signed/unsigned issues with loop variables
- Fix missing return statement (the code path is unreachable, but compiler wants a return)
- */ ouside of comment
- Fix duplicate layout name
2019-07-01 04:33:23 +03:00
Eladash 43f919c04b Fixup after #6143 (#6146)
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145.
    Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
    Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
    Used sequantially consistent ordering in semaphore_release for TSX path as well.
    Improved memory ordering for sys_rsx_context_iounmap/map.
    Fixed sync bugs in HLE gcm because of not using atomic instructions.
    Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
    Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
    Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
2019-06-29 18:48:42 +03:00
Eladash 1ee7b91646 Refactoring (#6143)
Prefer vm::ptr<>::ptr over vm::get_addr.
    Prefer vm::_ptr/base over vm::g_base_addr with offset.
    Added methods atomic_t<>::bts and atomic_t<>::btr .
    Removed obsolute rsx:🧵:Read/WriteIO32 methods.
    Removed wrong check in semaphore_release.
    Added handling for PUTRx commands for RawSPU MFC proxy.
    Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise.
    Fixed more potential overflows that may result in wrong behaviour.
    Added io/size alignment check for sys_rsx_context_iounmap.
    Added rsx::constants::local_mem_base which represents RSX local memory base address.
    Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.
2019-06-29 01:27:49 +03:00
Nekotekina 1641be5e0c Fix UTF-8 BOM in vm.cpp 2019-06-25 22:21:56 +03:00
Lassi Hämäläinen c963c51a60 Remove unnecessary header includes
- Manually removed lot of unneeded #includes to clean code and reduce
  compilation time
- Reordered some of the #includes to be in more logical order
2019-06-25 17:11:10 +03:00
Lassi Hämäläinen 499035512b Split Emu/Memory into more logical headers
- Add vm_locking.h and vm_reservation.h and move relevant functions
  and types to these headers.
- Change include order and make vm_ptr.h, vm_var.h and vm_ref.h headers
  usable invidually and them including vm.h instead of other way around
- Because usage of vm::ptr now requires including vm_ptr.h instead of
  vm.h updated multiple #includes
- Added additional #includes to vm_reservation.h and vm_locking to
  where vm::reservation_* and locking related functions are used
2019-06-25 17:11:10 +03:00
Eladash 806a7bbf04 Fixup for #6115 (#6120) 2019-06-22 12:10:47 +03:00
Eladash ade291e73d Fix potential overflow in sys_vm 2019-06-21 00:02:52 +03:00
Nekotekina 5d45a3e47d Implement cpu_thread::suspend_all
Remove Accurate PUTLLC option.
Implement fallback path for SPU transactions.
2019-06-19 20:36:12 +03:00
eladash b307aff9eb Prefetch byteswapped opcodes in ppu interpreter 2019-04-11 17:47:52 +03:00
eladash 182054b8af Implement sys_vm_append/return_memory 2019-03-31 14:57:21 +03:00
eladash a3f65084df Fix sys_process_exit2 when SPUs are at av handler 2019-03-31 14:57:21 +03:00
eladash 6502d933df Fix stack memory view on the debugger
the debugger uses super ptr which was unmapped for stack.
2019-03-31 14:57:21 +03:00
eladash a43e7c172c Fix shared memory page flags
TODO: From hw testing, it seems like sys_memory_get_page_attribute and sys_rsx_context_iomap check page size a little differently

get_page_attribute() always go by area flags, sys_rsx_context_iomap checks page by the page granularity
This means that if the area page size 64k, but shared memory is mapped with SYS_MEMORY_GRANULARITY_1M
It can be mapped for rsxio, but the page attribute will indicate 64k page size :thonk:
rsxio memory is verified to need 1m pages.
2019-03-08 23:44:46 +03:00
eladash e38b7aee5a check address in sys_rsx_context_iomap
* Fix 0 vm page flags to behave like 1m flags, follows c8a681e60
* check if address exists and valid for rsx io allcations (must be allocated on 1m pages)
2019-03-05 21:23:24 +03:00
eladash d4a24433e8 Fix DECR mode allocations (sys_memory) 2019-01-31 16:03:38 +03:00
Nekotekina 2b66abaf10 Implement atomic_t<>::release
More relaxed store with release memory order
2019-01-29 03:32:16 +03:00
elad fc92ae4085 SPU/PPU atomics performance and LR event fixes (#5435)
* Fix SPU LR event setting in atomic commands according to hw test
* MFC: increment timestamp for PUT cmd in non-tsx path
* MFC: fix reservation lost test on non-tsx path in regard to the lock bit
* Reservation notification moved out of writer_lock scope to reduce its lifetime
* Use passive_lock/unlock in ppu atomic inctrustions to reduce redundancy
* Lock only once for dma transfers (non-TSX)
* Don't use RDTSC in reservation update logic
* Remove MFC cmd args passing to process_mfc_cmd
* Reorder check_state cpu_flag::memory check for faster unlocking
* Specialization for 128-byte data copy in SPU dma transfers
* Implement memory range locks and isolate PPU and SPU passive lock logic
2019-01-15 18:31:21 +03:00
Nekotekina 1b37e775be Migration to named_thread<>
Add atomic_t<>::try_dec instead of fetch_dec_sat
Add atomic_t<>::try_inc
GDBDebugServer is broken (needs rewrite)
Removed old_thread class (former named_thread)
Removed storing/rethrowing exceptions from thread
Emu.Stop doesn't inject an exception anymore
task_stack helper class removed
thread_base simplified (no shared_from_this)
thread_ctrl::spawn simplified (creates detached thread)
Implemented overrideable thread detaching logic
Disabled cellAdec, cellDmux, cellFsAio
SPUThread renamed to spu_thread
RawSPUThread removed, spu_thread used instead
Disabled deriving from ppu_thread
Partial support for thread renaming
lv2_timer... simplified, screw it
idm/fxm: butchered support for on_stop/on_init
vm: improved allocation structure (added size)
2018-10-19 22:22:35 +03:00
Nekotekina da6ce80f4f Make vm::get_super_ptr return contiguous memory
Cleanup RSX code complexity
2018-09-27 23:37:13 +03:00
Rui Pinheiro f3029b2b42 Change Cell->RSX map/unmap notifications
This allows for further flexibility on the RSX side, allowing us to fix
some bugs and crashes in later commits.
2018-09-24 15:26:40 +03:00
Nekotekina ed9fb8405b Move rotate/cntlz/cnttz helpers to Utilities/asm.h 2018-09-08 00:32:04 +03:00
Nekotekina 8abe6489ed Mega-cleanup for atomic_t<> and named bit-sets bs_t<>
Remove "atomic operator" classes
Remove test, test_and_set, test_and_reset, test_and_complement global functions
Simplify atomic_t<> with constexpr if, remove some garbage
Redesign bs_t<> to use class, mark its methods constexpr
Implement atomic_bs_t<> for optimizations
Remove unused __bitwise_ops concept (should be in other header anyway)
Bitsets can now be tested via safe bool conversion
2018-09-03 21:40:36 +03:00
Nekotekina 85fa0942e7 vm: allow 4k-aligned allocations for vm::stack
Fix utils::shm::map logic for MapViewOfFileEx
2018-08-30 14:56:45 +03:00
Nekotekina e8d144f399 Improve vm::reader_lock
Add upgrade() method
2018-08-18 21:14:52 +03:00
Nekotekina 6ec4a88eb5 Hardcode vm::user64k location 2018-08-18 21:02:16 +03:00
Nekotekina 182c04b59d Improve vm::unmap
Prevent unmapping predefined locations
2018-08-18 21:01:32 +03:00
eladash 061c787e56 rsx-capture: unbreak 2018-08-16 03:27:11 +04:00
Nekotekina 5e20d4b481 Fix vm regression from #4975
Incorrect vector management
2018-08-14 23:57:20 +03:00
Nekotekina aa4040bb7b Implement vm::find_map; improve memory allocation
Add vm::user64k and vm::user1m constants
Remove vm::user_space, unreserve it
2018-08-14 15:14:06 +03:00
eladash 449888b9db Rsx/vm: fix base addresses 2018-08-13 16:16:34 +03:00
eladash f349695a75 Rsx: rewrite address translation 2018-08-13 16:16:34 +03:00
Nekotekina 1ac203a958 Funny workaround 2018-07-05 22:26:35 +03:00
eladash b456955688 rsx: fix hardcoded rsx allocation address 2018-06-24 10:57:30 +03:00
Nekotekina 72574b11ff SPU: use reservation spinlocks on writes (non-TSX)
This should decrease contention by avoiding global lock
2018-05-21 21:56:14 +03:00
Nekotekina 367f039523 Build transactions at runtime
Drop _xbegin family intrinsics due to bad codegen
Implemented `notifier` class, replacing vm::notify
Minor optimization: detach transactions from global mutex on TSX path
Minor optimization: don't acquire vm::passive_lock on PPU on TSX path
2018-05-16 17:31:58 +03:00
Nekotekina 5d15d64ec8 Memory mirror support
Implemented utils::memory_release (not used)
Implemented utils::shm class (handler for shared memory)
Improved sys_mmapper syscalls
Rewritten ppu_patch function
Implemented vm::get_super_ptr (ignores memory protection)
Minimal allocation alignment increased to 0x10000
2018-05-09 23:35:34 +03:00
Nekotekina 3681507136 Rewrite vm::reservation
Use flat virtual memory area
2018-04-07 20:51:21 +03:00
Nekotekina 2b5cf2455f SPU: improve TSX usage
Reduce transaction failure amount
Remove vm::try_to_lock
2018-04-06 21:47:54 +03:00
Nekotekina d392379c7a Use vm::passive_lock for SPU threads 2018-04-06 15:47:00 +03:00
Nekotekina cce0ad0c35 Clean vm::ps3 namespace use 2018-02-09 17:49:37 +03:00
Nekotekina 76be7d40ac Remove PSP2 2018-02-09 15:24:46 +03:00
Nekotekina df2fc13b7a Add PPU instruction stat dumper
Needs PPU Debug option to activate and PPU Interpreter
Dumps after Resume (after Pause)
Fix utils::memory_decommit, clean vm.cpp
2017-10-11 20:06:33 +03:00
Nekotekina 22d8b57027 Improve lv2_memory object 2017-09-14 00:23:23 +03:00
kd-11 1da732bbf5 rsx/gl/vk: Invalidate texture regions when memory is unmapped
- Free GPU resources immediately if mappings change to avoid leaking VRAM
2017-08-16 23:58:30 +03:00
Nekotekina af11ad6253 Fix deadlock in vm::unmap 2017-08-13 21:39:08 +03:00
Nekotekina 7b4c044480 Relax allocations in ppu_load_exec 2017-08-04 14:33:53 +03:00
Jake d9a693019b rsx/gcm: Implement rsx dma. Refactor gcm/rsx to not be as codependent 2017-08-02 01:33:12 +03:00
Nekotekina b24eb621ae Use RTM instructions (skylake+) 2017-07-20 17:22:09 +03:00