With the option enabled GET commands are blocked until the current PUTLLC/PUTLLUC executer on that address finishes
Additional improvements:
- Minor race fix of sys_ppu_thread_exit (wait until the writer finishes)
- Max number of ppu threads bumped to 8
This is the only reason theres a need to specify lwmutex_sq id at creation. unlike sys_cond, lwcond isn't connected to lwmutex at the lv2 level.
SYS_SYNC_RETRY fix is done explicitly at the firmware level.
This fixes issues when the original lwcond and lwmutexol data got corrupted or deallocated, this can happen when the program simply memcpy the control data to somewhere else.
Or if it uses direct lv2 the lwcond conrol param can even be NULL which will make it access violate when dereferncing it. (This param is unchecked and can be anything)
- Fix a typo in OpenAL
- Fix typo in cellHttp.h
- Unused variables in catch
- Use 64-bit shifts
- Use use_count with shared pointers, unique is depracated and getting removed
- Explicitly cast boolean to int
- Signed/unsigned issues with loop variables
- Fix missing return statement (the code path is unreachable, but compiler wants a return)
- */ ouside of comment
- Fix duplicate layout name
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes#6145.
Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
Used sequantially consistent ordering in semaphore_release for TSX path as well.
Improved memory ordering for sys_rsx_context_iounmap/map.
Fixed sync bugs in HLE gcm because of not using atomic instructions.
Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
Prefer vm::ptr<>::ptr over vm::get_addr.
Prefer vm::_ptr/base over vm::g_base_addr with offset.
Added methods atomic_t<>::bts and atomic_t<>::btr .
Removed obsolute rsx:🧵:Read/WriteIO32 methods.
Removed wrong check in semaphore_release.
Added handling for PUTRx commands for RawSPU MFC proxy.
Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise.
Fixed more potential overflows that may result in wrong behaviour.
Added io/size alignment check for sys_rsx_context_iounmap.
Added rsx::constants::local_mem_base which represents RSX local memory base address.
Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.
- Multiple header files where missing #includes to other headers that
where used in the header. Correct header was included in correct
order in source files which caused everything to compile.
- Added missing #includes so header files correctly include all their
dependencies and fixes problems with IDEs being unable to parse
headers correctly due to missing symbols
- Add vm_locking.h and vm_reservation.h and move relevant functions
and types to these headers.
- Change include order and make vm_ptr.h, vm_var.h and vm_ref.h headers
usable invidually and them including vm.h instead of other way around
- Because usage of vm::ptr now requires including vm_ptr.h instead of
vm.h updated multiple #includes
- Added additional #includes to vm_reservation.h and vm_locking to
where vm::reservation_* and locking related functions are used
Misc: fix EH frame registration (LLVM, non-Windows).
Misc: constant-folding bitcast (cpu_translator).
Misc: add syntax for LLVM arrays (cpu_translator).
Misc: use function names for proper linkage (SPU LLVM).
Changed function search and verification in Giga mode.
Basic stack frame layout analysis.
Function detection in Giga mode.
Basic use of new information in SPU LLVM.
Fixed jump table compilation in SPU LLVM.
Disable broken optimization in Accurate xfloat mode.
Make compiled SPU modules position-independent in SPU LLVM.
Optimizations include but not limited to:
* Compiling SPU functions as native functions when eligible
* Avoiding register context write-out
* Aligned stack assumption (CWD alike instruction)
If the rwlock is currently acquired by a writer signaling readers is wrong and will lead to EPERM for wunlock!
Only signal blocked readers if the rwlock is currently acquired by readers
readers can wait on the sleep queue if a writer lock has been blocked before it, in this case after runlock: writer should acquire the lock but the r's sleep queue is still not empty!
* Hw tests show state is unaffected by external destruction of the event queue
* Minor race regarding state check fixed (can result in an undestroyable state)
* Implement both RawSPU and threaded SPU page fault recovery
* Guard page_fault_notification_entries access with a mutex
* Add missing lock in sys_ppu_thread_recover_page_fault/get_page_fault_context
* Fix EINVAL check in sys_ppu_thread_recover_page_fault, previously when the event was not found begin() was erased and
CELL_OK was returned.
* Fixed page fault recovery waiting logic:
- Do not rely on a single thread_ctrl notification (unsafe)
- Avoided a race where ::awake(ppu) can be called before ::sleep(ppu) therefore nop-ing out the notification
* Avoid inconsistencies with vm flags on page fault cause detection
* Fix sys_mmapper_enable_page_fault_notification EBUSY check
from RE it's allowed to register the same queue twice (on a different area) but not to enable page fault notifications twice
TODO: From hw testing, it seems like sys_memory_get_page_attribute and sys_rsx_context_iomap check page size a little differently
get_page_attribute() always go by area flags, sys_rsx_context_iomap checks page by the page granularity
This means that if the area page size 64k, but shared memory is mapped with SYS_MEMORY_GRANULARITY_1M
It can be mapped for rsxio, but the page attribute will indicate 64k page size :thonk:
rsxio memory is verified to need 1m pages.
- From RE, only protocols SYS_SYNC_FIFO and SYS_SYNC_PRIORITY are valid
- Use conditional atomic op store in a few places
- Properly revert changes in sys_event_flag_set when aomic op fails
* Fix 0 vm page flags to behave like 1m flags, follows c8a681e60
* check if address exists and valid for rsx io allcations (must be allocated on 1m pages)
Implement helper functions balanced_wait_until and balanced_awaken
They include new path for Windows 8.1+ (WaitOnAddress)
shared_mutex, cond_variable, cond_one, cond_x16 modified to use it
Added helper function utils::popcnt16
Replace most semaphore<> with shared_mutex
Move lambda into a cpu_stop()
Use running thread counter to synchronize with sys_spu_thread_group_join()
Use SPU_STATUS_STOPPED_BY_STOP exclusively for sys_spu_thread_exit() as before
Remove unnecessary waiting in sys_spu_thread_group_exit()
Rollback some minor unnecessary changes
Use shared_mutex in SPU TG
*Set priority under a lock
*Fix yield command making threads going out of scheduler control by removing it from the queue (not a bug that affects compatibility)
* gcm: Fix tile offset setting
highest bit signifyies location, so ignore that while reading the offset.
* rsx-capture: Fix tile binding
fixes division by zero when dividing by pitch when the tile is not bound.
* rsx-capture: Fix zcull binding
Preprocess . and .. correctly
Don't use recursive locking
Also use std::string_view
Fix format system for std::string and std::string_view
Fix fmt::merge for std::string_view
Remove "atomic operator" classes
Remove test, test_and_set, test_and_reset, test_and_complement global functions
Simplify atomic_t<> with constexpr if, remove some garbage
Redesign bs_t<> to use class, mark its methods constexpr
Implement atomic_bs_t<> for optimizations
Remove unused __bitwise_ops concept (should be in other header anyway)
Bitsets can now be tested via safe bool conversion
- Matches ps3 accuracy for all tested values with few exceptions
- Do not enter the host OS kernel if waiting for less than 500us to avoid scheduler issues
1. rsx: Rework section synchronization using the new memory mirrors
2. rsx: Tweaks
- Simplify peeking into the current rsx::thread instance.
Use a simple rsx::get_current_renderer instead of asking fxm for the same
- Fix global rsx super memory shm block management
3. rsx: Improve memory validation. test_framebuffer() and
tag_framebuffer() are simplified due to mirror support
4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine
5. rsx: Explicitly mark clobbered flushable sections as dirty to have them
removed
6. rsx: Cumulative fixes
- Reimplement rsx::buffered_section management routines
- blit engine subsections are not hit-tested against confirmed/committed memory range
Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops
it's currently unknown whats the exact relationship between the prio and the group type SYS_SPU_THREAD_GROUP_TYPE_COOPERATE_WITH_SYSTEM (0x20).
tho we do know prio'es whom less than 16 are reserved for the system.
Use atomic variable to sync TTY size
Implement console_putc (liblv2)
Write plaintext instead of HTML
Slightly improve performance
Fix random line breaks in TTY
- opengl driver optimization for nvidia. On nvidia glTextureBufferRange performance is horrendous
-- Initialize texture buffer to whole buffer at startup and use absolute offsets to read data instead
-- Over 2x performance in some cases (Resogun, TNT racers)
- gl/vk: Do not flip non-existent display buffers. Fixes spec violation at boot in TNT racers demo
- whitespace fixes for sys_rsx
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
Implement sys_net syscalls
Clean libnet functions
Use libnet.sprx
Use libhttp.sprx
Use libssl.sprx
Use librudp.sprx
Implement sys_ss_random_number_generator
Implemented _sys_process_exit2 syscall
Rewritten sys_game_process_exitspawn
Rewritten sys_game_process_exitspawn2
Implemented _sys_process_atexitspawn
Implemented _sys_process_at_Exitspawn
And some other changes