Nekotekina
8deb20e928
SPU: write cache before compiling
2019-04-13 22:56:11 +03:00
eladash
8da78c098c
SPU LLVM: Fix branch to self at start of block state check
2019-04-11 17:47:52 +03:00
eladash
eba8e2284b
SPU LLVM: Fix CFLTU
...
Clamp properly result from both sides!
TODO: Figure out whats different CreateFPToUi has from CFLTU and why it fails here.
2019-04-11 17:47:52 +03:00
eladash
969af86eba
SPU: Implement BISLED
...
DFCMGT instruction removed, it was wrong to add to begin with
ASMJIT: Fix compilation of double compare instructions, move exception to runtime instead of compiletime!
Jarves confirmed that he implemented this instruction because of that bug with asmjit only, affected God Of War 3
2019-04-11 17:47:52 +03:00
Nekotekina
d873802b9c
Use LLVM 9
...
Use new add/sub with saturation intrinsics
2019-03-30 01:36:48 +03:00
Nekotekina
d77fed6105
SPU LLVM: remove wrong dead code
2019-03-29 17:00:53 +03:00
Nekotekina
71b88cdc82
New SPU interpreter (SPU fast)
...
Use LLVM to build SPU interpreter.
Simplify interpreter loop.
2019-03-27 20:33:44 +03:00
Nekotekina
7ea04d5d76
Minor optimization in SPU analyser
...
Reduce vector copy/allocation
2019-03-23 02:43:41 +03:00
Nekotekina
4b381fbbb1
Implement spu_runtime::reset
...
To handle JIT: Out Of Memory error.
2019-03-23 02:43:41 +03:00
Nekotekina
1880a17f79
SPU recs: implement spu_runtime::find
...
Use this function to link to existing functions from branch patchpoints.
Don't compile from branch patchpoints.
2019-03-23 02:43:41 +03:00
Nekotekina
31304f4234
SPU rec: refactor some trampoline generation
...
Move branch/dispatch trampoline generation at startup.
2019-03-23 02:43:41 +03:00
Nekotekina
3794f65bb6
Add cpu_flag::jit_return
2019-03-23 02:43:41 +03:00
Nekotekina
466d58ccef
SPU LLVM: fix branch patchpoints
...
Forgot to passthrough 3rd arg (rip)
2019-03-23 02:43:41 +03:00
Nekotekina
e9b6beadfc
SPU LLVM: implement static branch weights
...
May help branch prediction in some cases
2019-03-13 21:14:55 +03:00
Nekotekina
388d49db80
SPU LLVM: fix SPU MMIO in TSX mode
2019-03-13 21:14:55 +03:00
Nekotekina
fb64b28886
SPU LLVM: reintroduce branch patchpoints
...
Previously only used on SPU ASMJIT, may improve perf in some cases.
Now refactored to spu_runtime::make_branch_patchpoint.
2019-03-01 00:08:20 +03:00
Nekotekina
765d15f23f
Optimize SPU trampolines
...
Load values in EAX and reuse it if possible
2019-03-01 00:08:19 +03:00
Nekotekina
58358e85dd
spu_runtime::add minor optimization
...
Use preallocated vectors in trampoline generation subroutine
2019-01-29 03:32:16 +03:00
Nekotekina
2b66abaf10
Implement atomic_t<>::release
...
More relaxed store with release memory order
2019-01-29 03:32:16 +03:00
Nekotekina
50922faac9
Remove SPUThread::jit_dispatcher
...
Use global array - save memory
Move the array to JIT memory
2019-01-29 03:32:16 +03:00
Nekotekina
4292997a01
Added jit_runtime class
...
Is a memory manager for ASMJIT, replaces asmjit::JitRuntime
Unified memory manager for ASMJIT and LLVM
Unified SPU trampoline generation
Remove previous workarounds
2019-01-29 03:32:16 +03:00
Nekotekina
4f152ad126
SPU: multithread compilation
...
Allow parallel compilation of SPU code, both at startup and runtime
Remove 'SPU Shared Runtime' option (it became obsolete)
Refactor spu_runtime class (now is common for ASMJIT and LLVM)
Implement SPU ubertrampoline generation in raw assembly (LLVM)
Minor improvement of balanced_wait_until<> and balanced_awaken<>
Make JIT MemoryManager2 shared (global)
Fix wrong assertion in cond_variable
2019-01-22 22:02:02 +03:00
elad
fc92ae4085
SPU/PPU atomics performance and LR event fixes ( #5435 )
...
* Fix SPU LR event setting in atomic commands according to hw test
* MFC: increment timestamp for PUT cmd in non-tsx path
* MFC: fix reservation lost test on non-tsx path in regard to the lock bit
* Reservation notification moved out of writer_lock scope to reduce its lifetime
* Use passive_lock/unlock in ppu atomic inctrustions to reduce redundancy
* Lock only once for dma transfers (non-TSX)
* Don't use RDTSC in reservation update logic
* Remove MFC cmd args passing to process_mfc_cmd
* Reorder check_state cpu_flag::memory check for faster unlocking
* Specialization for 128-byte data copy in SPU dma transfers
* Implement memory range locks and isolate PPU and SPU passive lock logic
2019-01-15 18:31:21 +03:00
eladash
f19fd23227
spu: Fix support for multiple lists when one is stalled
2019-01-15 02:33:22 +03:00
Nekotekina
a419e98acb
Move PPU and shader cache
...
New hash-based location (already used for SPU)
Bump PPU cache version, improve naming and decrease size
Remove fs::get_data_dir
Disable boot.elf cache
2019-01-14 01:24:05 +03:00
Nekotekina
aefee04c4a
SPU analyser: fix branch to self
...
Fixed not filling the predeccessor list on BR-to-self on entry point
Version bumped (v1-tane)
Closes #5353
2019-01-14 00:01:27 +03:00
Nekotekina
d7be0a96f3
SPU LLVM: approximate xfloat option
...
Adapt previous SPU ASMJIT changes made by @kd-11
FM, FMA, FNMS, FMS are approximated.
FCGT, FCMGT are accurate.
2018-12-24 16:04:46 +03:00
Nekotekina
2fd384ae95
SPU LLVM: check state in every callable chunk
...
It's often redundant but may be necessary
2018-11-09 16:19:59 +03:00
Nekotekina
488928eca2
Fix SPU STOP instruction
...
Check thread state after STOP instruction
2018-11-05 14:35:50 +03:00
Nekotekina
1b37e775be
Migration to named_thread<>
...
Add atomic_t<>::try_dec instead of fetch_dec_sat
Add atomic_t<>::try_inc
GDBDebugServer is broken (needs rewrite)
Removed old_thread class (former named_thread)
Removed storing/rethrowing exceptions from thread
Emu.Stop doesn't inject an exception anymore
task_stack helper class removed
thread_base simplified (no shared_from_this)
thread_ctrl::spawn simplified (creates detached thread)
Implemented overrideable thread detaching logic
Disabled cellAdec, cellDmux, cellFsAio
SPUThread renamed to spu_thread
RawSPUThread removed, spu_thread used instead
Disabled deriving from ppu_thread
Partial support for thread renaming
lv2_timer... simplified, screw it
idm/fxm: butchered support for on_stop/on_init
vm: improved allocation structure (added size)
2018-10-19 22:22:35 +03:00
Nekotekina
ca5158a03e
Cleanup semaphore<> (sema.h) and mutex.h (shared_mutex)
...
Remove semaphore_lock and writer_lock classes, replace with std::lock_guard
Change semaphore<> interface to Lockable (+ exotic try_unlock method)
2018-09-03 23:00:36 +03:00
Nekotekina
8abe6489ed
Mega-cleanup for atomic_t<> and named bit-sets bs_t<>
...
Remove "atomic operator" classes
Remove test, test_and_set, test_and_reset, test_and_complement global functions
Simplify atomic_t<> with constexpr if, remove some garbage
Redesign bs_t<> to use class, mark its methods constexpr
Implement atomic_bs_t<> for optimizations
Remove unused __bitwise_ops concept (should be in other header anyway)
Bitsets can now be tested via safe bool conversion
2018-09-03 21:40:36 +03:00
Nekotekina
9578e1e923
SPU LLVM: lower some log levels
2018-08-14 15:14:06 +03:00
eladash
f349695a75
Rsx: rewrite address translation
2018-08-13 16:16:34 +03:00
Nekotekina
fdd4f03b93
SPU LLVM: improve xfloat precision
...
Use doubles for intermediate representation
Add option "Accurate xfloat" to enable
2018-08-12 15:42:47 +03:00
Nekotekina
14e6577700
SPU LLVM: improve debugging RPCS3
...
Build cache in reverse order
Catch exceptions in instruction loop: print IR
2018-08-12 02:42:32 +03:00
Nekotekina
711e0f75ee
SPU LLVM: inline WRCH (preview)
...
With lööps for TSX bróþers
2018-08-12 02:42:32 +03:00
Nekotekina
d01bf3bcb0
SPU LLVM: rewrite CGX
2018-08-12 02:42:32 +03:00
Nekotekina
d3ad44aec4
SPU LLVM: improve constant propagation
...
Propagate constants in non-volatile registers between chunks
Disable function table in Mega mode
2018-08-12 02:42:32 +03:00
Nekotekina
9b4e63df6d
SPU LLVM: simplify CG, CGX, BG, BGX
2018-07-21 12:18:07 +03:00
scribam
1b0f5b1ed9
spu: improve dfnma instruction
2018-07-09 03:33:05 +04:00
Nekotekina
d856dc89a8
SPU LLVM: combine SELB with comparison instructions
...
Turn bitwise select into a vector select
2018-07-06 02:26:18 +03:00
Nekotekina
caf827344f
SPU LLVM: improve SHL, SHLH, ROTM, ROTHM instructions
...
Avoid zero extension, select undef result to zero
2018-07-06 00:33:52 +03:00
Nekotekina
b9c026d441
SPU LLVM: improve ROTMA and ROTMAH instructions
...
Avoid sign extension, clamp shift amount with min op
2018-07-06 00:33:52 +03:00
Nekotekina
2b9fa7ed23
SPU LLVM: combine SHUFB with CWD-alike instructions
...
Turn SHUFB into a vector insert
2018-07-06 00:33:52 +03:00
Nekotekina
253e8b4466
SPU LLVM: improve SHUFB with constant mask
2018-07-06 00:33:52 +03:00
Nekotekina
622f2f7f66
SPU LLVM: constant computation fixes
...
Fixed instructions:
Gather Bits: GB, GBH, GBB
Form Select Mask: FSM, FSMH, FSMB
2018-07-06 00:33:52 +03:00
Nekotekina
c959ab2698
SPU LLVM: fix constant propagation
...
Compute constant bitcasts
2018-07-06 00:33:52 +03:00
Nekotekina
afd5af04f6
SPU: improve analyser (v5)
...
Fix jumptable analysis
2018-07-06 00:33:52 +03:00
Nekotekina
712632d28a
SPU LLVM: inline RDCH
2018-07-06 00:33:52 +03:00