Elad Ashkenazi
3d2229ca05
SPU LLVM Precompilation Fixup
2023-08-28 13:33:43 +03:00
Eladash
b5faf5800b
SPU LLVM Precompilation
...
Implement function SPU function discovery in images or random SPU code
2023-08-28 09:03:56 +03:00
Malcolm Jestadt
290ff5b839
Zero register optimization for AVX-512-VBMI
...
- Take advantage of the fact that AVX instructions zero the upper 128 bits for a nice optimization when one input vector is zeroed
2023-08-28 05:09:30 +03:00
Eladash
a001e6ef09
Progress Dialog: Fix race on PPU compilation status
2023-08-22 05:40:53 +03:00
Malcolm Jestadt
f2e782f5dd
SPU LLVM: Inline timer reads for WrDec and RdDec
...
- Uses RDTSC to emulate the spu decrementer
2023-08-13 00:16:35 +03:00
Malcolm Jestadt
512f0a814c
SPU LLVM: Fix for AVX-512 CFLTU path
...
- vcvvtps2udq doesn't turn negative numbers into 0, fix by using signed integer max with 0 instead of vrangeps
2023-08-12 02:55:08 +03:00
Eladash
2a0278fbb1
Fixup SPU/PPU Cache Abortion
2023-08-06 21:37:10 +03:00
Megamouse
343ba8733b
Merge xfloat options
2023-08-06 09:30:53 +03:00
Ivan Chikish
d34287b2cc
Linux: use futex_waitv syscall for atomic waiting
...
In order to make this possible, some unnecessary features were removed.
2023-08-02 21:46:06 +03:00
Whatcookie
fd6829f757
SPU LLVM: AVX-512 optimization for CFLTU ( #14384 )
...
- Takes advantage of vrangeps and the new float to uint instructions from AVX-512
- Down from 6 to 3 instructions
TODO: Somehow ensure that this is what llvm outputs using CreateFPToUI?
2023-07-29 09:01:01 +03:00
Whatcookie
4ecb06c901
SPU LLVM: Optimize common SFI+ROTQMBY pattern
2023-07-28 10:26:40 +03:00
Elad Ashkenazi
9265ff53d0
Include spu.log inside RPCS3.log when SPU Debug is true
2023-07-27 19:15:32 +03:00
Eladash
50dad6801b
SPU LLVM: Use get_known_bits() in SHUFB
2023-07-18 22:27:45 +03:00
Malcolm Jestadt
ee7475a9d4
SPU LLVM: Handle SHUFB special cases with a lookup table
...
- Needs 3 instructions to handle the special cases, since x86 lacks an 8 bit simd shift instruction
2023-07-18 22:27:45 +03:00
oltolm
0c94606fcf
Make compile with msvc, clang and gcc on Windows
2023-07-11 21:40:30 +03:00
Eladash
482dd0e8f8
SPU: Remove wrong clamp in MFC_Size
...
Just crashes real MFC.
2023-07-09 13:33:03 +03:00
RipleyTom
cbb1b1f28e
Fix spu_fm
2023-05-19 18:26:42 +03:00
RipleyTom
f11770b88b
Better accuracy for FREST/FRSQEST ( #13863 )
2023-05-15 17:20:47 +01:00
RipleyTom
5c0113ce59
Deterministic FREST and FRSQEST
2023-05-06 12:59:34 +03:00
Ivan Chikish
bb8e43f16c
SPU LLVM: fixup custom LICM pass
2023-04-22 03:07:06 +03:00
Ivan Chikish
1041284384
SPU LLVM: sink stores deeper in custom LICM pass
2023-04-21 18:11:59 +03:00
Ivan Chikish
183bea3b98
SPU LLVM: upgrade custom DSE pass
2023-04-20 11:12:31 +03:00
Ivan Chikish
39d17a94c6
SPU LLVM: make savestates unsavable inside the code
2023-04-18 12:19:15 +03:00
Ivan Chikish
8153e5338f
SPU LLVM: optimize register stores
...
Custom implementation of DSE+LICM
2023-04-18 12:19:15 +03:00
Ivan Chikish
44b3709d1d
SPU LLVM: use volatile stores for PC update
2023-04-15 12:40:59 +03:00
Ivan Chikish
ba29f0ccd1
SPU LLVM: use atomic loads in read channel count
2023-04-14 13:36:04 +03:00
Ivan Chikish
3473e19508
SPU LLVM: fix savestate safety guards
...
Volatile was removed since it prevented optimizations.
2023-04-14 07:26:30 +03:00
RipleyTom
d35fecbeea
Forces deterministic FP operations when online
2023-04-12 15:31:36 +03:00
Ivan Chikish
06b0e35fb9
Update to LLVM 16.0.1
...
Fix Zen4+ AVX-512 detection
2023-04-11 12:13:09 +03:00
oltolm
6fbca1acfd
remove unnecessary pointer bitcasts
2023-04-09 12:45:18 +03:00
Ivan Chikish
fb88e1c1c9
Update to LLVM 16.0.0, switch to upstream LLVM
2023-04-06 10:19:31 +03:00
oltolm
cf5346c263
use new LLVM API in SPURecompiler
2023-03-12 10:11:06 +03:00
Ivan Chikish
776b3b5efa
SPU LLVM: fix regression from #13500
...
Fixes #13526
2023-03-11 19:48:55 +03:00
oltolm
520524285a
llvm: update code to new API ( #13500 )
...
* llvm: update code to new API
* llvm: remove OLDLLVM define
2023-03-11 01:57:21 +03:00
Malcolm Jestadt
813f7b50c1
SPU LLVM: Minor SUMB AVX-512 path optimization
...
- Tweak shuffle to allow LLVM to emit a cheap blend instruction instead of the expensive VPERMI2W instruction
2023-01-27 13:06:48 +03:00
Eladash
2a00a88e2a
SPU LLVM: don't force-enter process_mfc_cmd() because it's slower
2022-10-04 16:28:34 +03:00
Malcolm Jestadt
d8897c585d
PPU/SPU LLVM: Allow Zen4 cpus to use VPERMI2B/VPERMT2B instead of the vperm2b256to128 path
...
- Zen4 based cpus can process VPERM2B in a single uop, unlike intel where it is 3 uops.
2022-10-01 15:38:29 +03:00
Nekotekina
6ff6a4989a
Implement at32() util
...
Works like .at() but uses source location for "exception".
2022-09-26 18:04:15 +03:00
Nekotekina
b49a1f27eb
Warning fixes
2022-09-17 16:35:02 +03:00
Nekotekina
5985f0eefa
BufferUtils: cleanup regarding ARM64
2022-09-07 17:59:07 +03:00
sguo35
a0d48c588a
spu/arm64: clean up assembly code generation
...
Clean up asmjit usage so we don't unnecessarily allocate memory
anymore for SPURecompiler functions.
2022-09-07 17:33:01 +03:00
Eladash
ee1384341e
rsx: Implement atomic vertex upload (with Strict Rendering Mode)
2022-09-01 20:09:28 +03:00
Eladash
506b9deec5
Savestates/SPU LLVM: Improve saving performance
2022-08-25 23:54:56 +03:00
Malcolm Jestadt
51e6d0a336
SPU LLVM: Add integer compare optimization for FCMGT
2022-07-29 11:59:59 +03:00
sguo35
73ed657e00
spu/arm64: fix 16 byte branch patch alignment
2022-07-15 12:37:33 +03:00
sguo35
c52abed4d3
spu: implement ubertrampoline generator for arm64
...
Implement the ubertrampoline generator for arm64. It generally follows
the x86 version, but uses asmjit to generate code instead of writing raw
opcodes to memory, trading memory usage for readability. Currently the
trampoline implementation is fairly inefficient in terms of instruction
size and is substantially larger than the x86 version.
2022-07-15 12:37:33 +03:00
sguo35
9e57efe82c
spu: implement assembly functions for arm64
2022-07-15 12:37:33 +03:00
sguo35
77ab872bec
spu: remove rotqby C++ impl
...
rotqby C++ implementation is broken, since replacing it with the
intrinsic version reliably fixes spurs test. A conditional branch
immediately after a rotqby instruction will fail using the C++ version
but succeed using the intrinsic.
2022-07-15 12:37:33 +03:00
Eladash
3e51426379
Savestates/SPU: Kill emulation when its safe to save SPU state
2022-07-15 09:30:53 +03:00
Nekotekina
4b787b22c8
Implement FN (lambda shortener)
...
Useful for some higher order functions.
Allows to make short lambdas even shorter.
2022-07-08 14:47:41 +03:00