Commit graph

216 commits

Author SHA1 Message Date
Nekotekina f95395b351 PPU LLVM: improve accuracy of VSL/VSR
Passes tests, should now be equal to interpreter.
2022-01-15 21:13:31 +03:00
Nekotekina df24cff0b1 PPU LLVM: fix VMINFP and VMAXFP accuracy
PPU cache needs to be cleared.
2022-01-15 17:36:57 +03:00
Nekotekina 6dda047128 PPU LLVM: fix VNMSUBFP sign handling
PPU cache needs to be cleared.
2022-01-15 17:36:57 +03:00
Nekotekina e9efa73eed PPU: restore previous NJ mode handling option
Fix the divergence between PPU Interpreter and LLVM.
2022-01-15 17:36:57 +03:00
Nekotekina 580bd2b25e Initial Linux Aarch64 support
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
*   (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
Eladash a60cee6536 Update PPUTranslator::MTFSFI for its intention to be clearer 2022-01-12 03:37:39 +03:00
Nekotekina e3e39e8de3 PPU LLVM: rewrite and optimize saturation bit
Use vector accumulator
2021-12-03 00:14:06 +03:00
Nekotekina 209b14fbac PPU LLVM: inline remaining vector instructions 2021-12-03 00:14:06 +03:00
Nekotekina 04c9d01390 PPU LLVM: modernize most vector instructions
Rewritten VSUM instructions:
VSUMSWS, VSUM2SWS, VSUM4SBS, VSUM4SHS, VSUM4UBS
2021-12-03 00:14:06 +03:00
Nekotekina c9d8e59dbf PPU LLVM: allow to drop setting SAT flag (optimization, module-wide)
Implement ppu_attr::has_mfvscr (partially, module-wide search).
If this instruction isn't found, allow to drop setting SAT flag.
It's based on presumption that only MFVSCR can retrieve SAT flag.
2021-12-03 00:14:06 +03:00
Nekotekina 86b194014b PPU LLVM: rewrite more packing instructions
Rewritten VPKUHUM, VPKUHUS, VPKUWUM, VPKUWUS.
Decoupled saturation test from sat pack pattern.
2021-12-03 00:14:06 +03:00
Nekotekina e7c827f73b PPU LLVM: rewrite some packing instructions
Rewritten VPKSHSS, VPKSHUS, VPKSWSS, VPKSWUS.
Decoupled saturation test from sat pack pattern.
2021-12-03 00:14:06 +03:00
Nekotekina abe498f35c PPU LLVM: modernize some code with new DSL
PPU: rewritten instructions VMHADDSHS, VMHRADDSHS
PPU: added optimized path for VPERM (ra=rb)
2021-12-03 00:14:06 +03:00
Nekotekina 69f321a471 LLVM 13 2021-11-02 20:11:08 +03:00
Malcolm Jestadt f06c8b22e8 PPU/SPU LLVM: Emulate VPERM2B with a 256 bit wide VPERMB
- Save 1 uop by using 256 wide VPERMB instead of VPERM2B. (Compiles down to a vinserti128 and vpermb)
2021-10-13 17:51:54 +03:00
Nekotekina 4b8ee85995 LLVM DSL: reimplement pshufb, add 'calli'
Implement postponed custom intrinsic replacement.
Make bitcast operator static like other ones.
2021-09-17 10:23:43 +03:00
Nekotekina 7cf9d1380b LLVM DSL: add line number in get_const_vector automatically 2021-09-17 10:23:43 +03:00
Eladash f98595bee5 Patches/PPU: Add jump_link patch type 2021-09-10 11:46:39 +03:00
Nekotekina 06f733a7f2 Fixup No.2 for #10779 2021-09-01 16:56:38 +03:00
Eladash b40ed5bdb7
Patches/PPU: Extend and improve patching capabilities (code allocations, jumps to any address) (#10779)
* Patches/PPU: Implement dynamic code allocation + Any-Address jump patches

Also fix deallocation path of fixed allocation patches.
2021-09-01 13:38:17 +03:00
Eladash ddb042148d Patches/LLVM: Implement Complex Patches Support 2021-08-26 23:04:32 +03:00
Nekotekina 160b131de3 types.hpp: implement smin, smax, amin, amax
Rewritten the following global utility constants:
`umax` returns max number, restricted to unsigned.
`smax` returns max signed number, restricted to integrals.
`smin` returns min signed number, restricted to signed.
`amin` returns smin or zero, less restricted.
`amax` returns smax or umax, less restricted.

Fix operators == and <=> for synthesized rel-ops.
2021-05-22 12:10:57 +03:00
Megamouse a16d8ba3ea More random changes 2021-04-11 14:01:51 +03:00
Nekotekina 87af905018 Enable -Wunused-parameter 2021-03-06 18:07:08 +03:00
Nekotekina 0c034ad7de PPU LLVM: upgrade to GHC call conv
Get rid of some global variables.
Implement ppu_escape (unused yet).
Bump PPU cache version to v4.
2021-02-01 11:30:50 +03:00
Nekotekina c89362f6a2 PPU LLVM: don't use module name as PRX indicator 2021-02-01 11:30:50 +03:00
Nekotekina 8a029159cd PPU Analyser: compile certain functions on per-instruction basis
PPU LLVM: optimize small blocks
2021-02-01 11:30:50 +03:00
Nekotekina 382509d778 PPU LLVM: Implement inline __add_get_ov 2021-02-01 11:30:50 +03:00
Nekotekina f9ee8978ff PPU LLVM: improve analyser
Compile possibly executable holes between detected functions.
Add unused "PPU LLVM Greedy Mode" option (for future updates).
Add "nounwind" attribute to compiled functions (reduces size).
2021-02-01 11:30:50 +03:00
Nekotekina db8e6fe7a7 Enable -Wunused-variable 2021-01-12 14:34:14 +03:00
Nekotekina bd269bccaf types.hpp: remove intrinsic includes
Replace v128 with u128 in some places.
Removed some unused files.
2020-12-21 21:11:25 +03:00
Nekotekina fb29933d3d Add usz alias for std::size_t 2020-12-18 12:23:53 +03:00
Eladash 7eb16e13bb PRX loader: Fix libfs_155.sprx loading
Fix relocations' segments referencing when there are "empty" (memsize=0) LOAD segments.
2020-12-15 11:16:45 +03:00
Nekotekina e321765c54 Split BEType.h to util/v128.hpp and util/to_endian.hpp 2020-12-13 16:34:45 +03:00
Nekotekina 65c04e4ddd Remove constexpr from ppu/spu decoders.
We don't need them at compile time (yet).
But can reduce compile time and complexity.
2020-12-10 15:06:01 +03:00
Nekotekina 36c8654fb8 Remove HERE macro
Some cleanup.
Add location to some functions.
2020-12-10 12:30:22 +03:00
Nekotekina 5d934c8759 Improve narrow() and size32() with src_loc detection 2020-12-09 16:26:20 +03:00
Nekotekina e055d16b2c Replace verify() with ensure() with auto src location.
Expression ensure(x) returns x.
Using comma operator removed.
2020-12-09 15:43:38 +03:00
RipleyTom af8c661a64 Remove BOM markers 2020-12-06 15:30:12 +03:00
Nekotekina 1b8bf081b5 Upgrade to LLVM 11 Stable 2020-11-02 21:23:25 +03:00
Eladash 443c2b920d PPU: Handle cache line inconsistencies (PPU 128 reservations) 2020-10-16 22:51:30 +03:00
Nekotekina f2d2a6b605 JIT cleanup for PPU LLVM
Remove MemoryManager3 as unnecessary.
Rewrite MemoryManager1 to use its own 512M reservations.
Disabled unwind info registration on all platforms.
Use 64-bit executable pointers under vm::g_exec_addr area.
Stop relying on deploying PPU LLVM objects in first 2G of address space.
Implement jit_module_manager, protect its data with mutex.
2020-10-11 17:22:28 +03:00
Eladash f4ca6f02a1 PPU: Implement support for 128-byte reservations coherency 2020-09-28 22:34:42 +03:00
Eladash 09cddc84be SPU/PPU: Implement Atomic Cache Line Stores 2020-09-27 20:09:21 +03:00
Eladash 8cdfe5952a
SPU/PPU LLVM: Improve 0 addend FMA detection (#8709) 2020-08-13 04:13:08 +03:00
Whatcookie 4ce2ad54a8
PPU LLVM: Use VPERM2B to emulate VPERM (#8704)
- The VPERM2B instructions are a match of VPERM's behavior, besides operating in reverse byte order
2020-08-09 01:50:26 +01:00
Eladash 7e11855330 SPU/PPU LLVM: Fix FMA signed zeroes handling 2020-08-08 22:21:22 +01:00
Eladash 6a51c27fde PPU LLVM: Fix VMAXFP, VMINFP NaN handling 2020-08-03 15:43:00 +01:00
Eladash dd497625a5 PPU LLVM: Fix constant folding of BitCast 2020-07-30 17:06:24 +01:00
Eladash f6764767f6 SPU/PPU LLVM: Fix cpu_translator::get_const_vector<v128>() 2020-07-30 17:06:24 +01:00
Whatcookie 9f829b375a
SPU/PPU LLVM: Optimize VSEL/SELB with constant mask (#8559) 2020-07-25 17:59:35 +01:00
Eladash da44d5f10d
PPU: Fix DIVW, DIVWU, MULHW, MULLW, MULHWU when op.rc is set (#8630) 2020-07-25 17:13:58 +01:00
Eladash 917069e31a
PPU Precise/LLVM: Support NJ modes (#8617) 2020-07-25 07:41:41 +01:00
Eladash 3354c800d7
SPU/PPU LLVM: Improve expressions matching (#8620) 2020-07-24 16:53:48 +01:00
sampletext32 1a8fb61373 Fix some misspells
Note: in main.cpp there are many dirs similar to Program Files, so tip should be appropriate.
2020-05-20 22:53:24 +03:00
Nick Renieris 78ac2a86bb PPU LLVM: Accurate vector instruction NaNs
Tested with https://github.com/RPCS3/ps3autotests/tree/master/tests/cpu/ppu_vpu,
results in that test improved by about half.
2020-05-14 11:14:28 +01:00
Nekotekina e1042bc631 Get rid of "module" keyword
Workaround some intellisense problems.
2020-05-06 18:20:11 +03:00
Nekotekina 58ba6d68bb Don't use std::popcount (workaround)
It seems MSVC uses POPCNT instruction when compiling for SSE2.
2020-04-25 18:01:39 +03:00
Eladash dbce10d0e3
PPU LLVM: Fix rounding regression of FNMADDS, FNMSUBS (#8066)
* PPU LLVM: Fix rounding regression of FNMADDS, FNMSUBS
2020-04-19 20:55:26 +01:00
rxys 5101bc189e
Fix FMA copypasta (#8060) 2020-04-19 19:17:19 +01:00
Nekotekina f72af2973d Replace utils::popcnt32 with std::popcount
Cleanup includes.
2020-04-14 16:05:58 +03:00
Whatcookie 6b0f7a8f55
PPU LLVM: Optimize altivec FMA with 0 addend (#8013)
- When VMADDFP and VNMSUBFP are used with a constant addend of 0, they can be simplified into a single floating multiply
2020-04-12 09:52:21 +01:00
Eladash 158b24ec25 SPU LLVM: Add accurate double-precision FMA support 2020-04-09 17:27:14 +03:00
Eladash 92f821aeb1
PPU LLVM: Add FMA accuracy setting (#7874)
* PPU LLVM : Match PS3 for the instructions fmadd, fmadds, fmsub, fmsubs, fnmadd, fnmadds, fnmsub, fnmsubs

Co-authored-by: doesthisusername <yfirestorm@gmail.com>
2020-03-31 20:01:10 +03:00
Eladash 7ed570dc4a PPU LLVM: Add relocation 5 for ADDIS
+ Add some more for u16 relocations (4, 5, 6), simplify logic.
2020-03-26 17:52:45 +03:00
Nekotekina fa29c5aa94 ppu_iname: refactor to use actual strings 2020-03-26 15:28:41 +03:00
Eladash 453478c98b PPU LLVM: Log unsupported relocation opcode 2020-03-26 15:22:45 +03:00
Nekotekina 1ceb779a38 Make ppu_decoder<> objects constexpr (partial) 2020-03-24 13:46:46 +03:00
Nekotekina 5ebc538d7e Workaround for VS 16.5
Strange codegen bug didn't promote s32 to u64.
2020-03-23 14:48:49 +03:00
Eladash cccc32fa9d
sys_lwmutex/lwcond: track lwcond waiters (#7826)
In lwmutex destroy syscall, wait for pending waiters.
2020-03-23 10:30:17 +03:00
Malcolm Jestadt 0bfdc1f62e PPU LLVM: Improve VMADDFP and VNMSUBFP
- Use native FMA to emulate VMADDFP, with a fallback for processors that don't support FMA
- Use native FMA to emulate VNMSUBFP as well, but note that it differs from the emulated path with regards to negative zero
2020-03-19 06:47:16 +03:00
Nekotekina 04dedb17eb Disable exception handling.
Use -fno-exceptions in cmake.
On MSVC, enable _HAS_EXCEPTION=0.
Cleanup throw/catch from the source.
Create yaml.cpp enclave because it needs exception to work.
Disable thread_local optimizations in logs.cpp (TODO).
Implement cpu_counter for cpu_threads (moved globals).
2020-03-12 16:03:08 +03:00
Nekotekina e4a81b1d13 Move Log.h to util/logs.hpp 2020-03-07 12:29:23 +03:00
Nekotekina 0a41999818 PPU LLVM: fix regression from warning fixes
Forgot that negative power is used here.
2020-03-05 11:07:40 +03:00
Nekotekina Aux1 250736ece5 Fix warnings in emucore 2020-03-04 21:23:34 +03:00
Nekotekina 5b0476e772 Update LLVM to new llvm-mirror (LLVM 11)
Use clang-cl to build LLVM on Windows.
2020-03-03 18:33:02 +03:00
Nekotekina 92e3eaf3ff Fix signed-unsigned comparisons and mark warning as error (part 2). 2020-02-19 22:54:58 +03:00
Nekotekina 327bb2d8f0 Modernize PPU logging (ppu_log variable) 2020-02-01 11:52:24 +03:00
Eladash 923cd7ad72 SPU LLVM: rewrite comparison on non-xfloat path of CFLTU, CFLTS
CFLTU on non-xfloat path is accurate as xfloat path now.
* Also optimize FCTIW like FCTIWZ (PPU)
2019-12-30 22:20:34 +03:00
Nekotekina 185c067d5b C-style cast cleanup V 2019-12-03 17:23:00 +03:00
Nekotekina 6e19881b82 Update LLVM (10) 2019-10-23 16:01:14 +03:00
Eladash a902874b01 Fixup after #6286 2019-08-13 13:34:14 +03:00
Eladash 4b82006984 ppu: Improve LWSYNC
Block load<->load reordering as real lwsync.
2019-08-13 04:56:00 +03:00
Eladash a560498cd4 ppu: Improve FCTIW, FCTIWZ, FCTID and FCTIDZ 2019-08-13 04:56:00 +03:00
Eladash 43f919c04b Fixup after #6143 (#6146)
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145.
    Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
    Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
    Used sequantially consistent ordering in semaphore_release for TSX path as well.
    Improved memory ordering for sys_rsx_context_iounmap/map.
    Fixed sync bugs in HLE gcm because of not using atomic instructions.
    Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
    Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
    Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
2019-06-29 18:48:42 +03:00
JohnHolmesII be521ff0ab Fix warnings related to parentheses 2019-06-25 20:36:32 -07:00
Nekotekina 7492f335e9 SPU analyser: basic function detection in Giga mode
Misc: fix EH frame registration (LLVM, non-Windows).
Misc: constant-folding bitcast (cpu_translator).
Misc: add syntax for LLVM arrays (cpu_translator).
Misc: use function names for proper linkage (SPU LLVM).

Changed function search and verification in Giga mode.
Basic stack frame layout analysis.
Function detection in Giga mode.
Basic use of new information in SPU LLVM.
Fixed jump table compilation in SPU LLVM.
Disable broken optimization in Accurate xfloat mode.
Make compiled SPU modules position-independent in SPU LLVM.

Optimizations include but not limited to:
 * Compiling SPU functions as native functions when eligible
 * Avoiding register context write-out
 * Aligned stack assumption (CWD alike instruction)
2019-05-11 02:13:19 +03:00
Nekotekina 2ade3c594c LLVM DSL: expression matching (preview 2)
Implement more instructions.
2019-04-25 03:33:18 +03:00
Nekotekina ac473eb400 Rewrite cpu_translator::rol, add fshl and fshr
Use new funnel shift intrinsics
2019-04-24 23:55:41 +03:00
Nekotekina 42448cf3e5 Remove cpu_translator::scarry, cpu_translator::merge 2019-04-24 23:55:41 +03:00
Nekotekina 524aac75ed LLVM DSL: rewrite bitcast, zext, sext, trunc, select, min, max ops
Are made composable in expressions similar to arithmetic ops.
Implement noncast in addition to bitcast (no-op case).
Implement bitcast constant folding.
Fixed some misuse of sext<>.
2019-04-24 23:55:41 +03:00
eladash b307aff9eb Prefetch byteswapped opcodes in ppu interpreter 2019-04-11 17:47:52 +03:00
eladash 3304e3b0b7 PPU LLVM: Fix STSWI and LSWI 2019-04-11 17:47:52 +03:00
eladash f028737db8 Implement fallback for PPU LLVM
This matches with interpreter implementation, fixing unregistered functions in lost cases
2019-04-11 17:47:52 +03:00
Nekotekina d873802b9c Use LLVM 9
Use new add/sub with saturation intrinsics
2019-03-30 01:36:48 +03:00
Nekotekina 7e0b941e9f PPU LLVM: implement get_vrs<>() adaptor
Make use of structured bindings
2019-03-30 01:36:48 +03:00
eladash fb8302817f ppu: Set link unconditionally 2018-12-10 01:34:02 +03:00
Nekotekina b2f29cd4d4 LLVM: remove false alarm errors
Writable sections ARE supported
2018-09-27 12:16:43 +03:00
scribam f294729b28 ppu: improve lvebx/lvehx/lvewx instructions 2018-09-11 21:20:52 +03:00
Nekotekina a424fcfcf7 PPU LLVM: fix phenoms 2018-08-12 02:42:32 +03:00