Commit graph

8152 commits

Author SHA1 Message Date
Herman S.
92b341e50c [Testing] Add support for building binutils with mingw64 2026-02-14 17:40:40 +09:00
Herman S.
f2b9b57e18 Reapply "[x64] Zero extend on mov to 8bit register"
This reverts commit 3a6f63f34f.

(The issue wasn't xbyak but rather cherrypick order wrt to
other commits, xbyak upgrade was a red herring)
2026-02-14 02:23:58 +09:00
Herman S.
f5afafaec0 Ensure stack allocations maintain 16-byte alignment for AVX instructions 2026-02-14 02:02:40 +09:00
Herman S.
1548b4e11c [CPU/PPC] Add VX128_2 VC field support for vperm128
Add 3-bit VC field to VX128_2 format structure to support
instructions like vperm128 that use 4 distinct vector operands.

The VC field occupies bits 6-8 of the instruction encoding.
2026-02-14 01:57:22 +09:00
Herman S.
f4af1e2a77 [CPU/PPC] Implement and fix FPSCR-related instructions
mcrfs: Implement Move CR from FPSCR instruction
Copies a 4-bit FPSCR field to CR and clears the FPSCR exception bits.

mffsx: Fix Rc bit handling
Properly update CR1 from FPSCR when Rc=1 instead of returning error.
Previously treated Rc=1 as unimplemented.

fcfidx: Fix Rc field access
Use i.X.Rc instead of i.A.Rc for correct instruction format.
fcfid uses X format, not A format.
2026-02-14 01:55:54 +09:00
Herman S.
fb76cbb291 [CPU/PPC] Implement mcrxr instruction
Implement mcrxr (Move to Condition Register from XER).

Copies XER condition bits (SO, OV, CA) to a CR field and
clears those bits in XER. This was previously unimplemented.
2026-02-14 01:45:44 +09:00
Herman S.
b5d2eea07b [CPU/PPC] Fix mfvscr/mtvscr instruction format
Correct mfvscr and mtvscr to use VX format instead of VX128_1.

These instructions operate on the standard Altivec VSCR register,
not VMX128 extended registers. The previous VX128_1 format was
incorrectly accessing the RB field instead of VD/VB.
2026-02-14 01:44:33 +09:00
Herman S.
9369464396 [PPC] vsubcuw is actually implemented, enable it 2026-02-14 01:39:00 +09:00
Herman S.
7e66a85f43 [x64/Linux] More return __m128i by value in xmm0 rather than pointer 2026-02-14 01:38:53 +09:00
Herman S.
0921a4fb04 [x64/Linux] Return __m128i by value in xmm0 rather than a pointer 2026-02-14 01:35:12 +09:00
Herman S.
2abf91603e [x64/Linux] Ensure EmitHostToGuestThunk saves rsi 2026-02-14 01:34:59 +09:00
Herman S.
83ff0c501b [CPU] Set VSCR NJ (non-Java) bit 2026-02-14 00:59:45 +09:00
Herman S.
11815400cd [CPU] Fix f16 pack rounding and SHORT_2 test input
Add round-to-nearest-even to the fast float16 pack path by folding a
0xFFF rounding bias into XMMF16PackLCPI0 and extracting bit 13 of the
source as the tie-breaker, matching the software fallback behavior.

Fix PACK_SHORT_2 test to use pre-biased float input (0x40400000 = 3.0)
as the hardware expects, rather than raw 0.0f which is out of range.
2026-02-14 00:48:16 +09:00
Herman S.
bb70e5c651 [CPU] Fix incorrect all_same detection in constant vector shift paths
The loop conditions `n < 8 - n` and `n < 4 - n` terminated early,
only checking the first half of elements. This caused EmitInt16 and
EmitInt32 to incorrectly take the uniform shift path when trailing
elements had different shift amounts.

Resolves potential issues in SHL, SHR, and SHA.
2026-02-14 00:35:09 +09:00
Herman S.
10e8224a63 [CPU] Use RAII lock_guard in GuestTrampolineGroup 2026-02-14 00:17:45 +09:00
Herman S.
643c13668d [CPU] Correctly zero extend instead of sign extending
When performing unsigned multiplication on Linux/GCC,
the code incorrectly cast constant.i64 (which is a signed int64_t)
to unsigned __int128. This caused sign extension when
the value should be treated as unsigned.
2026-02-13 23:54:51 +09:00
Herman S.
7912eab85e [CPU] Check for null when returning machine code ptr 2026-02-13 23:53:01 +09:00
Herman S.
dad5f327bf [CPU] Fix off by one in bsearch 2026-02-13 23:33:40 +09:00
Herman S.
3a6f63f34f Revert "[x64] Zero extend on mov to 8bit register"
This reverts commit 88b0aea272.

(It needs to be applied after xbyak update)
2026-02-13 23:13:07 +09:00
Herman S.
1182f9d73d [x64] Add software fallback for PACK_FLOAT16_4
Current implementation has an off by 1 in rounding, should
round up to even but doesn't. Need to figure out how to implement
it properly so just leaving the software version here for later
verification.
2026-02-13 21:48:42 +09:00
Herman S.
374ec3634e [x64] Implement f32 arithmetic and tests
Probably not needed but doesn't hurt to be complete
2026-02-13 21:48:01 +09:00
Herman S.
fdb32b909f [x64] Remove AVX512 optimization for vrefp
The precision is too low so it's more trouble than it's worth.
2026-02-13 21:13:39 +09:00
Herman S.
77e320f79a [x64] Fix AVX2 optimization path for VECTOR_SHL_V128
Ensure the optimized path matches the fallback behavior
2026-02-13 20:40:48 +09:00
Herman S.
b3a131698c [x64] Fix AVX512 optimization path for vadduws 2026-02-13 19:57:44 +09:00
Herman S.
70c1092195 [x64] Fix AVX-512 vctuxs NaN handling
(using ordered comparison predicate)
2026-02-13 19:31:08 +09:00
Herman S.
f95ebb9c55 [x64] Fix mismatched operand sizes in CNTLZ fallback 2026-02-13 19:29:21 +09:00
Herman S.
88b0aea272 [x64] Zero extend on mov to 8bit register 2026-02-13 17:20:31 +09:00
Herman S.
15f61a1a10 [x64] Fix vector mask issue and add missing tests 2026-02-13 17:07:03 +09:00
Herman S.
435ea98a5a [Testing] Make sure tests clean up shm resources on exit 2026-02-13 15:22:43 +09:00
Herman S.
532418eed4 [Tests] Fix broken xenia-cpu tests 2026-02-13 15:20:19 +09:00
Herman S.
63d06a6083 [Testing] Ensure memory is fully initialized for stwcx test 2026-02-13 15:19:41 +09:00
Herman S.
d6a4493d5b [Testing] Add more instruction tests 2026-02-13 15:19:23 +09:00
Herman S.
2bb7c8dbf5 [Testing] Fix tests to use opcode mnemonics 2026-02-13 15:08:07 +09:00
Herman S.
ab182d4044 [Testing] Fix vsel128 and vnor128 tests 2026-02-13 15:07:55 +09:00
Herman S.
24c671bf63 [Testing/binutils] Fix VMX128 instruction definitions
Add dcbz128 instruction with opcode X(31,1014) | (1<<21)
dcbz and dcbz128 share extended opcode 1014, distinguished by
bit 21 (RT field): dcbz has RT=0, dcbz128 has RT=1.

Fix vspltisw128 operand list to {VD128, SIMM}
Removed incorrect VB128 operand; instruction only takes
destination and immediate value.

Fix VPERM128 field definition from 8-bit to 0xff
The vperm128 permute control is an 8-bit value (0-255),
not a 3-bit value (0-7).

Fix VC128 field flags from PPC_OPERAND_VR to 0
VC128 is a 3-bit immediate field, not a vector register operand.
2026-02-13 15:05:23 +09:00
Herman S.
1b25574986 [Testing] Add a lot more missing instruction tests 2026-02-13 14:49:49 +09:00
Herman S.
9e9b733dfb [Testing] binutils to support older instructions used in 360 2026-02-13 14:49:09 +09:00
Herman S.
4832bb485f [Testing] Add some lvx* tests 2026-02-13 13:59:49 +09:00
Herman S.
32b0e2c323 [Testing] Add missing control instruction tests 2026-02-13 13:59:40 +09:00
Herman S.
72f81b56c7 [Testing] Add missing ALU instruction tests 2026-02-13 13:59:32 +09:00
Herman S.
4a954506c6 [Testing] Add vpkpx tests 2026-02-13 13:59:23 +09:00
Herman S.
c0d1723469 [Testing] Add vlogefp tests 2026-02-13 13:59:14 +09:00
Herman S.
b6af8ed545 [Testing] Add more tests for logic, conversion and fp instructions 2026-02-13 13:59:05 +09:00
Herman S.
910157f6e0 [Testing] add more arithmetic instruction tests 2026-02-13 13:58:56 +09:00
Herman S.
4c7d152fee [Testing] Add missing avg and min/max tests 2026-02-13 13:58:45 +09:00
Herman S.
3461f2578b [Testing] add missing saturate arithmetic tests 2026-02-13 13:58:36 +09:00
Herman S.
f455b0b3c8 [Testing] add missing integer and rotation tests 2026-02-13 13:58:22 +09:00
Herman S.
09a15fbedc [Testing] Get tests running (dirty hack for linux) 2026-02-13 13:32:39 +09:00
Gliniak
14c2814654 [CPU] Fixed bug in VECTOR_SHL_V128 implementation
- For EmitInt8 there was missing check for & 7 which was causing graphical glitches in movies

- There is probably similar bug in 16/32 version, but that's for another commit
2026-02-11 22:41:36 +01:00
Gliniak
8f5da619f9 [Kernel] Replaced Yield in XAudioGetVoiceCategoryVolumeChangeMask with NanoSleep
- Removed Yield in XamUserGetSigninState
2026-02-08 23:01:44 +01:00