Commit graph

1172 commits

Author SHA1 Message Date
Lioncash d86fea0d28 A64: Implement FCMEQ (zero)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash 9bec354791 A64: Implement FCMEQ (register)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash 5ce187a54e ir: Add opcodes for floating-point vector equalities 2020-04-22 20:46:18 +01:00
Lioncash e64978ed89 fuzz_with_unicorn: Make float_numbers in floating-point tests constexpr
Given this is just a lookup table, this can be made immutable.
2020-04-22 20:46:18 +01:00
Lioncash cf188448d4 emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64()
Gets rid of the need to perform a fallback.
2020-04-22 20:46:18 +01:00
Lioncash 954deff2d4 emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned()
This doesn't alter behavior but does make the code better if anything
else is ever added to this function in the future.
2020-04-22 20:46:18 +01:00
Lioncash 11a92eaaef A64: Implement SRHADD and URHADD 2020-04-22 20:46:18 +01:00
Lioncash bc718c5b28 ir: Add opcodes for performing rounding halving adds 2020-04-22 20:46:18 +01:00
Lioncash 054549da35 emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64
I realized I introduced a helper for simple AVX operation emitting, so
use that instead of writing it all out long-form.
2020-04-22 20:46:18 +01:00
Lioncash cb456f914b A64: Implement UMLAL{2}, UMLSL{2}, and UMULL{2}
Now that we have the helper function set up for the signed variants, we
can also modify it to be used with the unigned ones by performing a zero
extension instead of a sign extension.
2020-04-22 20:46:18 +01:00
Lioncash 3576c02d91 A64: Implement SMLSL{2} 2020-04-22 20:46:18 +01:00
Lioncash ada5c0b2fa A64: Implement SMLAL{2} 2020-04-22 20:46:18 +01:00
Lioncash 2d1aca25e6 A64: Implement SMULL{2} 2020-04-22 20:46:18 +01:00
Lioncash 329137a277 fuzz_with_unicorn: Remove exclusion of FMOV (imm) for FP-16 floats
Qemu, or rather, Unicorn now supports FP-16, since I backported support
for the recent changes to mainline Qemu relating to FP-16 support.
2020-04-22 20:46:18 +01:00
Lioncash c5ae9107a9 A64: Implement SABAL/SABAL2 and SABDL/SABDL2
Now that we have a helper function for the unsigned variants, we can
modify it to also be usable with the signed variants.
2020-04-22 20:46:18 +01:00
Lioncash 26d4473851 A64: Implement UABAL/UABAL2 2020-04-22 20:46:18 +01:00
Lioncash 3397742c74 A64: Implement UABDL/UABDL2 2020-04-22 20:46:18 +01:00
Lioncash 6de5ed96e5 emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs
Shortens code-gen down to a single instruction in the 64-bit path.
2020-04-22 20:46:18 +01:00
Lioncash 9054d1c20b A64: Implement LDR (literal, SIMD&FP) 2020-04-22 20:46:18 +01:00
Lioncash 0da5e949a8 Correct typo in DataCacheOperation enum
Fixes a typo for the InvalidateByVAToPoC enum entry. Given yuzu is the
only known user of 64-bit mode and it doesn't use this value, we can get
away with changing this.
2020-04-22 20:46:18 +01:00
Lioncash 9736e2cce2 A64: Implement FABS' half-precision variant 2020-04-22 20:46:18 +01:00
Lioncash 6e5750e4ec A64: Implement FABS' single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash 7bce8d8757 A64: Implement URSHR (scalar) and URSRA (scalar)
Now that the utility function is all set up from implementing SRSRA, the
unsigned variants can now be trivially implemented by modifying the
utility function to perform a logical shift right instead of an
arithmetical shift right for the unsigned case.
2020-04-22 20:46:18 +01:00
Lioncash 1e70a589b0 A64: Implement SRSRA (scalar) 2020-04-22 20:46:18 +01:00
Lioncash 998aef07f6 A64: Implement SRSHR (scalar) 2020-04-22 20:46:17 +01:00
Lioncash 7c0250e9f8 A64: Implement SABA 2020-04-22 20:46:17 +01:00
Lioncash f00789e6f7 A64: Implement SABD 2020-04-22 20:46:17 +01:00
Lioncash 1e10017f4b ir: Add opcodes for signed absolute differences 2020-04-22 20:46:17 +01:00
Tillmann Karras d3b44c1b5a decoder_detail: use structured bindings 2020-04-22 20:46:17 +01:00
Lioncash ed11c7d904 CMakeLists: Add detection for Aarch64 compiler environments
Just closes a small hole in architecture detection for the ARM family.
2020-04-22 20:46:17 +01:00
Lioncash f745eb28bf simd_two_register_misc: Handle 64-bit case for SCVTF_int_4 2020-04-22 20:46:17 +01:00
Lioncash 3f6c529da2 ir: Add opcode to perform the vector conversion S64->F64
Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us
2020-04-22 20:46:17 +01:00
Lioncash 0e61ee6bf6 A64: Implement SHLL/SHLL2 2020-04-22 20:46:17 +01:00
Lioncash 43e6e98c3b A64: Add missing decoding for PRFM (unscaled offset) 2020-04-22 20:46:17 +01:00
Lioncash f2a85d5601 A64: Implement UHSUB 2020-04-22 20:46:17 +01:00
Lioncash b33360a324 A64: Implement SHSUB 2020-04-22 20:46:17 +01:00
Lioncash 44a5f8095a ir: Add opcodes for performing vector halving subtracts 2020-04-22 20:46:17 +01:00
Lioncash 4f37c0ec5a A64: Implement SM4EKEY 2020-04-22 20:46:17 +01:00
Lioncash 3bde3347a5 A64: Implement SM4E 2020-04-22 20:46:17 +01:00
Lioncash b312d28295 ir: Add an opcode for doing an SM4 lookup table query 2020-04-22 20:46:17 +01:00
Lioncash 27a6d5f6ce emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available 2020-04-22 20:46:17 +01:00
Lioncash 0c63e8f396 fuzz_with_unicorn: Silence unused variable warning
Currently, structured bindings don't provide a way to ignore unused variables.
2020-04-22 20:46:17 +01:00
Lioncash dcadaeba80 externals: Update Catch to v2.2.2
Keeps the unit-testing library up to date.
2020-04-22 20:46:17 +01:00
Lioncash 4dcc7724e0 A64: Implement UHADD 2020-04-22 20:46:17 +01:00
Lioncash f8714f7250 A64: Implement SHADD 2020-04-22 20:46:17 +01:00
Lioncash 089096948a ir: Add opcodes for performing halving adds 2020-04-22 20:46:17 +01:00
Lioncash 3d00dd63b4 emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash b97b71b8aa emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash 033e400df0 emit_x64_vector_floating_point: Deduplicate accurate NaN handling code
Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.
2020-04-22 20:46:17 +01:00
Lioncash 0f067b7330 emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available 2020-04-22 20:46:17 +01:00