Commit graph

1172 commits

Author SHA1 Message Date
Lioncash d4ee878cbd emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash b38dd191bd disassembler_arm: Remove rotation helper function in favor of Common::RotateRight
Mildly reduces the amount of duplicated behavior
2020-04-22 20:46:17 +01:00
Lioncash 51e4f1d9db emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32() 2020-04-22 20:46:17 +01:00
Lioncash c692ccdd6d emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8() 2020-04-22 20:46:17 +01:00
Lioncash b194313d8c emit_x64_vector: Vectorize fallback path in EmitVectorMinU32() 2020-04-22 20:46:17 +01:00
Lioncash 7ceda6d919 emit_x64_vector: Vectorize fallback path in EmitVectorMinU16() 2020-04-22 20:46:17 +01:00
Lioncash cda85a1da0 emit_x64_vector: Vectorize fallback path in EmitVectorMinS32() 2020-04-22 20:46:17 +01:00
Lioncash 6e08eed210 emit_x64_vector: Vectorize fallback path in EmitVectorMinS8() 2020-04-22 20:46:17 +01:00
Lioncash 0fb6dce689 emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift
This can simply be merged with the previous one.
2020-04-22 20:46:17 +01:00
Lioncash 5b71b1337b emit_x64_vector: Avoid left shift of negative value in LogicalVShift
Now that we handle the signed variants, we also have to be careful about left shifts with negative values,
as this is considered undefined behavior.
2020-04-22 20:46:17 +01:00
Lioncash 9954d28868 a64_jitstate: Zero SP and PC on construction of A64JitState
Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent
2020-04-22 20:46:17 +01:00
Lioncash 4efbd40ea4 backend_x64/callback: Default virtual destructor in the cpp file
Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)
2020-04-22 20:46:17 +01:00
Lioncash edd0b5c8c7 a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks
It's well-defined to static_cast a void* to its proper type.
2020-04-22 20:46:17 +01:00
Lioncash e71612d394 A64: Implement SSHL (scalar) 2020-04-22 20:46:17 +01:00
Lioncash ef1e69a1e3 A64: Implement SSHL (vector) 2020-04-22 20:46:17 +01:00
Lioncash 21974ee57e backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash 9fc89f0a0e emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
Similar changes were done in emit_x64_vector, but these were missed.
2020-04-22 20:46:17 +01:00
Lioncash af28e89a13 emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16() 2020-04-22 20:46:17 +01:00
Lioncash cda75e2079 A64: Implement CMTST's scalar variant 2020-04-22 20:46:17 +01:00
Lioncash 0d20423ad5 emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32() 2020-04-22 20:46:17 +01:00
Lioncash d70ee7c0d1 emit_x64_vector: Use VBPROADCAST where applicable and available
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2020-04-22 20:46:17 +01:00
Lioncash bebe7235ae A64: Implement UZP1 and UZP2 2020-04-22 20:46:17 +01:00
Lioncash 26d77c6f09 ir: Add opcodes for performing vector deinterleaving 2020-04-22 20:46:17 +01:00
Lioncash d6f9ed47d9 A64: Implement FNEG (half-precision) 2020-04-22 20:46:17 +01:00
MerryMage 2b8bc1d4e1 README: Add usage example 2020-04-22 20:46:17 +01:00
Lioncash 7efbd73bac A64: Implement USHL (scalar) 2020-04-22 20:46:17 +01:00
Lioncash 41f4717f2b A64: Implement FNEG (vector) 2020-04-22 20:46:17 +01:00
Lioncash ba1cc6366d A64: Implement RSUBHN/RSUBHN2 2020-04-22 20:46:17 +01:00
Lioncash e41640fe33 A64: Implement RADDHN/RADDHN2 2020-04-22 20:46:17 +01:00
Lioncash b719a6b3f7 A64: Implement XAR 2020-04-22 20:46:17 +01:00
Lioncash 0b1b131ec2 simd_two_register_misc: Factor out common comparison code
Gets rid of a tiny bit of duplicated code.
2020-04-22 20:46:17 +01:00
Lioncash ed0b84da70 A64: Implement CMLE (zero)'s vector variant 2020-04-22 20:46:17 +01:00
Lioncash b595a68ffa A64: Implement CMTST (vector) 2020-04-22 20:46:17 +01:00
Lioncash 48c7f8630c A64: Implement ADDHN{2} and SUBHN{2} 2020-04-22 20:46:17 +01:00
Lioncash 3acd9c9200 translate: zero extend result in Vpart when storing to lower part of vector 2020-04-22 20:46:17 +01:00
Lioncash 87ca63699f emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs 2020-04-22 20:46:17 +01:00
Lioncash f17702f608 emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs 2020-04-22 20:46:17 +01:00
Lioncash 596a8dd1dd emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
Provides a better alternative to a fallback operation.
2020-04-22 20:46:17 +01:00
Lioncash 75fd4eaaaa emit_x64_vector: Get rid of some magic numbers in loop bounds 2020-04-22 20:46:17 +01:00
Lioncash 7b80ac25eb emit_x64_vector: Generify variable shift functions 2020-04-22 20:46:17 +01:00
Lioncash 4ec735f707 A64: Implement CMLE (zero)'s scalar variant 2020-04-22 20:46:17 +01:00
Lioncash 6534184df2 A64: Implement CMLT (zero)'s scalar single/double-precision variant 2020-04-22 20:46:17 +01:00
Lioncash 8863c9bb4b A64: Implement SHA512H2 2020-04-22 20:46:17 +01:00
Lioncash 033b890e25 A64: Implement SHA512H 2020-04-22 20:46:17 +01:00
Lioncash d1f5b084b4 A64: Handle S32->F32 case for SCVTF (vector) 2020-04-22 20:46:17 +01:00
Lioncash 38fa984b53 IR: Add opcode for packed word->f32 conversions 2020-04-22 20:46:16 +01:00
Lioncash b8587d8e34 A64: Implement SHA512SU1 2020-04-22 20:46:16 +01:00
Lioncash 44d846045a A64: Implement SHA512SU0 2020-04-22 20:46:16 +01:00
Lioncash ca903c1585 A64: Implement SHA256H and SHA256H2 2020-04-22 20:46:16 +01:00
MerryMage e4237c44eb A64: Implement SCVTF (vector, integer), scalar varaint 2020-04-22 20:46:16 +01:00