Commit graph

1611 commits

Author SHA1 Message Date
MerryMage bd47f2ca8f emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS64 2020-04-22 20:55:50 +01:00
MerryMage 3bf183d7e8 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftS32 2020-04-22 20:55:50 +01:00
MerryMage 94f9d402eb emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftU16() 2020-04-22 20:55:50 +01:00
MerryMage 6d9639e3b0 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU64() 2020-04-22 20:55:50 +01:00
MerryMage bbc066a266 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU32() 2020-04-22 20:55:50 +01:00
Lioncash da2e7fad87 emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8()
pshufb lyfe
2020-04-22 20:55:50 +01:00
Mat M 9130ed45b9 Merge pull request #397 from VelocityRa/dec-shift-fix
decoders: Cast to correctly-sized type before shifting
2020-04-22 20:55:50 +01:00
MerryMage 238f2f2cd0 a64_emit_x64: Lowercase PAGE_SIZE
PAGE_SIZE is defined as a macro by musl.
2020-04-22 20:55:50 +01:00
VelocityRa c30b8dbe99 decoders: Cast to correctly-sized type before shifting
Fixes decoding for 64-bit instructions

Does not help/apply to any currently supported ARM versions (since
all are 32-bit length or below), it's for future-proofing should
such an arch be supported.
2020-04-22 20:55:50 +01:00
MerryMage 7162f6f254 emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed 2020-04-22 20:55:50 +01:00
MerryMage e7a5592699 emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE 2020-04-22 20:55:50 +01:00
MerryMage b8fde48732 emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8 2020-04-22 20:55:50 +01:00
MerryMage fd37b637aa emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16 2020-04-22 20:55:50 +01:00
MerryMage 09bf273bc8 A64: Implement SCVTF, UCVTF (vector, fixed-point), scalar variant 2020-04-22 20:55:06 +01:00
MerryMage 03ad2072a7 emit_x64_floating_point: Reduce fallback LUT code in EmitFPToFixed 2020-04-22 20:55:06 +01:00
MerryMage f9129db6fd A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant 2020-04-22 20:55:06 +01:00
Lioncash 48df9b9a7d A64: Implement UQSHL's vector immediate and register variants 2020-04-22 20:55:06 +01:00
Lioncash d426dfe942 ir: Add opcodes for unsigned saturating left shifts 2020-04-22 20:55:06 +01:00
Lioncash ab60720418 A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants
Makes them all consistent, so it isn't necessary to change the
prototypes over when implementing them.
2020-04-22 20:55:06 +01:00
Lioncash 6b5ea6ee66 A64: Implement BRK
Currently, we can just implement this as part of the exception
interface, similar to how it's done for the A32 interface with BKPT.
2020-04-22 20:55:06 +01:00
Lioncash b915364c16 A64/imm: Add full range of comparison operators to Imm template
Makes the comparison interface consistent by providing all of the
relevant members. This also modifies the comparison operators to take
the Imm instance by value, as it's really only a u32 under the covers,
and it's cheaper to shuffle around a u32 than a 64-bit pointer address.
2020-04-22 20:55:06 +01:00
MerryMage 02150bc0b7 IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2020-04-22 20:55:06 +01:00
MerryMage 027b0ef725 A64: Implement SCVTF, UCVTF (scalar, fixed-point) 2020-04-22 20:55:06 +01:00
MerryMage 8051f60db0 opcodes.inc: Align columns to a tabstop of 4 2020-04-22 20:55:06 +01:00
MerryMage 90193b0e3d IR: Add fbits argument to FixedToFP-related opcodes 2020-04-22 20:55:06 +01:00
Lioncash 616a153c16 A64: Implement SQSHL's vector immediate variant 2020-04-22 20:55:06 +01:00
Lioncash e8b0f25dff A64: Implement SQSHL's vector register variant 2020-04-22 20:55:06 +01:00
Lioncash b14eaaec46 ir: Add opcodes for left signed saturated shifts 2020-04-22 20:55:06 +01:00
Lioncash da55ed7b31 branch: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash 867b666285 move_wide: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash 78024a9dc4 load_store_register_unprivileged: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash e45e5da610 load_store_register_immediate: Place conditional bodies on their own line
Makes the conditionals visually consistent with the rest of the
codebase.
2020-04-22 20:55:06 +01:00
Lioncash b586cf3f56 load_store_load_literal: Make variables const where applicable 2020-04-22 20:55:06 +01:00
Lioncash c3a3b9687e data_processing_logical: Move datasize declarations after early-exit conditionals
While we're at it, make variables const where applicable.
2020-04-22 20:55:06 +01:00
Lioncash ed797e6540 data_processing_conditional_select: Make variables const where applicable
Makes CSEL's function consistent with all of the others.
2020-04-22 20:55:06 +01:00
Lioncash c82fa5ec5a data_processing_addsub: Move datasize declarations after early-exit conditionals
While we're at it, also make relevant variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash f4a66d2477 data_processing_bitfield: Move datasize variables after early-exit conditionals
Moves the declaration of datasize to the scope that it's used within.
This also takes the opportunity to apply const where applicable, and
make early-exits all vertically consistent with one another.
2020-04-22 20:55:06 +01:00
Lioncash 2e0fcd6161 A64: Implement CLS's vector variant
Leverages CLZ like the integral variant does.
2020-04-22 20:55:06 +01:00
Lioncash a2cd643525 emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked
Given this is just an internal helper function, it can be marked static.
2020-04-22 20:55:06 +01:00
Lioncash c39ea2e3c9 perf_map: Use std::string_view instead of std::string for PerfMapRegister()
We can just use a non-owning view into a string in this case instead of
potentially allocating a std::string instance.
2020-04-22 20:55:06 +01:00
MerryMage 12243692f5 A64: Implement SQRDMULH (vector), vector variant 2020-04-22 20:55:06 +01:00
MerryMage a9ffcf08b1 A64: Implement SQDMULL (vector), vector variant 2020-04-22 20:55:06 +01:00
MerryMage 3e447614c6 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2020-04-22 20:55:06 +01:00
MerryMage 06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage 08c0e017a5 IR: Implement Vector{Signed,Unsigned}Multiply{16,32} 2020-04-22 20:55:06 +01:00
Lioncash b6df34cdde backend_x64/a64_interface: Re-enable the constant folding pass
This was disabled for debugging, but never re-enabled. Just to be sure,
testing was done downstream in yuzu to make sure this didn't happen to
break anything (which seems to be the case).
2020-04-22 20:55:06 +01:00
MerryMage 06ba397af2 emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused 2020-04-22 20:55:06 +01:00
MerryMage e553c4fe8d emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused 2020-04-22 20:55:06 +01:00
MerryMage 3caeb62ef1 emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused 2020-04-22 20:55:06 +01:00
MerryMage 344ee76aba emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64} 2020-04-22 20:55:06 +01:00