Commit graph

71 commits

Author SHA1 Message Date
Lioncash d6606deda2 A64: Implement half-precision vector variants of FMLA/FMLS 2020-04-22 21:01:44 +01:00
MerryMage c106d8cedf A64: Implement FMULX, vector single-precision and double-precision variant 2020-04-22 20:57:37 +01:00
MerryMage f0920c0ded Fix VShift terminology
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.

- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
Lioncash 48df9b9a7d A64: Implement UQSHL's vector immediate and register variants 2020-04-22 20:55:06 +01:00
Lioncash e8b0f25dff A64: Implement SQSHL's vector register variant 2020-04-22 20:55:06 +01:00
MerryMage 12243692f5 A64: Implement SQRDMULH (vector), vector variant 2020-04-22 20:55:06 +01:00
MerryMage 06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
Lioncash 9c03311fed A64: Implement SQDMULH's vector variant 2020-04-22 20:53:46 +01:00
Lioncash 29f8b30634 A64: Implement SRSHL and URSHL
Implements both scalar and vector variants.
2020-04-22 20:53:45 +01:00
Lioncash 7eb6be7a6a A64: Implement FMAXNM and FMINNM vector variants.
Currently we can implement these in terms of the scalar IR variants.
2020-04-22 20:53:45 +01:00
Lioncash 8b65ea68c0 A64: Implement FMAXP, FMAXNMP, FMINP, and FMINNMP's vector variants
We can just implement these in terms of scalars for the time being.
2020-04-22 20:53:45 +01:00
Lioncash 7ef7def661 A64: Implement SQ{ADD, SUB}, and UQ{ADD, SUB}'s vector variants
Currently we implement these in terms of the scalar variants. Falling
back to the interpreter is slow enough to make it more effective than
doing that.
2020-04-22 20:46:23 +01:00
Lioncash d0fdd3c6e6 simd_three_same: Extract non-paired SMAX, SMIN, UMAX, UMIN code to a common function
Deduplicates a bit of code and makes its layout consistent with the
paired variants
2020-04-22 20:46:23 +01:00
Lioncash 2bea2d0512 A64: Implement SMAXP, SMINP, UMAXP, UMINP 2020-04-22 20:46:23 +01:00
Lioncash b48fb8ca6b A64: Implement PMUL 2020-04-22 20:46:22 +01:00
Lioncash c778c7b868 A64: Implement FMAX's vector single and double precision variants 2020-04-22 20:46:22 +01:00
Lioncash 009879d92b A64: Implement FMIN's vector single and double precision variants 2020-04-22 20:46:22 +01:00
MerryMage 10de36394e A64: Implement FRECPS, vector/scalar single/double variants 2020-04-22 20:46:22 +01:00
MerryMage 0de37b11ad A64: Implement FMLS (vector), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage 934132e0c5 A64: Implement FMLA (vector), single/double variant 2020-04-22 20:46:22 +01:00
MerryMage b2e4c16ef8 A64: Implement FRSQRTS (vector), single/double variant 2020-04-22 20:46:22 +01:00
Lioncash 3447c82656 translate: Return by bool in helpers where applicable
Gets rid of a bit of duplication regarding the early-out cases and makes
all helpers functions consistent (previously some had a return type of
bool, while others had a return type of void).
2020-04-22 20:46:21 +01:00
MerryMage e18fca17dc A64: Implement FABD in terms of existing IR instructions
Fixes NaN issue. Closes #306.
2020-04-22 20:46:21 +01:00
MerryMage 33fa65de23 A64: Implement FADDP (vector) 2020-04-22 20:46:19 +01:00
Lioncash 245c903129 simd_three_same: Join FPAbsoluteComparison() into FPCompareRegister()
These are part of the same comparison family, so there's no real point
in keeping them separate.
2020-04-22 20:46:19 +01:00
Lioncash 53dbb6a92a A64: Implement FACGE's vector single/double precision variants 2020-04-22 20:46:18 +01:00
Lioncash 6912a02d9b A64: Implement FACGT's vector single/double precision variants 2020-04-22 20:46:18 +01:00
Lioncash d898d1779d A64: Implement FABD's vector single/double precision variant 2020-04-22 20:46:18 +01:00
Lioncash 24e3299276 A64: Implement FCMGT, FCMGE (register) vector double and single precision variants 2020-04-22 20:46:18 +01:00
Lioncash 9bec354791 A64: Implement FCMEQ (register)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash 11a92eaaef A64: Implement SRHADD and URHADD 2020-04-22 20:46:18 +01:00
Lioncash 7c0250e9f8 A64: Implement SABA 2020-04-22 20:46:17 +01:00
Lioncash f00789e6f7 A64: Implement SABD 2020-04-22 20:46:17 +01:00
Lioncash f2a85d5601 A64: Implement UHSUB 2020-04-22 20:46:17 +01:00
Lioncash b33360a324 A64: Implement SHSUB 2020-04-22 20:46:17 +01:00
Lioncash 4dcc7724e0 A64: Implement UHADD 2020-04-22 20:46:17 +01:00
Lioncash f8714f7250 A64: Implement SHADD 2020-04-22 20:46:17 +01:00
Lioncash ef1e69a1e3 A64: Implement SSHL (vector) 2020-04-22 20:46:17 +01:00
Lioncash 21974ee57e backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash ba1cc6366d A64: Implement RSUBHN/RSUBHN2 2020-04-22 20:46:17 +01:00
Lioncash e41640fe33 A64: Implement RADDHN/RADDHN2 2020-04-22 20:46:17 +01:00
Lioncash b595a68ffa A64: Implement CMTST (vector) 2020-04-22 20:46:17 +01:00
Lioncash 48c7f8630c A64: Implement ADDHN{2} and SUBHN{2} 2020-04-22 20:46:17 +01:00
MerryMage 5c47f03888 A64: Implement FMUL (vector) 2020-04-22 20:46:15 +01:00
Lioncash a6e264c2dd A64: Implement UABA
Now that we have unsigned absolute difference capabilities, we can just use this to
append onto the result via a vector add.
2020-04-22 20:46:15 +01:00
Lioncash c2e7364d3e A64: Implement UABD 2020-04-22 20:46:15 +01:00
MerryMage 49cc6d7fad A64: Implement FDIV (vector) 2020-04-22 20:46:15 +01:00
MerryMage 147284427b A64: Implement USHL 2020-04-22 20:46:15 +01:00
MerryMage ccf7df057b simd_three_same: Add VectorZeroUpper to CMGE (vector) and CMHS (vector) 2020-04-22 20:46:14 +01:00
MerryMage d5af052f06 A64: Implement CMGE (register) 2020-04-22 20:46:14 +01:00