* rsx: Add code to detect instanced draw commands
* rsx: Add GLSL support for instanced rendering
* rsx: Move draw call related functions to their own class
* rsx: Move more functions from rsx thread to the draw command processor
* rsx: Fix vertex program compiler crash
* vk: Add support for hardware instanced draws
* rsx: Fix instancing bug when indexed addressing is used to read constants
* rsx: Fix rare crash in vertex program decompiler
- This whole decompiler mess needs a rewrite
* rsx: Handle dangling execution barriers
* rsx: Do not use global registers object in logical "firmware" units
* Cosmetic improvements
* rsx: Test vertex program flags on each draw
* rsx: Properly track changes in instancing state
- Prefer lazy retire model. Sync commands are sent out and the reports will be
retired when they are available without forcing.
- To make this work with conditional rendering, hardware support is
required where the backend will automatically determine visibility by
itself during rendering.
- Sometimes program-point-size is enabled, but the vs does not actually
write to the point size register. In this case, pass the incoming point
size along instead of the default register init.
- Fix 2D coordinate sampling of W coordinate.
W is actually HPOS.w and not 1. Z is however always 0.
- Optimize register usage a bit
Disassembling compiled SPV shows that global declaration results in less ops than using inout modifiers. Modifiers generate extra mov instructions.
- Fix reading of varying registers in FP
Different registers have different behavior
- Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1)
- Reimplements two-sided lighting correctly without hacks
- Also bumps shader cache version
- Remove string comparisons from the hot-path!
- Use attribute streaming and push constants to avoid forcing a descriptor block copy every other draw call/pass.
While this isn't so bad on nvidia cards, it makes AMD cards a slideshow.
- The hw generates inaccurate values when doing perspective-correct
interpolation of vertex output attributes and makes the comparison (a ==
b) fail even when they are a fixed constant value.
- Increase equality tolerance when doing comparisons in fragment
shaders for NV cards only to work around this issue.
- Teepo fix
- Index offset is ignored anyway and only used to calculate vertex attribute divisor index
- Specialized optimization for untouched xfer without primitive restart
- Compute is now used to assist in some parts of blit operations, since there are no format conversions with vulkan like OGL does
- TODO: Integrate this into all types of GPU memory conversion operations instead of downloading to CPU then converting
- Do not add unused subroutines in shaders unless necessary
-- makes shaders easier to read and disassembled spir-v has less clutter
- glsl: Replace switch block with lookup table