* rsx: Refactor vertex clip emit to avoid using f64 unnecessarily
- Fixes driver crash on intel
* vk: Add NVIDIA driver version check
- Warn if user has outdated drivers with known problems
- Significantly improves compilation speed by simplifying most of the code and doing something similar to LICM.
* Actual decoding is now vectorized and performed in one step rather than in a loop.
* Switches inside loops are removed and replaced with simple comparison. Generates much nicer (and smaller) GCN bytecode.
- Fix special case where n=f making (f-n) = 0
- Dynamically update depth range by setting dirty bits
- Fix depth bounds when n=f and bounds test is disabled
- Tracks which kind of raster was done (Z-ordered vs linear) throughout the application.
- This allows to identify if data is in the expected format or not.
- Avoids double expansion when both the exp_tex flag is set AND the texture also is sampled as signed
- Should fix missing eyeballs in Mass Effect 1 with the previous sign expansion fix
- Already partially supported via EXP option in the shader opcode, but format decoding was disabled.
- Noticed in some UE3 games which use _SNORM variants on PC but _UNORM on rpcs3
- SRC0, SRC1 and SRC2 have different bits for precision modifiers all stored inside SRC1
- This explains the strange observed behavior of the MAD instruction which has 3 inputs
- Sometimes, usually with shaders that do pack/unpack operations, there is a write-to-self operation.
For example r0.xy = r0.xy
Obviously no new data was introduced into "r0" by this, so we should not mark the register as having new data.
- TODO: Investigate on realhw if self-reference is needed to "cast" the overlapping half registers to their full register counterparts.
- Do not allocate too many objects. This is a problem in games using dynamic memory allocators that can make it rare for a surface to fall on the same address twice, keeping zombie RTVs and DSVs alive much longer than needed.
- Current limit used is 256M of virtual VRAM which is impossible on retail PS3
- Those strange offsets noted in some games seem to match to subpixel addressing.
For example, when scaling down by a factor of 4, a pixel offset of 2 will end up inside pixel 0 of the output
It works fine on Mesa iris
Fixes detection of Mesa as recent Mesa does not have "x.org" on vendor string, allowing vendor_MESA to become true instead of vendor_INTEL on Mesa Intel
- In blit engine logic there is a tendancy to over-allocate so as to avoid having to sticth together textures later
- Sometimes this can lead to out of bounds access and crash applications, so memory must be validated
- Noticed when debugging X-men origins: wolverine which has a bogus SCA op whilst writing vector to output
- It makes no sense for both SCA and VEC to both write to the same register in the same instruction as memory ordering becomes an issue
- Fix DIV instruction
- Add EXP_TEX modifier
- Implement WPOS register read
- Swap 3D and Cubemap enums to match RSX ids
- Adds two extra instruction classes: flow control and packing control
- Implement remaining FP instructions with exception of the rare projected texture lookups
- Fix typo causing output color index > 0 to not work
- Fix KIL instruction
- Implement conditional vertex program writes