Commit graph

6978 commits

Author SHA1 Message Date
Stefan Schmidt 4e9f2e81be
Merge 87ee12d407 into c7f61342d7 2026-01-20 07:26:40 -05:00
Triang3l c7f61342d7 [GPU/D3D12] Convert gamma-as-linear red to gamma in transfers to stencil
Co-authored-by: Herman S. <429230+has207@users.noreply.github.com>
2026-01-20 12:53:20 +03:00
Triang3l 0a19234b4e [GPU/D3D12] Add forgotten gamma conversion format check 2026-01-19 21:08:15 +03:00
Triang3l cec9ca0ef2 [GPU] 8-bit PWL gamma RT as linear 16-bit UNorm on the host
With render target HLE, directly store linear values as R16G16B16A16_UNORM
without gamma conversion, as this format provides more than enough bits
(need at least 11 per component due to the maximum scale being 2^3 in the
piecewise linear gamma curve) to represent linear values without precision
loss.

This makes blending work correctly in linear space, improving quality of
transparency, lighting passes, and fixing issues such as transparent parts
of impact and footstep decals in 4D5307E6 being bright instead.

The new behavior is enabled by default, as it hugely improves the accuracy
of emulation of this format, that is pretty commonplace in Xbox 360 games,
with likely just a small GPU memory and bandwidth usage increase, compared
to the alternatives that were previously available on the HLE RB path.

It's currently implemented only on Direct3D 12, as most of the current GPU
emulation code is planned to be phased out and redone, and no methods other
than 8-bit with pre-conversion were implemented on Vulkan previously.

To implement on Vulkan later, same conversion as in the Direct3D 12
implementation will need to be done in ownership transfer and resolve
shaders. Currently it's somewhat inconvenient to decouple the conversion
functions in `SpirvShaderTranslator` from an instance of the translator due
to vector constant usage. Later, simpler SPIR-V generation functions may be
added (`spv::Builder` usage in general is overly verbose).

The previously default method (8-bit storage with pre-conversion in shaders
and incorrect blending) can be re-enabled by setting the
"gamma_render_target_as_unorm16" configuration option to `false`. This may
be useful if the game, for instance, switches between 8_8_8_8_GAMMA and
8_8_8_8 formats for the same data frequently, as switching will result in
EDRAM range ownership transfer data copying now. Also, the old path is
preserved for Vulkan devices not supporting R16G16B16A16_UNORM with
blending.

The other workaround that was available previously, replacing the PWL
encoding with host hardware sRGB with linear-space blending in render
target management and in texture fetching, was also inherently inaccurate
in many ways (especially when games have their own PWL encoding math, like
4541080F that displayed incorrect colors on the loading screen), and
required tracking of the encoding needed for ranges in the memory.

The sRGB workaround therefore was deleted in this commit, greatly
simplifying the code in the parts of render target, texture and memory
management and shader generation that were involved in it.
2026-01-18 16:22:22 +03:00
Triang3l f2fabfdf04 [GPU] Change "bpe" to "bpb" (bytes per block) in comments
Forgotten in the other "element" to "block" change.
2026-01-15 20:48:26 +03:00
Triang3l 88f95a8bd6 [GPU] Change "element" name back to "block" in texture addressing
The 32_32_32_FLOAT format seems to be vertex-only, so it looks like there
can't be storage elements smaller than a single texel.

So, use a more precise name that can't be confused with "picture element"
(pixel) or "texture element" (texel) that represents a single logical pixel
rather than a storage block of pixels.
2026-01-15 11:56:45 +03:00
Triang3l c4e1242fa2 [D3D12] Include recent D3D12 headers from the Microsoft GitHub 2026-01-14 23:05:08 +03:00
Triang3l 9535e610b0 [GPU] Change texture address local X offsetting back to addition
More likely to be emitted as an immediate load/store offset in host
hardware shaders.
2026-01-14 00:15:08 +03:00
Triang3l 6db6192170 [GPU] Use XOR to flip X texel group in all load/resolve shaders
For visual consistency (missed in the commit that added LocalXAddressXor).
2026-01-13 23:37:01 +03:00
Triang3l 7db772bfeb [GPU] XeResolveLocalXAddressXor comment typo correction 2026-01-13 23:22:30 +03:00
Triang3l 0f23f05683 [GPU] Simplify local X offsetting with resolution scaling
Switch between even and odd 16-byte element sequences along X by simply
flipping a bit rather than going to a different resolution-scaled group of
pixels, by increasing the size of the group within the constraints imposed
by tiling.
2026-01-13 23:14:15 +03:00
Triang3l 76c531bff2 [GPU] Rename AddressTiled in TiledAddress in shaders
There will possibly be more `XenosTextureTiled*` functions in the future
where the word "Address" would make the names excessively long.
2026-01-13 21:20:26 +03:00
Triang3l 80a9af4277 [GPU] Add macro tile size constants to texture addressing headers 2026-01-13 21:18:25 +03:00
Triang3l ea8ae81bfd [GPU] Remove now-unused texture_conversion.cc/h 2026-01-06 17:59:20 +03:00
Triang3l 28b69c21be [GPU] Document tiled texture address bits
Will be useful for calculating memory extents more precisely in the future.
2026-01-06 17:47:19 +03:00
Triang3l cbbaae8ead [Vulkan] Recompile internal shaders with Vulkan SDK 1.4.335.0
spirv-remap was replaced with spirv-opt --canonicalize-ids, and debug
information is preserved now.
2026-01-04 17:03:37 +03:00
Triang3l ca34a022a5 [Build] Replace spirv-remap with spirv-opt --canonicalize-ids
`spirv-remap` is not present in modern Vulkan SDK versions, it was replaced
with the `--canonicalize-ids` pass in `spirv-opt`.

Overall, canonicalization provides a significant compression improvement,
which is important considering that currently Xenia is distributed in a ZIP
archive and contains many very similar shaders.

With normal DEFLATE compression, canonicalization reduced the size of a ZIP
with `xenia.exe` from 3.54 MB to 3.45 MB in a test done before committing.

Also disable stripping of debug information from shaders, which apparently
was among what `spirv-remap` was doing with `--do-everything`, as binding
and uniform buffer member names heavily aid in debugging in RenderDoc.

Partially integrated from #2329.

Co-authored-by: Herman S. <429230+has207@users.noreply.github.com>
Co-authored-by: Gliniak <Gliniak93@gmail.com>
2026-01-04 16:53:52 +03:00
Triang3l dfa1b3fae1 [Vulkan] Clamp device API version to one Xenia was tested on
Fixes the assertion failure in the Vulkan Memory Allocator library when a
driver for a new API version is released, but VMA hasn't been updated yet.
2025-12-14 22:59:35 +03:00
Triang3l fe1fd36137 [D3D12/Vulkan] Simplify host GPU fence management
Replace the `SubmissionTracker`s with new `GPUCompletionTimeline`s with a
more unified interface (using a base class), and without the internal logic
for queue ownership transfers since that idea was scrapped during the
development of the `Presenter`.

Also use this fence management logic for GPU emulation, though without
architectural reworks for now, just on the bottom level.

Still very messy, but can be cleaned up in further GPU command processor
and presenter reworks.
2025-12-14 21:24:38 +03:00
guccigang420 01ae24e46e [Base/Memory] Fix VirtualQuery length parameter 2025-08-20 13:34:39 +03:00
Triang3l 0b2ffa3148 [GPU] Change texture load cbuffer to push constants
Simplify the code, eliminating the need for supporting requesting cbuffers
for anything other than guest draw command execution.
2025-08-20 12:46:26 +03:00
Triang3l 04d5c40d0d [GPU/UI] XeSL readability improvements + float suffix
Use the _xe suffix instead of the xesl_ prefix for quicker visual
recognition of identifiers, also switch to snake_case for consistency.

Also add the f suffix to float32 literals because the Metal Shading
Language is based on C++.
2025-08-19 21:36:06 +03:00
Triang3l 3b4b04c371 [Build] Locate FXC among Windows SDK architectures and versions 2025-08-19 20:48:26 +03:00
Triang3l 4234440681 [Vulkan] Fix VulkanInstance::Create return values 2025-08-15 17:19:23 +03:00
Triang3l b5432ab83f [Vulkan] Refactoring and fixes for VulkanProvider and related areas
Enable portability subset physical device enumeration.

Don't use Vulkan 1.1+ logical devices on Vulkan 1.0 instances due to the
VkApplicationInfo::apiVersion specification.

Make sure all extension dependencies are enabled when creating a device.

Prefer exposing feature support over extension support via the device
interface to avoid causing confusion with regard to promoted extensions
(especially those that required some features as extensions, but had those
features made optional when they were promoted).

Allow creating presentation-only devices, not demanding any optional
features beyond the basic Vulkan 1.0, for use cases such as internal tools
or CPU rendering.

Require the independentBlend feature for GPU emulation as working around is
complicated, while support is almost ubiquitous.

Move the graphics system initialization fatal error message to xenia_main
after attempting to initialize all implementations, for automatic fallback
to other implementations in the future.

Log Vulkan driver info.

Improve Vulkan debug message logging, enabled by default.

Refactor code, with simplified logic for enabling extensions and layers.
2025-08-14 23:44:21 +03:00
Triang3l a06be03f1b [GPU] Cleanup definitions of some registers
VS/PS_NUM_REG is 6-bit on Adreno 200, and games aren't seen using the
bit 7 to indicate that no GPRs are used. It's not clear why Freedreno
configures it this way.

Some texture fetch fields were deprecated or moved during the development
of the Xenos, reflect that in the comments.

Add definitions of the registers configuring the conversion of vertex
positions to fixed-point. Although there isn't much that can be done with
it when emulating using PC GPU APIs, there are some places in Xenia that
wrongly (though sometimes deliberately, for results closer to the behavior
of the host GPU) assume that the conversion works like in Direct3D 10+,
however the Xenos supports only up to 4 subpixel bits rather than 8. The
effects of this difference are largely negligible, though.

Also add more detailed info about register references and differences from
other ATI/AMD GPUs for potential future contributors.
2025-08-06 13:21:19 +03:00
guccigang420 9ae3a72500 [CPU/HIR] Fixed MulHi in value.cc for Linux systems 2025-07-30 23:47:17 +03:00
Stefan Schmidt 87ee12d407 [Testing] Fix building on Linux 2024-08-06 02:39:42 +02:00
Triang3l 3d30b2eec3 [Vulkan] Shader memory export (#145) 2024-05-25 16:31:50 +03:00
Triang3l 210ac4b2d2 [GPU] Fix gamma ramp writing after RegisterFile API change (#2262) 2024-05-18 23:53:09 +03:00
Triang3l 8e7301f4d8 [SPIR-V] Use a helper class for most if/else branching
Simplifies emission of the blocks themselves (including inserting blocks
into the function's block list in the correct order), as well as phi after
the branching.

Also fixes 64bpp storing with blending in the fragment shader interlock
render backend implementation (had a typo that caused the high 32 bits to
overwrite the low ones).
2024-05-16 23:05:49 +03:00
Triang3l 3189a0e259 [GPU] Check memexport stream constant upper bits in range gathering 2024-05-12 20:26:14 +03:00
Triang3l a3304d252f [Base/GPU] Cleanup float comparisons and NaN and -0 in clamping
C++ relational operators are supposed to raise FE_INVALID if an argument is
NaN, use std::isless/greater[equal] instead where they were easy to locate
(though there are other places possibly, mostly min/max and clamp usage was
checked).

Also fixes a copy-paste error making the CPU shader interpreter execute
MINs as MAXs instead.
2024-05-12 19:21:37 +03:00
Triang3l f964290ea8 [Base] Relax the system clock difference allowance in the test
Hopefully should reduce the CI failure rate, although this testing
approach is fundamentally flawed as it depends on OS scheduling.
2024-05-12 17:44:52 +03:00
Triang3l 376bad5056 [GPU] Remove register reinterpret_casts + WAIT_REG_MEM volatility
Hopefully prevents some potential #1971-like situations.

WAIT_REG_MEM's implementation also allowed the compiler to load the value
only once, which caused an infinite loop with the other changes in the
commit (even in debug builds), so it's now accessed as volatile. Possibly
it would be even better to replace it with some (acquire/release?) atomic
load/store some day at least for the registers actually seen as
participating in those waits.

Also fixes the endianness being handled only on the first wait iteration in
WAIT_REG_MEM.
2024-05-12 17:28:17 +03:00
Triang3l f0ad4f4587 [Base] Add aliasing-safe xe::memory::Reinterpret
Accessing the same memory as different types (other than char) using
reinterpret_cast or a union is undefined behavior that has already caused
issues like #1971.

Also adds a XE_RESTRICT_VAR definition for declaring non-aliasing pointers
in performance-critical areas in the future.
2024-05-12 17:28:16 +03:00
Triang3l a90f83d44c [Vulkan] Non-seamless cube map filtering 2024-05-05 15:20:23 +03:00
Triang3l e9f7a8bd48 [Vulkan] Optional functionality usage improvements
Functional changes:
- Enable only actually used features, as drivers may take more optimal
  paths when certain features are disabled.
- Support VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE.
- Fix the separateStencilMaskRef check doing the opposite.
- Support shaderRoundingModeRTEFloat32.
- Fix vkGetDeviceBufferMemoryRequirements pointer not passed to the Vulkan
  Memory Allocator.

Stylistic changes:
- Move all device extensions, properties and features to one structure,
  especially simplifying portability subset feature checks, and also making
  it easier to request new extension functionality in the future.
- Remove extension suffixes from usage of promoted extensions.
2024-05-04 22:47:14 +03:00
Triang3l f87c6afdeb [Vulkan] Update headers to 1.3.278 2024-05-04 19:59:28 +03:00
Triang3l 9ebe25fd77 [GPU] Declare unused register fields explicitly 2024-05-02 23:31:13 +03:00
Gliniak f6b5424a9f [VFS] Fixed invalid month decoding in decode_fat_timestamp 2023-09-14 12:32:51 +03:00
Gliniak 0f331b5313 [Testing] Added test project for vfs
- Added test case for: decode_fat_timestamp
- Changed location of: decode_fat_timestamp
2023-09-14 12:32:51 +03:00
Gliniak c5e6352c34 [CPU] Added constant propagation pass for: OPCODE_AND_NOT 2023-07-27 23:41:45 +03:00
Adriano Martins 1887ea0795 [Base] Add missing #include <cstdint> to utf8.cc 2023-07-27 13:02:54 +03:00
Gliniak 00aba94b98 [NET] NetDll___WSAFDIsSet: Fixed incorrect endianness of fd_count
Plus: limit it to 64 entries
Thanks to Bo98 for pointing that out
2023-06-09 19:47:56 -05:00
Roy Stewart 07e81fe172 [Base] Filter out relative directories on linux 2023-06-09 19:47:28 -05:00
Roy Stewart 41c423109f [Base] Set the path for posix file info 2023-06-09 19:43:49 -05:00
Adrian 4a3b04d4ee [XAM] Implemented XamGetCurrentTitleId 2023-06-09 19:43:15 -05:00
Gliniak 858af5ae75 [XAM] xeXamContentCreate - Disposition cleanup 2023-06-09 19:42:48 -05:00
Gliniak e110527bfe [Base] ListFiles: Prevent leakage of file descriptors 2023-06-09 19:41:27 -05:00