kd-11
3fe9aea5b5
rsx/overlays: Allow some basic communication from the UI components to the backend renderers
2022-10-11 23:13:12 +02:00
Megamouse
ab6ba848b8
overlays: simplify overlay_media_list_dialog
2022-10-11 23:13:12 +02:00
kd-11
65d20f2d08
gl: Add mesa support for polygon offset
2022-10-11 14:00:34 +03:00
kd-11
a229e30b08
rsx: Implement RSX-compliant polygon offset
2022-10-11 14:00:34 +03:00
kd-11
d246a37b11
rsx: Move fp16 toggle to a global shader precision option
2022-10-11 14:00:34 +03:00
Elad Ashkenazi
92b08a4faf
rsx: Fixup a bug after mfc list optimization ( #12782 )
2022-10-10 04:04:41 +03:00
Eladash
a6dfc3be2f
SPU: Enable the MFC list optimization for Atomic RSX FIFO
2022-10-09 19:27:46 +03:00
kd-11
d6d7ade6e3
vk: Reload state on dynamic state changed
2022-10-09 03:00:39 +03:00
Elad Ashkenazi
e0df2c584f
rsx: Attempt to fix frame limiter
2022-10-09 01:33:40 +03:00
kd-11
3c88477270
Fixup for scissor/viewport invalidation rules
2022-10-07 15:27:54 +03:00
kd-11
df46e5137c
gl: Fix texture reconstruction logic
...
- Use correct target types
- Fix key generation to apply differently for each target type
2022-10-07 11:53:34 +03:00
kd-11
ffe8133865
vk: Avoid unnecessary dynamic state updates
2022-10-07 11:53:34 +03:00
kd-11
7140e82189
rsx: Fix program invalidation rules
2022-10-07 11:53:34 +03:00
kd-11
87411da95f
gl: Explicitly declare gl_Position as invariant when using MESA
2022-10-06 06:41:24 +03:00
Eladash
9b5cc7cda7
System.cpp: Fix RSX thread abort
2022-10-04 14:14:38 +03:00
kd-11
73784b9e12
Fix GCC build
2022-10-03 12:57:16 +03:00
kd-11
533f960854
rsx: Handle some more corner cases
2022-10-03 12:57:16 +03:00
kd-11
765208a181
rsx: Avoid clobbering CELL memory when splitting fbos
2022-10-03 12:57:16 +03:00
kd-11
4417701ea7
rsx: Track orphaned surfaces' parent addresses
2022-10-03 12:57:16 +03:00
kd-11
f66eaf8f44
rsx: Add some handy util functions to simple_array
2022-10-03 12:57:16 +03:00
kd-11
a0e2a3db1d
Fix underflow in ZCULL sync
2022-09-30 23:44:37 +03:00
kd-11
102d30db2d
vk: Update support for framebuffer loops to comply with current spec
2022-09-28 12:55:31 +03:00
kd-11
5281a85b67
rsx: Fix compiler warnings
2022-09-28 12:55:31 +03:00
kd-11
de28c812e8
rsx: Re-evaluate color MRT setup when the surface target type changes
2022-09-28 12:55:31 +03:00
kd-11
67c02e3522
vk: Bump compute descriptor pool size to 8k
...
- TODO: This should be dynamic.
2022-09-27 14:58:47 +03:00
kd-11
19dd2a693b
gl: Fix transform job assert
2022-09-27 14:58:47 +03:00
Nekotekina
6ff6a4989a
Implement at32() util
...
Works like .at() but uses source location for "exception".
2022-09-26 18:04:15 +03:00
kd-11
dd8a337b14
rsx: Fix some more warnings
2022-09-22 23:46:48 +03:00
kd-11
0572d44996
gl: Fix enum collision
2022-09-22 23:46:48 +03:00
kd-11
38aa116c59
Fix build
2022-09-22 23:46:48 +03:00
kd-11
61666bae69
rsx: Fix hardware deswizzle not getting used when hardware deswizzle flag is not set
2022-09-22 23:46:48 +03:00
kd-11
362a26a404
gl: Fix D24X8 accelerated encode/decode
...
- PS3 D24X8 is swapped as a full word, unlike PC.
- Add missing paths to handle custom swap behavior.
2022-09-22 23:46:48 +03:00
kd-11
81fa3da101
gl: Minor optimization around test..set patterns in the state tracker
2022-09-22 23:46:48 +03:00
nastys
acc2fea7e3
Update MoltenVK to 250e1f9 and single queue ( #12620 )
2022-09-20 11:12:27 +03:00
kd-11
3dc7b64fa1
rsx: Fix initialization of null cubemap resources
2022-09-19 19:13:46 +03:00
kd-11
79f2c21dfb
gl: Restrict compute image bindings to [0-8]
...
NVIDIA only supports 8 compute image slots even on modern GPUs.
2022-09-19 01:37:10 +03:00
kd-11
df36c44bc2
gl: Avoid UBO/SSBO binding index collisions
...
- Some drivers don't like this. Actually only RADV.
- Almost all GPUs going back 15 years have a large number of UBO slots but limited SSBO slots.
Move UBO slots up as we have tons more headroom there.
2022-09-19 01:37:10 +03:00
Nekotekina
c4db65cc08
Fix one more warning
2022-09-18 18:35:17 +03:00
Nekotekina
b49a1f27eb
Warning fixes
2022-09-17 16:35:02 +03:00
Eladash
c8199de188
CPU preemption control: Improve stutter elimination
2022-09-16 18:57:55 +03:00
Eladash
2e9ee81dcd
CPU preemption control: Improve analysis
2022-09-16 18:57:55 +03:00
Eladash
cf4da5c4d1
CPU preemption control: bugfixes
2022-09-16 18:57:55 +03:00
Eladash
9c5108c1ca
CPU preemption control: Add one more debug variable
2022-09-16 18:57:55 +03:00
Eladash
ec7b18dab5
Implement independent CPU preemptions
2022-09-13 19:28:20 +03:00
kd-11
572a2a06d1
rsx: Properly reset occlusion counters even when the register is not in use.
2022-09-12 17:15:06 +03:00
kd-11
d686b48f65
rsx: Simplify FIFO concurrent access.
2022-09-09 23:17:27 +03:00
kd-11
f319362e35
vk: Fix queue concurrency behavior for images
2022-09-09 23:17:27 +03:00
kd-11
940e726754
rsx: Minor FIFO cleanup
2022-09-09 23:17:27 +03:00
kd-11
f43824762a
rsx: Get rid of an allocation in analyse_vertex_data that adds about 5% overhead.
...
This method is called many thousands of times per frame and that single allocation introduces a small perf hit.
Just get rid of it, it doesn't improve anything to have it there.
2022-09-09 23:17:27 +03:00
kd-11
cd53bb7eff
rsx: Avoid on-the-fly ZCULL allocations with unordered_map
2022-09-09 23:17:27 +03:00
Eladash
274386a078
rsx: Add some debugging information
2022-09-07 18:39:32 +03:00
Nekotekina
5985f0eefa
BufferUtils: cleanup regarding ARM64
2022-09-07 17:59:07 +03:00
Nekotekina
82258915da
BufferUtils: rewrite remaining intrinsic code with simd_builder
2022-09-07 17:59:07 +03:00
Nekotekina
11a1f090d3
BufferUtils: simd_builder refactoring
...
Some simplifications implemented.
2022-09-07 17:59:07 +03:00
Elad Ashkenazi
290226539f
Fix ARM build ( #12606 )
2022-09-04 21:11:04 +03:00
Eladash
11a197a387
Savestates/RSX: fix unintentional vblank thread spin after abort
2022-09-01 20:09:28 +03:00
Eladash
ee1384341e
rsx: Implement atomic vertex upload (with Strict Rendering Mode)
2022-09-01 20:09:28 +03:00
Nekotekina
58e3232710
BufferUtils: Fix regression in upload_untouched
2022-09-01 17:39:04 +03:00
Nekotekina
e28707055b
Implement simd_builder for x86
...
ASMJIT-based tool for building vectorized loops (such as ones in BufferUtils.cpp)
2022-08-28 18:38:52 +03:00
kd-11
1fc0191311
Fix build
2022-08-23 23:49:46 +03:00
kd-11
1f9e04f72d
rsx/vk: Implement flushing surface cache blocks to linear mem
2022-08-23 23:49:46 +03:00
kd-11
bca833dad7
Fix surface reuse
2022-08-20 01:23:15 +03:00
kd-11
f981e05908
rsx: Do not lie about surface details
2022-08-20 01:23:15 +03:00
kd-11
b5abd777b0
rsx: Allow longer dispatch queues to accomodate games with high draw call count
2022-08-19 20:29:32 +03:00
Elad Ashkenazi
b2c9add47e
rsx: Fix semaphore timeout on boot
...
Allow semaphore timeout to be disabled again.
2022-08-19 15:40:20 +03:00
kd-11
a401a192b8
Fixup for dst_stage
2022-08-19 14:29:20 +03:00
kd-11
ad1b007dd1
Fix whitespace
2022-08-19 14:29:20 +03:00
kd-11
71e35c8b4d
vk: Implement support for VK_EXT_attachment_feedback_loop_layout
2022-08-19 14:29:20 +03:00
kd-11
2e504b2dac
rsx: Silence some warnings
2022-08-19 14:29:20 +03:00
kd-11
bacf518189
rsx: Fix 2D intersection tests
2022-08-14 23:53:50 +03:00
kd-11
b960ce1426
vk: Align write length when pre-filling buffers with constant patterns
2022-08-14 23:53:50 +03:00
kd-11
c55a889c23
vk: Initialize buffer info blocks to avoid null descriptors
2022-08-14 23:53:50 +03:00
Eladash
4464a6c3f6
CG-Disasm: Name input/output vetex arrays
2022-08-12 15:20:48 +03:00
Elad Ashkenazi
c4cc0154be
LV2: Optimizations and fixes
...
Fix and optimize sys_ppu_thread_yield
Fix LV2 syscalls with timeout bug. (use ppu_thread::cancel_sleep instead)
Move timeout notification out of mutex scope
Allow g_waiting timeouts to be awaked in scope
2022-08-11 11:42:16 +03:00
kd-11
c51d3b5465
Workaround for msvc weirdness
2022-08-09 18:32:54 +03:00
kd-11
e179adc4a0
rsx: Refactor surface cache storage
2022-08-09 18:32:54 +03:00
kd-11
61a055a1c6
Tuning
2022-08-07 22:14:49 +03:00
kd-11
64b4cfa59f
rsx: Erase surface background when reloading after a pitch mismatch
2022-08-07 22:14:49 +03:00
kd-11
c799ffd223
rsx: Stubs for pitch conversion
2022-08-07 22:14:49 +03:00
kd-11
2445ab8d8e
Fix RSX capture playback
2022-08-04 19:01:45 +03:00
kd-11
3e923b4993
rsx: Optimize VTX_FMT_SNORM16 decoding
...
- Cuts down SNORM16 overhead by ~65%
2022-08-03 23:33:31 +03:00
kd-11
8181498d86
gl: Alias UBO/SSBO slots to avoid exceeding the available number of binding slots.
...
- The sets are different anyway and should not overwrite each other in a proper driver.
2022-08-03 23:33:31 +03:00
kd-11
57dd611111
gl: Fix incomplete stencil view of depth-stencil texture
...
- Samplers must use point sampling for stencil views
2022-08-03 23:33:31 +03:00
Eladash
b3162bd41c
rsx/vp: Fix SNORM16 vertex decoding
2022-08-03 18:11:46 +03:00
Elad Ashkenazi
cd2adbad9a
Update rsx_methods.cpp
2022-08-03 17:15:59 +03:00
Elad Ashkenazi
99730ac4f9
Update rsx_methods.cpp
2022-08-03 17:15:59 +03:00
Elad Ashkenazi
d2ab3383ad
Update rsx_methods.cpp
2022-08-03 17:15:59 +03:00
Elad Ashkenazi
3b15a6b39e
Update rsx_methods.cpp
2022-08-03 17:15:59 +03:00
Elad Ashkenazi
651e58f443
rsx: Trivial optimization
2022-08-03 17:15:59 +03:00
Eladash
769f9e33e9
Savestates/RSX: Fix fifo_control::restore_state
2022-08-03 15:35:41 +03:00
kd-11
052725fdc7
rsx: Do not require ZCULL buffer binding to enable ZPASS counting
...
- ZPASS data is still accessible in unbuffered mode.
The only thing that buffered ZCULL enables is something closer to early-Z where large blocks of pixels can be dicarded earlier.
It is strictly a performance optimization and not required for ZPASS to work.
- Update ZCULL stat calculations to take into account unbuffered Z
2022-08-01 00:23:54 +03:00
Megamouse
f90b79791f
HLE: fix file not found errors in media functions
2022-07-31 16:45:05 +02:00
Megamouse
228844c017
overlays: fix line wrapping and position of lines
...
- Fix off by one issue when we wrapping a line caused by unnecessary zeroed whitespaces.
- Fix centering of lines that end with carriage return caused by overzealous reset of counters.
- Remove fabs where there shouldn't be any
2022-07-29 09:26:45 +02:00
Megamouse
577f379a12
implement cellPhotoImport
2022-07-26 17:27:35 +02:00
kd-11
c9058280e0
vk: Fix a potential deadlock
2022-07-25 21:05:31 +03:00
kd-11
5af50cfd55
vk: Handle corner cases
...
- Fix up flush sequence in DMA handling (WCB)
- Do not request resource sharing if queue family is not different!
2022-07-25 21:05:31 +03:00
kd-11
d846142f0c
vk: Reimplement compliant async texture streaming
...
- Use CONCURRENT queue access instead of fighting with queue acquire/release via submit chains.
The minor benefits of forcing EXCLUSIVE mode are buried under the huge penalty of multiple vkQueueSubmit.
Batching submits does not help alleviate this situation. We simply must avoid interrupting execution.
2022-07-25 21:05:31 +03:00
Megamouse
c40439ae6b
cellMusic/Decode: implement playlist shuffle and repeat
2022-07-22 08:42:43 +02:00
kd-11
246bf1df64
Use C++17 ctor for string_view
2022-07-21 22:29:40 +03:00
kd-11
9a868e9239
gl: Silence compiler warning
2022-07-21 22:29:40 +03:00
kd-11
ab3cde1939
gl: Do some macro patching for intel driver
2022-07-21 22:29:40 +03:00
kd-11
bec3e156fb
vk: Disable robust buffer access for ANV
...
- Robust access is nice, but we don't actually need it
2022-07-21 22:29:40 +03:00
Megamouse
086afbbaa5
overlays: implement back and focus in media_list_dialog
2022-07-21 01:36:33 +02:00
kd-11
680f08c2b8
gl: Destroy barrier signals correctly
2022-07-18 18:58:22 +03:00
kd-11
82bac4173e
gl: Reuse scratch images
2022-07-18 18:58:22 +03:00
kd-11
8a8fda3e02
gl: Combine RGBA8/D24S8 readback and byteswap into one operation
2022-07-18 18:58:22 +03:00
kd-11
1c5b685398
gl: Only toggle state settings that are relevant to the current RSX state
2022-07-18 18:58:22 +03:00
kd-11
e95084f138
gl: Use DSA for imageview configuration and avoid needless bind operations
2022-07-18 18:58:22 +03:00
kd-11
e12d268662
gl: Implement support for texture1D decode
2022-07-18 18:58:22 +03:00
kd-11
6a3f17cd36
gl: Fix compute invocation counts for format handling code
2022-07-18 18:58:22 +03:00
Eladash
3e51426379
Savestates/SPU: Kill emulation when its safe to save SPU state
2022-07-15 09:30:53 +03:00
Megamouse
105781fa76
overlays: properly align lines with leading or trailing whitespace
2022-07-14 23:32:20 +02:00
Megamouse
d2be12bb07
overlays: find missing characters lost during wrapped rendering
2022-07-14 23:32:20 +02:00
Megamouse
fdc15e12c4
overlays: properly calculate offsets for wrapped text
2022-07-14 23:32:20 +02:00
Eladash
e548743cbf
Fixup rsx cpatures
2022-07-14 18:50:31 +03:00
kd-11
cdef752a9c
gl: Fix 2D->3D splat in CopyBufferToImage
2022-07-13 02:09:58 +03:00
kd-11
1483941bea
gl: Implement row alignment in CopyBufferToImage routines
2022-07-13 02:09:58 +03:00
kd-11
453e1bfaec
gl: Silence compiler warning
2022-07-13 02:09:58 +03:00
kd-11
82439327fa
gl: Support loading data from SSBO using compute shaders
...
- Gives better performance than using raw draw calls.
- Does not work with all formats. The draw call version is still used when needed.
2022-07-13 02:09:58 +03:00
kd-11
f60002e87d
gl: Optimize memory barriers a bit
...
- Move waits to server side
- Increase the scratch buffer size to avoid waiting on barriers
2022-07-13 02:09:58 +03:00
kd-11
9fc6382909
gl: Finalize BGRA storage format internals
...
- Performance is terrible but it works properly now
2022-07-13 02:09:58 +03:00
kd-11
ebad08aa97
gl: Fix image creation for virtual formats
2022-07-13 02:09:58 +03:00
kd-11
599f1dd157
gl: Properly match BGRA RTT formats
2022-07-13 02:09:58 +03:00
kd-11
bb5ce67d57
gl: Handle corner cases for CopyBufferToImage
...
- Handle 3D textures and cubemaps
- Handle writing to mip > 0
2022-07-13 02:09:58 +03:00
kd-11
f948ce399e
gl: Implement CopyBufferToImage in software
...
- Overrides the drivers CopyBufferToImage handling where possible
2022-07-13 02:09:58 +03:00
kd-11
954c60947d
gl: Avoid calling gl functions without a context even if the object is GL_NONE
...
- While calling glDestroyXXXX with GL_NONE is a no-op, calling it without a context will crash some drivers.
2022-07-13 02:09:58 +03:00
kd-11
98b6783c05
gl: Fix image views broken after refactor
2022-07-13 02:09:58 +03:00
kd-11
0894d2886a
Fix build
2022-07-13 02:09:58 +03:00
kd-11
4995b4abe3
gl: Do not use raw GL image copy command for RSX data
2022-07-13 02:09:58 +03:00
kd-11
35ef19cfc8
gl: Refactor the rest of GLHelpers
2022-07-13 02:09:58 +03:00
kd-11
09824a718f
gl: Separate BGRA8 storage from RGBA8
2022-07-13 02:09:58 +03:00
Eladash
ab27ee4cf4
Savestates/RSX: Save NV406E semaphore waiting
2022-07-12 15:15:42 +03:00
Eladash
24fddf1ded
rsx: Fix emu stopping crash when using multi-threaded rsx
...
FXO signaled abort before it completed its work, leading to unsignalled vk::fence and deadlock. Fix it by deregistering it from FXO.
2022-07-10 14:19:59 +03:00
Eladash
87cd65ff03
Savestates: support game collections
2022-07-10 14:19:59 +03:00
Eladash
4ade06f36f
Savestates/RSX: Restore the ZCULL control state
...
And fix the ZCULL control state at the initial state of RSX.
2022-07-10 14:19:59 +03:00
Nekotekina
4b787b22c8
Implement FN (lambda shortener)
...
Useful for some higher order functions.
Allows to make short lambdas even shorter.
2022-07-08 14:47:41 +03:00
Eladash
4ac88fa8d3
Savestates/RSX: Save drawing context
2022-07-08 12:57:43 +03:00
Eladash
5f8f9e33f1
RSX/Savestates: Replace GCM hack with a proper fix
2022-07-08 12:57:43 +03:00
Megamouse
b683110e72
cellGem/overlays: show cursor if necessary
2022-07-07 12:40:23 +02:00
Megamouse
4823d4c32a
input: add background input option
...
Adds an option to disable background input to the IO tab in the settings dialog.
This will disable pad input as well as ps move and overlays input when the window is unfocused.
2022-07-06 21:49:31 +02:00
Eladash
bd9ba7ef1f
Remove incorrect Emu.IsStopped() checks
2022-07-05 08:25:36 +02:00
kd-11
fddb6a31a7
Use utils::c_page_size
2022-07-04 22:35:05 +03:00
kd-11
5cafaef0a9
Aarch64 fixes for RSX
2022-07-04 22:35:05 +03:00
Elad Ashkenazi
fcd297ffb2
Savestates Support For PS3 Emulation ( #10478 )
2022-07-04 16:02:17 +03:00
Nekotekina
69912ba3c7
Partial revert for cf0fcf5a2a
2022-06-30 14:38:14 +03:00
Eladash
cf0fcf5a2a
SPU: Implement execution wake-up delay
2022-06-28 19:54:25 +03:00
Eladash
f5a55b3024
rsx: Fixup after #12052 for frame limiter off
2022-06-25 17:39:07 +03:00
Eladash
7422ab9e55
rsx: Do not discard flip notifications
2022-06-25 15:30:41 +02:00
Eladash
f66256cc13
rsx: PS3 Native frame limiter improvements, add Infinite frame limiter
...
* Do not wait on DEVICE 0x30 semaphore, it seems like it is something to do with queue command synchronization.
- This also fixes cellGcmSetFlipWithWaitLabel which is built specifically to enable accurate RSX flipping time, its waiting command is confirmed to be placed **AFTER** DEVICE 0x30 waiting.
* Fix default vsync state to be enabled. (and set it to enabled in cellGcmSetVBlankFrequency as well)
* Add experimental "Infinite" frame limiter mode.
* Fix spurious enabling of second vblank.
2022-06-25 15:30:41 +02:00
Eladash
5e01ffdfd8
Debugger: Optimize cpu_thread::dump_regs()
...
Reuse string buffer. Copies and reallocations are expensive with such large strings.
2022-06-23 22:41:32 +02:00
Eladash
3899248305
RSX Debugger: Stable NOP skipping
...
Allow addresses of NOP blocks to remain consistent in between debugger position changes except for the first which can shrink or grow.
2022-06-21 16:59:45 +03:00
Jeff Guo
cefc37a553
PPU LLVM arm64+macOS port ( #12115 )
...
* BufferUtils: use naive function pointer on Apple arm64
Use naive function pointer on Apple arm64 because ASLR breaks asmjit.
See BufferUtils.cpp comment for explanation on why this happens and how
to fix if you want to use asmjit.
* build-macos: fix source maps for Mac
Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo
modes.
* LLVM PPU: fix aarch64 on macOS
Force MachO on macOS to fix LLVM being unable to patch relocations
during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by
PPUTranslator/Recompiler.
* virtual memory: use 16k pages on aarch64 macOS
Temporary hack to get things working by using 16k pages instead of 4k
pages in VM emulation.
* PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS
Fixes some intrinsics usage and patches usages of asmjit to properly
emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also
patches the SPU recompiler to properly work on arm64 by telling LLVM to
target arm64.
* virtual memory: fix W^X toggles on macOS aarch64
Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default
to RW mode. For both SPU and PPU execution threads, when initialization
finishes we toggle to RX mode. This exploits Apple's per-thread setting
for RW/RX to let us be technically compliant with the OS's W^X
enforcement while not needing to actually separate the memory
allocated for code/data.
* PPU: implement aarch64 specific functions
Implements ppu_gateway for arm64 and patches LLVM initialization to use
the correct triple. Adds some fixes for macOS W^X JIT restrictions when
entering/exiting JITed code.
* PPU: Mark rpcs3 calls as non-tail
Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you
call a function inside e.g. an L2 syscall, it will clobber LR on arm64
and subtly break returns in emulated code. Only JIT -> JIT "calls"
should be tail.
* macOS/arm64: compatibility fixes
* vm: patch virtual memory for arm64 macOS
Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to
existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix
memory unmapping on 16K page machines with a hack to mark "unmapped"
pages as RW.
* PPU: remove wrong comment
* PPU: fix a merge regression
* vm: remove 16k page hacks
* PPU: formatting fixes
* PPU: fix arm64 null function assembly
* ppu: clean up arch-specific instructions
2022-06-14 15:28:38 +03:00
Eladash
264253757c
rsx: Improve Null Renderer
2022-06-12 20:54:42 +03:00
Ani
2512e958fa
glsl: Avoid implicit int->uint conversions ( #12220 )
2022-06-12 18:05:43 +01:00
Elad Ashkenazi
280aa6da91
rsx: Fix NV406E semaphore_acquire timeout detection ( #12205 )
2022-06-12 12:34:29 +03:00
Malcolm Jestadt
0d022d420b
RSX: Add more wide paths for upload_untouched
...
- Adds AVX512 path for upload_untouched u16 with primitive restart, and
AVX2 and AVX512 paths for upload_untouched without restart
- The AVX512 paths handle the remainder in simd code with masking, which
provided a large speedup
- On my i5-1135G7 in demons souls benchmarking a scene in boletaria with
a lot of geometry on screen via perf:
SSE4_1 0.64%
AVX2 0.59%
AVX512 0.56%
AVX512 w/ remainder masking 0.51%
2022-06-12 06:23:55 +03:00
Elad Ashkenazi
ec530a2c91
rsx: Suggest to try setting RSX FIFO Accuracy to a higher mode of accuracy on crash ( #12204 )
2022-06-11 23:26:12 +02:00
kd-11
7530b3c971
vk: Fix image view search and destroy
2022-06-09 02:13:55 +03:00
Eladash
f9bc7458d4
rsx: Resurgence of HLE GCM
2022-06-06 12:56:25 +02:00
kd-11
6c315e8aee
gl: Disallow overlapping binding points
2022-06-05 10:13:41 +03:00
Elad Ashkenazi
88faac7bbc
rsx: Minor fixup ( #12165 )
2022-06-04 15:04:27 +01:00
Elad Ashkenazi
9bb7e8d614
rsx: Implement atomic FIFO fetching (stability improvement) (non-default setting) ( #12107 )
2022-06-04 15:35:06 +03:00
kd-11
286f97fad0
rsx: Reduce some error spam
2022-06-04 14:02:33 +03:00
kd-11
f0a02e0d9d
gl: Fix leaking texture views
2022-06-04 14:02:33 +03:00
kd-11
8185bfe893
gl: Track image destruction and remove handles from state tracker
...
- Handles are reused for different resources which can cause problems
2022-06-04 14:02:33 +03:00
kd-11
d577cebd89
gl: Refactor image and command-context handling
...
- Move texture object code out of the monolithic header
- All texture binds go through the shared state
- Transient texture binds use a dedicated temp image slot shared with native UI
2022-06-04 14:02:33 +03:00
kd-11
167161d8ce
rsx: Restore some accidentally removed depth-format conversion macros
2022-06-03 11:54:09 +03:00
kd-11
b8b0ecabd8
gl: Fix data pointer on the optimized AMD path
2022-06-03 11:54:09 +03:00
kd-11
bb05de2e80
gl: Fix copypasta
2022-06-03 11:54:09 +03:00
kd-11
7890e87234
gl: Fix warning
2022-06-03 11:54:09 +03:00
kd-11
25c05867d6
gl: Fix ring buffer remove() function
...
- Fixes crash on running a second game in the same session
2022-06-03 11:54:09 +03:00
kd-11
a421270c19
gl: Use new scratch buffer system
2022-06-03 11:54:09 +03:00
kd-11
764fb57fdc
gl: Implement scratch ring buffer with memory barriers
2022-06-03 11:54:09 +03:00
kd-11
3fd846687e
gl: Refactor buffer object code
2022-06-03 11:54:09 +03:00
kd-11
ff9c939720
gl: Assume decode buffer is to be used as SSBO as this seems to be a hint to the driver about where to put the buffer
...
Part of OpenGL's achilles' heel - the API does not distinguish between VRAM and SYSTEM memory at all and relies on developers wrestling with the driver's heurestic algorithm for this.
2022-06-03 11:54:09 +03:00
kd-11
234db2be3f
gl: Fix texture binding in overlay renderer
2022-06-03 11:54:09 +03:00
kd-11
fc44d53bb0
gl: Reset buffer size on destroying the GPU handle
2022-06-03 11:54:09 +03:00
kd-11
555a4b5f5c
gl: Suggest readback buffer as ssbo if it is not provided
...
- We're likely to jump into a compute or readback pass anyway.
2022-06-03 11:54:09 +03:00
kd-11
a6e6df1445
gl: Implement fast texture readback for D24X8 and RGBA8/BGRA8
2022-06-03 11:54:09 +03:00
Nekotekina
76c72351a5
rsx_methods: fix warning
2022-06-02 12:56:49 +03:00
kd-11
eb52ac55a7
gl: Fix AMD buffer decode
2022-05-31 23:34:14 +03:00
kd-11
d167582f6b
gl: Implement on-chip buffer-to-d24x8 conversion
2022-05-31 23:34:14 +03:00
kd-11
dd6cb054a7
gl: Add missing viewport save
2022-05-31 23:34:14 +03:00
kd-11
b97557ce7b
gl: Use DSA for compressed texture upload
2022-05-31 23:34:14 +03:00
kd-11
964fd1095e
gl: Properly preserve texture state
...
- Remove rogue glBindTexture calls and use gl commandstate object instead
2022-05-31 23:34:14 +03:00
kd-11
fcc6c2384b
Fix linux build
2022-05-31 23:34:14 +03:00
kd-11
a5d73f41b5
gl: Remove debug message
2022-05-31 23:34:14 +03:00
kd-11
1b305bf789
gl: Workaround for poor AMD OpenGL performance
...
- Turns out the AMD driver really hates it if you render with a mapped index buffer.
The driver internally seems to make a copy of the consumed indices and uses that. Very slow.
I was able to isolate this after observing that glDrawArrays is not entirely shit, but glDrawElements duration scaled linearly with the number of vertices.
2022-05-31 23:34:14 +03:00
kd-11
943752db30
gl: Compute optimizations
...
- Keep buffers around longer to allow driver heurestics to work
- Properly initialize the shaders to allow optimal workgroup dispatch size
2022-05-31 23:34:14 +03:00
kd-11
60a2a39e88
gl: Deswizzle textures on the GPU
2022-05-31 23:34:14 +03:00
kd-11
532563e861
gl: Update some more buffer-object functions
2022-05-31 23:34:14 +03:00
kd-11
3ee27bd434
gl: Optimize consumption of buffer objects when uploading textures
2022-05-31 23:34:14 +03:00
kd-11
55e68441cb
gl: Commit to bindless framebuffer object management
2022-05-31 23:34:14 +03:00
kd-11
7ec481d99b
rsx: Allocate scratch memory using simple array with no default initialize
...
- This cuts down processing time significantly by eliminating calls to memset_stosb
2022-05-31 23:34:14 +03:00
kd-11
129e947720
gl: Improve CS throughput
...
- Avoids making too many invocations, especially given the 1D nature of some GPU dispatch handlers
2022-05-31 23:34:14 +03:00
kd-11
e964060a6a
gl: Handle texture binding using the global state tracker
2022-05-31 23:34:14 +03:00
kd-11
74696d2e44
gl: Commit to a consistent global state
2022-05-31 23:34:14 +03:00
kd-11
78746fdb6f
gl: Commit to using DSA for internal buffer management
...
- Gets rid of spammy BindBuffer calls on every draw
2022-05-31 23:34:14 +03:00
kd-11
ed2068fb03
gl: Rewrite buffer mapping
2022-05-31 23:34:14 +03:00
kd-11
b61c4d3693
gl: Fix stat counters
2022-05-31 23:34:14 +03:00
kd-11
81b9952e34
gl: Do not allow cross-aspect bitcasts
...
- There is special handling for some cross-aspect bitcasts in vulkan, but this is not possible using OpenGL
2022-05-31 23:34:14 +03:00
Elad Ashkenazi
95233b5299
rsx: Fix deadlock in vm::_page_unmap
2022-05-30 11:53:34 +03:00
Elad Ashkenazi
610d29dab0
rsx: Fix VBLANK time
2022-05-28 13:00:42 +02:00
Megamouse
345bda69ec
Overlays: Add screenshot message to queue
2022-05-26 08:52:12 +02:00
kd-11
9c824aa0b5
vk: Enable event scope hack for INTEL proprietary drivers
2022-05-24 20:11:31 +03:00
kd-11
efff2a78c8
vk: Restructure how the conditional render evaluation is done ( #12071 )
...
Fixes conditional render fast-path
2022-05-24 11:11:21 +03:00
RipleyTom
e68ffdbc81
Add a message overlay
2022-05-23 08:38:02 +02:00
kd-11
7c8fbc35bc
rsx: Move PS3-compliant behavior to a new option
2022-05-21 16:35:35 +03:00
kd-11
b637429e44
Fix display flickering
2022-05-21 16:35:35 +03:00
kd-11
d52bb78d2c
rsx: Trivial non-blocking display synchronization
2022-05-21 16:35:35 +03:00
kd-11
4e6be9172a
rsx: Asynchronously flush the pipelines when handing ZCULL memory access violations
2022-05-21 10:06:32 +03:00
kd-11
0e1333ed5f
rsx: Deadlock avoidance of accurate RSX reservations
2022-05-21 10:06:32 +03:00
Eladash
cd74fb6a6d
rsx: Implement HW accurate frame limiter
2022-05-20 22:40:48 +02:00
kd-11
ec2d529832
rsx: Separate loop interrupts from graphics state
...
- The interrupts are for multithreaded signals andmake the main loop run more aggressively for the next cycle
2022-05-20 16:29:27 +03:00
kd-11
257556bbf5
rsx: Add eng lock before flagging memory unmap
...
- This is much better than polling on atomics every cycle for something that happens a few times during gameplay
2022-05-20 16:29:27 +03:00
kd-11
93d93b4805
rsx: Fix typo
2022-05-20 16:29:27 +03:00
kd-11
e368453751
rsx: Rework loop interrupts a bit
...
- Reset backend interrupt in core handler
- Separate memory config interrupt from regular backend interrupt
2022-05-20 16:29:27 +03:00
kd-11
d0dc095c84
rsx: Silence some log spam
2022-05-20 16:29:27 +03:00
kd-11
360fdca5ac
vk: Avoid multimap when handling image views
2022-05-20 16:29:27 +03:00
kd-11
e1b95913ea
rsx/zcull: Improve deadlock avoidance
...
- Do not acquire eng lock while holding the page lock
RSXThread may be waiting on the page lock and will never ack the pause request
2022-05-20 16:29:27 +03:00
kd-11
a3ea9e2985
rsx/zcull: Less aggressive disabling of optimizations
2022-05-20 16:29:27 +03:00
kd-11
e9bf3e13d0
rsx/zcull: Pause the main thread before flushing reports
2022-05-20 16:29:27 +03:00
kd-11
094fda0e73
Crash fix
2022-05-20 16:29:27 +03:00
kd-11
d2de560060
rsx: Improve sync_hint callback interface
2022-05-20 16:29:27 +03:00
kd-11
5315eb546f
rsx: Stop spamming ZCULL update method
...
- This has a negative impact when ZCULL is active due to spamming __rdtsc
- While the method is fast, it is not free and some checks are done before the instruction can be emitted
Let's use the saved time to actually get something useful done
2022-05-20 16:29:27 +03:00
kd-11
7fa521a046
rsx/vk: Redesign how conditional rendering hints work
...
- Pass a sync address to the backend
- Ignore the hint if the query is running in lazy mode
- Do not submit CBs too close to each other. Submits are expensive
2022-05-20 16:29:27 +03:00
kd-11
0244c4046e
rsx: Lower performance hit due to frequency fetch
2022-05-20 16:29:27 +03:00
kd-11
7e8c93bea2
Random optimization
2022-05-20 16:29:27 +03:00
kd-11
9a1e6cc3e8
rsx: Implement RSX reports area access detection and optimize around it
...
- If nobody is reading RSX reports, do not be in a hurry to write them
- Requires HLE of some methods (cellGcmGetTimestamp) to function correctly
2022-05-20 16:29:27 +03:00
kd-11
f0135a02f5
vk: Unconditionally enable hw acceleration for conditional evaluation
2022-05-20 16:29:27 +03:00
kd-11
0b7e013fbe
rsx: Simplify ZCULL logic a bit
2022-05-20 16:29:27 +03:00
kd-11
850eef0c1a
rsx: Move ZCULL logic to its own file
...
- It's over 1k lines of code in its own namespace; it really should be in its own file
2022-05-20 16:29:27 +03:00
Nekotekina
a2bfd5fcfc
Minor AArch64 support changes
2022-05-04 16:12:32 +03:00
RipleyTom
8316469cfc
Update libusb to v1.0.26
2022-04-29 02:04:52 +02:00
kd-11
7a434d19a6
rsx/vp: Zero-initialize temporary registers
2022-04-28 01:31:07 +03:00
kd-11
95ac7724a6
Fix typos
2022-04-28 01:31:07 +03:00
kd-11
e236ba4daf
rsx: Improve lowered precision comparison emulation
2022-04-28 01:31:07 +03:00
Megamouse
3183d73e4d
OSK/overlays: fix initial input interception
...
Don't use default interception if we already intercept with custom params.
2022-04-26 00:51:38 +02:00
Eladash
7329fa9cf5
TRPLoader: Use std::string_view
2022-04-25 20:15:10 +02:00
Megamouse
8d662e9327
overlays: enable key repeat by default
2022-04-25 19:44:56 +02:00
Megamouse
ff7636ea01
OSK/overlays: handle keyboard enter and escape
2022-04-25 19:44:56 +02:00
Megamouse
8f14f392fd
overlays: ignore input if kb pad handler is active
2022-04-25 19:44:56 +02:00
Megamouse
5fad7e1b87
OSK: flush key input to prevent key event spam
2022-04-25 19:44:56 +02:00
Megamouse
8864f944e2
cellOskDialog: implement dimmer_enabled
2022-04-25 19:44:56 +02:00
Megamouse
918984ee64
overlays: only log actual input loop errors
2022-04-25 19:44:56 +02:00
Megamouse
b29f106c51
cellOskDialog: implement base_color
2022-04-25 19:44:56 +02:00
Megamouse
71f8280c5e
cellOskDialog: implement KeyboardEventHookCallback
2022-04-25 19:44:56 +02:00
Megamouse
0ff293707a
OSK: allow device input during interception
2022-04-25 19:44:56 +02:00
Megamouse
9adab801ac
cellOskDialog: implement device mask and lock
2022-04-25 19:44:56 +02:00
Megamouse
aee91b4f6f
OSK: Ignore gamepad input if a key was pressed
2022-04-25 19:44:56 +02:00
Megamouse
ffd36ea662
OSK: handle keyboard input
2022-04-25 19:44:56 +02:00
nastys
f21b298e5e
Make MSL Fast Math and software vkSemaphore optional
2022-04-24 09:25:13 +02:00
Eladash
f92b487947
rsx: Allow NV0039 0x2100
2022-04-22 18:20:23 +03:00
kd-11
bca7b02ae9
Fix compressed pitch calculation
2022-04-19 22:58:29 +03:00
sguo35
e761b3235c
macos: fix build for arm64
...
Adds arm64 branches to some x86 specific code and modifies some casting
logic to make Clang happy
2022-04-18 17:53:54 +03:00
Eladash
6783bcd273
Log a snippet of guest thread code at crash
2022-04-15 22:34:51 +03:00
Eladash
1d51f3af0c
RSX-Debugger: Implement backwards scrolling
...
* Use 2 points of known true RSX code roots and follow them in order to peek at the current section of valid RSX code:
These roots are: current RSX instruction address and the last targeted address by a branch instruction.
2022-04-15 22:34:51 +03:00
kd-11
57aee92bfe
rsx: Separate guest flip timer from host timing operations
2022-04-13 23:39:01 +03:00
kd-11
89de1a8cf6
overlays: Fix frame timing
2022-04-13 23:39:01 +03:00
kd-11
60cbd7a88c
Automatically determine the epsilon value programatically
2022-04-13 15:48:28 +03:00
kd-11
2db68acab9
rsx: Implement Z value snapping to account for precision errors
2022-04-13 15:48:28 +03:00
kd-11
e53bbd668b
rsx: Fix surface cache scanning and removal
2022-04-05 14:07:05 +03:00
kd-11
fc05511354
rsx: Optimize software sampling further for the 6-tap kernel
2022-04-04 16:51:03 +03:00
kd-11
ca35a75a7d
rework weighting scheme
2022-04-04 16:51:03 +03:00
kd-11
15b7e4f05e
6-tap experiment
2022-04-04 16:51:03 +03:00
kd-11
49c84f099a
rsx/glsl: Fixup
2022-04-04 16:51:03 +03:00
kd-11
43b267ea51
glsl: Rewrite MS sampling implementation
2022-04-04 16:51:03 +03:00
kd-11
a8441b28e8
rsx: Implement basic 2D bilinear filtering for MSAA images
2022-04-04 16:51:03 +03:00
kd-11
4a86638ce8
rsx: Avoid unnecessary memprotect syscalls
2022-03-29 12:35:32 +03:00
kd-11
e037b5c438
rsx: Handle in-place image swaps when locking data for WCB/WDB
...
- Rare, but possible if a surface address is switched from color to depth usage
- In such a case, deref the old image and ref the new one to avoid leaks
2022-03-29 12:35:32 +03:00
kd-11
f45343a345
rsx: Handle DMA block init where empty pages exist in the range
2022-03-29 12:35:32 +03:00
kd-11
94a7e52c1f
rsx: Disable ref count on exit
2022-03-28 19:55:34 +03:00
kd-11
2b42895bc7
rsx: Reduce log spam a bit
2022-03-28 19:55:34 +03:00
kd-11
d98d152d23
rsx: Fix leaking surface cache refs from texture cache
...
- Lock surfaces in use by texture cache to prevent complete deletion
- Remove discarded surfaces from the reprotect cache to avoid uaf
2022-03-28 19:55:34 +03:00
kd-11
b645a7faf5
vk: Rebuild swapchain in case of unexpected errors during present
2022-03-28 19:55:34 +03:00
kd-11
ffa841e7c1
vk: Force resolve explicitly for transfer operations
2022-03-28 19:55:34 +03:00
kd-11
e66d6a9399
Fix interpreter
2022-03-26 16:10:18 +03:00
kd-11
ef65c47592
vk: Restore UBO alignment
...
- NV requires some very large alignment thresholds
2022-03-26 16:10:18 +03:00
kd-11
1592ecdc55
rsx: Invalidate transform block on program change
...
- Since each program now does a remap of the outputs, we need to reupload the constants
- This is not a loss, constants are almost always changing between draw calls anyway
2022-03-26 16:10:18 +03:00
kd-11
96742852eb
Fix OGL
2022-03-26 16:10:18 +03:00
kd-11
de0e660d28
rsx: Handle vertex shaders with no constant references
...
- If no vc[] refs exist, do not upload anything!
2022-03-26 16:10:18 +03:00
kd-11
d057ffe80f
rsx: Fix program generation and compact referenced data blocks
2022-03-26 16:10:18 +03:00
kd-11
9a2d4fe46b
rsx: Relocatable transform constants
2022-03-26 16:10:18 +03:00
RipleyTom
a4d715e25d
Warning Fixes
2022-03-23 19:35:10 +01:00
kd-11
af0e1f609e
Fix vulkan compilation warnings
2022-03-23 11:26:06 +03:00
kd-11
1ab5b481ff
Fix ambiguous comparison operator warning
2022-03-23 11:26:06 +03:00
kd-11
26ee1246ae
rsx: Block size back down to 4MB
...
- 4M is a good compromise, a 720p surface occupies just under 4MB
2022-03-23 11:26:06 +03:00
kd-11
d0402332f7
rsx: Bump surface cache block size to 16M
2022-03-23 11:26:06 +03:00
kd-11
43c7417906
rsx: Rework ranged map
...
- Adds metadata lookup for intersecting range calculations
- Make fetch/put methods more explicit
2022-03-23 11:26:06 +03:00
kd-11
56540a55ec
Fix linux
2022-03-23 11:26:06 +03:00
kd-11
35ec4de776
rsx: Optimize surface store for faster scanning
2022-03-23 11:26:06 +03:00
kd-11
bc7ed8eaab
rsx/vk: Rework MSAA implementation
2022-03-17 22:02:20 +03:00
Megamouse
04df392866
Log cpu usage periodically
2022-03-16 19:42:06 +01:00
kd-11
78b8bd80e4
rsx: Unconditionally set MSAA flags if MSAA is active
2022-03-11 01:15:13 +03:00
kd-11
1943d9819f
rsx: Clean up surface cache routines around RTT invalidate
2022-03-10 20:43:58 +03:00
kd-11
59a0cf94ab
rsx: Fix msvc build
2022-03-08 22:06:26 +03:00
kd-11
3e4faf602a
rsx: Fix clang build
2022-03-08 22:06:26 +03:00
kd-11
454a724f4e
rsx: Reduce the performance impact of enabling the profiling timer
...
- Just use TSC if available
2022-03-08 22:06:26 +03:00
kd-11
cfecbb24ca
rsx: Avoid calling slow functions every draw call
...
- Use TSC for timing where interval duration matters.
- Use atomic counter for ordering timestamps otherwise.
2022-03-08 22:06:26 +03:00
kd-11
762b594927
rsx: Fully process texture if surface cache configuration changed
2022-03-08 22:06:26 +03:00
kd-11
8d3d290e33
rsx: Fix build
2022-03-08 22:06:26 +03:00
kd-11
0df903090d
rsx: Optimize metrics a bit
...
- For some reason this has a massive impact on performance above some arbitrary threshold of calls
Shows up under surface_cache::get_merged_memory_region when doing gathers.
2022-03-08 22:06:26 +03:00
kd-11
6812fa4764
rsx: Fix surface write coherency when MSAA is active
2022-03-08 22:06:26 +03:00
Megamouse
cd97d74f0f
cellMusic/Decode: add SelectContents functions
2022-03-08 09:02:59 +01:00
Megamouse
aafd74f9ea
cellMusicDecode: initial implementation
...
Implements the basic functionality of cellMusicDecode.
Works with Space Invaders (if you add the list selection from the other PR).
Probably fixes SSX custom music.
2022-03-05 18:34:27 +01:00
kd-11
0dbfe314a3
vk: Encode image type when caching resources
2022-03-01 21:51:55 +03:00
kd-11
00a1864a95
Revert "rsx: Downgrade depth-1 3D images to 2D ( #11593 )"
...
This reverts commit 6c096b72b5 .
2022-03-01 21:51:55 +03:00
kd-11
6c096b72b5
rsx: Downgrade depth-1 3D images to 2D ( #11593 )
...
- Fixes problems with implicit view types derived from dimensions.
2022-03-01 10:45:50 +03:00
kd-11
e035000864
vk: Do not enable passthrough DMA unconditionally (yet)
...
- There are still some kinks to work out. Host labels do not fix all the bugs which means I missed something.
2022-02-26 10:28:46 +03:00
kd-11
6db5d83615
Flush dma offloader on texture read sema
2022-02-25 10:53:55 +03:00
kd-11
f3823232e0
Disable passthrough DMA for proprietary intel driver
2022-02-23 21:15:08 +03:00
kd-11
6b8b23c401
vk: Drain the label queue before using the CPU fallback to avoid out-of-order signals
...
- This avoids crashes in some game engines which expect RSX semaphores to signal in the order they are submitted.
2022-02-23 12:57:04 +03:00
kd-11
6fd2a9b677
rsx: Remove leftover dprints
2022-02-23 12:57:04 +03:00
kd-11
da559b5568
vk/rsx: Tuning and optimization for host labels
2022-02-23 12:57:04 +03:00
kd-11
24587ab459
rsx: Add the option to the advanced tab
2022-02-23 12:57:04 +03:00
kd-11
c7e49b58a8
rsx: Implement host GPU sync labels
2022-02-23 12:57:04 +03:00
kd-11
10e6b43a2f
Drop redundant declaration
2022-02-21 23:58:01 +03:00
kd-11
0809e7cf9f
Fix build
2022-02-21 23:58:01 +03:00
kd-11
12fd43e1c6
vk: Remove unused variables
2022-02-21 23:58:01 +03:00
kd-11
397a795e75
vk: Remove hardcoded command buffer list length
2022-02-21 23:58:01 +03:00
kd-11
1f9ade0ab6
vk: Remove pointless function (VKGSRender::open_command_buffer)
...
A relic of the past, back before we wrote wrappers for raw handles.
2022-02-21 23:58:01 +03:00
kd-11
83407c386c
vk: Move renderer types to a separate file
...
- Makes my life easier managing conflicts
2022-02-21 23:58:01 +03:00
kd-11
b791d90b35
vk: Rewrite command buffer chains
2022-02-21 23:58:01 +03:00
Megamouse
93e7988df7
rsx: add boost mode shortcut
2022-02-20 11:56:11 +01:00
nastys
7801e8368b
Add MoltenVK Semaphore setting
2022-02-20 08:47:16 +01:00
kd-11
254ddcad51
vk/dma: Initialize COW DMA block contents to avoid leaks
...
- It is possible to lose data when uploading since the result of map_dma can change types and handles.
- Consider sync-on-exit for inherited spans
Not a problem when using passthrough DMA, but this extension does not work properly on NVIDIA + windows
2022-02-16 16:33:27 +03:00
kd-11
2d5d5746d1
gl: Harmonize format conversion values
...
- Return values that are true to the PS3, not the host.
2022-02-13 15:31:39 +03:00
kd-11
314b63eebf
vk: Drop unused native format ABGR8
2022-02-13 15:31:39 +03:00
kd-11
f382d54e9a
gl: Remove pointless assert
2022-02-13 15:31:39 +03:00
kd-11
df5295ae85
vk: Per work-queue scratch resources
...
- Avoids parallel tasks from trampling over each other's data
2022-02-13 14:39:42 +03:00
kd-11
c8ad8b18bb
vk: Ignore queue transfer stuff when using 'fast' mode
2022-02-13 14:39:42 +03:00
kd-11
44cc254620
Fix linux build
2022-02-13 14:39:42 +03:00
kd-11
cef512a123
vk: Spec-compliant async compute
2022-02-13 14:39:42 +03:00
kd-11
ec3e8de780
rsx: End the current frame before performing cache cleanup to release in-flight data
2022-02-10 22:20:56 +03:00
kd-11
f667b52cca
vk: Rewrite resource management
2022-02-10 22:20:56 +03:00
kd-11
48b54131f6
vk: Fix up multiple resource allocation routines
...
- Originally part of async bringup. Imported to allow smoother transition.
2022-02-10 22:20:56 +03:00
Megamouse
d172b9add6
Rename CallAfter to CallFromMainThread
2022-02-07 19:42:08 +01:00
kd-11
2d9f21a2ea
rsx: Lower performance warnings to 'warn' level instead of 'error' level to avoid causing panic for users
2022-02-07 09:25:01 +03:00
kd-11
247759b75b
rsx: Fix memory tagging and add some security checks
2022-02-07 09:25:01 +03:00
kd-11
90d368ae30
vk: Speed up cached image search a bit
2022-02-06 15:49:50 +03:00
kd-11
a2d33a7d76
vk: Fix WCB crash
2022-02-06 15:49:50 +03:00
kd-11
51f9310b9f
vk: Silence compiler warnings
2022-02-06 15:49:50 +03:00
kd-11
dca3d477c9
vk: Use image hot-cache for faster allocation times
...
- Creating new images is expensive.
- We can keep around a set of images that have been recently discarded and use them instead of creating new ones from scratch each time.
2022-02-06 15:49:50 +03:00
nastys
6b370e85d5
Add overlay animations
2022-02-06 12:26:34 +01:00
Eladash
e951c619c5
Implement Emulator::GracefulShutdown()
2022-02-05 11:49:29 +01:00
kd-11
86919ec0e1
rsx: Validate requested images before attempting to upload them
...
- Do not allow dimensions of 0 to reach the backend APIs
2022-01-30 14:58:51 +03:00
kd-11
0e320d17c1
vk: Fix 'grow' behavior when we reach the size limit
...
- Just swap out the current heap ptr and spawn a fresh one. Chances are, we can spare 1GB of host memory.
2022-01-30 10:56:15 +03:00
kd-11
d063f0b335
vk: Fix working buffer calculation for emulated D16F operations
2022-01-30 10:56:15 +03:00
Eladash
781b2b4548
Implement fs::isfile ( #11447 )
2022-01-29 22:10:48 +03:00
Nekotekina
16aae4eb77
Fixup creating image path
2022-01-26 15:46:16 +03:00
Nekotekina
3a1082fe0d
Fix overlays::image_info constructor
2022-01-26 15:46:16 +03:00
kd-11
ffe00e8619
gl: Clean up format bitcast checks and register D32F type for FORMAT_CLASS16F
...
- Also hides a dangerous export for vulkan, same as GL
2022-01-26 12:08:36 +03:00
kd-11
3fa45ff994
Fix missing typeless info update
2022-01-26 12:08:36 +03:00
Eladash
73ff506b88
overlay_controls.cpp: Improve image_info ctor withstandability
2022-01-26 10:35:52 +03:00
kd-11
3a1676e558
vk: Fix float16 requirement issue
2022-01-25 21:34:21 +03:00
Nekotekina
0db9850a73
Add loop building utilities for ASMJIT
...
Refactor copy_data_swap_u32 a bit
2022-01-25 03:16:37 +03:00
Nekotekina
12c83b340d
Remove built_function
...
With today's branch prediction techniques, it's hardly useful.
2022-01-24 22:21:41 +03:00
kd-11
1fa82eec89
vk: Rework format feature validation
...
- Requirements have changed a lot over the years. We no longer blit Z formats around for example because they never support linear filtering
- Removing some unused requirements allows more hardware to be usable
2022-01-24 19:14:27 +03:00
kd-11
2f7d38bb81
rsx: Improve coverage checking logic to handle 3D and cubemap resources
2022-01-23 00:03:03 +03:00
kd-11
4f8b5849b7
rsx: Take depth into account when calculating coverage
2022-01-23 00:03:03 +03:00
kd-11
7f216f2581
rsx: Fix local slice height calculation
2022-01-23 00:03:03 +03:00
kd-11
6ffd38c393
vk: Only enable DCC workaround if the format features allow it
2022-01-22 13:16:48 +03:00
nastys
801e7f3c2f
macOS: Implement texture swizzling for 16-bit formats
2022-01-22 00:17:17 +01:00
nastys
c7140df5f8
Initial support for Apple GPUs
2022-01-22 00:17:17 +01:00
nastys
6b5f0957ce
Disable macOS swizzling workaround
2022-01-22 00:17:17 +01:00
kd-11
3942a464fe
vk: Avoid leaking descriptor copies
2022-01-20 19:21:24 +03:00
kd-11
2331dc3256
vk: Keep the total number of allocated samplers under control
2022-01-20 19:21:24 +03:00
Nekotekina
4704367382
Remove unnecessary asmjit::imm_ptr
2022-01-18 00:10:32 +03:00
Nekotekina
14cca55b50
PPU: refactor vector rounding instructions
...
Fix: nearbyint -> roundeven
2022-01-18 00:10:32 +03:00
kd-11
000ec71629
Fix invalid descriptor setup if subdraw0 has broken vertex setup
2022-01-17 12:38:10 +03:00
kd-11
3e794e7fdb
rsx: Optimize 8-bit rounding logic a bit
...
- NV hw does not like the raw use of round()
2022-01-17 10:28:23 +03:00
kd-11
c38ca21a81
rsx: Round up 8-bit ROP output on NVIDIA cards
...
- NV GPUs have a tendancy to be off by a very small margin, breaking rendering when greaterThan/lessThan checks are used.
- NOTE: Currently this setting is using the sRGB flag which indicates 8-bit output.
Only one game is currently known to care about this behaviour so this is good enough for now.
2022-01-17 10:28:23 +03:00
kd-11
f923eaf09a
rsx: Surface format remapping enhancements
2022-01-17 10:28:23 +03:00
Nekotekina
580bd2b25e
Initial Linux Aarch64 support
...
* Update asmjit dependency (aarch64 branch)
* Disable USE_DISCORD_RPC by default
* Dump some JIT objects in rpcs3 cache dir
* Add SIGILL handler for all platforms
* Fix resetting zeroing denormals in thread pool
* Refactor most v128:: utils into global gv_** functions
* Refactor PPU interpreter (incomplete), remove "precise"
* - Instruction specializations with multiple accuracy flags
* - Adjust calling convention for speed
* - Removed precise/fast setting, replaced with static
* - Started refactoring interpreters for building at runtime JIT
* (I got tired of poor compiler optimizations)
* - Expose some accuracy settings (SAT, NJ, VNAN, FPCC)
* - Add exec_bytes PPU thread variable (akin to cycle count)
* PPU LLVM: fix VCTUXS+VCTSXS instruction NaN results
* SPU interpreter: remove "precise" for now (extremely non-portable)
* - As with PPU, settings changed to static/dynamic for interpreters.
* - Precise options will be implemented later
* Fix termination after fatal error dialog
2022-01-15 06:48:04 +03:00
kd-11
d6aa834b5f
vk: Enable shading rate hack for all GPUs
...
- This is a hack, ideally we should be using coverage-based masking when writing the exploded texture.
- We do not have access to the fragment coverage mask and it is non-trivial to integrate it in a competent manner.
2022-01-14 10:21:38 +03:00
kd-11
6d737e61fd
rsx: Use 32 bit integers for pitch
...
- RSX max pitch = 65536 which requires 17 bits
2022-01-10 12:27:30 +03:00
kd-11
83026fd263
rsx: use coverage ratio to determine when too much data is overlapping
2022-01-07 22:55:27 +03:00
kd-11
92824b6729
rsx: Rework invalidation tagging
2022-01-07 22:55:27 +03:00
kd-11
7563655221
rsx: Bump surface removal threshold values
...
- It is much slower to attempt surface removal than to render duplicates on the host GPU
2022-01-07 22:55:27 +03:00
kd-11
6889b48973
rsx: Add optimized version of section removal code
2022-01-07 22:55:27 +03:00
Eladash
bba528e2ae
rsx: Fix wrong fault report in initialization ( #11323 )
...
* rsx: Fix wrong fault report in initialization
* Ensure emu.isstopped() == true at RPCS3 startup
Based on zero initialization.
2022-01-05 20:41:01 +03:00
kd-11
7c47b0029c
gl: Fully drop alignment restriction for compressed textures
...
- This is just not part of spec, there is no enforcement for multiple of block size for width or height of s3tc compressed images.
- This restriction does indeed exist for ASTC and ETC but we're not using those formats.
2022-01-02 14:29:38 +03:00
Nekotekina
cb2748ae08
Update ASMJIT (new upstream API)
2021-12-29 02:45:00 +03:00
Nekotekina
d836033212
LLVM: enable some JIT events (Intel, Perf)
...
Made some related adjustments.
Currently incomplete.
2021-12-26 16:41:37 +03:00
Nekotekina
510041a873
rsx_methods.cpp: optimize compile time (120s to 10s)
...
Untemplate NV308A_COLOR
2021-12-26 14:40:21 +03:00
Nekotekina
8b4b6ba946
copy_data_swap_u32: build AVX-512 path
2021-12-26 14:40:21 +03:00
Nekotekina
599e00d6da
BufferUtils: remove dead code (vertex streaming)
...
RIP. It won't be useful.
2021-12-26 14:40:21 +03:00
Nekotekina
3cd8891ab8
Re-refactor copy_data_swap_u32 again
...
Drop AVX2 path for now, since it usually operates on small data.
Rely on automatic SSE vectorization on recent compilers.
Side refactoring on JIT.h to workaround weird conflict issue.
2021-12-26 14:40:21 +03:00
kd-11
a9303acfdf
rsx: Fix zclip w scaling
2021-12-26 12:50:31 +03:00
nastys
a0040e6fb1
macOS: Implement texture converter for Metal (2) ( #11289 )
...
* macOS: Implement texture converter for Metal (2)
* Fix texture conversion formatting
2021-12-24 15:46:37 +03:00
kd-11
28d7af313b
rsx: Remove noisy debug print
2021-12-24 15:13:33 +03:00
kd-11
39ef39aa4e
rsx: Exercise caution when testing for overlaps in invalidated sections
2021-12-24 15:13:33 +03:00
kd-11
56dd09f4fe
rsx: Handle floating point shenanigans
...
- If near and far clip are too close together, the API will not distinguish between them leading to out of bounds values
2021-12-22 22:08:53 +03:00
kd-11
de495952fd
rsx: Enable fallback for devices without wide integer Z buffers
2021-12-22 22:08:53 +03:00
kd-11
1ce5349199
rsx: Remove zclip hackery
...
- Calculates precise Z value as requested by the game
- Works properly if the underlying Z format matches the PS3 1:1 but may cause minor problems otherwise
2021-12-22 22:08:53 +03:00
Nekotekina
12e3c9e08b
Use PAUSE in vk::query_pool_manager::get_query_result
2021-12-21 23:28:09 +03:00
Nekotekina
262ff01619
Use aligned stores in write_index_array_data_to_buffer
...
Ensure that target buffer is cache line aligned.
Improve stx::make_single to support alignment.
2021-12-21 23:28:09 +03:00
Nekotekina
76ccaf5e6f
BufferUtils: refactoring
...
Optimize CPU capability tests for arch-tuned builds.
Separate streaming and non-streaming utilities.
Rewritten copy_data_swap_u32(_cmp) with AVX2 path.
2021-12-21 23:28:09 +03:00
nastys
47e4a95d8f
Fix remap_vector redefinition on macOS ( #11271 )
2021-12-21 10:36:09 +01:00
nastys
08333e0876
macOS moltenVK support and SIGBUS handling ( #11252 )
2021-12-12 21:35:56 +01:00