rpcsx

mirror of https://github.com/RPCSX/rpcsx.git synced 2026-02-19 14:15:28 +01:00

Author	SHA1	Message	Date
Nekotekina	69912ba3c7	Partial revert for `cf0fcf5a2a`	2022-06-30 14:38:14 +03:00
Eladash	cf0fcf5a2a	SPU: Implement execution wake-up delay	2022-06-28 19:54:25 +03:00
Eladash	f5a55b3024	rsx: Fixup after #12052 for frame limiter off	2022-06-25 17:39:07 +03:00
Eladash	7422ab9e55	rsx: Do not discard flip notifications	2022-06-25 15:30:41 +02:00
Eladash	f66256cc13	rsx: PS3 Native frame limiter improvements, add Infinite frame limiter * Do not wait on DEVICE 0x30 semaphore, it seems like it is something to do with queue command synchronization. - This also fixes cellGcmSetFlipWithWaitLabel which is built specifically to enable accurate RSX flipping time, its waiting command is confirmed to be placed AFTER DEVICE 0x30 waiting. * Fix default vsync state to be enabled. (and set it to enabled in cellGcmSetVBlankFrequency as well) * Add experimental "Infinite" frame limiter mode. * Fix spurious enabling of second vblank.	2022-06-25 15:30:41 +02:00
Eladash	5e01ffdfd8	Debugger: Optimize cpu_thread::dump_regs() Reuse string buffer. Copies and reallocations are expensive with such large strings.	2022-06-23 22:41:32 +02:00
Eladash	3899248305	RSX Debugger: Stable NOP skipping Allow addresses of NOP blocks to remain consistent in between debugger position changes except for the first which can shrink or grow.	2022-06-21 16:59:45 +03:00
Jeff Guo	cefc37a553	PPU LLVM arm64+macOS port (#12115 ) * BufferUtils: use naive function pointer on Apple arm64 Use naive function pointer on Apple arm64 because ASLR breaks asmjit. See BufferUtils.cpp comment for explanation on why this happens and how to fix if you want to use asmjit. * build-macos: fix source maps for Mac Tell Qt not to strip debug symbols when we're in debug or relwithdebinfo modes. * LLVM PPU: fix aarch64 on macOS Force MachO on macOS to fix LLVM being unable to patch relocations during codegen. Adds Aarch64 NEON intrinsics for x86 intrinsics used by PPUTranslator/Recompiler. * virtual memory: use 16k pages on aarch64 macOS Temporary hack to get things working by using 16k pages instead of 4k pages in VM emulation. * PPU/SPU: fix NEON intrinsics and compilation for arm64 macOS Fixes some intrinsics usage and patches usages of asmjit to properly emit absolute jmps so ASLR doesn't cause out of bounds rel jumps. Also patches the SPU recompiler to properly work on arm64 by telling LLVM to target arm64. * virtual memory: fix W^X toggles on macOS aarch64 Fixes W^X on macOS aarch64 by setting all JIT mmap'd regions to default to RW mode. For both SPU and PPU execution threads, when initialization finishes we toggle to RX mode. This exploits Apple's per-thread setting for RW/RX to let us be technically compliant with the OS's W^X enforcement while not needing to actually separate the memory allocated for code/data. * PPU: implement aarch64 specific functions Implements ppu_gateway for arm64 and patches LLVM initialization to use the correct triple. Adds some fixes for macOS W^X JIT restrictions when entering/exiting JITed code. * PPU: Mark rpcs3 calls as non-tail Strictly speaking, rpcs3 JIT -> C++ calls are not tail calls. If you call a function inside e.g. an L2 syscall, it will clobber LR on arm64 and subtly break returns in emulated code. Only JIT -> JIT "calls" should be tail. * macOS/arm64: compatibility fixes * vm: patch virtual memory for arm64 macOS Tag mmap calls with MAP_JIT to allow W^X on macOS. Fix mmap calls to existing mmap'd addresses that were tagged with MAP_JIT on macOS. Fix memory unmapping on 16K page machines with a hack to mark "unmapped" pages as RW. * PPU: remove wrong comment * PPU: fix a merge regression * vm: remove 16k page hacks * PPU: formatting fixes * PPU: fix arm64 null function assembly * ppu: clean up arch-specific instructions	2022-06-14 15:28:38 +03:00
Eladash	264253757c	rsx: Improve Null Renderer	2022-06-12 20:54:42 +03:00
Ani	2512e958fa	glsl: Avoid implicit int->uint conversions (#12220 )	2022-06-12 18:05:43 +01:00
Elad Ashkenazi	280aa6da91	rsx: Fix NV406E semaphore_acquire timeout detection (#12205 )	2022-06-12 12:34:29 +03:00
Malcolm Jestadt	0d022d420b	RSX: Add more wide paths for upload_untouched - Adds AVX512 path for upload_untouched u16 with primitive restart, and AVX2 and AVX512 paths for upload_untouched without restart - The AVX512 paths handle the remainder in simd code with masking, which provided a large speedup - On my i5-1135G7 in demons souls benchmarking a scene in boletaria with a lot of geometry on screen via perf: SSE4_1 0.64% AVX2 0.59% AVX512 0.56% AVX512 w/ remainder masking 0.51%	2022-06-12 06:23:55 +03:00
Elad Ashkenazi	ec530a2c91	rsx: Suggest to try setting RSX FIFO Accuracy to a higher mode of accuracy on crash (#12204 )	2022-06-11 23:26:12 +02:00
kd-11	7530b3c971	vk: Fix image view search and destroy	2022-06-09 02:13:55 +03:00
Eladash	f9bc7458d4	rsx: Resurgence of HLE GCM	2022-06-06 12:56:25 +02:00
kd-11	6c315e8aee	gl: Disallow overlapping binding points	2022-06-05 10:13:41 +03:00
Elad Ashkenazi	88faac7bbc	rsx: Minor fixup (#12165 )	2022-06-04 15:04:27 +01:00
Elad Ashkenazi	9bb7e8d614	rsx: Implement atomic FIFO fetching (stability improvement) (non-default setting) (#12107 )	2022-06-04 15:35:06 +03:00
kd-11	286f97fad0	rsx: Reduce some error spam	2022-06-04 14:02:33 +03:00
kd-11	f0a02e0d9d	gl: Fix leaking texture views	2022-06-04 14:02:33 +03:00
kd-11	8185bfe893	gl: Track image destruction and remove handles from state tracker - Handles are reused for different resources which can cause problems	2022-06-04 14:02:33 +03:00
kd-11	d577cebd89	gl: Refactor image and command-context handling - Move texture object code out of the monolithic header - All texture binds go through the shared state - Transient texture binds use a dedicated temp image slot shared with native UI	2022-06-04 14:02:33 +03:00
kd-11	167161d8ce	rsx: Restore some accidentally removed depth-format conversion macros	2022-06-03 11:54:09 +03:00
kd-11	b8b0ecabd8	gl: Fix data pointer on the optimized AMD path	2022-06-03 11:54:09 +03:00
kd-11	bb05de2e80	gl: Fix copypasta	2022-06-03 11:54:09 +03:00
kd-11	7890e87234	gl: Fix warning	2022-06-03 11:54:09 +03:00
kd-11	25c05867d6	gl: Fix ring buffer remove() function - Fixes crash on running a second game in the same session	2022-06-03 11:54:09 +03:00
kd-11	a421270c19	gl: Use new scratch buffer system	2022-06-03 11:54:09 +03:00
kd-11	764fb57fdc	gl: Implement scratch ring buffer with memory barriers	2022-06-03 11:54:09 +03:00
kd-11	3fd846687e	gl: Refactor buffer object code	2022-06-03 11:54:09 +03:00
kd-11	ff9c939720	gl: Assume decode buffer is to be used as SSBO as this seems to be a hint to the driver about where to put the buffer Part of OpenGL's achilles' heel - the API does not distinguish between VRAM and SYSTEM memory at all and relies on developers wrestling with the driver's heurestic algorithm for this.	2022-06-03 11:54:09 +03:00
kd-11	234db2be3f	gl: Fix texture binding in overlay renderer	2022-06-03 11:54:09 +03:00
kd-11	fc44d53bb0	gl: Reset buffer size on destroying the GPU handle	2022-06-03 11:54:09 +03:00
kd-11	555a4b5f5c	gl: Suggest readback buffer as ssbo if it is not provided - We're likely to jump into a compute or readback pass anyway.	2022-06-03 11:54:09 +03:00
kd-11	a6e6df1445	gl: Implement fast texture readback for D24X8 and RGBA8/BGRA8	2022-06-03 11:54:09 +03:00
Nekotekina	76c72351a5	rsx_methods: fix warning	2022-06-02 12:56:49 +03:00
kd-11	eb52ac55a7	gl: Fix AMD buffer decode	2022-05-31 23:34:14 +03:00
kd-11	d167582f6b	gl: Implement on-chip buffer-to-d24x8 conversion	2022-05-31 23:34:14 +03:00
kd-11	dd6cb054a7	gl: Add missing viewport save	2022-05-31 23:34:14 +03:00
kd-11	b97557ce7b	gl: Use DSA for compressed texture upload	2022-05-31 23:34:14 +03:00
kd-11	964fd1095e	gl: Properly preserve texture state - Remove rogue glBindTexture calls and use gl commandstate object instead	2022-05-31 23:34:14 +03:00
kd-11	fcc6c2384b	Fix linux build	2022-05-31 23:34:14 +03:00
kd-11	a5d73f41b5	gl: Remove debug message	2022-05-31 23:34:14 +03:00
kd-11	1b305bf789	gl: Workaround for poor AMD OpenGL performance - Turns out the AMD driver really hates it if you render with a mapped index buffer. The driver internally seems to make a copy of the consumed indices and uses that. Very slow. I was able to isolate this after observing that glDrawArrays is not entirely shit, but glDrawElements duration scaled linearly with the number of vertices.	2022-05-31 23:34:14 +03:00
kd-11	943752db30	gl: Compute optimizations - Keep buffers around longer to allow driver heurestics to work - Properly initialize the shaders to allow optimal workgroup dispatch size	2022-05-31 23:34:14 +03:00
kd-11	60a2a39e88	gl: Deswizzle textures on the GPU	2022-05-31 23:34:14 +03:00
kd-11	532563e861	gl: Update some more buffer-object functions	2022-05-31 23:34:14 +03:00
kd-11	3ee27bd434	gl: Optimize consumption of buffer objects when uploading textures	2022-05-31 23:34:14 +03:00
kd-11	55e68441cb	gl: Commit to bindless framebuffer object management	2022-05-31 23:34:14 +03:00
kd-11	7ec481d99b	rsx: Allocate scratch memory using simple array with no default initialize - This cuts down processing time significantly by eliminating calls to memset_stosb	2022-05-31 23:34:14 +03:00

1 2 3 4 5 ...

3762 commits