Commit graph

751 commits

Author SHA1 Message Date
Eladash 945abcc6cd rsx: Align down index array offset
* Also use improved to_be_t<> template (recetly ignoring one byte long types) for vm gsl::byte referencing, remove redundent narrow<> cast (same type)
2019-10-22 13:45:09 +03:00
kd-11 901942f24a rsx: Replace pointless f32[4] restriction on texture parameters.
- Use a struct instead to improve readability and remove pointless OpBitCast
2019-10-22 13:44:49 +03:00
kd-11 f7842b765f rsx: Implement packed format renormalization
- Renormalizes arbitrary N-bit values as 8-bit normalized.
- NV hardware performs integer normalization at 8 bits if the size is less than 8.
- This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.
2019-10-22 13:44:49 +03:00
Eladash 29cddc30f0 rsx: Fix vblank signals flood after Emu.Resume() 2019-10-21 15:31:45 +03:00
Eladash 5de0005f5a rsx: Report full method range on invalid methods
Also report full command on fifo desync event for the first time
2019-10-21 15:31:45 +03:00
eladash 730e9cde84 sys_rsx: Improve allocations and error checks
* allow sys_rsx_device_map to be called twice: in this case the DEVICE address retrived from the previous call returned
* Add ENOMEM checks for sys_rsx_memory_allocate and sys_rsx_context_allocate
* add EINVAL check for sys_rsx_context_allocate if memory handle is not found
* Separate sys_rsx_device_map allocation from sys_rsx_context_allocate's
* Implement sys_rsx_memory_free; used by cellGcmInit upon failure
* Added context_id checks
* Throw if sys_rsx_context_allocate was called twice.
2019-10-21 15:31:45 +03:00
Eladash 06017cb14e rsx: Recover from invalid writes to CELL_GCM_NV4097_SET_INDEX_ARRAY_DMA
Also: Trigger a FIFO recovery when encountering an invalid method.
2019-10-10 19:34:23 +03:00
Eladash 0b2fa6ffdc rsx: Flush FIFO GET before smeaphore_acquire 2019-09-30 17:30:15 +03:00
Eladash 822287b418 rsx: Avoid unsigned/signed mismatch with fifo ret addr 2019-09-29 13:05:24 +03:00
kd-11 1464069476 rsx: Restructure deferred flip queue handling
- Allows frameskipping to occur naturally if RSX thread is bombarded with flip requests but just jumping to the last one if possible
- See request_emu_flip() for async frame submission and implicit skipping
- Also allows display queue to fill faster than the flip thread can drain the queue
2019-09-28 21:13:56 +03:00
Nekotekina bd1a24b894 Tidy endianness support (se_t) implementation
Move se_t and se_storage to util/endian.hpp
Use single template instead of two specializations.
Add minor optimization for MSVC.
Remove v128 dependency.
Try to enable intrinsics for unaligned data.
Fix minor bug in u16/u32/u64 specializations.
2019-09-28 15:39:50 +03:00
Nekotekina 5f9c5e8765 Use g_fxo for rsx::thread 2019-09-26 23:26:36 +03:00
kd-11 1a892c6b1b rsx: Avoid recursion in flip handler 2019-09-20 15:08:41 +03:00
kd-11 e0005ec347 rsx: Refactoring and improvement
- Separate displayed statistics from actual backend statistics.
  Allows asynchronous flipping to work correctly as it just uses display stats.
  The real stats are used by the frame scope marker to determine behavior like engaging the FIFO optimizer or skipping draw calls correctly.
2019-09-19 23:10:09 +03:00
kd-11 2c76f47eec rsx: Restructure flip code and frame scoping
- Add an explicit frame scope marker tied in with the queue_prepare command
  Since queue_prepare is emitted at the end of a frame, it can be used as end-of-frame in games that emit this
- If this command is not emitted, fifo flatenner and frameskip will not work
2019-09-19 23:10:09 +03:00
kd-11 52e8747b83 rsx: Workaround for exit deadlock
- Avoids games locking up when the stop button is pressed
2019-09-12 23:32:21 +03:00
kd-11 7f99de36c1 rsx: Fixup for surface_target_a flag being broken
- While the mask for surface_a is at index 0, the surface cache expects the order to be maintained correctly!
  Set the correct mask since surface store now checks each RTT individually
2019-08-30 21:46:19 +03:00
kd-11 e0a7912d7c rsx: Check for stencil writes when determining zeta_write flag 2019-08-30 21:45:41 +03:00
kd-11 04c808b8ab rsx: Fixup for MRT color write lookup and surface_target_a 2019-08-28 16:12:10 +03:00
kd-11 2962e05f26 rsx: Implement per-RTT color masks
- Also refactors and simplifies some common code in surface store and rsx core
2019-08-27 21:59:02 +03:00
Nekotekina d2eba2387b Use g_fxo for display_manager 2019-08-27 03:50:15 +03:00
Nekotekina 38a06c4b14 Use g_fxo for SysRsxConfig
Rename to lv2_rsx_config
2019-08-27 03:50:15 +03:00
kd-11 3e28e4b1e0 rsx/decompiler: Restructure program register behavior
- Fix reading of varying registers in FP
  Different registers have different behavior
- Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1)
- Reimplements two-sided lighting correctly without hacks
- Also bumps shader cache version
2019-08-26 20:03:31 +03:00
kd-11 f9aea076ae rsx: Implement depth_buffer_float support.
- Since this is transparent to the application at all time, it only becomes a problem when doing memory transfer or DEPTH->RGBA conversion in shaders.
2019-08-26 20:03:31 +03:00
kd-11 9d981de96d rsx: Fix offloader deadlock
- Do not allow offloader to handle its own faults. Serialize them on RSX instead.
  This approach introduces a GPU race condition that should be avoided with improved synchronization.
- TODO: Use proper GPU-side synchronization to avoid this situation
2019-08-25 22:09:20 +03:00
kd-11 f0bd0b5a7c rsx: Conditional render sync optimization
- ZCULL queue was updated to one-per-cb but the conditional render sync hint was not updated.
- Do not unconditionally flush the queue unless the upcoming ref is contained in the active CB.
- This avoids spamming queue flush, which frees up resources and improves performance
2019-07-30 21:13:42 +03:00
Eladash 230c3d55b6 Fixup 2019-07-27 04:03:29 +01:00
Eladash fcc75c8b0f rsx: Write atomically semaphore updates and fix zcull timestamp 2019-07-26 21:27:55 +03:00
Eladash c53f0dd7b5 rsx: Fix gcm unmap events 2019-07-26 21:27:55 +03:00
Eladash 85b1152e29 Timers scaling and fixes 2019-07-23 00:09:01 +01:00
kd-11 9a7c2784f0 rsx: Do not clip scissor to viewport when doing buffer clear 2019-07-20 16:39:32 +03:00
kd-11 e2574ff100 rsx: Support CSAA transparency without multiple rasterization samples enabled 2019-07-19 15:49:08 +03:00
kd-11 b5a2f0df68 rsx: Implement separate viewport raster clipping
- Merge viewport raster window and scissor into one clipping region
- Viewport raster clip is different from viewport geometry clipping in
hardware as the latter is configurable separately
2019-07-19 14:21:19 +03:00
Eladash c4d8ef4340 rsx: Allow to configure vblank rate
Removed "HLE protection" hack from sys_rsx_context_attribute
2019-07-12 00:19:56 +03:00
kd-11 d8f753f1e8 rsx: Do not allow framebuffer surfaces that exceed their allocated pitch dimensions
- Truncate surfaces to forcefully fit inside the declared region
2019-07-11 13:22:13 +03:00
kd-11 219a5382f7 rsx: If no array streams are enabled, mark inline array as disabled (null render) 2019-07-09 16:27:59 +03:00
Eladash 43f919c04b Fixup after #6143 (#6146)
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes #6145.
    Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
    Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
    Used sequantially consistent ordering in semaphore_release for TSX path as well.
    Improved memory ordering for sys_rsx_context_iounmap/map.
    Fixed sync bugs in HLE gcm because of not using atomic instructions.
    Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
    Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
    Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
2019-06-29 18:48:42 +03:00
Eladash 1ee7b91646 Refactoring (#6143)
Prefer vm::ptr<>::ptr over vm::get_addr.
    Prefer vm::_ptr/base over vm::g_base_addr with offset.
    Added methods atomic_t<>::bts and atomic_t<>::btr .
    Removed obsolute rsx:🧵:Read/WriteIO32 methods.
    Removed wrong check in semaphore_release.
    Added handling for PUTRx commands for RawSPU MFC proxy.
    Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise.
    Fixed more potential overflows that may result in wrong behaviour.
    Added io/size alignment check for sys_rsx_context_iounmap.
    Added rsx::constants::local_mem_base which represents RSX local memory base address.
    Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.
2019-06-29 01:27:49 +03:00
JohnHolmesII 23094b48bb Fix warnings related to -Wswitch
Add default cases.
Move default breaks to newline
Add proper handling in some instances.
Add missing enums to switches
2019-06-28 01:40:52 +03:00
kd-11 4ff77a8555 rsx: Improve balancing of the offloader thread
- Use two counters to avoid atomic operations
- Yield instead of sleeping because some games are very sensitive to timing
2019-06-25 20:50:54 +03:00
kd-11 d26b25816d rsx: Improve profiling setup
- Avoid spamming QPC when not needed
- Free performance when debug overlay is not enabled
2019-06-25 20:50:54 +03:00
kd-11 b893a75002 rsx: Rework RSX offloading
- Use a lockless queue
- Do not enqueue small transfers
2019-06-25 20:50:54 +03:00
kd-11 0fa3bcc336 rsx: Asynchronous data transfer 2019-06-25 20:50:54 +03:00
Lassi Hämäläinen c963c51a60 Remove unnecessary header includes
- Manually removed lot of unneeded #includes to clean code and reduce
  compilation time
- Reordered some of the #includes to be in more logical order
2019-06-25 17:11:10 +03:00
Eladash cd0ef99df5 Fix BE endianess arch support in semaphore_406e (#6116)
Add raw() methods for endianness support types and make use of it.
2019-06-21 19:29:49 +03:00
kd-11 bca5f94b3f rsx: Add option to toggle MSAA 2019-06-14 16:19:52 +03:00
kd-11 4a5bbba277 rsx: Enable MSAA
- vk: Enable depth buffer resolve+unresolve
- vk: Add AMD stenciling extension support
- rsx: Temporarily disables MSAA-compatible hacks such as transparency AA
- TODO: Add paths to optionally disable MSAA
2019-06-14 16:19:52 +03:00
kd-11 0d906d6974 rsx: Remove surface aa_mode hacks 2019-06-14 16:19:52 +03:00
scribam 13671d9684 rsx: Apply Clang-Tidy fix "modernize-loop-convert" + const when relevant 2019-06-12 15:11:52 +03:00
scribam 635695ac78 rsx: Apply Clang-Tidy fix "modernize-use-emplace" 2019-06-12 15:11:52 +03:00
scribam a555504142 rsx: Apply Clang-Tidy fix "modernize-deprecated-headers" 2019-06-12 15:11:52 +03:00
scribam db926ee671 rsx: Apply Clang-Tidy fix "performance-unnecessary-value-param" 2019-06-12 15:11:52 +03:00
scribam 81a3b49c2f rsx: Apply Clang-Tidy fix "readability-container-size-empty" 2019-06-12 15:11:52 +03:00
Nekotekina dfd50d0185 Implement std::bit_cast<>
Partial implementation of std::bit_cast from C++20.
Also fix most strict-aliasing rule break warnings (gcc).
2019-06-02 23:22:16 +03:00
scribam 78c7ef3039 rsx: Use clear() instead of resize(0)
The result is the same but clear [1] has slightly less code than resize [2] and signals better the intent IMHO.

[1] fb7fb646fa/libstdc%2B%2B-v3/include/bits/stl_vector.h (L1495)
[2] fb7fb646fa/libstdc%2B%2B-v3/include/bits/stl_vector.h (L934)
2019-06-01 22:59:23 +03:00
kd-11 463b1b220d rsx: Improve accuracy of shadow compare Ops when non-integer depth formats are used
- The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour
- D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders
2019-04-25 16:23:05 +03:00
eladash 6f76e34104 rsx: Fix race on clearing native_ui vs emu_requested flag 2019-04-20 01:04:41 +03:00
eladash 888cb9d673 Remove reader_lock executed in every instruction by RSX
Use optimistic double check instead, use one load instruction for the check to be atomic

+ Read emu status once every FIFO iteration
2019-04-20 01:04:41 +03:00
kd-11 0f7af391d7 vk: Implement copy-to-buffer and copy-from-buffer for depth_stencil
formats
- Allows D24S8 and D32S8 transport via typeless channels
- Allows uploading and downloading D24S8 data easily
- TODO: Implement optional byteswapping to fix flushed readbacks with
the same method
2019-04-09 13:40:54 +03:00
eladash 8185ef7610 rsx: Improve vblank accuracy 2019-03-31 14:57:21 +03:00
kd-11 f4ebcb0029 rsx: Properly decode packed renders from the type flag
- Seems to occupy bits [8-9]
2019-03-10 16:09:05 +03:00
eladash 6f770c8e35 Fix potential crash in begin_occlusion_query() while closing the Emu 2019-01-30 18:44:29 +03:00
kd-11 fb778e4821 rsx: Reimplement attrib divisor 2019-01-25 14:34:22 +03:00
kd-11 6fdc0fd7f0 rsx: Reimplement MSAA transparency
- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
2019-01-25 14:34:22 +03:00
kd-11 0f64583c7a rsx: Reimplement pitch lookup
- Remove the required_xxx_pitch constraint as it makes no sense. The pitch controls what can be written per line.
- It is possible to have a huge surface width but only render to a small region at the beginning and have a smaller pitch than can fit the surface (NFS carbon)
2019-01-06 10:44:40 +03:00
kd-11 64a8829614 rsx: Minor cleanup 2019-01-06 10:44:40 +03:00
kd-11 15488eb247 rsx: Avoid unnecessarily touching framebuffer memory
- Do not bind companion framebuffer when clearing single aspect; let the
  contest mechanism sort it out instead
- Do not prematurely tag framebuffers, instead only do so at
  write-confirmation time. Should avoid false tagging if setup does not
  allow a render to occur.
2019-01-06 10:44:40 +03:00
eladash db784556aa rsx: Evaluate cond render test at set_render_enabled 2018-12-30 15:04:59 +01:00
kd-11 f48abde14b rsx: Fixups for immediate rendering mode
- Immediate mode is isolated from the rest of the vertex configuration
- TODO: Verify register behaviour when immediate mode is used
  Check if per-primitive const register values are supported (likely are)
2018-12-24 09:05:19 +03:00
eladash 45ed58cdaf Fix rsx capture replay
Allow to capture non-increment cmd flag that was missing in command.reg
2018-12-15 19:40:18 +03:00
eladash 87988e9da8 rsx fifo: Stability improvements
* Restore stack in fifo error handling

* Update get register after the cmd execution

* Fix put pause in the middle of command

* Add restore points when branching to self

* Precise nopcmd detection

* Test all invalid cmds for early treatment of queue corruption
2018-12-15 19:40:18 +03:00
eladash 415b995a54 log rsx get ctrl 2018-12-15 19:40:18 +03:00
Nekotekina 476090a747 Detach VBlank and RSX Decompiler threads
Should fix exception handling in RSX Thread
2018-12-04 23:41:54 +03:00
kd-11 ec768afbd9 rsx: Flip workarounds for applications that flip via syscall
- Do not assume flip marks end-of-frame if executed via syscall
- Also disables skip_frame for these applications as there is no frame boundary
- NOTE: QUEUE_HEAD cannot be relied on as it is seemingly possible to flip the same head and not need to queue it
2018-11-30 23:51:25 +03:00
kd-11 2168159d03 gl: Fix flip regression
- Restore graphics state after flip (including active fbo) because flip can be made through a syscall
2018-11-30 23:51:25 +03:00
kd-11 5b6e1420f3 rsx: Pipeline barriers fixed up
- Ensure barriers are invoked even if no draw occurs!
-- Ensures that deferred commands are executed eventually
2018-11-30 23:51:25 +03:00
kd-11 1d19f71a46 rsx: Re-enable fifo error reset 2018-11-30 23:51:25 +03:00
kd-11 718a04c84f fixup: Clear disabled attrib entries 2018-11-30 23:51:25 +03:00
kd-11 833c25894f [WIP] rsx: Rebase cleanup 2018-11-30 23:51:25 +03:00
kd-11 5193c99973 rsx: Enable dynamic FIFO preprocessing
- Tries to detect when FIFO preprocessing is beneficial and only enables optimizations if the benefit outweighs the cost
- Current threshold is at least 500 draw calls saved at over 2000 draw calls to justify the overhead
- TODO: More tuning for other CPUs
2018-11-30 23:51:25 +03:00
kd-11 7b065d7781 rsx: Fixup; input attributes blob decoding
- Use an unstructured blob and index into the vec4 structures to extract the real data
2018-11-30 23:51:25 +03:00
kd-11 846daadd5d rsx: Fixups
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
2018-11-30 23:51:25 +03:00
kd-11 2e32777375 rsx: Scrap the prebuffered queue approach
- Basically starting over
- The cost of making command copies into the queue has a measurable impact
2018-11-30 23:51:25 +03:00
kd-11 1ad76ad331 rsx: Restructure programs
- Also re-enable pipeline optimizations
2018-11-30 23:51:25 +03:00
kd-11 677b16f5c6 rsx: Fixups
- Also fix visual corruption when using disjoint indexed draws

- Refactor draw call emit again (vk)

- Improve execution barrier resolve
  - Allow vertex/index rebase inside begin/end pair
  - Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]

- gl bringup

- Simplify
  - using the simple_array gets back a few more fps :)
2018-11-30 23:51:25 +03:00
kd-11 e01d2f08c9 rsx: Refactor FIFO
- Removes fifo structures from common RSXThread
- Sets up a dedicated FIFO controller
- Allows for configurable queue optimizations
2018-11-30 23:51:25 +03:00
eladash 37b6afaf2c rsx: inlined array stride fix 2018-11-11 23:17:07 +03:00
eladash 75221a6078 rsx: Fix inlined vertex array validation 2018-11-04 22:57:18 +03:00
Megamouse d56c85fe01 RSX/Capture: fix filePath and remove strict mode check (#5283)
- Fixes regression introduced by kd-11 when merging in jarves' flip rework.
2018-10-27 13:06:50 +03:00
elad 6829fa0286 rsx: Improve inlined arrays (#5248)
* rsx: Implement register reads in inlined arrays

* rsx: Check for disabled streams in inlined arrays
2018-10-20 16:00:53 +03:00
Nekotekina 1b37e775be Migration to named_thread<>
Add atomic_t<>::try_dec instead of fetch_dec_sat
Add atomic_t<>::try_inc
GDBDebugServer is broken (needs rewrite)
Removed old_thread class (former named_thread)
Removed storing/rethrowing exceptions from thread
Emu.Stop doesn't inject an exception anymore
task_stack helper class removed
thread_base simplified (no shared_from_this)
thread_ctrl::spawn simplified (creates detached thread)
Implemented overrideable thread detaching logic
Disabled cellAdec, cellDmux, cellFsAio
SPUThread renamed to spu_thread
RawSPUThread removed, spu_thread used instead
Disabled deriving from ppu_thread
Partial support for thread renaming
lv2_timer... simplified, screw it
idm/fxm: butchered support for on_stop/on_init
vm: improved allocation structure (added size)
2018-10-19 22:22:35 +03:00
Nekotekina da6ce80f4f Make vm::get_super_ptr return contiguous memory
Cleanup RSX code complexity
2018-09-27 23:37:13 +03:00
eladash 72ba062b1a rsx: Fix pfifo ret opcode 2018-09-27 17:47:32 +03:00
eladash a47ebad24c rsx: Fix pfifo nop cmd 2018-09-27 17:47:32 +03:00
Nekotekina 306f95a9ae New named_thread template (preview)
Old class named_thread renamed to old_thread
It's too hard to move in a single commit
2018-09-27 14:04:16 +03:00
kd-11 6a9f234dc7 rsx: Fixup flip behaviour
- handle_emu_flip is very heavy, only fire
2018-09-26 19:41:50 +03:00
kd-11 f72157bcec rsx: Fix vertex attrib parsing 2018-09-25 22:03:35 +03:00
kd-11 a3d44b5e1f rsx: Cleanup changes for the flip patch 2018-09-24 16:44:02 +03:00
Jake 699eadc84f rsx: Move render flip from rsx queue command to flip command 2018-09-24 16:44:02 +03:00
Rui Pinheiro 35139ebf5d Texture cache cleanup, refactoring and fixes 2018-09-24 15:26:40 +03:00
Rui Pinheiro f3029b2b42 Change Cell->RSX map/unmap notifications
This allows for further flexibility on the RSX side, allowing us to fix
some bugs and crashes in later commits.
2018-09-24 15:26:40 +03:00
eladash e0a676a3fe rsx: Fix vertex arrays fetch with inlined draws 2018-09-24 13:25:05 +03:00
eladash e6b68b260a rsx: Improve FIFO mem faults handling
increase the delay between faults, reduce log spam by allowing the messages to stack up
2018-09-20 01:05:40 +03:00
eladash a8ea576b22 rsx/cellgcm: Implemet initialization registers reset 2018-09-20 01:05:40 +03:00
elad d24f9194f7 typo fix
shader's location is decremented by one to match cellGcm's constants.
2018-09-13 16:49:58 +03:00
eladash b9ad578b00 rsx: Add a default shader address state 2018-09-13 16:49:58 +03:00
scribam 4cb98014a2 rsx: tiny zcull optimizations 2018-09-13 12:43:40 +03:00
eladash efbd77deb4 rsx: dont silently ignore null shader address 2018-09-12 00:40:20 +03:00
Nekotekina ca5158a03e Cleanup semaphore<> (sema.h) and mutex.h (shared_mutex)
Remove semaphore_lock and writer_lock classes, replace with std::lock_guard
Change semaphore<> interface to Lockable (+ exotic try_unlock method)
2018-09-03 23:00:36 +03:00
kd-11 2e0ecb556c rsx: Possible fix for UB data type consistency 2018-09-03 18:24:20 +03:00
kd-11 6399833182 rsx: Fix endianness order when immediate mode register is updated, but used as register lookup
- Simplify the code by unifying all the register-backed memory
2018-09-03 18:24:20 +03:00
kd-11 c6e35706a3 vk: Support sw component swizzle decode because metal sucks 2018-08-23 22:54:56 +03:00
eladash 56d553f10d Rsx: fix cmd jump over put register 2018-08-22 12:20:31 +03:00
kd-11 dd21e43ed5 rsx: Force disable draw reordering when capturing a frame 2018-08-18 16:14:30 +03:00
kd-11 0f36e87010 zcull: Improve the delay algorithm to be more consistent
- Use proper time checking; depending on what is being done one 'tick' can
  be almost a millisecond long or several nanoseconds
- Avoid spamming the system timer unless necessary
2018-08-18 16:14:30 +03:00
kd-11 38191c3013 rsx: Avoid acquiring the vm lock; deadlock evasion
- A possible deadlock is still present if rsx is trying to get a super_ptr whilst the vm lock holder is in an access violation
  This patch makes this scenario very unlikely since each block need only be touched once
2018-08-18 16:14:30 +03:00
kd-11 373e02e91c rsx: Timestamp accuracy workaround 2018-08-18 16:14:30 +03:00
kd-11 1200ca8172 rsx: Optimize hash_struct; vk cleanup 2018-08-18 16:14:30 +03:00
kd-11 d0165290b6 rsx: Refactor and fix framebuffer layout checks
- Refactors shared code back into rsx core
- Adds extra check to avoid contest confusion
2018-08-18 16:14:30 +03:00
kd-11 9f0fada17a ZCULL: lower notice severity to avoid spam 2018-08-18 16:14:30 +03:00
kd-11 8800c10476 zcull synchronization tweaks
- Implement forced reading when calling update method to sync partial lists
- Defer conditional render evaluation and use a read barrier to avoid extra work
- Fix HLE gcm library when binding tiles & zcull RAM
2018-08-18 16:14:30 +03:00
kd-11 3b47e43380 rsx: Synchronization rewritten
- Do not do a full sync on a texture read barrier
- Avoid calling zcull sync in FIFO spin wait
- Do not flush memory to cache from the renderer side; this method is now obsolete
2018-08-18 16:14:30 +03:00
eladash f349695a75 Rsx: rewrite address translation 2018-08-13 16:16:34 +03:00
eladash 9e380a4a4a Rsx: avoid invalid cmds execution 2018-08-12 23:37:24 +03:00
eladash c80eb1ba02 Rsx: fix CALL and RET cmd 2018-08-12 23:37:24 +03:00
Megamouse eecb984689 RSX/Qt: add more performance overlay options to the gui 2018-07-28 23:10:45 +02:00
kd-11 19d808d378 rsx/gl: Minor cleanup and optimization
- Track register change status
- Remove unused gl classes
2018-07-22 17:19:59 +03:00
kd-11 e7f30640ef rsx: Async shader compilation
- Defer compilation process to worker threads
- vulkan: Fixup for graphics_pipeline_state.
  Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)
2018-07-14 15:19:56 +03:00
kd-11 fa55a8072c rsx: Improve vertex textures support
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
2018-07-12 18:02:28 +03:00
kd-11 4d40ed9dbd rsx: Silence harmless warning 2018-07-07 16:20:33 +03:00
kd-11 2ca935a26b vp: Improve vertex program analyser
- Adds dead code elimination
- Fix absolute branch target addresses to take base address into account
- Patch branch targets relative to base address to improve hash matching
- Bumps shader cache version
- Enables shader logging option to write out vertex program binary,
  helpful when debugging problems.
2018-07-07 16:20:33 +03:00
eladash 345f92ab0a rsx: more efficient command reading 2018-06-27 21:59:34 +03:00
kd-11 1730708f47 rsx: Rework memory protection management for framebuffer access
- Avoid re-locking memory if there is no reason to do so (no draws issued)
- Actively bound regions should always get written to the backing cache
- Forcefully read memory during download if writes to the target have occured since last sync event
2018-06-26 20:07:20 +03:00
eladash b456955688 rsx: fix hardcoded rsx allocation address 2018-06-24 10:57:30 +03:00
VelocityRa dd0684b58a overlays/perf_overlay: Make pos, font, opacity, margin configurable
- Also some perf overlay refactoring
2018-06-18 22:34:26 +03:00
kd-11 6362942928 rsx: Avoid semaphore acquire deadlock 2018-05-30 13:30:23 +03:00
VelocityRa c8d8a81ccd overlays: Performance Overlay 2018-05-30 12:35:41 +03:00
eladash 23b380eb41 allow deallocations to unmap rsx mapped memory 2018-05-29 19:57:28 +03:00
kd-11 83f9be2524 rsx: Promote FIFO optimizations outside of strict mode
- The benefits of FIFO optimizations are huge in some cases.
  The optimizations also do not break any tested applications so no need to disable with strict mode
- A debug option is provided to disable this behaviour for testing
2018-05-29 13:54:30 +03:00
kd-11 493d4e8613 fixup - Improve invalidated region checks for performance 2018-05-24 10:36:04 +03:00
kd-11 92b5a705d8 fixup - locking 2018-05-23 19:07:08 +03:00
kd-11 b957eac6e8 rsx: Avoid calling any blocking callbacks from threads that are not rsx::thread
- Defers on_notity_memory_unmapped to only run from within rsx context
- Avoids passive_lock + writer_lock deadlock
2018-05-23 19:07:08 +03:00
kd-11 8fcd5c1e5a rsx: Texture cache fixes
1. rsx: Rework section synchronization using the new memory mirrors
2. rsx: Tweaks
    - Simplify peeking into the current rsx::thread instance.
      Use a simple rsx::get_current_renderer instead of asking fxm for the same
    - Fix global rsx super memory shm block management
3. rsx: Improve memory validation. test_framebuffer() and
tag_framebuffer() are simplified due to mirror support
4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine
5. rsx: Explicitly mark clobbered flushable sections as dirty to have them
removed
6. rsx: Cumulative fixes
    - Reimplement rsx::buffered_section management routines
    - blit engine subsections are not hit-tested against confirmed/committed memory range
      Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops
2018-05-23 19:07:08 +03:00
kd-11 f6f45b8699
Native UI refactored (#4623)
Refactor and improve native overlays
2018-05-20 23:05:00 +03:00
scribam 04ad49de4d typos 2018-05-14 21:14:39 +04:00
kd-11 bff6060bd6 rsx: Improve puller state management
- Properly identify puller spin primitives
- Add a small wake delay after exiting a spin delay. Fixes desynchronization
  It seems real hw has a small delay between cell edits to commandbuffer memory at the GET address and the changes becoming visible to the DMA puller
  Simulated with a short busy_wait, large values will improve sync but degrade performance
2018-05-13 14:44:14 +03:00
kd-11 b7979d3f57 rsx/vk: Improvements and minor optimizations
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
  transform data uploads are quite expensive
2018-05-13 14:44:14 +03:00
kd-11 440a31ef18 rsx: Optimizations for program management 2018-05-13 14:44:14 +03:00
kd-11 a52ea7f870 rsx: Improve fragment and vertex program usage
- Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search
  - Avoids detecting shader as being different because of unused textures having state changes
  - Adds better program size detection for vertex programs
- Improved vertex program decompiler
  - Properly support CAL type instructions
  - Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes
  - Fix SRC checks and abort
  - Fix CC register initialization
  - NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)
2018-05-13 14:44:14 +03:00
Jake 75b40931fc rsx: initial capture/replay functionality (#4510)
* rsx: initial capture/replay functionality
2018-05-13 12:18:05 +03:00
kd-11 c5d1f30e82 rsx: Fix performance counters
- Detect jump-to-self type idling
2018-04-25 19:14:36 +03:00
kd-11 a42b00488d rsx: Texture fixes
- gl/vk: Fix subresource copy/blit
- gl/vk: Fix default_component_map reading
- vk: Reimplement cell readback path and improve software channel decoder
- Properly name the subresource layout field - its in blocks not bytes!
- Implement d24s8 upload from memory correctly
- Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules
- NOTE: Redirection of 16-bit textures is not implemented yet
2018-04-25 19:14:36 +03:00
kd-11 cfd0b8a975 rsx: Fix alphakill 2018-04-05 01:06:50 +03:00
kd-11 93b2776604 rsx: Fix vertex input detection
- Properly detect inline array registers vs constant value registers
- Silence needless spam, 306E is 2D surface engiine, the assumption that y is multiplied by 306E pitch is not crazy
2018-04-05 01:06:50 +03:00
Jake 6d6d6fa827 dx12/vk/gl: implement use of vertex_data_base_index when calculating index 2018-03-30 13:30:04 +03:00
kd-11 7627ad04f1 rsx: Disable gamma control on WZYX textures
- Gamma is seemingly used for (D/X/A)RGB only. Data textures are unaffected
2018-03-29 13:52:11 +03:00
kd-11 321c360dcb rsx: Overhaul rendertarget sampling/shuffles
- Reimplements render target views used for sampling
- Optimizes access using an encoded control token
- Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR)
- Adds proper encoding for ABGR textures (ABGR8 -> ARGB8)
- Silence some compiler warnings as well
- TODO: Real texture views for OGL current method is a hack
2018-03-25 13:31:06 +03:00
kd-11 9fc1740608 rsx/fp: Fragment program overhaul
- Separate TXB from TXL: They are completely different!
- Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment
- Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work
- Document some more bits in SHADER_CONTROL register
2018-03-25 13:31:06 +03:00
kd-11 27552891ad rsx/fp: Improvements
- Export some debug information in the free texture register space components zw
  Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
  Some engines (UE3) read all the components in the shader and use mul/mad with the result
2018-03-25 13:31:06 +03:00
kd-11 5f047034ae rsx: Disable async count verification to avoid lockup due to zombie reports in ZCULL 2018-03-13 18:55:03 +03:00
kd-11 f00d9a7c7f rssx" Halfplement alpha-to-coverage AA transparency 2018-03-13 18:55:03 +03:00
kd-11 2dce55d036 rsx: ZCULL synchronization fixes
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
  Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
2018-03-13 18:55:03 +03:00
kd-11 315798b1f4 rsx: ZCULL rewrite and other improvements
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
2018-03-13 18:55:03 +03:00
kd-11 dece1e01f4 rsx: Improve transform constants management
- Removes the duplicate local_transform_constants
- Resets the transform constants on every context reset
- Simplifies the code abit which should make it faster
- NOTE: Transform constants are persistent across context re-init events (VF5)
2018-03-13 18:55:03 +03:00
kd-11 0c8e4c0887 rsx: Improve FIFO commandlist flattening
- TODO: Alot of work is still needed to execute draw commands out of order
  Thats the only solution to games sending many draw calls with high frequency of state changes
2018-03-13 18:55:03 +03:00
kd-11 84b8a08d26 rsx: Basic performance counters 2018-03-13 18:55:03 +03:00
kd-11 af1b13550b rsx/vk: More optimizations
- Do not bother rechecking the dirty sampler pool for hits. Its faster to create new sampler than to search the pool
- Reserve some memory on vertex layout struct to reduce reallocation penalty
2018-03-13 18:55:03 +03:00
Jake 7233640cf0 rsx: add vertex data base to offset and mask before translating address 2018-03-07 16:57:20 +03:00
kd-11 bd297d079d rsx: Minor optimizations 2018-02-16 16:14:54 +03:00
kd-11 a5500ebfa4 rsx: Fix disjoint draw range splitting
- Fixes flickering and missing draws in R&C and other games such as Motorstorm Apocalypse and Okami HD when strict mode is disabled
2018-02-16 16:14:54 +03:00
Nekotekina cce0ad0c35 Clean vm::ps3 namespace use 2018-02-09 17:49:37 +03:00
Jake 2f414f96bf rsx: fix potential hang during thread close 2018-01-24 16:28:09 +00:00
kd-11 3d9e3a16f1 rsx/gl/vk: Fixes and optimizations
- opengl driver optimization for nvidia. On nvidia glTextureBufferRange performance is horrendous
-- Initialize texture buffer to whole buffer at startup and use absolute offsets to read data instead
-- Over 2x performance in some cases (Resogun, TNT racers)
- gl/vk: Do not flip non-existent display buffers. Fixes spec violation at boot in TNT racers demo
- whitespace fixes for sys_rsx
2018-01-22 11:43:35 +03:00
kd-11 0a2992839b rsx/gl/vk: Simulate z clipping with selective depth clamp
- The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances
-- If depth falls within near/far, clamp depth value to [0,1]
2018-01-19 12:03:57 +03:00
kd-11 cbc8bf01a1 cell/scheduler: Manage thread placement depending on cpu hardware 2018-01-19 12:03:57 +03:00
kd-11 71f69d1d48
rsx/overlays: Introduce 'native' HUD UI and implement some common dialogs (#4011) 2018-01-17 19:14:00 +03:00
Jake 0477f8ed3c rsx: add log for potential source of error 2018-01-14 20:50:55 +03:00
Jake 7ca2c444cc rsx: Fix depth clipping 2018-01-14 20:50:55 +03:00
kd-11 ee009ec99c rsx: Robustness fixes
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
2018-01-02 21:17:56 +03:00
kd-11 0d0821e914 rsx: Pause FIFO queue when changing ctrl registers 2017-12-18 10:45:37 +03:00
kd-11 90c2324e47 rsx: Program cache fixes
- Reorganize storage hash vs ucode hash
- Scan for actual fragment program start in case leading NOPed code precedes the actual instructions
-- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes
2017-12-04 18:22:18 +03:00
kd-11 90a3f3af30 rsx: Discard queue if RET is found without CALL 2017-12-01 21:00:50 +03:00
kd-11 de5a4fe083 rsx: Reimplement depth <-> RGBA reinterpretation code
- Implements proper channel order for fp24-ARGB8 conversion
- Takes swizzle remap into account when reconstructing source bytes
2017-12-01 21:00:50 +03:00
kd-11 ccc0383f75 vulkan: Implement overlay shader passes
- Implements vk::overlay_pass and vk::depth_convert_pass
- Also added a sanity check in RSX core for depth replace shaders
2017-12-01 21:00:50 +03:00
kd-11 680ca1d12a rsx: Zcull refactoring and vulkan implementation 2017-12-01 21:00:50 +03:00
kd-11 0aaae000b3 rsx: Minor improvements 2017-12-01 21:00:50 +03:00
Zion Nimchuk 3a9ae2df9e silence warnings in RSX stuff 2017-11-30 18:07:19 +03:00
Nekotekina dbc9bdfe02 Implement set_ideal_processor_core (linux) 2017-11-15 21:00:02 +03:00
kd-11 1fa18757fc rsx: Implement render-to-cubemap; Also simplify unnormalized samplers [WIP, DELETE SHADER CACHE, VERY SLOW]
- Enables real-time cubemap reflections
- TODO: Vulkan is broke; rsx is very slow with this feature
2017-11-08 13:15:34 +03:00
kd-11 0961a43997 rsx: Implement 1D<->2D image type casts 2017-11-08 13:15:34 +03:00
kd-11 bbcb6b6851 rsx: Fbo fixes 2
- Use AA mode to predict surface compression. Compression mode is useless without AA activated
- Rewrites most image subresource fetch routines to use the new heuristic
- Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code
-- Fixes alot of failures when looking for tiled regions

rsx: Fix antialiased unnormalized coords
- scaling factors are inverse to allow proper coordinates to be computed in fs
2017-11-08 13:15:34 +03:00
kd-11 173d05b54f rsx: Optimizations
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well

addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine

vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb
2017-11-08 13:15:34 +03:00
Jake e0d1ac676e rsx: invalidate surface store address when tile is unbound 2017-10-28 12:46:20 +03:00
Jake 626b9f93c4 rsx: make dmactrl get 'readonly' 2017-10-28 12:46:20 +03:00
kd-11 9c9495621c rsx: Fix critical bug concerning transient data layout in memory 2017-10-26 00:35:45 +03:00
kd-11 5db45c3699 rsx: More fixes
- Workaround for AMD glMultiDrawArrays bug
- Disable disjoint command submission when multidraw support is disabled
2017-10-19 12:22:52 +03:00
kd-11 86bf61ad35 rsx: Fix memory protection
- Fixes hanging when wcb is enabled
2017-10-14 14:19:14 +03:00
kd-11 c570410e06 rsx: Stop executing broken command queues if the application fails to recover in 3 retries 2017-10-12 13:51:29 +03:00
kd-11 9af71699a4 rsx: Minor fixes
- Dont skip cb if a problem occurs, just spin on it instead to allow possibility of recovery
- Vulkan cleanup for the die_with_error helper
2017-10-12 13:51:29 +03:00
kd-11 fc0f98b5db rsx: Res scaling improvements
- gl: Reintroduce the wcb hw downscaling
- rsx: Clamp scalable render target size with a config var
2017-10-09 20:25:41 +03:00
kd-11 12ab03b0b5 rsx/gl: Implement resolution scaling
rsx: Revise wpos calculation to take resolution scale into account
2017-10-09 20:25:41 +03:00
kd-11 3499d089e7 rsx: Texture cache fixes and improvements
rsx: Conditional lock hack removed
vulkan - Fixes
- Remove unused texture class
- Fix native pitch calculation (WCB)
rsx: Catch hanging begin/end pairs when flushing deferred draw calls
vulkan: Register DXT compressed formats
vulkan: Register depth formats
gl: Workaround for 'texture stitching' when gathering flip surface
- TODO: Add a proper flip hack option
rsx: Fix texture memory size calculation
- DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size
rsx: Fix cpu copy detection
rsx: Validate blit dst surface and dont make assumptions about region blit order
- Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled
rsx: Fix depth texture detection
rsx: Do not manually offset into dst. The overlapped range check does so automatically
rsx: Minor optimizations
rsx: Minor fixes
- Fix to detect incompatible formats when using GPU texture scaling and show message
- Better 'is_depth_texture' algorithm to eliminate false positives
2017-09-21 16:17:06 +03:00
kd-11 3836b40bf7 rsx: Fixups 2017-09-21 16:17:06 +03:00
kd-11 571dbfb7b1 rsx: Texture cache improvements
- Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO)
rsx: Fixups to allow large textures for blit operation
- Also includes checks for both leaking sections and blit regions for vulkan
hotfix for hanging when using WCB
addendum - unlock both ro and no blocks before attempting to copy memory blocks
gl: Fixups for ARB_explicit_uniform_location
- Forces glsl v 430 to make use of the extension
rsx/vk: Rework texture cache to minimize recursive access violations
- Also modifies the vulkan commandbuffer begin/end/submit mechanism
gl: Fix cached_texture_section::is_flushable to take memory protection into account
rsx: Fix blit dst offset calculation
2017-09-21 16:17:06 +03:00
kd-11 10e25eb362 rsx: Fix multidraw range splits again
rsx: Hotfix for disjoint range detection
2017-09-21 16:17:06 +03:00
kd-11 45d0e821dc gl: Minor optimizations
rsx: Texture cache - improvements to locking
rsx: Minor optimizations to get_current_vertex_program and begin-end batch flushes
rsx: Optimize texture cache storage
- Manages storage in blocks of 16MB
rsx/vk/gl: Fix swizzled texture input
gl: Hotfix for compressed texture formats
2017-09-21 16:17:06 +03:00
kd-11 061824a7ec rsx: Add support for batched multidraw
gl: Fix multidraw [WIP]
rsx: Ignore vertex base when data source is generated using arithmetic
vk: Check pending flag before doing fence poke
vk/gl: Fix for inlined array and immediate draws
rsx: Collapse joined draws when batching
2017-09-21 16:17:06 +03:00
kd-11 abb56a354d rsx: Bug fixes and improvements
rsx: Try to skip unknown commands without discarding entire cb
2017-09-21 16:17:06 +03:00
kd-11 f71f67c4ff rsx: Make fragment state dynamic to reduce shader permutations 2017-08-26 21:53:54 +03:00
kd-11 142bfb5250 rsx: Fix immediate indexed drawing 2017-08-18 16:51:38 +03:00
kd-11 2fd4726919 rsx: Fix single vertex array input declarations 2017-08-16 23:58:30 +03:00
kd-11 c04aa05398 rsx: Shader pipeline fixes and improvements
- Do not set zfunc if alphakill is not enabled. This is because at the moment alphakill requires a different shader to be built

- use glsl loop-unroll friendly comparison; skip vertex input compare if either key requests it

- Minor tweaks to fp key generation
2017-08-16 23:58:30 +03:00
kd-11 bbf2a97d2e rsx: Zero-initialize the vertex register block
- Some games reference constant regs that they never initialize
2017-08-16 23:58:30 +03:00
kd-11 1da732bbf5 rsx/gl/vk: Invalidate texture regions when memory is unmapped
- Free GPU resources immediately if mappings change to avoid leaking VRAM
2017-08-16 23:58:30 +03:00
kd-11 b2b5f564a1 rsx: Add a few more depth format types to known behaviour paths 2017-08-16 23:58:30 +03:00
kd-11 d54c2dd39a gl: Move vertex processing to the GPU
- Significant gains from greatly reduced CPU work
- Also reorders command submission in end() to improve throughput

- Refactors most of the vertex buffer handling
- All vertex processing is moved GPU side
2017-08-16 23:58:30 +03:00
RipleyTom 2d7e91ba8a Yield instead of sleeping rsx thread. (#3158)
Another Yield
2017-08-06 01:46:01 +01:00
Jake 21dd715b42 sys_rsx: implement support for lle-gcm 2017-08-02 01:33:12 +03:00
Jake d9a693019b rsx/gcm: Implement rsx dma. Refactor gcm/rsx to not be as codependent 2017-08-02 01:33:12 +03:00
kd-11 4cd5624fa7 rsx/vk/gl: Refactoring - Also adds a vertex cache to openGL as well 2017-07-27 14:33:30 +03:00
kd-11 6557bf1b20 rsx: More aggressive thread scheduling for vertex processing
- Significantly helps vertex performance
- Not recommended as more threads will harm performance if the PC does not have the cores for it
2017-07-24 16:52:42 +03:00
kd-11 df8fa74e2a vulkan hotfix (#3046)
* Rework vertex attribute binding for vulkan. Allows always providing a buffer view to the pipeline even if the game has the attribute disabled as long as it is consumed by the vertex shader.
2017-07-22 01:54:28 +03:00
kd-11 05ffb50037 vk/rsx: Bug fixes and improvements
- Improvements to framebuffer usage; Avoid creating new resources every frame
- Handle null fragment program properly
- Collect vertex upload statistics

- vk: Pre-initialize 'unused' varying registers in the vertex shader in case it gets matched with a fs that consumes it
 -- Fixes a crash about fog_c not being declared

gl/dx12/vk: Handle null fragment program

- cleanup - use yield semantic instead of sleep(0) as yield is more cross-platform
 -- sleep(0) is a windows specific scheduler hint
2017-07-19 23:28:33 +03:00
kd-11 e9b8f94fb1 rsx/gl/vk: Enable frame skipping 2017-07-08 14:52:16 +03:00
kd-11 b95ffaf4dd rsx: Implement skip draw. Also, start working on MT vertex upload 2017-07-08 14:52:16 +03:00
kd-11 459a7ba5a2 rsx: Avoid using push_back/emplace_back on empty STL containers
- Reckless management of STL containers causes significant slowdown
- Also reorders vertex compare steps to fail quickly on simpler checks
2017-06-29 13:13:19 +03:00
kd-11 b2e906f4cc rsx: Code cleanup. Fixes several dozen warnings
- Wrap unused parameters as comments to prevent C1400
- Fix sized variable conversions with explicit casts
2017-06-22 23:36:15 +03:00
kd-11 11317acdbe rsx: Handle non-zero base vertex better
- Vertex buffer contents treat the base vertex as vertex 0 so we do the same for indices

rsx: Fix vertex base indexing

rsx: Properly fix non-zero offset indexed rendering
2017-06-22 23:36:15 +03:00
kd-11 98cf72e0fb rsx: Fix clip space computations 2017-06-22 23:36:15 +03:00
kd-11 69d3d47901 gl: Fix clip-space -> depth conversion. Fixes remaining depth read issues
- Also set some default values for samplers in a cleaner way using their 'natural' float values
2017-06-22 23:36:15 +03:00
kd-11 6a9eef0382 rsx/gl/vk: Enable use of native PCF shadows 2017-06-22 23:36:15 +03:00
kd-11 860b76452f vulkan bringup on linux
cleanup: drop unused stuff
2017-06-08 19:08:44 +03:00
Nekotekina f010b5b235 Configuration simplified 2017-05-20 16:01:48 +03:00
scribam 299f627321 Stub cell (#2785)
* Update cellGcmSys

* Update cellStorage

* Update cellSubdisplay

* Update sceNpTrophy
- Use error_code as return type
- Add few checks

* Update cellKey2char

* Update cellKb:
- Use error_code as return type
- Replace UNIMPLEMENTED_FUNC by .todo

* Update cellNetCtl

* Update cellSpudll

* Update cellSysutilAp

* Update cellUserInfo

* Stub sys_mempool_allocate_block (bad idea)
2017-05-15 14:30:14 +03:00
kd-11 c5975d5f66 rsx: Vertex program output fixes 2017-05-12 20:10:03 +03:00
kd-11 450d45354c rsx: Enable GPU texture scaling by default 2017-05-10 21:50:14 +03:00
Jake 60ce85f840 [Render] Userclip for d12/vk/ogl (#2719) 2017-04-25 18:32:39 +08:00
Ofek a5fd7abcf7 Trophy update (#2655)
* Added checksum check to TROPHY.TRP loader

* Implemented sceNpTrophyGetGameProgress, sceNpTrophyGetGameIcon & sceNpTrophyGetTrophyIcon

* Updates to up to date APIs and tiny changes

* Code style fixes for checksum verifier, and another fix for trophy functions

* Format fix
2017-04-13 20:29:47 +03:00
kd-11 adefd1fd63 rsx/ui: Add config toggle for GPU texture scaling/blit 2017-04-08 23:12:09 +03:00
kd-11 909f3e9b3e rsx: Support indexed immediate draw via ArrayElement method 2017-03-29 23:06:17 +03:00
kd-11 70d3a6d840 rsx: Support more base types for immediate rendering
fix alignment
2017-03-26 16:22:53 +03:00
kd-11 79d114cc06 rsx: Support immediate mode rendering 2017-03-26 16:22:53 +03:00
kd-11 34c2b8a55e rsx: recover from FIFO parse errors
- Validate FIFO registers before access

-- Validate the args ptr separate from the get ptr
2017-03-24 09:30:23 +03:00
kd-11 7c73c3b75c rsx/gl: Minor refactoring; prepare vulkan backend 2017-03-01 00:16:55 +03:00
kd-11 1e826f5ccf rsx: Minor optimization (tangible boost) 2017-03-01 00:16:55 +03:00
Nekotekina baf22527b0 Ditch fs::get_executable_dir 2017-02-22 17:17:26 +03:00
Ani 65104b5909 Rough implementation of GCM_CONTEXT_DMA methods
Rough implementation of GCM_CONTEXT_DMA methods.
Fixes #1487
2017-02-17 22:35:28 +03:00
Nekotekina 598c90f376 PPU thread scheduler 2017-02-13 22:26:11 +03:00
kd-11 d6159a35aa gl/vk/dx12: Fix texture scaling on unnormalized rtt access 2017-02-11 15:45:59 +03:00
Nekotekina 246b9f3182 CHECK_EMU_STATUS removal 2017-02-05 17:35:27 +03:00
Nekotekina a5a2d43d7c Thread.cpp refinement
Hide thread mutex
Safe notify() method
Other refactoring
2017-01-29 19:52:19 +03:00
kd-11 2c803dbe66 gl/vk: Bug fixes and improvements (#2206)
* gl: Only bind attrib textures on thread startup

* gl: Persistent mapped buffers

* gl: Fix emulated primitives in an inlined array

* gl: Do not re-update program information every draw call

* gl/vk: s1 type is signed normalized not unsigned normalized

* gl/rsx: Allow disabling of persistent buffers for debugging

gl: Large heap size is more practical

gl: Fix a bug with legacy opengl buffers

* gl/rsx: Allow emulation of unsupported attribute formats

* gl: Fix typos and remove dprints

gl: cleanup debug prints

* ui: Move the GL legacy buffer toggle to the left pane

* vk/gl: Fix cmp type, its range is [-1,1] not [0,1] SNORM_INT
2016-10-18 15:57:28 +08:00
Melissa Goad 22b1400018 Revamp PFIFO command submission emulation (#2179) 2016-10-01 22:13:15 +03:00
kd-11 5430e1d310 rsx/gl/vk/dx12: Add emulated texture fetch for depth read (#2173)
* rsx/gl/vk/dx12: Add emulated texture fetch for depth read

gl/vk/dx12: Simplify reinterpretation equation

* gl: Remove unnecessary re-swizzle

* glsl: explicitly cast uint to float
2016-09-29 14:54:32 +08:00
vlj 8d54bcbc0d rsx: Use variant based draw commands. 2016-09-17 23:37:52 +02:00
vlj 03c86ae43b rsx: Move inline array to draw_clause structure. 2016-09-17 23:37:52 +02:00
Oil 153a2d2b40 Fixed fog and alphakill implementation in glsl (based on DH's old commits) (#2137)
* Fixed NV4097_SET_COLOR_CLEAR_VALUE

* Fixed fog and alphakill implementation in glsl (based on DH's old commits)
2016-09-14 22:47:53 +08:00
vlj 11858dce1a rsx: Vertex array attributes don't need to be stored outside of regs. 2016-08-27 15:40:41 +02:00
vlj a64053fd68 rsx: Remove some unused code. 2016-08-27 15:40:41 +02:00
Vincent Lejeune 42b518cf7e rsx: use range for vertex buffer attribute. 2016-08-24 21:58:59 +02:00
Nekotekina 84d0d396ed EXPECTS usage removed 2016-08-15 16:29:38 +03:00
Nekotekina a7e808b35b EXCEPTION macro removed
fmt::throw_exception<> implemented
::narrow improved
Minor fixes
2016-08-08 19:19:32 +03:00
Vincent Lejeune eb1d4811de rsx: Use a "draw clause" object in rsx_state. 2016-08-05 23:33:40 +02:00
Vincent Lejeune 7a6f5b6ee5 rsx: Move index pointer generation in rsx::thread. 2016-08-05 17:54:44 +02:00
Nekotekina 5a36c57c57 Formatting system improved
`unveil<>` renamed to `fmt_unveil<>`, now packs args to u64 imitating va_args
`bijective...` removed, `cfg::enum_entry` now uses formatting system
`fmt_class_string<>` added, providing type-specific "%s" handler function
Added `fmt::append`, removed `fmt::narrow` (too obscure)
Utilities/cfmt.h: C-style format template function (WIP)
Minor formatting fixes and cleanup
2016-08-04 21:34:00 +03:00
Nekotekina 1c69eb2b73 rsx_method_t extended
rsx_methods.cpp cleanup
2016-07-31 18:16:49 +03:00
Nekotekina 6a9f3040e1 rsx_methods.cpp fix 2016-07-31 18:16:48 +03:00
Nekotekina f8719c1230 PPUThread refactoring
`CallbackManager` removed, added _gcm_intr_thread for cellGcmSys
`PPUThread` renamed to `ppu_thread`, inheritance allowed
Added lightweight command queue for `ppu_thread`
Implemented call stack dump for PPU
`get_current_thread_mutex` removed
`thread_ctrl::spawn`: minor initialization fix
`thread_ctrl::wait_for` added
`named_thread`: some methods added
`cpu_thread::run` added
Some bugs fixes, including SPU channels
2016-07-30 16:35:02 +03:00
Vincent Lejeune ac771f951d rsx: Copy state in capture frame call 2016-07-27 20:20:35 +02:00
Vincent Lejeune 8b12379eb3 rsx: Use bitfield template to decode values. 2016-07-27 18:38:36 +02:00
Nekotekina 7ccdea7822 Removed std::enable_shared_from_this
Minor ID manager refactoring
2016-07-24 21:06:05 +03:00
Nekotekina deeb4acbe5 Partial revert of 6ae54ae27b 2016-07-22 19:26:58 +03:00
raven02 8157e7cac8 Obsolete 3D monitor (#1955) 2016-07-20 23:45:26 +08:00
Nekotekina ae634bb87e RSX exception fix
VBlank thread management fix
2016-07-20 15:16:19 +03:00
Vincent Lejeune e9bee80f4b rsx: Use register_decoder for vertex attributes. 2016-07-19 20:28:32 +02:00
raven02 e1ff3f4674 rsx: use fragment_textures_count (#1948)
* rsx:  use fragment_textures_count

* Typo: unknow -> unknown
2016-07-19 22:50:40 +08:00
kd-11 2337bf204c vk/dx12: Enable/fix separate back and front lighting (#1927)
* vk: separate specular color

rsx: separate front color output from back color output

re-enable front-back diffuse lighting

vk: fix front face selection and actually enable face culling

* dx12: Hide constant-key blended visuals (by common use of factor, 1-factor)

* dx12: Fix 2 sided lighting when the shader does not compute both outputs

* vk/dx12: confirm that src register exists before copying for 2-sided lighting
2016-07-18 00:57:50 +08:00
Vincent Lejeune d97cdb9fbf rsx: Gather most rsx commands pretty printing and state modification function in a single file.
rsx_decode.h implements a "rsx_decoders" template class that is specialized for most GCM command
found in rsx command buffer. 3 static members are defined : a "decode" function that turns command
value into a more meaninfull type if applicable (for instance bool for _enabled* command, surface
formats for set_surface_format command...), a "commit_rsx_state" that modifies a given rsx_state
structure when the command is parsed, and a "dump" function used in rsx_debugger for pretty printing.
Hopefully having the 3 functions in a single place for every command will act as a self documenting
list of rsx command buffer opcode.

rsx_state is also expanded into several explicit variables instead of being stored into a u32 array.
This should makes debugging easier (Visual Studio will display the exact value of these member for instance)
as well as preparing rsx_state for serialisation/deserialisation.

The vertex array and textures opcode are not concerned atm for bisecting purpose.
2016-07-17 17:31:53 +02:00
kd-11 2c981cf940 rsx: mark register access with divider op enabled and frequency 1 (#1892) 2016-07-12 02:53:52 +08:00
Vincent Lejeune 772706ca4c Factorize rsx state 2016-07-07 21:38:57 +02:00
DH 989f954432 Added WIP vertex textures support 2016-06-28 12:58:44 +03:00
DH 44879dd9f3 Implemented alpha kill and fog 2016-06-27 01:52:08 +03:00
DH 6ae54ae27b RSX: Added legacy non-array vertex attributes support (if count of elements > 1)
Fixed ps1ght games
2016-06-26 21:32:50 +03:00
DH e296f81a37 Shaders decompiler: support non 2D textures
Do not validate programs with undefined textures uniforms
Minor fix
2016-06-26 21:32:48 +03:00
DH f30d71da6c OpenGL renderer: improved vertex attributes setup
Minor fixes
2016-06-22 22:46:47 +03:00
O1L 083c4fc855 Try to use new shaders decompiler in OpenGL backend 2016-06-21 19:56:00 +03:00
Nekotekina 266db1336d The rest 2016-05-23 16:22:25 +03:00
Ivan aafcf44581 Header optimizations (#1684)
Shouldn't break anything. I hope.
2016-04-27 01:27:24 +03:00
Ivan da7472fe81 Optimizations (#1680)
* Optimizations

1) Some headers simplified for better compilation time
2) Some templates simplified for smaller executable size
3) Eliminate std::future to fix compilation for mingw64
4) PKG installation can be cancelled now
5) cellGame fixes
6) XAudio2 fix for mingw64
7) PPUInterpreter bug fixed (Clang)

* any_pod<> implemented

Aliases: any16, any32, any64
rsx::make_command fixed
2016-04-25 13:49:12 +03:00
Ivan 75fe95eeb1 GSL moved from stdafx.h (#1676)
Added GSL.h helper for correct including
2016-04-20 02:32:27 +03:00
Nekotekina b85a68e8a1 Partial commit: RSX 2016-04-15 19:22:36 +03:00
Vincent Lejeune 2ae5a7ff39 rsx/common/d3d12/gl/vulkan: Use single overload for write_index_array_data_to_buffer. 2016-04-07 22:17:28 +02:00
Vincent Lejeune 91d0229bc5 rsx/common: Use an help texture_dimension_extended to handle cubemap more cleanly. 2016-03-30 22:19:29 +02:00
Vincent Lejeune f2c82d3cf4 rsx/common: Use a typed class for texture dimension. 2016-03-30 20:03:50 +02:00
Vincent Lejeune 4d71df70db rsx-debug: Record and display index buffer content. 2016-03-05 18:48:30 +01:00
Vincent Lejeune 52e2800fb5 rsx: Reset fog mode/param to linear/1.;
Fix After Burner Climax fog
2016-03-05 18:25:31 +01:00
Vincent Lejeune 32434dd848 rsx/common/d3d12/gl: Support for fog mode.
Fix hitman 2
2016-02-29 16:31:18 +01:00
Vincent Lejeune 35db227af4 rsx/common/d3d12: Separate int type buffer from float type buffer. 2016-02-27 00:21:14 +01:00
Vincent Lejeune a6ba47265f rsx/common/gl: s32k is actually signed short unormalized.
gl fix
2016-02-27 00:21:12 +01:00
Vincent Lejeune 5ef7f8bf3e rsx/common: Fix handling of UB256 2016-02-27 00:21:06 +01:00