Increase untouched buffer timeout when some of the buffers have been
touched. Might improve audio quality on games that suffered from
miniscule popping even when buffering was enabled (such as DeS).
In addition, made time stretching algorithm slightly more aggressive.
Includes some other tiny tweaks as well.
Also renames "AudioThread" to "AudioBackend". The new name is more
descriptive of what the class really is responsible for, since the
backends are not responsible for managing the audio thread.
NOTE: Right now only XAudio2 is supported
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
- D24S8 targets have 2 aspects that are dealt with separately; Forcefully initialize the remaining data if a partial init is done. Its 'free' anyway
- It seems that the stencil mask matters when clearing unlike the depth mask and color mask
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
usage. Allows for operations like optional memory initialization before
reading
- Remove the required_xxx_pitch constraint as it makes no sense. The pitch controls what can be written per line.
- It is possible to have a huge surface width but only render to a small region at the beginning and have a smaller pitch than can fit the surface (NFS carbon)
- If draw call resources consume memory that intersects with NA parts of the texture cache, we get a framebuffer test mismatch.
This mismatch is false and happens because the thread has not yet reached the point of relocking the pages
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
- Do not bind companion framebuffer when clearing single aspect; let the
contest mechanism sort it out instead
- Do not prematurely tag framebuffers, instead only do so at
write-confirmation time. Should avoid false tagging if setup does not
allow a render to occur.
* Never return NULL (also apllies to similar functions)
* Base offset is 0x0e000000 for main location
* Default location is LOCAL
Info was taken from disasm of gcm
- Immediate mode is isolated from the rest of the vertex configuration
- TODO: Verify register behaviour when immediate mode is used
Check if per-primitive const register values are supported (likely are)
- Implements a mirror view of D24S8 data that accesses the stencil components.
Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data
- Add a few missing destructors
Image classes are inherited a lot and I forgot to make the dtors virtual
- Per-channel conditional execution introduces RAW hazards all over the place
- Its cheaper to process both branches and select between the two
- Also improves ShaderVariable functionality to allow functionality such as match_size and taking complex variables as inputs
* Restore stack in fifo error handling
* Update get register after the cmd execution
* Fix put pause in the middle of command
* Add restore points when branching to self
* Precise nopcmd detection
* Test all invalid cmds for early treatment of queue corruption
To avoid the need (and performance hit) of Read Color/Depth Buffers, we
may not invalidate overlapping fbos inside lock_memory_region unless
they are guaranteed to be superseded by the new one.
This avoids e.g. issues with overblooming, among others.
Fixes VRAM leaks and incorrect destruction of resources, which could
lead to drivers crashes.
Additionally, lock_memory_region is now able to flush superseded
sections. However, due to the potential performance impact of this
for little gain, a new debug setting ("Strict Flushing") has been
added to config.yaml
The hw doesnt fix pitch, when specifying src pitch 0 it copies the same pixels line to dst. keep in mind out_pitch = 0 is not allowed in image_in.
Same goes for buffer_notify, though it allows out_pitch to be 0.
* Dont bother capturing 'destination' blocks with no data. instead premap all main memory to ensure allocated
* Capture zcull and tile state as their compressed gcm forms
* Fix index array capturing, ignore empty sets
* hle gcm: Fix byteswaping in cellGcmSetZcull
* Added a helper function for fetching game's PARAM.SFO path
This should properly get SFO path for unlocked C00 games
* Normalized line endings
* Refresh game list after installing a RAP file
- Do not assume flip marks end-of-frame if executed via syscall
- Also disables skip_frame for these applications as there is no frame boundary
- NOTE: QUEUE_HEAD cannot be relied on as it is seemingly possible to flip the same head and not need to queue it
- Ignore barriers inserted after BEGIN but before any draw commands are emitted
- Properly process tail barriers inserted before END but after draw commands are submitted
- Ignore execution barriers with no effect (same register value written)
- Tries to detect when FIFO preprocessing is beneficial and only enables optimizations if the benefit outweighs the cost
- Current threshold is at least 500 draw calls saved at over 2000 draw calls to justify the overhead
- TODO: More tuning for other CPUs
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
- Replace a few more vectors with simple_array<T>
- Avoid unnecessary string comparisons in backends. We already know referenced textures from the program analysers!
- Also fix visual corruption when using disjoint indexed draws
- Refactor draw call emit again (vk)
- Improve execution barrier resolve
- Allow vertex/index rebase inside begin/end pair
- Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]
- gl bringup
- Simplify
- using the simple_array gets back a few more fps :)
Implement helper functions balanced_wait_until and balanced_awaken
They include new path for Windows 8.1+ (WaitOnAddress)
shared_mutex, cond_variable, cond_one, cond_x16 modified to use it
Added helper function utils::popcnt16
Replace most semaphore<> with shared_mutex
Move lambda into a cpu_stop()
Use running thread counter to synchronize with sys_spu_thread_group_join()
Use SPU_STATUS_STOPPED_BY_STOP exclusively for sys_spu_thread_exit() as before
Remove unnecessary waiting in sys_spu_thread_group_exit()
Rollback some minor unnecessary changes
Use shared_mutex in SPU TG
* Remove complete buffer clear
* If pressure sensitivity option is not specified, write zeroes (should this be handled from our actual controller handler?)
* Check sensor setting before reporting changes
*Set priority under a lock
*Fix yield command making threads going out of scheduler control by removing it from the queue (not a bug that affects compatibility)
* gcm: Fix tile offset setting
highest bit signifyies location, so ignore that while reading the offset.
* rsx-capture: Fix tile binding
fixes division by zero when dividing by pitch when the tile is not bound.
* rsx-capture: Fix zcull binding
lf_queue<>: unbound FIFO queue with dynamic linked-list
lf_value<>: concurrently-assignable value readable without locking at the cost of memory (using dynamic linked list)
Add atomic_t<>::compare_exchange
- POS does not have to be fetched from ATTR[0]
- Confirmed with UC1 that uses WEIGHT for positions
- At least one POS stream has to exist to feed the position attribute which cannot repeat for a single triangle
- Matching attachments with resource id fails because drivers are reusing
handles!
- Properly sets up stale fbo ref counting and removal
- Properly sets up resource reference test with subsequent removal to
avoid using a broken fbo entry
- Orders flushing to preserve memory at all cost
- Avoids false positive where flushing overlapping sections can falsely invalidate another with head/tail test
- Forcefully downloads and reuploads data from the CPU in case of unexpected overlaps
- Properly detect correct size of newly created blit targets
- Remember to clear any existing views when changing the default component map!
- These dependencies have #defines which enable code related to them
in rpcs3 and rpcs3_ui targets. They are only used in rpcs3_emu but
HAVE_* defines have to be defined in rpcs3 and rpcs3_ui targets also,
so they have to have PUBLIC visibility so defines carried over
CMake: Fix Alsa and Pulse audios being disabled
- HAVE_PULSE and HAVE_ALSA were not defined in rpcs3 target
Fixup libevdev
* CMake: Refactor build to multiple libraries
- Refactor CMake build system by creating separate libraries for
different components
- Create interface libraries for most dependencies and add 3rdparty::*
ALIAS targets for ease of use and use them to try specifying correct
dependencies for each target
- Prefer 3rdparty:: ALIAS when linking dependencies
- Exclude xxHash subdirectory from ALL build target
- Add USE_SYSTEM_ZLIB option to select between using included ZLib and
the ZLib in CMake search path
* Add cstring include to Log.cpp
* CMake: Add 3rdparty::glew interface target
* Add Visual Studio CMakeSettings.json to gitignore
* CMake: Move building and finding LLVM to 3rdparty/llvm.cmake script
- LLVM is now built under 3rdparty/ directory in the binary directory
* CMake: Move finding Qt5 to 3rdparty/qt5.cmake script
- Script has to be included in rpcs3/CMakeLists.txt because it defines
Qt5::moc target which isn't available in that folder if it is
included in 3rdparty directory
- Set AUTOMOC and AUTOUIC properties for targets requiring them (rpcs3
and rpcs3_ui) instead of setting CMAKE_AUTOMOC and CMAKE_AUTOUIC so
those properties are not defined for all targets under rpcs3 dir
* CMake: Remove redundant code from rpcs3/CMakeLists.txt
* CMake: Add BUILD_LLVM_SUBMODULE option instead of hardcoded check
- Add BUILD_LLVM_SUBMODULE option (defaults to ON) to allow controlling
usage of the LLVM submodule.
- Move option definitions to root CMakeLists
* CMake: Remove separate Emu subtargets
- Based on discussion in pull request #5032, I decided to combine
subtargets under Emu folder back to a single rpcs3_emu target
* CMake: Remove utilities, loader and crypto targets: merge them to Emu
- Removed separate targets and merged them into rpcs3_emu target as
recommended in pull request (#5032) conversations. Separating targets
probably later in a separate pull request
* Fix relative includes in pad_thread.cpp
* Fix Travis-CI cloning all submodules needlessly
Preprocess . and .. correctly
Don't use recursive locking
Also use std::string_view
Fix format system for std::string and std::string_view
Fix fmt::merge for std::string_view
- NOTE: The address swizzle index is only for use as src. The address registers are only used one channel at a time.
- When the destination of ARL, the encoding is the same as the other temp registers
- Retag resources reprotected under flush_always rules
- Properly check for blit resource fitting taking into account format
mismatch, pitch mismatch and typeless transfers
Remove vm::ptr operator %
This was a bad idea but explicit_bool_t was created almost for it
Other removed types are unused and have little to no meaning
- The x value contains the VP output value interpolated across primitive surface
- The y coordinate contains the fog fraction according to the selected fog formula