This replaces the totally messed up PR #6728
Some games make heavy use of getllar() & putllc() without even changing data.
In this case avoid unneccesary heavy locking of the PPU threads on non-TSX
hosts.
Run another thread to collect profile data from SPU threads.
Use this data to prioritize compiling hot spot SPU blocks.
Implement stx::init_mutex::wait_for_initialized() helper.
This implementation optimises correctly on all relevant compilers,
unlike GSL’s which gave extremely slow code on any compiler other than
MSVC.
Supersedes #6948.
- When the point sprite flag is set, overrides the input similar to the
2D mask. The returned X and Y values are always the gl_PointCoord values
for the fragment.
- Stacks with the 2D mask to override the z and w coordinates.
- Adhere to workgroup count limits as exposed by the GPU vendor.
They already execute properly even when going beyond the limits but this removes validation noise.
- Fix invocation counts for deswizzle kernel. The count was incorrect if blocksize was not 4, causing a bunch of useless work to be done.
- Handles all LODs per layer meaning cubemaps are now fully handled in 6 passes instead of 6 * (log2(width)) passes.
- Handles all LODs of a 3D texture in one pass as well.
- The improvements do warrant dropping down the number of allowed compute invocations a bit
- Remove use of uniform buffers for compute static data. Use push
constants instead.
- Minor touchups to the deswizzle code to avoid redundant data copies.
- Allow delaying report flushes triggered by image_in or buffer_notify
- When the report is ready, all the delayed transfers will automatically
be done.
- TODO: Make this configurable?
* rsx: Optimise primitive_restart::upload_untouched() with SSE4.1
This optimisation is only applied when skip_restart is false.
I’ve only tested the u16 codepath, as it is the one used in NieR.
In some very unscientific profiling, this function used to take 2.76% of
the total frame time at the save point of the port town, it now takes
about 0.40%.
* rsx: Mark all SSE4.1 functions with attributes on gcc and clang
This assures the compiler we will take care of only calling these
functions after having checked that the CPU does support these
instructions.
* rsx: Add an AVX2 implementation of primitive restart ibo upload
* rsx: Remove redefinition of SSE4.1 instructions
Now that clang is aware that our functions are compiled with SSE4.1, it
lets us generate this code using its intrinsics.
* rsx: Optimise vector to scalar conversion
This is done using minpos and srli intrinsics and generate less code
than before.
Thanks Nekotekina for the suggestion!
- Simplify active instance management. While multicontext support will
be required in future, this is better done with multiple logical devices
rather than multiple instances.
- Destroy the WSI surface on exit
- Enable depthBoundsTest explicitly. TODO: Properly check for supported
features.
subresource_layout::dim_in_texel
- These two are not always linked when working with compressed textures.
The actual texels extend past the actual size of the image if the size
is not aligned. e.g if height is 1, the real height is 4, but its not
possible to determine this from the aligned size. It could be 1, 2, 3 or
4 for example.
- Fixes image out-of-bounds writes when uploading from CPU
This lowers the relative cost of this function from ~2.25% to ~1.80% on
gcc 9 which I found quite surprising, some of it probably gets inlined
better in the callers, but I haven’t been able to isolate which parts.
Clear() error checking simplified a bit
Mount() now clears cache if ID was changed from last or NULL specified.
Implemented vfs::host::remove_all():
Clear() now uses vfs::host::remove_all() to match behaviour on Windows with ps3
* Ignore more warnings
These are intentional
* Signed/unsigned mismatch when comparing
* Explictly cast values
* Intentionally discard a nodiscard value
* Change ppu_tid to u32
* Do not use POSIX function name on Windows
* Qt: Use horizontalAdvance instead of width
* Change progress variables to u32
* Normalize audio when downmixing to avoid clipping
Idea came from this topic:
https://hydrogenaud.io/index.php?topic=104214.msg855199#msg855199
Fixes very loud audio in Motorstorm (and probably other games
when playing over headphones/stereo speakers with
Downmix to Stereo option enabled)
Caused warnings. Not sure what the actual intention was, this may need to be inverted.
This commit assumes that erase() returning 0 is a sign that deletion
failed, or that there was corruption. There should be a port there.
The files already had a #ifdef _WIN32, but this avoid even trying to
compile their translation unit.
I was surprised to see XAudio2 being mentioned on Linux, this makes sure
no one else will get this surprise.
The current behaviour when going fullscreen from windowed was to keep
the previous size of the swapchain, with black borders on all sides,
which looks quite ugly.
The root of this issue is that rpcs3 only checks for frame resize if
vkQueuePresent() returns VK_SUBOPTIMAL_KHR, which drivers can’t do on
Wayland, see https://gitlab.freedesktop.org/mesa/mesa/issues/1979
- Noticed a glitch on AMD hw and windows drivers where discard seems to affect entire 4x4 cells.
- Dead fragments (outside the primitive boundary) could have their discards trigger as they do not have proper access to variables.
- This introduces dead fragments along triangle edges, causing a diagonal line pattern across the screen that is very annoying.
- Renormalizes arbitrary N-bit values as 8-bit normalized.
- NV hardware performs integer normalization at 8 bits if the size is less than 8.
- This can cause significant arithmetic drift because the error is multiplied by a huge number when sampling.
* allow sys_rsx_device_map to be called twice: in this case the DEVICE address retrived from the previous call returned
* Add ENOMEM checks for sys_rsx_memory_allocate and sys_rsx_context_allocate
* add EINVAL check for sys_rsx_context_allocate if memory handle is not found
* Separate sys_rsx_device_map allocation from sys_rsx_context_allocate's
* Implement sys_rsx_memory_free; used by cellGcmInit upon failure
* Added context_id checks
* Throw if sys_rsx_context_allocate was called twice.
- If either source data or dest is a render target, do image operations on the GPU same as before
- If swizzle is desired, use CPU fallback
- If no scaling and no format conversion is required, use CPU fallback
- If scaling is desired and the transfer target is in local memory, use the GPU
- When doing trivial copies, use the routine in rsx_methods instead of
duplicating code. Also has the benefit of better range checking.
- When commiting a block as fbo, keep blit_dst data as well.
- Avoids removing (and losing data from) blit targets that just happen to share a page with a framebuffer.
- Uncacheable resources can be reused as soon as they're made visible to the draw call.
- Since they're likely to be reused every draw call until the shader changes, it is important to reuse as much as possible
- Don't allow all args to be nullptr at once.
- Fill listBuf with zeroes for unwritten entries
- Fix userId set in listBuf
Similarly to what the firmware does
Unify internal code generation to make better use of GHC calling convention.
Ideally, it would just work on Windows as well, but some random bug appeared.
This bug was causing freezes on SPU LLVM compilation.
This commit desperately attempts to workaround it.
Adds support for preventing the display from sleeping while a game is
running. Supports Windows, Linux (with the org.freedesktop.ScreenSaver
D-Bus service), and macOS.
* Use Linux timers for sleeps up to 1ms (v3)
The current sleep timer implementation basically offers two variants. Either
wait the specified time exactly with a condition variable (as host) or use a
combination of it with a thread yielding busy loop afterwards (usleep timer).
While the second one is very precise it consumes CPU loops for each wait call
below 50us. Games like Bomberman Ultra spam 30us waits and the emulator hogs
low power CPUs. Switching to host mode reduces CPU consumption but gives a
~50us penalty for each wait call. Thus extending all sleeps by a factor of
more than two.
The following bugfix tries to improve the system timer for Linux by using
Linux native timers for small wait calls below 1ms. This has two effects.
- Host wait setting has much less wait overhead
- usleep wait setting produces lower CPU overhead
Move source files to Emu/GDB.cpp, GDB.h
Remove "WITH_GDB" option, enable GDB Server by default.
Change class name to gdb_thread.
Alias for external access gdb_server.
Change config option name to "GDB Server"
Bind on 127.0.0.1 by default.
* camera_frame = NULL is now checked for CELL_GEM_NO_VIDEO (applied both on start and finish)
* camera_frame = NULL on Start() still starts the update
* Implemented NOT_FINISHED/STARTED
All of those were reversed and hw tested.
* sys_cond fixes
sys_cond_wait is now signaled atomically (regression fix)
Fix a corner case with sys_cond_wait and sys_cond_destroy EBSUY check (waiter count was inaccurate if thread is not the owner of mutex)
Add not about EBUSY corner case (TODO)
* Fix inconcistency in sys_cond_destroy and ETIMEDOUT
.. event at sys_cond_wait regarding waiters count.
Now waiters count is properly connected to the mutex owner actions after ETIMEDOUT in sys_cond_wait.
- If a stale reference is left lying around (e.g the texture bound to
depth has been deleted and we attach a color image) no operations
actually take place. glCheckFramebufferStatus also does not catch this
problem.
- Sometimes program-point-size is enabled, but the vs does not actually
write to the point size register. In this case, pass the incoming point
size along instead of the default register init.
- When compiling LLVM objects, it is possible to starve the driver thread and cause the timeouts to trigger
- Observed in RE6 when using SPU LLVM since the game generates a very large number of objects "infinitely"
- Allows frameskipping to occur naturally if RSX thread is bombarded with flip requests but just jumping to the last one if possible
- See request_emu_flip() for async frame submission and implicit skipping
- Also allows display queue to fill faster than the flip thread can drain the queue
Move se_t and se_storage to util/endian.hpp
Use single template instead of two specializations.
Add minor optimization for MSVC.
Remove v128 dependency.
Try to enable intrinsics for unaligned data.
Fix minor bug in u16/u32/u64 specializations.
Possibly fixes sys_fs_rmdir and other cases of directory removal.
Make sure the directory with deleted files always becomes empty.
For this purpose, temp files are moved to the root of the device.
This routine:
1) Removes junk backup directories
2) Fixes interrupted save data process in edge case
This case can happen if emu terminates between two atomic renames.
Also use directory renaming technique for delete op.
Also rewrite recreate operation to be part of atomic process.
- Separate displayed statistics from actual backend statistics.
Allows asynchronous flipping to work correctly as it just uses display stats.
The real stats are used by the frame scope marker to determine behavior like engaging the FIFO optimizer or skipping draw calls correctly.
- Add an explicit frame scope marker tied in with the queue_prepare command
Since queue_prepare is emitted at the end of a frame, it can be used as end-of-frame in games that emit this
- If this command is not emitted, fifo flatenner and frameskip will not work
- Calculate exact sizes when doing hit tests to avoid false negatives
- Defer page checking until actually require to do memory setup
- Introduce align2 helper to do non-pow2 alignments
- Allow use of intrinsics when SSSE3 and SSSE4.1 are not available in the build target environment
- Properly separate SSE4.1 code from SSSE3 code for some older proceessors without SSE4.1
- With harmonization between all texture types implemented, there is no difference between blit_engine_src and shader_read for supported formats
- Adds extra format filtering to ensure no conflicts when copying data
- While the mask for surface_a is at index 0, the surface cache expects the order to be maintained correctly!
Set the correct mask since surface store now checks each RTT individually
- Avoid silly broken tests due to queue_tag being called before pitch is initialized.
- Return actual memory range covered and exclude trailing padding.
- Coordinates in src are to be calculated with src_pitch, not required_pitch.
- This allows creating buffers with no MAP bits set which should ensure they are created for VRAM usage only
- TODO: Implement compute kernels to avoid software fallback mode for pack/unpack operations
- Fix 2D coordinate sampling of W coordinate.
W is actually HPOS.w and not 1. Z is however always 0.
- Optimize register usage a bit
Disassembling compiled SPV shows that global declaration results in less ops than using inout modifiers. Modifiers generate extra mov instructions.
- Fix reading of varying registers in FP
Different registers have different behavior
- Always write to varying registers. If a register is not written to, it is initialized to (0, 0, 0, 1)
- Reimplements two-sided lighting correctly without hacks
- Also bumps shader cache version
- Do not allow offloader to handle its own faults. Serialize them on RSX instead.
This approach introduces a GPU race condition that should be avoided with improved synchronization.
- TODO: Use proper GPU-side synchronization to avoid this situation
- Avoids memory appearing older when used for depth test without depth write
The write_barrier before the call will inherit new data but the tag will not update as no new information is added.
- Properly commit orphaned blocks not invalidating existing cache structures
- Do not ignore overwritten objects when commiting as unprotected fbo. Avoids stale references to invalidated surface objects.
- Load into memory as straightforward BGRA
- Fixes a bug in vulkan caused by byte shuffling in blit engine vs shader access
- Removes the need for memory shuffling when transferring into a rendertarget
This is based off the upstream implementation in mbedTLS as well as an
external pull request [1] for MSVC support (using intrinsics).
1: https://github.com/ARMmbed/mbedtls/pull/1355
- Implements render target data load (aka Read Color Buffer/Read Depth Buffer)
- Refactors vulkan surface barrier to be much cleaner.
- Removes redundant surface barrier invocations after doing a merged load
from surface cache.
- Adds explicit access modes when gathering surfaces from cache.
- Further improve aliased data preservation by unconditionally scanning.
Its is possible for cache aliasing to occur when doing memory split.
- Also sets up for RCB/RDB implementation
With the option enabled GET commands are blocked until the current PUTLLC/PUTLLUC executer on that address finishes
Additional improvements:
- Minor race fix of sys_ppu_thread_exit (wait until the writer finishes)
- Max number of ppu threads bumped to 8
vkAcquireNextImageKHR can also return VK_SUBOPTIMAL_KHR and is non-fatal.
However, it's a good idea to still recreate the swap chain later to maintain
optimal presentation paths after temporary occlusion.
- ZCULL queue was updated to one-per-cb but the conditional render sync hint was not updated.
- Do not unconditionally flush the queue unless the upcoming ref is contained in the active CB.
- This avoids spamming queue flush, which frees up resources and improves performance
This is the only reason theres a need to specify lwmutex_sq id at creation. unlike sys_cond, lwcond isn't connected to lwmutex at the lv2 level.
SYS_SYNC_RETRY fix is done explicitly at the firmware level.
This fixes issues when the original lwcond and lwmutexol data got corrupted or deallocated, this can happen when the program simply memcpy the control data to somewhere else.
Or if it uses direct lv2 the lwcond conrol param can even be NULL which will make it access violate when dereferncing it. (This param is unchecked and can be anything)
- Merge viewport raster window and scissor into one clipping region
- Viewport raster clip is different from viewport geometry clipping in
hardware as the latter is configurable separately
- After splitting, the sections may not be referenced at all for anything other than just pixel storage
- In such cases, either merge down or sample from the upstream source instead
- Texel borders are no longer actually supported in modern APIs
- Removes the border texels and uses border color instead which is incorrect but should work fine
- Tagged eventIDs can be used to safely delete resources that are no
longer used
- TODO: Expand gc to collect images as well
- TODO: Fix the texture cache to avoid over-allocating image resources
- Fix a typo in OpenAL
- Fix typo in cellHttp.h
- Unused variables in catch
- Use 64-bit shifts
- Use use_count with shared pointers, unique is depracated and getting removed
- Explicitly cast boolean to int
- Signed/unsigned issues with loop variables
- Fix missing return statement (the code path is unreachable, but compiler wants a return)
- */ ouside of comment
- Fix duplicate layout name
vm::spu max address was overflowing resulting in issues, so cast to u64 where needed. Fixes#6145.
Use vm::get_addr instead of manually substructing vm::base(0) from pointer in texture cache code.
Prefer std::atomic_thread_fence over _mm_?fence(), adjust usage to be more correct.
Used sequantially consistent ordering in semaphore_release for TSX path as well.
Improved memory ordering for sys_rsx_context_iounmap/map.
Fixed sync bugs in HLE gcm because of not using atomic instructions.
Use release memory barrier in lwsync for PPU LLVM, according to this xbox360 programming guide lwsync is a hw release memory barrier.
Also use release barrier where lwsync was originally used in liblv2 sys_lwmutex and cellSync.
Use acquire barrier for isync instruction, see https://devblogs.microsoft.com/oldnewthing/20180814-00/?p=99485
Prefer vm::ptr<>::ptr over vm::get_addr.
Prefer vm::_ptr/base over vm::g_base_addr with offset.
Added methods atomic_t<>::bts and atomic_t<>::btr .
Removed obsolute rsx:🧵:Read/WriteIO32 methods.
Removed wrong check in semaphore_release.
Added handling for PUTRx commands for RawSPU MFC proxy.
Prefer overloaded methods of v128 instead of _mm_... in VPKSHUS ppu interpreter precise.
Fixed more potential overflows that may result in wrong behaviour.
Added io/size alignment check for sys_rsx_context_iounmap.
Added rsx::constants::local_mem_base which represents RSX local memory base address.
Removed obsolute rsx:🧵:main_mem_addr/ioSize/ioAddress members.
-Indentation warnings
-prevent shift overflow
-This was declared extern in all contexts. Remove this for initialization
-Fix main return types. OH CANADA!
-Silence extraneos 'unused expression' warning
-Force use return value (warning)
-Remove tautological compare copy-pasta (char always < 256)
- Do not consume a slot every draw call, instead batch as many draws as possible
- Since renderpasses are dispatched per-draw-clause, keeping occlusion queries outside the renderpasses works fine
- If renderpasses are reorganized, occlusion tasks will have to be reorganized again
- Remove string comparisons from the hot-path!
- Use attribute streaming and push constants to avoid forcing a descriptor block copy every other draw call/pass.
While this isn't so bad on nvidia cards, it makes AMD cards a slideshow.
- When multithreaded RSX is enabled, the vertex cache just lowers performance
- The small cost of upload is paid by the asynchronous thread, allowing RSX to work optimally
- Multiple header files where missing #includes to other headers that
where used in the header. Correct header was included in correct
order in source files which caused everything to compile.
- Added missing #includes so header files correctly include all their
dependencies and fixes problems with IDEs being unable to parse
headers correctly due to missing symbols
- Add vm_locking.h and vm_reservation.h and move relevant functions
and types to these headers.
- Change include order and make vm_ptr.h, vm_var.h and vm_ref.h headers
usable invidually and them including vm.h instead of other way around
- Because usage of vm::ptr now requires including vm_ptr.h instead of
vm.h updated multiple #includes
- Added additional #includes to vm_reservation.h and vm_locking to
where vm::reservation_* and locking related functions are used
- Refactor overlays and resolve passes to support use of push constants instead of relying buffer map/unmap
- Add support for nvidia resolve (NV is the only vendor not supporting shader_stencil_export)
- TODO: Option to completely skip clamping in some architectures as it is not needed in most games
- Mostly affects older GPUs that do not have access to native fp16
- Removes a lot of wm_event code that was used to perform window management and is no longer needed.
- Significantly simplifies the vulkan code.
- Implements resource management when vulkan window is minimized to allow resources to be freed.
- Just use a semaphore and let the driver handle it instead of manual framepacing.
We lose framepace control but drivers have matured in the past few years so it should work fine.
- Ensures the current renderpass matches the image properties even when a cyclic reference is detected
- Solves SDK debug output error spam due to mismatching layouts and renderpasses
Intel ANV has been tested and verified to work without workaround
AMDVLK and the proprietary AMD driver have been confirmed to require workaround for window resizing
- Add double-buffered descriptor pools to avoid use-after-free situations
- Make descriptor pools more configurable
- Also adds in a hack to allow renderdoc to capture properly
Make use of lock bits in reservation counters.
On PPU, fallback to compare_and_swap instead of desperate retry.
On SPU, lighten write set on retry by 'locking' outside of the transaction.
- Use a simple queue to avoid redundant checks over all the contexts
- Poll queue if RSX pipe is idle
- Only check the queue when the frame context is dirty (after a queue operation)
- Reset descriptors at the start of the frame context to avoid having to synchronize mid-frame
- Fully synchronize if a descriptor reset is required mid-frame (spec compliance; also fixes flickering verts on some hardware)
- Transition attachments to LAYOUT_GENERAL in case of a feedback loop
- Fixes appearance of garbage along polygon edges in some
post-processing passes.
- Also reverse this transition when rendering goes back to normal
- Allows render targets to behave like stacked 3D views same as shader inputs are resolved
- Basically implements most of 'Read Color/Depth Buffers" option for 'free'.
- Allows splitting RTV/DSV resources if they are superceded by a partial surface
- Also allows intersecting new resources through the surface cache for proper inheritance from other scattered data
- TODO: Refactor bind_surface_as_rtt and bind_surface_as_ds to reduce asinine code duplication
Misc: fix EH frame registration (LLVM, non-Windows).
Misc: constant-folding bitcast (cpu_translator).
Misc: add syntax for LLVM arrays (cpu_translator).
Misc: use function names for proper linkage (SPU LLVM).
Changed function search and verification in Giga mode.
Basic stack frame layout analysis.
Function detection in Giga mode.
Basic use of new information in SPU LLVM.
Fixed jump table compilation in SPU LLVM.
Disable broken optimization in Accurate xfloat mode.
Make compiled SPU modules position-independent in SPU LLVM.
Optimizations include but not limited to:
* Compiling SPU functions as native functions when eligible
* Avoiding register context write-out
* Aligned stack assumption (CWD alike instruction)
If the rwlock is currently acquired by a writer signaling readers is wrong and will lead to EPERM for wunlock!
Only signal blocked readers if the rwlock is currently acquired by readers
readers can wait on the sleep queue if a writer lock has been blocked before it, in this case after runlock: writer should acquire the lock but the r's sleep queue is still not empty!
TODO: Investigate the _s input modifier behaviour further, in case it can avoid generating zeroes from a MAD instruction.
x = MAD(+ve, -ve, -ve) with _s input modifier in BFBC expects result to be Non-zero
- Properly test for NaN and Inf when clamping down to fp16
- Optimize divsq a bit; mix(vec, vec, bvec) emits OpSelect which is what
we want here, instead of component-wise selection which is much slower.
- While mul(0, nan) = nan and 0 / 0 = nan, 0 / sqrt(0) = 0 because of hw
gremlins. normalize(0) is also nan so this behaviour does not work
around that particular case either which makes it even more baffling.
- The hw generates inaccurate values when doing perspective-correct
interpolation of vertex output attributes and makes the comparison (a ==
b) fail even when they are a fixed constant value.
- Increase equality tolerance when doing comparisons in fragment
shaders for NV cards only to work around this issue.
- Teepo fix
- The fixed-point D24S8 format does special Z clamping during compare which matches PS3 behaviour
- D32S8 is a floating point format and comparison with Dref > 1 always fails causing black edges/borders
- Improve support for float16_t by minimizing mixed inputs to functions
(ambiguous overloads)
- Minimize amount of downcasts in code by using opcode flags
- Re-enable float16_t support for vulkan
- Emulating f16 with f32 is not ideal and requires a lot of value clamping
- Using native data type can significantly improve performance and accuracy
- With openGL, check for the compatible extensions NV_gpu_shader5 and
AMD_gpu_shader_half_float
- With Vulkan, enable this functionality in the deviceFeatures if
applicable. (VK_KHR_shader_float16_int8 extension)
- Temporarily disable hw fp16 for vulkan
Are made composable in expressions similar to arithmetic ops.
Implement noncast in addition to bitcast (no-op case).
Implement bitcast constant folding.
Fixed some misuse of sext<>.
Properly forward value categories in expression structs.
Simplify SFINAE tests (is_llvm_expr, llvm_common_t) in global operators.
Add llvm_const_int and remove llvm_add_const, llvm_sub_const, etc.
Add llvm_ord and llvm_uno for FP comparison via >=< operators.
Replace cpu_translator::fcmp with fcmp_ord and fcmp_uno.
- Used IDs were not from the Guitar Hero instruments but in fact from the Rock Band ones. Sets the correct Guitar Hero IDs and adds the Rock Band ones on comments. TODO: Allow selecting the specific devices on the PAD Settings.
- Adds DJ Hero Turntable VID/PID.
- Adds Dance Dance Revolution Mat VID/PID.
* Check directory existence if setParam is NULL (dont create directory)
* Fix mask for reCreateMode
* Check a few setParam fields including reserved buffers.
* Fix sizeKb when the dir is empty except from PARAM.SFO
* Fix error checking when CELL_SAVEDATA_RECREATE_YES is specified but setParam is NULL (Doesnt do anything, simply errors)
TODO: There are probably more spots where we should yield.
A little more at the start because PacketRead is called twice.
Dont use sys_timer_usleep because it will just call this_thread::yield() repeatedly.
This handles hypothetical situation when AVX is disabled system-wise.
Also refactored register use, to match Windows path with Linux path.
This reduces read set a little at the cost of stack use.
- When reverse scanning, offsets are inverted and offset value of 0 is logically equivalent to an offset of -1
- Add an explicit message if clipping happens to avoid silent errors/bugs
DFCMGT instruction removed, it was wrong to add to begin with
ASMJIT: Fix compilation of double compare instructions, move exception to runtime instead of compiletime!
Jarves confirmed that he implemented this instruction because of that bug with asmjit only, affected God Of War 3
- Revert to using block metrics, but with optional per-channel decode
stage for the final transfer. Much cleaner than hacking in the width to
be in channels instead of blocks.
- Removes CPU-only transforms that broke GPU-side code.
-- Channels in GPU compute are laid out in cell-order, but CPU was uploading in favorable order and compensating with swizzles.
-- This leads to 2 different layouts depending on the location of the data (CPU vs GPU)
- Implement R8G8_R8B8 interleaved format decode
- General improvements
formats
- Allows D24S8 and D32S8 transport via typeless channels
- Allows uploading and downloading D24S8 data easily
- TODO: Implement optional byteswapping to fix flushed readbacks with
the same method
- Do not round up sub-pixel offsets, round down instead
- Do not allow incomplete sources for hw blit transfer
- Reimplement src clipping (slice_h)
- Check 'area' of incoming texels and correct for them before RTT lookup/transfer
- Filter out incomplete targets when performing RTT lookup (1 texel or less contribution)
* Hw tests show state is unaffected by external destruction of the event queue
* Minor race regarding state check fixed (can result in an undestroyable state)
- If a transfer writes to a RTT and depth mismatch happens, create a local target and the upload function will likely resolve between the two
- If a surface is rejected, reset the target region!
- Also refactors some bpp handling code
- Simplify texture intersection test to use a normalized/uniform coordinate space
- Fix broken bounds checking as well
- Batch dma transfers whenever possible and do them in one go
- vk: Always ensure that queued dma transfers are visible to the GPU before they are needed by the host
Requires a little refactoring to allow proper communication of the commandbuffer state
- vk: Code cleanup, the simplified mechanism makes it so that its not necessary to pass tons of args to methods
- vk: Fixup - do not forcefully do dma transfers on sections in an invalidation zone! They may have been speculated correctly already
- Properly wait for the buffer transfer operation to finish before map/readback!
- Change vkFence to vkEvent which works more like a GL fence which is what is needed.
- Implement supporting methods and functions
- Do not destroy fence by immediately waiting after copying to dma buffer
* Implement both RawSPU and threaded SPU page fault recovery
* Guard page_fault_notification_entries access with a mutex
* Add missing lock in sys_ppu_thread_recover_page_fault/get_page_fault_context
* Fix EINVAL check in sys_ppu_thread_recover_page_fault, previously when the event was not found begin() was erased and
CELL_OK was returned.
* Fixed page fault recovery waiting logic:
- Do not rely on a single thread_ctrl notification (unsafe)
- Avoided a race where ::awake(ppu) can be called before ::sleep(ppu) therefore nop-ing out the notification
* Avoid inconsistencies with vm flags on page fault cause detection
* Fix sys_mmapper_enable_page_fault_notification EBUSY check
from RE it's allowed to register the same queue twice (on a different area) but not to enable page fault notifications twice
- Avoids blindly reusing blit dst sections as they may contain garbage.
If a section was unlocked for a flush, just discard it as its reuse introduces potential data corruption.
Since the data needs to be reuploaded anyway (for now), its better to start afresh
- In case of format mismatch, reset the calculated dst block
- Add a bounds check to determine if data contained in an atlas is good enough for sampling the cache.
If not enough data is provided, fall back to full upload
- Blit operations do format conversion automatically which is NOT what we want!
- Scale onto temp buffer with similar format before performing data cast.
- Use a 5-point tap with an X pattern across the target's memory space to reduce chances of false positives
- TODO: Potential false positives identified, requires some minor
restructuring of surface_store
TODO: From hw testing, it seems like sys_memory_get_page_attribute and sys_rsx_context_iomap check page size a little differently
get_page_attribute() always go by area flags, sys_rsx_context_iomap checks page by the page granularity
This means that if the area page size 64k, but shared memory is mapped with SYS_MEMORY_GRANULARITY_1M
It can be mapped for rsxio, but the page attribute will indicate 64k page size :thonk:
rsxio memory is verified to need 1m pages.
- From RE, only protocols SYS_SYNC_FIFO and SYS_SYNC_PRIORITY are valid
- Use conditional atomic op store in a few places
- Properly revert changes in sys_event_flag_set when aomic op fails
* Fix 0 vm page flags to behave like 1m flags, follows c8a681e60
* check if address exists and valid for rsx io allcations (must be allocated on 1m pages)
- Properly synchronize when transitioning to/from GENERAL layout.
- General layout requires full pipeline dependency since its used in a 'general' sense. As such, its use is to be largely avoided.
- gl: Properly initialize and manage sampler states
- gl/vk: Snap overlay elements to pixel grid by aligning to pixel centers
- overlays: Disable grid snapping in stb since its now handled in the backend
- Make detail a separate text entity as it often contains a lot of noise
- Properly pad the entry if needed to avoid text sitting too close to the edge
- Use custom string conversion to ensure overlay deals with extended ascii whenever possible
- Improves language compatibility greatly and avoids empty spaces for unknown glyphs
- Adds all the major buttons to native dialog input options
- Adds more button options for the native osk
- Brighten osk cell backgrounds a bit to improve visibility
- NVIDIA drivers hook into the msq before our nativeEvent handler. This means NV is aware of events before rpcs3 is aware of them and sometimes stops until a new event is triggered.
If rpcs3 is inside a driver call at this time, the system will deadlock since the driver waits for msq which waits for the renderer which waits for the driver.
- Use explicit hook management to control window events
- Add fence timeout to attempt detection of surface loss events
Is a memory manager for ASMJIT, replaces asmjit::JitRuntime
Unified memory manager for ASMJIT and LLVM
Unified SPU trampoline generation
Remove previous workarounds
- Disable DEPTH<->RGBA typeless transfers for now as they require a lot more work to work for all vendors
- Do not allow switching layouts to UNDEFINED/PREINITIALIZED formats
- Apply dither to edges that almost fail the straight-up alpha test
- Significantly improves alpha tested geometry far from the camera
- Also removes blend factor overrides/hacks as they give incorrect results due to background bleeding
- Index offset is ignored anyway and only used to calculate vertex attribute divisor index
- Specialized optimization for untouched xfer without primitive restart
Allow parallel compilation of SPU code, both at startup and runtime
Remove 'SPU Shared Runtime' option (it became obsolete)
Refactor spu_runtime class (now is common for ASMJIT and LLVM)
Implement SPU ubertrampoline generation in raw assembly (LLVM)
Minor improvement of balanced_wait_until<> and balanced_awaken<>
Make JIT MemoryManager2 shared (global)
Fix wrong assertion in cond_variable
* Fix SPU LR event setting in atomic commands according to hw test
* MFC: increment timestamp for PUT cmd in non-tsx path
* MFC: fix reservation lost test on non-tsx path in regard to the lock bit
* Reservation notification moved out of writer_lock scope to reduce its lifetime
* Use passive_lock/unlock in ppu atomic inctrustions to reduce redundancy
* Lock only once for dma transfers (non-TSX)
* Don't use RDTSC in reservation update logic
* Remove MFC cmd args passing to process_mfc_cmd
* Reorder check_state cpu_flag::memory check for faster unlocking
* Specialization for 128-byte data copy in SPU dma transfers
* Implement memory range locks and isolate PPU and SPU passive lock logic
Increase untouched buffer timeout when some of the buffers have been
touched. Might improve audio quality on games that suffered from
miniscule popping even when buffering was enabled (such as DeS).
In addition, made time stretching algorithm slightly more aggressive.
Includes some other tiny tweaks as well.
Also renames "AudioThread" to "AudioBackend". The new name is more
descriptive of what the class really is responsible for, since the
backends are not responsible for managing the audio thread.
NOTE: Right now only XAudio2 is supported
- Avoid tagging and rely on read/write barriers and the dirty flag mechanism. Testing is done with a weak 8-byte memory test
- Introducing new data when tagging breaks applications with race conditions where tags can overwrite flushed data
- D24S8 targets have 2 aspects that are dealt with separately; Forcefully initialize the remaining data if a partial init is done. Its 'free' anyway
- It seems that the stencil mask matters when clearing unlike the depth mask and color mask
- gl: Include an execution state wrapper to ensure state changes are consistent. Also removes a lot of required 'cleanup' for helper methods
- texture_cache: Make execition context a mandatory field as it is required for all operations. Also removes a lot of situations where duplicate argument is added in for both fixed and vararg fields
- Explicit read/write barrier for framebuffer resources depending on
usage. Allows for operations like optional memory initialization before
reading
- Remove the required_xxx_pitch constraint as it makes no sense. The pitch controls what can be written per line.
- It is possible to have a huge surface width but only render to a small region at the beginning and have a smaller pitch than can fit the surface (NFS carbon)
- If draw call resources consume memory that intersects with NA parts of the texture cache, we get a framebuffer test mismatch.
This mismatch is false and happens because the thread has not yet reached the point of relocking the pages
- Implicitly invoke a memory barrier if actively reading from an unsynchronized texture
- Simplify memory transfer operations
- Should allow more games to work without strict mode
- Do not bind companion framebuffer when clearing single aspect; let the
contest mechanism sort it out instead
- Do not prematurely tag framebuffers, instead only do so at
write-confirmation time. Should avoid false tagging if setup does not
allow a render to occur.
* Never return NULL (also apllies to similar functions)
* Base offset is 0x0e000000 for main location
* Default location is LOCAL
Info was taken from disasm of gcm
- Immediate mode is isolated from the rest of the vertex configuration
- TODO: Verify register behaviour when immediate mode is used
Check if per-primitive const register values are supported (likely are)
- Implements a mirror view of D24S8 data that accesses the stencil components.
Finishes the implementation of TEX2D_DEPTH_RGBA as the stencil component was previously missing from the reconstructed data
- Add a few missing destructors
Image classes are inherited a lot and I forgot to make the dtors virtual
- Per-channel conditional execution introduces RAW hazards all over the place
- Its cheaper to process both branches and select between the two
- Also improves ShaderVariable functionality to allow functionality such as match_size and taking complex variables as inputs
* Restore stack in fifo error handling
* Update get register after the cmd execution
* Fix put pause in the middle of command
* Add restore points when branching to self
* Precise nopcmd detection
* Test all invalid cmds for early treatment of queue corruption
To avoid the need (and performance hit) of Read Color/Depth Buffers, we
may not invalidate overlapping fbos inside lock_memory_region unless
they are guaranteed to be superseded by the new one.
This avoids e.g. issues with overblooming, among others.
Fixes VRAM leaks and incorrect destruction of resources, which could
lead to drivers crashes.
Additionally, lock_memory_region is now able to flush superseded
sections. However, due to the potential performance impact of this
for little gain, a new debug setting ("Strict Flushing") has been
added to config.yaml
The hw doesnt fix pitch, when specifying src pitch 0 it copies the same pixels line to dst. keep in mind out_pitch = 0 is not allowed in image_in.
Same goes for buffer_notify, though it allows out_pitch to be 0.
* Dont bother capturing 'destination' blocks with no data. instead premap all main memory to ensure allocated
* Capture zcull and tile state as their compressed gcm forms
* Fix index array capturing, ignore empty sets
* hle gcm: Fix byteswaping in cellGcmSetZcull
* Added a helper function for fetching game's PARAM.SFO path
This should properly get SFO path for unlocked C00 games
* Normalized line endings
* Refresh game list after installing a RAP file
- Do not assume flip marks end-of-frame if executed via syscall
- Also disables skip_frame for these applications as there is no frame boundary
- NOTE: QUEUE_HEAD cannot be relied on as it is seemingly possible to flip the same head and not need to queue it
- Ignore barriers inserted after BEGIN but before any draw commands are emitted
- Properly process tail barriers inserted before END but after draw commands are submitted
- Ignore execution barriers with no effect (same register value written)
- Tries to detect when FIFO preprocessing is beneficial and only enables optimizations if the benefit outweighs the cost
- Current threshold is at least 500 draw calls saved at over 2000 draw calls to justify the overhead
- TODO: More tuning for other CPUs
- Improve vertex attribute layout format. Allows for full 16-bit attribute divisor
- Use actual pitch when declaring framebuffer rsx pitch instead of register value in case of swizzle? rendering
- Replace a few more vectors with simple_array<T>
- Avoid unnecessary string comparisons in backends. We already know referenced textures from the program analysers!
- Also fix visual corruption when using disjoint indexed draws
- Refactor draw call emit again (vk)
- Improve execution barrier resolve
- Allow vertex/index rebase inside begin/end pair
- Add ALPHA_TEST to list of excluded methods [TODO: defer raster state]
- gl bringup
- Simplify
- using the simple_array gets back a few more fps :)
Implement helper functions balanced_wait_until and balanced_awaken
They include new path for Windows 8.1+ (WaitOnAddress)
shared_mutex, cond_variable, cond_one, cond_x16 modified to use it
Added helper function utils::popcnt16
Replace most semaphore<> with shared_mutex
Move lambda into a cpu_stop()
Use running thread counter to synchronize with sys_spu_thread_group_join()
Use SPU_STATUS_STOPPED_BY_STOP exclusively for sys_spu_thread_exit() as before
Remove unnecessary waiting in sys_spu_thread_group_exit()
Rollback some minor unnecessary changes
Use shared_mutex in SPU TG
* Remove complete buffer clear
* If pressure sensitivity option is not specified, write zeroes (should this be handled from our actual controller handler?)
* Check sensor setting before reporting changes
*Set priority under a lock
*Fix yield command making threads going out of scheduler control by removing it from the queue (not a bug that affects compatibility)
* gcm: Fix tile offset setting
highest bit signifyies location, so ignore that while reading the offset.
* rsx-capture: Fix tile binding
fixes division by zero when dividing by pitch when the tile is not bound.
* rsx-capture: Fix zcull binding
lf_queue<>: unbound FIFO queue with dynamic linked-list
lf_value<>: concurrently-assignable value readable without locking at the cost of memory (using dynamic linked list)
Add atomic_t<>::compare_exchange
- POS does not have to be fetched from ATTR[0]
- Confirmed with UC1 that uses WEIGHT for positions
- At least one POS stream has to exist to feed the position attribute which cannot repeat for a single triangle
- Matching attachments with resource id fails because drivers are reusing
handles!
- Properly sets up stale fbo ref counting and removal
- Properly sets up resource reference test with subsequent removal to
avoid using a broken fbo entry
- Orders flushing to preserve memory at all cost
- Avoids false positive where flushing overlapping sections can falsely invalidate another with head/tail test
- Forcefully downloads and reuploads data from the CPU in case of unexpected overlaps
- Properly detect correct size of newly created blit targets
- Remember to clear any existing views when changing the default component map!
- These dependencies have #defines which enable code related to them
in rpcs3 and rpcs3_ui targets. They are only used in rpcs3_emu but
HAVE_* defines have to be defined in rpcs3 and rpcs3_ui targets also,
so they have to have PUBLIC visibility so defines carried over
CMake: Fix Alsa and Pulse audios being disabled
- HAVE_PULSE and HAVE_ALSA were not defined in rpcs3 target
Fixup libevdev
* CMake: Refactor build to multiple libraries
- Refactor CMake build system by creating separate libraries for
different components
- Create interface libraries for most dependencies and add 3rdparty::*
ALIAS targets for ease of use and use them to try specifying correct
dependencies for each target
- Prefer 3rdparty:: ALIAS when linking dependencies
- Exclude xxHash subdirectory from ALL build target
- Add USE_SYSTEM_ZLIB option to select between using included ZLib and
the ZLib in CMake search path
* Add cstring include to Log.cpp
* CMake: Add 3rdparty::glew interface target
* Add Visual Studio CMakeSettings.json to gitignore
* CMake: Move building and finding LLVM to 3rdparty/llvm.cmake script
- LLVM is now built under 3rdparty/ directory in the binary directory
* CMake: Move finding Qt5 to 3rdparty/qt5.cmake script
- Script has to be included in rpcs3/CMakeLists.txt because it defines
Qt5::moc target which isn't available in that folder if it is
included in 3rdparty directory
- Set AUTOMOC and AUTOUIC properties for targets requiring them (rpcs3
and rpcs3_ui) instead of setting CMAKE_AUTOMOC and CMAKE_AUTOUIC so
those properties are not defined for all targets under rpcs3 dir
* CMake: Remove redundant code from rpcs3/CMakeLists.txt
* CMake: Add BUILD_LLVM_SUBMODULE option instead of hardcoded check
- Add BUILD_LLVM_SUBMODULE option (defaults to ON) to allow controlling
usage of the LLVM submodule.
- Move option definitions to root CMakeLists
* CMake: Remove separate Emu subtargets
- Based on discussion in pull request #5032, I decided to combine
subtargets under Emu folder back to a single rpcs3_emu target
* CMake: Remove utilities, loader and crypto targets: merge them to Emu
- Removed separate targets and merged them into rpcs3_emu target as
recommended in pull request (#5032) conversations. Separating targets
probably later in a separate pull request
* Fix relative includes in pad_thread.cpp
* Fix Travis-CI cloning all submodules needlessly
Preprocess . and .. correctly
Don't use recursive locking
Also use std::string_view
Fix format system for std::string and std::string_view
Fix fmt::merge for std::string_view
- NOTE: The address swizzle index is only for use as src. The address registers are only used one channel at a time.
- When the destination of ARL, the encoding is the same as the other temp registers
- Retag resources reprotected under flush_always rules
- Properly check for blit resource fitting taking into account format
mismatch, pitch mismatch and typeless transfers
Remove vm::ptr operator %
This was a bad idea but explicit_bool_t was created almost for it
Other removed types are unused and have little to no meaning
- The x value contains the VP output value interpolated across primitive surface
- The y coordinate contains the fog fraction according to the selected fog formula
Remove "atomic operator" classes
Remove test, test_and_set, test_and_reset, test_and_complement global functions
Simplify atomic_t<> with constexpr if, remove some garbage
Redesign bs_t<> to use class, mark its methods constexpr
Implement atomic_bs_t<> for optimizations
Remove unused __bitwise_ops concept (should be in other header anyway)
Bitsets can now be tested via safe bool conversion
cellMicEnd would get stuck waiting for the cellMic thread to finish, but
it never does because micInited is still true.
Fixes Resistance: Fall of Mankind getting stuck after the menu
- Newer nvidia drivers are not exposing IMMEDIATE present mode unless you change options in nvidia control panel
This can cause severe performance degradation unless the vsync option is set to "off" in control panel
- Some games overflow the program buffer e.g Resistance games
The observed overflow is one instruction longer, likely an engine bug
with counting instructions
- Tags framebuffer resources on first use (when on_write is called to verify memory)
- Texture cache now selects the best match and even sorts atlas writes with memory write order to avoid older data showing over newer one
- Some formats are proven to ignore swizzle flag
- DXT compressed textures
- COMPRESSED_BG_GB class textures
- Some applications are using swizzled wide integer formats so those are confirmed to swizzle
- Use proper time checking; depending on what is being done one 'tick' can
be almost a millisecond long or several nanoseconds
- Avoid spamming the system timer unless necessary
- A possible deadlock is still present if rsx is trying to get a super_ptr whilst the vm lock holder is in an access violation
This patch makes this scenario very unlikely since each block need only be touched once
- Implement forced reading when calling update method to sync partial lists
- Defer conditional render evaluation and use a read barrier to avoid extra work
- Fix HLE gcm library when binding tiles & zcull RAM
- Do not do a full sync on a texture read barrier
- Avoid calling zcull sync in FIFO spin wait
- Do not flush memory to cache from the renderer side; this method is now obsolete
invalidate_range_impl_base does not mark all textures that will only be
unprotected as dirty when doing a deferred flush, since that is done by
flush_all.
However, if there are no sections to flush, the deferred flush will
use the same code path as non-deferred flushes for unprotecting textures
and forget to mark them as dirty.
This commit fixes this bug.
The existing implementation restarts the loop immediately after
finding a range_data instance that updates the trampled_range.
This commit refactors this method to continue the loop with the updated
trampled_range, and then repeat only those range_data instances that
were iterated through before the trampled_range was last updated.
As a result, the number of total iterations required is reduced.
When the trampled range changes, get_intersecting_set restarts the
outer loop. However, due to an off-by-one error, it skips the first
cache entry when doing so. This can cause a texture not to be
correctly unlocked, which could lead to issues or even deadlocks.
This commit fixes this off-by-one error.
If "address" is not page-aligned, the calculated end address for the
corresponding page is incorrect which can lead to false positives,
causing some textures to be re-uploaded unnecessarily.
This commit fixes this typo.
- Defer compilation process to worker threads
- vulkan: Fixup for graphics_pipeline_state.
Never use struct assignment operator on vk** structs due to padding after sType member (4 bytes)
- Adds proper support for vertex textures, including dimensions other than 2D textures
- Minor analyser fixup, removes spurious 'analyser failed' errors
- Minor optimizations for program state tracking
- Adds dead code elimination
- Fix absolute branch target addresses to take base address into account
- Patch branch targets relative to base address to improve hash matching
- Bumps shader cache version
- Enables shader logging option to write out vertex program binary,
helpful when debugging problems.
- Fix program abort logic to never abort before resolving later label addresses
Fixes jumping over broken code and jumping over END markers
- TEXTURE_CONTROL2 has indexing range of [0..15] without stride skipping!
This register does not have interleaving with other texture registers
- Track shader address poke as it seems to invalidate programs as well
- Avoid re-locking memory if there is no reason to do so (no draws issued)
- Actively bound regions should always get written to the backing cache
- Forcefully read memory during download if writes to the target have occured since last sync event
- Unroll main compute queue loop
- Do NOT run GPU cores on mappable memory! This has a dreadful impact on performance for obvious reasons
- Enable dynamic SSBO indexing (affects AMD)
- Make loop unrolling and loop length variable depending on hardware and find optimum
Build SPU cache after PPU, fix mixing progress
SPU ASMJIT: add support for Giga mode
SPU ASMJIT: use the same spu.log location as SPU LLVM
SPU: improve spu.log disasm
SPU: improve trampolines, unify with SPU ASMJIT
SPU: decode interrupt handler address from BR/BRA at 0x0
SPU LLVM: support Mega/Giga modes
SPU LLVM: implement function chunks
SPU LLVM: use PHI nodes, value visibility across basic blocks
SPU LLVM: implement function chunk table
New simple memory manager for LLVM (bugfix)
- Region pitch of 64 (disabled) can be used to indicate packed contents - do not assume it is the actual pitch!
- Also fixes interaction of AA factors with lockable_region size
- Use names for overlay command config and vertex data instead of std::pair.
- Make a couple of compiled_resource constructors explicitly named functions.
- Used to transfer D32S8 data where it makes sense to use this variant
- On nvidia cards, it is very slow to move aspects from D24S8 probably due to the format being faked.
For this reason, the unsafe variant is used for both D16 and D24S8 to avoid the heavy performance loss
- RADV does not keep a mapping ptr around for subsequent remap and falls back to heavy amdgpu methods every time
Explicitly manage pointer in the ring buffer structure to fix this
- Compute is now used to assist in some parts of blit operations, since there are no format conversions with vulkan like OGL does
- TODO: Integrate this into all types of GPU memory conversion operations instead of downloading to CPU then converting
- Removes the old depth scaling using an overlay.
It was never going to work properly due to per-pixel stencil writes being unavailable
- TODO: Preserve stencil buffer during ARGB8->D32S8 shader conversion pass
- Some applications (e.g Backbreaker) use an evil hack to resolve MSAA.
The application respecifies a formerly AA region as a region with no AA then performs a framebuffer feedback lookup.
The old memory keeps AA during read, but writes back to itself with AA resolved.
This is evil on several levels but it just happens to work on PS3
Returning CELL_OK in sceNpManagerRequestTicket2 makes NPEB01268 loop indefinitely trying to check the downloaded content.
Telling that the system is offline escapes the loop and make the game go further.
Moves NPEB01268/BLES01794 from Intro to Ingame.
Disable extra modes for SPU LLVM for now.
In Mega mode, SPU Analyser tries to determine complete functions.
Recompiler tries to speed up returns via 'stack mirror'.
* Support for using arbitrary firmware fonts
* Support for specifying font extension in `font` constructor (useful for most firmware fonts that use .TTF)
- The benefits of FIFO optimizations are huge in some cases.
The optimizations also do not break any tested applications so no need to disable with strict mode
- A debug option is provided to disable this behaviour for testing
- Matches ps3 accuracy for all tested values with few exceptions
- Do not enter the host OS kernel if waiting for less than 500us to avoid scheduler issues
- Improves cleanup code to consist of 2 parts, remove then dispose. Remove
does not deallocate the item until dispose is called on it, allowing the
backends to first deallocate external references.
- Caller is responsible for managing list locking and tracking disposable list
of items when external references have been cleaned up before using
dispose method.
1. rsx: Rework section synchronization using the new memory mirrors
2. rsx: Tweaks
- Simplify peeking into the current rsx::thread instance.
Use a simple rsx::get_current_renderer instead of asking fxm for the same
- Fix global rsx super memory shm block management
3. rsx: Improve memory validation. test_framebuffer() and
tag_framebuffer() are simplified due to mirror support
4. rsx: Only write back confirmed memory range to avoid overapproximation errors in blit engine
5. rsx: Explicitly mark clobbered flushable sections as dirty to have them
removed
6. rsx: Cumulative fixes
- Reimplement rsx::buffered_section management routines
- blit engine subsections are not hit-tested against confirmed/committed memory range
Not all applications are 'honest' about region bounds, making the real cpu range useless for blit ops
Drop _xbegin family intrinsics due to bad codegen
Implemented `notifier` class, replacing vm::notify
Minor optimization: detach transactions from global mutex on TSX path
Minor optimization: don't acquire vm::passive_lock on PPU on TSX path
- Properly identify puller spin primitives
- Add a small wake delay after exiting a spin delay. Fixes desynchronization
It seems real hw has a small delay between cell edits to commandbuffer memory at the GET address and the changes becoming visible to the DMA puller
Simulated with a short busy_wait, large values will improve sync but degrade performance
- Reimplements the AMD workaround using an identity buffer to avoid the performance hit of doing multiple glDrawArrays for every single compiled set
- Reimplements first/count allocation using a scratch buffer to reduce allocation overhead when large number of draw calls is used
- Improve dirty state tracking affecting program state
- vk: Refactor out transform constants upload into a separate channel to avoid if possible
transform data uploads are quite expensive
- Introduces a gpu program analyser step to examine shader contents before attempting compilation or cache search
- Avoids detecting shader as being different because of unused textures having state changes
- Adds better program size detection for vertex programs
- Improved vertex program decompiler
- Properly support CAL type instructions
- Support jumping over instructions marked with a termination marker with BRA/CAL class opcodes
- Fix SRC checks and abort
- Fix CC register initialization
- NOTE: Even unused SRC registers have to be valid (usually referencing in.POS)
Allow indirect calls within current function using a jumptable
This restores some functionality removed in SPU ASMJIT 2.0
Change SPUThread::get_ch_value prototype
Use conditional memory access to invalid address.
This approach can allow continue (for debugging);
but at the same time it doesn't add function call to recompiled code.
- Readback does not work at all with float textures on AMD openGL
Driver throws a bogus OUT_OF_MEMORY error regardless of amount of VRAM and system RAM available
- Formats support is linked to the physical device and by extension the logical device derived from it
It therefore makes no sense to track this as a separate object.
Simplifies parameter passing and template specialization.
Also avoids corner cases with AMD hardware (where D24S8 is not supported)
- vk: Clear dirty textures before copying 'old contents' in case the old data does not fill the new region
- rsx: Properly decode border color - seems to be in BGRA format
- vk: better approximation of border color to better choose between the presets
- vk: Individually clear color images outside render pass and without scissor
- vk: Fix renderpass selection for clear overlay pass
- vk: Include scissor region when emulating clear mask
NOTES:
- vk: Completely avoid using vkClearXXXXimage - its 'broken' on nvidia drivers
Spec is vague about the function so its not an actual bug
ClearAttachment is clearly defined as bypassing bound state which works correctly
- TODO: Implement memory sampling to simulate loading precleared memory if cell used memset to preinitialize the framebuffer
Autoclear depth to 1|255 and color to 0 is hacky!
- gl/vk: Fix subresource copy/blit
- gl/vk: Fix default_component_map reading
- vk: Reimplement cell readback path and improve software channel decoder
- Properly name the subresource layout field - its in blocks not bytes!
- Implement d24s8 upload from memory correctly
- Do not ignore DEPTH_FLOAT textures - they are depth textures and abide by the depth compare rules
- NOTE: Redirection of 16-bit textures is not implemented yet
Primary:
- Fix SET_SURFACE_CLEAR channel mask - it has been wrong for all these
years! Layout is RGBA not ARGB/BGRA like other registers
Other Fixes:
- vk: Implement subchannel clears using overla pass
- vk: Simplify and clean up state management
- gl: Fix nullptr deref in case of failed subresource copy
- vk/gl: Ignore float buffer clears as hardware seems to do
- Ignore unlocked blit sections [TODO]
- Do not attempt blit on hw if bytesize is unsupported
- gl: Implement typeless memory transfers
Uses pbo to handle type-agnostic memory transfer
- Mainly affected are colormasks and read swizzles
NOTES:
- Writes to G write to the second and fourth component (YW)
- Writes to B write to first and third component (XZ)
- This means the actual format layout is BGBG (RGBA) making RG mapping actually GR
- Clear does not seem to have any intended effect on this format (TLOU)
Use opt-out shared spu_runtime to save memory (Option: SPU Shared Runtime)
Implement "übertrampolines" for dispatching compiled blocks
Patch fixed branch points to use trampolines after check failure
- Properly detect inline array registers vs constant value registers
- Silence needless spam, 306E is 2D surface engiine, the assumption that y is multiplied by 306E pitch is not crazy
Bug fix: don't display new data entry when not asked for
Use icon/title provided by the game for the new data entry
Display new data entry at the beginning of list when necessary
Minor cellSaveData cleanup
Fixes the following issues on Tales of Vesperia which requires SRM.
- Blacked out scene after the sleeping dog now renders correctly
- Ghosting effect. The ghosting was most noticeable as a delay between the character rendering and the cell shading around the character. This appears to be gone with this change.
- GL queries share the target binding (not asynchronous!)
- Discard active queries by closing them, leave closed queries alone (nothing to be done for discard op)
- Reimplements render target views used for sampling
- Optimizes access using an encoded control token
- Adds proper encoding for 24-bit textures (DRGB8 -> ORGB/OBGR)
- Adds proper encoding for ABGR textures (ABGR8 -> ARGB8)
- Silence some compiler warnings as well
- TODO: Real texture views for OGL current method is a hack
- Separate TXB from TXL: They are completely different!
- Properly perform TMU emulation in the fragment shader. Implemens SRGB conversion and alphakill at the moment
- Properly perform ROP emulation in the fragment shader. Implements FRAMEBUFFER_SRGB. While support on the chip looks to be incomplete (and wierd), it does work
- Document some more bits in SHADER_CONTROL register
- Export some debug information in the free texture register space components zw
Very useful when analysing renderdoc captures
- Enable shadow comparison on depth as long as compare function is active and texture is uploaded for depth read
Some engines (UE3) read all the components in the shader and use mul/mad with the result
it's currently unknown whats the exact relationship between the prio and the group type SYS_SPU_THREAD_GROUP_TYPE_COOPERATE_WITH_SYSTEM (0x20).
tho we do know prio'es whom less than 16 are reserved for the system.
gl/vk: Bump shader cache version
gl/vk: Disable anisotropic override when strict mode enabled as it is proven to alter some games negatively
gl: Clamp buffer view range to not exceed the backing buffer size. Also add assert for the same condition
- Track asynchronous operations in RSX core
- Add read barriers to force pending writes to finish.
Fixes zcull delay flicker in all UE3 titles without forcing hard stall
- Increase zcull latency as all writes should be synchronized now
- ZCULL unit emulation rewritten
- ZCULL reports are now deferred avoiding pipeline stalls
- Minor optimizations; replaced std::mutex with shared_mutex where contention is rare
- Silence unnecessary error message
- Small improvement to out of memory handling for vulkan and slightly bump vertex buffer heap
- Removes the duplicate local_transform_constants
- Resets the transform constants on every context reset
- Simplifies the code abit which should make it faster
- NOTE: Transform constants are persistent across context re-init events (VF5)
- TODO: Alot of work is still needed to execute draw commands out of order
Thats the only solution to games sending many draw calls with high frequency of state changes
- Adds support for compilation on MAC with moltenVK. Note that vulkan does
not work on MacOS yet. There are two main blockers:-
1) Texture component swizzles are not supported except for
RGBA8_UNORM->BGRA8_UNORM.
2) There is a bug in their SPIR-V -> MSL generator.
GLSL.std.450.xxxx functions are not implemented which breaks rpcs3
functionality. Trying to compile a vertex shader will throw because
unpackHalf2x16 is missing.
- According to NV_fragment_program spec, registers are zero initialized always
- A program even without writing to these registers will have black (0, 0, 0, 0) output
Confirmed behaviour with MotorStorm games. Their engine uses this quirk to clear color buffers when doing depth replace
Might be an unfixed game bug
According to the NV_fragment_program spec, its not feasible to have 16-bit depth wries
NOTE: NV_fragement_program precedes NV_fragment_program2 which is very
close to what RSX consumes. It is hardware from that era afterall
- Forces Bitcast of texture data if input format cannot possibly be the
same as the existing texture format
- rsx: Other minor improvements to texture cache :-
- remove obsolete blit engine incompatibility warning. The texture will be re-uploaded if it is indeed incompatible
- Implement warn_once and err_once to avoid spamming the log with systemic errors
- Track mispredicted flushes
- Reswizzle bitcasted texture data to native layout
TODO: Also needs reshuffle according to input remap vector
- A new step is added between decompilation and pipeline object creation allowing for properties to be updated based on shader contents
- Allos masking off attachment writes that are unmodified in the shader
- gl: Do not call makeCurrent every flip - it is already called in set_current()
- gl: Improve ring buffer behaviour; use sliding window to view buffers larger than maximum viewable hardware range
NV hardware can only view 128M at a time
- gl/vk: Bump transform constant heap size When lots of draw calls are issued, the heap is exhaused very fast (8k per draw)
- gl: Remove CLIENT_STORAGE_BIT from ring buffers. Performance is marginally better without this flag (at least on windows)
- Identify depth textures reaching the gpu via shader_read upload path
- Use correct timestamp counter for opengl
- inline draw_state::test_property because msvc doesnt do it for us
- Do not bother rechecking the dirty sampler pool for hits. Its faster to create new sampler than to search the pool
- Reserve some memory on vertex layout struct to reduce reallocation penalty
* [sysutil] Add Magnetometer system param
* [ui] Add UI for Move handler
Current options are "Null" and "Fake".
* cellGem: Improvements
* cellCamera: Improvements
Use atomic variable to sync TTY size
Implement console_putc (liblv2)
Write plaintext instead of HTML
Slightly improve performance
Fix random line breaks in TTY
* Made GDB debugger working with IDA
* Added async interrupts support
* Report proper thread after pausing
* Support attaching debugger before running app
- Adds support for abstract implementations
- Adds native windowing implementations for WIN32 and X11 as fallbacks
when present support is lacking (headless configs)
- For some reason the hardware forgets that primitive restart is enabled and tries to actually read vertex index 65535
- Works correctly if uint32 vertex indices are used instead of uint16 for cases where primitive restart is active
- Fix for texture barriers
- vulkan: Rework texture cache handling of depth surfaces
- Support for scaled depth blit using overlay pass
- Support proper readback of D24S8 in both D32F_S8 and D24U_S8 variants
- Optimize the depth conversion routines with SSE
- vulkan: Replace slow single element copy with std::memcpy
- Check heap status before attempting blit operations
- Bump guard size on upload buffer as well
- Implement flush-always behaviour to partially fix readback from a currently bound fbo
- Without this, only the first read is correct, as more draws are added the results become 'wrong'
- Fixes WCB and cpublit behviour
- Synchronize blit_dst surfaces to avoid data loss when gpu texture scaling is used
- Its still faster in such cases to disable gpu texture scaling but some types cannot be disabled without force cpu blit (e.g framebuffer transfers)
- Memory management tuning
- rsx: on-demand texture cache rescanning for unprotected sections
- rsx: Only framebuffer resources are upscaled
- Do not resize regular blit engine resources
- Lazy initialize readback buffer when using opengl
-- These measures should help minimize vram usage
Six instructions changed to use xmm registers instead of gpr.
ROTQBII, ROTQMBII, SHLQBII look better (shifts by imm)
ROTQBI, ROTQMBI, SHLQBI changed for consistency (shifts by variable)
Only AVX-512 path is changed (third version).
This instruction is extremely rare.
And the code is probably not optimal.
So this commit is pretty useless.
- For some surfaces, dimensions are passed via the log2 bits rather than surface pitch
-- This is similar to the setup for nv406e and probably means the surfaces are padded and swizzled
- Do not assume texture2D when creating new textures
- Flag invalid texture cache if readonly texture is trampled by fbo memory.
Avoids binding a stale handle to the pipeline and is rare enough that it should not hurt performance
- Do not add unused subroutines in shaders unless necessary
-- makes shaders easier to read and disassembled spir-v has less clutter
- glsl: Replace switch block with lookup table
- vulkan: Do not assume an aux frame context must exist in a well defined state as set in init_buffers() since the request might be external (via overlays path)
- gl: Do not bother waiting for idle before servicing external flip requests
- gl: Queue overlay cleanup requests to ensure only glthread attempts touching the context
- overlays: Do not compute size metrics for invalid/unsupported glyphs
- opengl driver optimization for nvidia. On nvidia glTextureBufferRange performance is horrendous
-- Initialize texture buffer to whole buffer at startup and use absolute offsets to read data instead
-- Over 2x performance in some cases (Resogun, TNT racers)
- gl/vk: Do not flip non-existent display buffers. Fixes spec violation at boot in TNT racers demo
- whitespace fixes for sys_rsx
- Implement low bit decode override flags for 2-component textures
- Properly implement alot of texture remaps according to the autotest results
rsx: Do not unnecessarily shuffle WZYX->RGBA unless we have proof
- From looking at format swizzles, this is incorrect
- Use edges of depth range to map clamped stuff
Disable range compression on regular draws vs extended range draws
- Some applications require full 0-1 usage without compromises.
-- TODO: This leaves the extended range z values to fight with regular draws in the .99 - 1.0 range
- The scale offset matrix is fine but on real hardware the z results seem to be independent of near/far clipping distances
-- If depth falls within near/far, clamp depth value to [0,1]
- Always flush the primary queue and wait if not involking readback from rsx thread
-- Should fix some instances of device_lost when using WCB
-- Marked remaining case as TODO
-- TODO: optimize amount of time rsx waits for external threads trying to read
fix input regression and fix input for FIFA games
fix input in NASCAR [BLUS30932]
fix port status query -> disconnected devices don't cripple following devices by decreased now_connect
- Track last working state and reset to it if RSX starts to desync
-- This is especially useful when running vulkan since the renderer will easily outpace the rest of the system when merely recording draw commands
- Ignore empty sets
-- Mark empty/invalid IB sets as having 0 element counts.
- Doesn't account for dynamic libraries loaded after the fact,
but usually good enough since
1) Those aren't even present in some games
2) They usually only have about 1 or 2 fragments (dialogs) each.
- Make all texture access on non-existent textures return 0
- If border color is closer to 0, then set alpha to 0 as well (might break some corner cases with alpha test)
- Zero initialize null sampler
- Force mipmap count to 1 if sampling from an RTV/DSV
- TODO: Better wcb flush detection, it should be better to re-upload the texture after it has been dwnloaded if expected mipmaps are > 1
- Sometimes square renders are done to surfaces with pitch=64 and re-uploaded with swizzle scanning
-- This setup avoids discarding targets if they are square and pitch == 64
- Abort nv406e semaphore acquire if the rsx thread stalls/crashes
- Fix texture size approximation to take mipmaps into account. Fixes some games hanging with WCB
- Avoid unprotecting memory until just before we have to write the data
- Avoids race conditions where the caller thread takes too long to enter the second phase and another thread accesses the "bad" memory
- Reorganize storage hash vs ucode hash
- Scan for actual fragment program start in case leading NOPed code precedes the actual instructions
-- e.g FEAR2 Demo has over 32k of padding before actual program code that messes up hashes
- Fix VERTEX_DATA_3F_M element alignment (its 16 bytes per attribute)
- Fix DATA_2S_X interpretation type. Its signed 16-bit unnormalized (s32k) and not signed normalized (s1)
- Zeta pitch is ignored by real HW for some reason
- Monitor ptch value changes as well since they may affect disabled surfaces
- TODO: Verify if MRT pitch is really taken into consideration
- Discard intentionally invalidated framebuffer resources. These are created after a flush has happened, forcing reupload since contents cannot be guaranteed (strict mode only)
- Fix for blits using vulkan; dont use the copy method if formats do not match, use generic blit instead
- Handle aliased depth + color target by disabling depth writes. This looks to be the correct way
- Add support for generic passes that cannot be done using general imaging operations. Lays the framework for tons of features and effects
- Implement RGBA->D24D8 casting. Sometimes games will split depth texture into RGBA8 then use the new RGBA8 as a depth texture directly
-- This happens alot in ps3 games and I'm not sure why. Its likely the ps3 did not sample fp values with linear filtering so this is a workaround
-- Only implemented for openGL at the moment
-- Requires a workaround for an AMD driver bug
* Input: further work on remapping Xinput and begin work on remapping DS4
* Input: Improve pad_settings_dialog a bit and begin Remapping for XInput
* Input: begin evdev remapping and change all handlers to use cfg::string
* Input: finish work on remapping evdev
and some more crap
* Input: finish work on remapping Xinput and DS4
* Input: add DS4 Colors to DS4 config
* Input: Improve DS4 deadzone scaling
Jarves made some mistakes, so I'll fix them in the follow up commit
* Input: fix Jarves fixes on DS4 deadzone
and remove unnecessary usage of toUtf8
* Input: add primitive batterychecks to XInput and DS4
* Input: add mmjoystick remapping
* Input: Fix evdev and some Vibration issues
* Input: adjust capabilities to fix stick input for games like LoS 2
also fix threshold slider minimum
also add ps button to all the handlers
* Input: Further evdev work
based on danilaml code review and own debugging:
Fixed path issue, <= 0 issue, some captures, const, axis with same codes.
Adds a map to each device that differentiates negative and positive axis mappings.
adjusted rest of the file to tabs (ListDevices and beginning of threadProc)
* Input: use 20ms vibration update time for xbox one elite controllers.
* Input: Fix return type of Clamp()
* Input: Evdev Fix
* Input: Evdev Optional GetNextButtonPress
presumably better than the other
* Input: review changes
* Input: evdev: fix wrong index in axis handling
move bindpadtodevice down to keep consistency between handlers and not get crazy
* Input: evdev: fix expensive add_device in GetNextButtonPress
* cleanup
* Input: mmjoy: fix type
* Input: evdev: final fixes
* Input: evdev: exclude unnecessary buttons while mapping Xbox 360 or DS4
* Input: add deadzone preview by passing necessary values in callback
use 0.5 of max value for threshold in pad dialog
* Input: get rid of all-uppercase variables
Added force_boot to force boot on cmdline boot.
Load() caused a Stop() that exited the application with "Exit RPCS3 when process finishes" enabled. Now Stop is only called if the emu is not stopped
* Thread: unbreak on BSDs after dbc9bdfe02
Utilities/Thread.cpp:1920:2: error: unknown type name 'cpu_set_t'; did you mean 'cpusetid_t'?
cpu_set_t cs;
^~~~~~~~~
cpusetid_t
/usr/include/sys/types.h:84:22: note: 'cpusetid_t' declared here
typedef __cpusetid_t cpusetid_t;
^
Utilities/Thread.cpp:1921:2: error: use of undeclared identifier 'CPU_ZERO'
CPU_ZERO(&cs);
^
Utilities/Thread.cpp:1922:2: error: use of undeclared identifier 'CPU_SET'
CPU_SET(core, &cs);
^
Utilities/Thread.cpp:1923:48: error: unknown type name 'cpu_set_t'; did you mean 'cpusetid_t'?
pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cs);
^~~~~~~~~
cpusetid_t
* JIT: use MAP_32BIT on Linux and FreeBSD
Unless RLIMIT_DATA is low enough FreeBSD by default reserves lower 2Gb
for brk(2) style heap, ignoring mmap(2) address hint requested by RPCS3.
Passing MAP_32BIT fixes the following crash
Assertion failed: ((Type == ELF::R_X86_64_32 && (Value <= UINT32_MAX)) || (Type == ELF::R_X86_64_32S && ((int64_t)Value <= INT32_MAX && (int64_t)Value >= INT32_MIN))), function resolveX86_64Relocation, file /usr/ports/devel/llvm40/work/llvm-4.0.1.src/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp, line 287.
* build: unbreak -DVULKAN_PREBUILT with system glslang on Unix
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:4:10: fatal error: '../../../../Vulkan/glslang/SPIRV/GlslangToSpv.h' file not found
#include "../../../../Vulkan/glslang/SPIRV/GlslangToSpv.h"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
rpcs3/CMakeFiles/rpcs3.dir/Emu/RSX/VK/VKCommonDecompiler.cpp.o: In function `vk::compile_glsl_to_spv(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, glsl::program_domain, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> >&)':
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x50e): undefined reference to `glslang::TProgram::TProgram()'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x51d): undefined reference to `glslang::TShader::TShader(EShLanguage)'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x542): undefined reference to `glslang::TShader::setStrings(char const* const*, int)'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x581): undefined reference to `glslang::TShader::parse(TBuiltInResource const*, int, EProfile, bool, bool, EShMessages, glslang::TShader::Includer&)'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x5d6): undefined reference to `glslang::TProgram::link(EShMessages)'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x5f1): undefined reference to `glslang::GlslangToSpv(glslang::TIntermediate const&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> >&, glslang::SpvOptions*)'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x5ff): undefined reference to `glslang::TShader::getInfoLog()'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x61a): undefined reference to `glslang::TShader::getInfoDebugLog()'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x630): undefined reference to `glslang::TShader::~TShader()'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x63c): undefined reference to `glslang::TProgram::~TProgram()'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x6d2): undefined reference to `glslang::TShader::~TShader()'
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x6de): undefined reference to `glslang::TProgram::~TProgram()'
rpcs3/CMakeFiles/rpcs3.dir/Emu/RSX/VK/VKCommonDecompiler.cpp.o: In function `vk::initialize_compiler_context()':
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x6f5): undefined reference to `glslang::InitializeProcess()'
rpcs3/CMakeFiles/rpcs3.dir/Emu/RSX/VK/VKCommonDecompiler.cpp.o: In function `vk::finalize_compiler_context()':
rpcs3/Emu/RSX/VK/VKCommonDecompiler.cpp:(.text+0x856): undefined reference to `glslang::FinalizeProcess()'
* build/msvc: add missing glslang include directory after 6bb3f1b4d75c
"c:\projects\rpcs3\rpcs3\VKGSRender.vcxproj" (default target) (15) ->
(ClCompile target) ->
Emu\RSX\VK\VKCommonDecompiler.cpp(4): fatal error C1083: Cannot open include file: 'SPIRV/GlslangToSpv.h': No such file or directory [c:\projects\rpcs3\rpcs3\VKGSRender.vcxproj]
- Fix texture cache blit behaviour when src has AA enabled and dst is a blit dst texture with or without AA
-- This requires handling AA resolve by removing a half downscale on multisampled axes
- Return all ones when a vertex attribute is disabled.
-- Some games forget to enable vertex attributes actually needed by the fs
- texture_cache: Fix internal size calculation for subresources
- vk: Delay dynamic state updates until just about to draw to ensure no flush has discarded the cb state
- In rare cases the driver derefs a nullptr and dies, taking the emulator with it
- From testing, it seems the vram is indeed freed when this happens so its "safe" to continue
- Support for raster offsets in surface descriptors (looks to be unused)
- Do not tag disabled render targets when using MRT (pitch = 64)
- Add missing notify_surface_changed() call for openGL
- Use AA mode to predict surface compression. Compression mode is useless without AA activated
- Rewrites most image subresource fetch routines to use the new heuristic
- Fix rsx:🧵:find_tile. FEED000(X) can be substituted for (X) in the code
-- Fixes alot of failures when looking for tiled regions
rsx: Fix antialiased unnormalized coords
- scaling factors are inverse to allow proper coordinates to be computed in fs
- Remove generic throws from the rsx pipeline. Stops the rsx thread from silently dying leaving the emulator in a hung state
- Hackplement add_signed and reverse_subtract_signed blend modes
- Reimplement fragment program fetch and rewrite texture upload mechanism
-- All of these steps should only be done at most once per draw call
-- Eliminates continously checking the surface store for overlapping addresses as well
addenda - critical fixes
- gl: Bind TIU before starting texture operations as they will affect the currently bound texture
- vk: Reuse sampler objects if possible
- rsx: Support for depth resampling for depth textures obtained via blit engine
vk/rsx: Minor fixes
- Fix accidental imageview dereference when using WCB if texture memory occupies FB memory
- Invalidate dirty framebuffers (strict mode only)
- Normalize line endings because VS is dumb
to flush_all is up to date. Should prevent recursive exceptions
Partially revert Jarves' fix to invalidate cache on tile unbind. This will
need alot more work. Fixes hangs
- Identify minimize/restore events as separate from regular resize and do not react to them
- Enable message queue consumption after loading the shaders cache. Also hides the frame in this step
-- This fixes the 'start fullscreen' bug when running vulkan
- This allows for better handling of deferred flushes.
-- There's still no guarantee that cache contents will have changed between the set acquisition and following flush operation
-- Hopefully this is rare enough that it doesnt cause serious issues for now
- vk: Always reopen primary command buffers. They should only be closed in flush_command_queue
- If uploading a texture and there are collisions with protected buffers, do not rebuild the cache
- Perform writes via flush before reprotecting pages that were not trampled
- Only flush no pages once
- Refactor invalidate memory functions into one function
- Add cached object rebuilding functionality to avoid throwing away useful memory on an invalidate
- Added debug monitoring of texture unit VRAM usage
- Emulate primitive restart in software whenever we get the chance
- Ensure PRIMITIVE_RESTART is never active when LIST topologies are active
- Reimplement TRIANGLE_FAN, POLYGON and QUAD expansion
- This communication is important in communicating window events. Helps properly synchronize swapchain management on vulkan and stops nvidia crashing
- Do not block the message queue lest the driver detect window as not responding
Implement sys_net syscalls
Clean libnet functions
Use libnet.sprx
Use libhttp.sprx
Use libssl.sprx
Use librudp.sprx
Implement sys_ss_random_number_generator
* spu: Rewrite interpreter fast FM
- Partially implement accurate FM
- Fix FMA/FMS/FNMS by removing an optimization that does not work for INF (cmpunord)
- cmpunord does not catch all cases of an extended result/overflow
- NOTE: FM still does not handle corner cases well (e.g inf * 1.2 because SPU does not have concept of inf)
rsx: Conditional lock hack removed
vulkan - Fixes
- Remove unused texture class
- Fix native pitch calculation (WCB)
rsx: Catch hanging begin/end pairs when flushing deferred draw calls
vulkan: Register DXT compressed formats
vulkan: Register depth formats
gl: Workaround for 'texture stitching' when gathering flip surface
- TODO: Add a proper flip hack option
rsx: Fix texture memory size calculation
- DXT textures dont have real pitch. Since pitch is used to calculate memory size, make sure it always evaluates to rsx_size
rsx: Fix cpu copy detection
rsx: Validate blit dst surface and dont make assumptions about region blit order
- Also relax restrictions on memory owned by the blit engine if strict rendering is not enabled
rsx: Fix depth texture detection
rsx: Do not manually offset into dst. The overlapped range check does so automatically
rsx: Minor optimizations
rsx: Minor fixes
- Fix to detect incompatible formats when using GPU texture scaling and show message
- Better 'is_depth_texture' algorithm to eliminate false positives
- Limits buffer size to min 720 in the Y axis (1024 section causes conflicts in some cases - TODO)
rsx: Fixups to allow large textures for blit operation
- Also includes checks for both leaking sections and blit regions for vulkan
hotfix for hanging when using WCB
addendum - unlock both ro and no blocks before attempting to copy memory blocks
gl: Fixups for ARB_explicit_uniform_location
- Forces glsl v 430 to make use of the extension
rsx/vk: Rework texture cache to minimize recursive access violations
- Also modifies the vulkan commandbuffer begin/end/submit mechanism
gl: Fix cached_texture_section::is_flushable to take memory protection into account
rsx: Fix blit dst offset calculation
gl/vk/rsx: Refactoring; unify texture cache code
gl: Fixups
- Removes rsx::gl::texture class and leave gl::texture intact
- Simplify texture create and upload mechanisms
- Re-enable texture uploads with the new texture cache mechanism
rsx: texture cache - check if bit region fits into dst texture before attempting to copy
gl/vk: Cleanup
- Set initial texture layout to DST_OPTIMAL since it has no data in it anyway at the start
- Move structs outside of classes to avoid clutter
gl: Fix multidraw [WIP]
rsx: Ignore vertex base when data source is generated using arithmetic
vk: Check pending flag before doing fence poke
vk/gl: Fix for inlined array and immediate draws
rsx: Collapse joined draws when batching
rsx: Blit engine improvements
- Always handle blits to and from framebuffers through the GPU
- Handle depth surfaces properly when using GL
- Check for format mismatches when blitting to the surface store [WIP]
- Make each frame context own its own memory
- Fix GPU blit
- Fix image layout transitions in flip
vk: Improve frame-local memory usage tracking to prevent overwrites
- Also slightly bumps VRAM requirements for stream buffers to help with running out of storage
- Fixes flickering and missing graphics in some cases. Flickering is still there and needs more work
vk: Up vertex attribute heap size and increase the guard size on it
vulkan: Reorganize memory management
vulkan: blit cleanup
vulkan: blit engine improvements
- Override existing image mapping when conflicts detected
- Allow blitting of depth/stencil surfaces
Fix rsx offscreen-render-to-display-buffer-blit surface reads
- Also, properly scale display output height if reading from compressed tile
gl: Fix broken dst height computation
- The extra padding is only there to force power-of-2 sizes and isnt used
gl: Ignore compression scaling if output is rendered to in a renderpass
rsx/gl/vk: Cleanup for GPU texture scaling. Initial impl [WIP]
- TODO: Refactor more shared code into RSX/common
Implemented _sys_process_exit2 syscall
Rewritten sys_game_process_exitspawn
Rewritten sys_game_process_exitspawn2
Implemented _sys_process_atexitspawn
Implemented _sys_process_at_Exitspawn
And some other changes
* Fixed sys_get_random_number generating less bytes than needed, and ceiling the buffer size in 0x1000 instead of failing
* Corrected alignment check in libgcm
* Now calling callback of sceNpManagerRegisterCallback
* Fixed trophies
Removes libad from ignore list.
It will only affect games that call for it, making them progress further with the Recompiler instead of dying in Unregistered Function, as there's no HLE implementation of libad.
Tested with Guitar Hero 5 [BLES00576]
- QUAD_STRIP evaluates to TRIANGLE_STRIP in memory. The memory layout is identical.
- The only difference between the two modes would be the primitive_ID but that doesnt matter on RSX
- Its worth noting that results will be different between the two modes if input vertices are non-coplanar for every set of N verts
Games like Tales of Vesperia seem to be using a random memory allocator with very low collision chance.
This means objects are very unlikely to be reused in such games leading to pile-up
- Do not set zfunc if alphakill is not enabled. This is because at the moment alphakill requires a different shader to be built
- use glsl loop-unroll friendly comparison; skip vertex input compare if either key requests it
- Minor tweaks to fp key generation
rsx/vk/shaders_cache: Move vp control mask to dynamic state
rsx/vk/gl: adds a shader cache for GL. Also Separates pipeline storage for each backend
rsx: Add more texture state variables to the cache
- Updates vulkan to use GPU vertex processing
- Rewrites vulkan to buffer entire frames and present when first available to avoid stalls
- Move more state into dynamic descriptors to reduce progam cache misses; Fix render pass conflicts before texture access
- Discards incomplete cb at destruction to avoid refs to destroyed objects
- Move set_viewport to the uninterruptible block before drawing in case cb is switched before we're ready
- Manage frame contexts separately for easier async frame management
- Avoid wasteful create-destroy cycles when sampling rtts
- Significant gains from greatly reduced CPU work
- Also reorders command submission in end() to improve throughput
- Refactors most of the vertex buffer handling
- All vertex processing is moved GPU side
* Fix always-true conditions in sceNp module
* gl_render_targets: useless check on unsigned variable, possible bug
* fixed UB in crypto utility functions
* copy-paste error in vk::init_default_resources
* pass strings by const ref
* Dont copy vectors. Make sure copies are not needed because functions are used in a multi-threaded context.
* Rework vertex attribute binding for vulkan. Allows always providing a buffer view to the pipeline even if the game has the attribute disabled as long as it is consumed by the vertex shader.
* Add sceNpScoreGetFriendsRanking and sceNpScoreGetFriendsRankingAsync functions
* Add sceNpSnsFbGetLongAccessToken function
* Add new functions for the sceNpTus module
* Add new functions for the cellSailRec module
* Stub cellCrossControllerInitialize
* Add sceNpAuth* functions for the sceNp2 module
* Remove unnecessary call to c_str()
* Add missing module id "CELL_SYSMODULE_ADEC_AT3MULTI"
* Add Turkish keyboard mapping constant
* Add cellOskDialogExtRegisterKeyboardEventHookCallbackEx function
* Update cellSubDisplay
* Update cotire version to 1.7.10
* Replace cellSubdisplay by cellSubDisplay
* Update cellSysutil.cpp with new functions stubbed
- spus run a tight gpu-style kernel with no multitasking on the cores themselves
-- this does not map well to PC processor cores because they never sleep even when doing nothing
-- the poll detection hack tries to find a good place to insert a scheduler yield
-- RdDec is a good spot as it signifies the spu kernel is waiting on a timer
- Improvements to framebuffer usage; Avoid creating new resources every frame
- Handle null fragment program properly
- Collect vertex upload statistics
- vk: Pre-initialize 'unused' varying registers in the vertex shader in case it gets matched with a fs that consumes it
-- Fixes a crash about fog_c not being declared
gl/dx12/vk: Handle null fragment program
- cleanup - use yield semantic instead of sleep(0) as yield is more cross-platform
-- sleep(0) is a windows specific scheduler hint
- Delays threads by a predetermined amount to 'desync' spurs kernels.
Largely reduces lock contention issues as well as making spurs kernels
play nice with reservations
- Also reduces number of lost notifications (SPU_EVENT_LR)
* Add the save icons to the save data entry and manager.
* Simplify code slightly since I have an else now so no need for == false
* Move the icon to the top of the list because it looks better. Remove redundant settitle.
* Fix size. It's a bit forced but there wasn't any better way as far as I could see on stack overflow.
Also, add an error dialog if you have no entries.
Simplify the logic slightly for the selected since with the no data case handled, I can make more assumptions about the return value.
* save_data_utility: fix dialog sizes
* CELL_OK
* Retcon dialog to instead be error in log.
* Dangle92 and I had some fun. Everything should be good now.
* In dangle's code he disabled the icon, in mine I hide it if there is nothing. Having both isn't needed. Yay merges resulting in doing stupid things.
* Fix leek
* Default size to zero for sanity. Shared pointer is fine handling null (tested with disgaea and renaming icon file)
* Simplifying. Thanks for review and advice all
Store disc locations for disc games
Create /dev_hdd0/disc/ directory
Move disc games from /dev_hdd0/game/ to /disc/ automatically
Load disc game patches automatically
- vk: Do not select first available format when choosing a swapchain format
- gl/vk: Ignore rendering zero sized framebuffers/scissors
- fp: Re-enable range clamp on fp16 registers; fix fx12 clamping [-2, 2]
* The basic unstubbing. Save entries will be listed and you can select a save. If you select none, then it'll work as well. WIP
* Filled out the trivial parts of the info dialog.
* Finish implementation and clean up. No "maintain" dialog or context menu for now until the copy/delete functions are implemented.
* Fix crash
* Update cellSaveData.cpp
For now, compile only one block at time
Use tail calls to move between blocks
Fully write PPU context (except CIA)
This fixes many compatibility problems
- Vertex buffer contents treat the base vertex as vertex 0 so we do the same for indices
rsx: Fix vertex base indexing
rsx: Properly fix non-zero offset indexed rendering
* sys_net: don't use fds_bits from a system header on FreeBSD
rpcs3/Emu/Cell/Modules/sys_net.cpp:137:14: error: no member named '__fds_bits' in
'sys_net::fd_set'; did you mean 'fds_bits'?
if (src->fds_bits[i] & (1 << bit))
^~~~~~~~
fds_bits
/usr/include/sys/select.h:75:18: note: expanded from macro 'fds_bits'
#define fds_bits __fds_bits
^
rpcs3/Emu/Cell/Modules/sys_net.h:114:13: note: 'fds_bits' declared here
be_t<u32> fds_bits[32];
^
* GUI: fallback to xdg-open on other Unices
rpcs3/Gui/GameViewer.cpp:289:26: error: use of undeclared identifier 'command'
wxExecute(fmt::FromUTF8(command));
^
* File: FreeBSD never supported copyfile(3) but sendfile(2) works fine
Utilities/File.cpp:114:10: fatal error: 'copyfile.h' file not found
#include <copyfile.h>
^~~~~~~~~~~~
* Thread: add signal handling for BSDs
Utilities/Thread.cpp:761:23: error: use of undeclared identifier 'REG_RAX'
static const decltype(REG_RAX) reg_table[] =
^
Utilities/Thread.cpp:763:2: error: use of undeclared identifier 'REG_RAX'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:11: error: use of undeclared identifier 'REG_RCX'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:20: error: use of undeclared identifier 'REG_RDX'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:29: error: use of undeclared identifier 'REG_RBX'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:38: error: use of undeclared identifier 'REG_RSP'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:47: error: use of undeclared identifier 'REG_RBP'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:56: error: use of undeclared identifier 'REG_RSI'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:763:65: error: use of undeclared identifier 'REG_RDI'
REG_RAX, REG_RCX, REG_RDX, REG_RBX, REG_RSP, REG_RBP, REG_RSI, REG_RDI,
^
Utilities/Thread.cpp:764:2: error: use of undeclared identifier 'REG_R8'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:10: error: use of undeclared identifier 'REG_R9'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:18: error: use of undeclared identifier 'REG_R10'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:27: error: use of undeclared identifier 'REG_R11'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:36: error: use of undeclared identifier 'REG_R12'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:45: error: use of undeclared identifier 'REG_R13'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:54: error: use of undeclared identifier 'REG_R14'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:63: error: use of undeclared identifier 'REG_R15'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:764:72: error: use of undeclared identifier 'REG_RIP'
REG_R8, REG_R9, REG_R10, REG_R11, REG_R12, REG_R13, REG_R14, REG_R15, REG_RIP
^
Utilities/Thread.cpp:792:26: error: no member named 'gregs' in '__mcontext'
const u64 reg_value = *X64REG(context, reg - X64R_RAX);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:804:21: error: no member named 'gregs' in '__mcontext'
out_value = (u8)(*X64REG(context, reg - X64R_AL));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:809:21: error: no member named 'gregs' in '__mcontext'
out_value = (u8)(*X64REG(context, reg - X64R_AH) >> 8);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:815:31: error: no member named 'gregs' in '__mcontext'
const s8 imm_value = *(s8*)(RIP(context) + i_size - 1);
^~~~~~~~~~~~
Utilities/Thread.cpp:784:18: note: expanded from macro 'RIP'
#define RIP(c) (*X64REG((c), 16))
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:827:33: error: no member named 'gregs' in '__mcontext'
const s16 imm_value = *(s16*)(RIP(context) + i_size - 2);
^~~~~~~~~~~~
Utilities/Thread.cpp:784:18: note: expanded from macro 'RIP'
#define RIP(c) (*X64REG((c), 16))
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:836:33: error: no member named 'gregs' in '__mcontext'
const s32 imm_value = *(s32*)(RIP(context) + i_size - 4);
^~~~~~~~~~~~
Utilities/Thread.cpp:784:18: note: expanded from macro 'RIP'
#define RIP(c) (*X64REG((c), 16))
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:846:20: error: no member named 'gregs' in '__mcontext'
out_value = (u32)RCX(context);
^~~~~~~~~~~~
Utilities/Thread.cpp:779:18: note: expanded from macro 'RCX'
#define RCX(c) (*X64REG((c), 1))
^~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:851:19: error: no member named 'gregs' in '__mcontext'
const u32 _cf = EFLAGS(context) & 0x1;
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:851:19: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:852:19: error: no member named 'gregs' in '__mcontext'
const u32 _zf = EFLAGS(context) & 0x40;
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:852:19: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:853:19: error: no member named 'gregs' in '__mcontext'
const u32 _sf = EFLAGS(context) & 0x80;
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:853:19: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:854:19: error: no member named 'gregs' in '__mcontext'
const u32 _of = EFLAGS(context) & 0x800;
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:854:19: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:855:19: error: no member named 'gregs' in '__mcontext'
const u32 _pf = EFLAGS(context) & 0x4;
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:855:19: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:885:12: error: no member named 'gregs' in '__mcontext'
case 1: *X64REG(context, reg - X64R_RAX) = value & 0xff | *X64REG(context, re...
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:885:62: error: no member named 'gregs' in '__mcontext'
...*X64REG(context, reg - X64R_RAX) = value & 0xff | *X64REG(context, reg - X64R_RAX) & 0xffffff...
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:886:12: error: no member named 'gregs' in '__mcontext'
case 2: *X64REG(context, reg - X64R_RAX) = value & 0xffff | *X64REG(context, ...
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:886:64: error: no member named 'gregs' in '__mcontext'
...reg - X64R_RAX) = value & 0xffff | *X64REG(context, reg - X64R_RAX) & 0xffff0000; return true;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:887:12: error: no member named 'gregs' in '__mcontext'
case 4: *X64REG(context, reg - X64R_RAX) = value & 0xffffffff; return true;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:888:12: error: no member named 'gregs' in '__mcontext'
case 8: *X64REG(context, reg - X64R_RAX) = value; return true;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:913:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) |= 0x1; // set CF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:913:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:917:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) &= ~0x1; // clear CF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:917:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:922:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) |= 0x40; // set ZF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:922:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:926:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) &= ~0x40; // clear ZF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:926:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:931:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) |= 0x80; // set SF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:931:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:935:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) &= ~0x80; // clear SF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:935:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:940:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) |= 0x800; // set OF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:940:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:944:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) &= ~0x800; // clear OF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:944:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:953:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) |= 0x4; // set PF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:953:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:957:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) &= ~0x4; // clear PF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:957:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:962:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) |= 0x10; // set AF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:962:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:966:3: error: no member named 'gregs' in '__mcontext'
EFLAGS(context) &= ~0x10; // clear AF
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:966:3: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:976:7: error: no member named 'gregs' in '__mcontext'
if (EFLAGS(context) & 0x400 /* direction flag */)
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:769:49: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:976:7: error: use of undeclared identifier 'REG_EFL'
Utilities/Thread.cpp:769:55: note: expanded from macro 'EFLAGS'
#define EFLAGS(context) ((context)->uc_mcontext.gregs[REG_EFL])
^
Utilities/Thread.cpp:1020:25: error: no member named 'gregs' in '__mcontext'
auto code = (const u8*)RIP(context);
^~~~~~~~~~~~
Utilities/Thread.cpp:784:18: note: expanded from macro 'RIP'
#define RIP(c) (*X64REG((c), 16))
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:1146:3: error: no member named 'gregs' in '__mcontext'
RIP(context) += i_size;
^~~~~~~~~~~~
Utilities/Thread.cpp:784:18: note: expanded from macro 'RIP'
#define RIP(c) (*X64REG((c), 16))
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:1368:47: error: no member named 'gregs' in '__mcontext'
const bool is_writing = context->uc_mcontext.gregs[REG_ERR] & 0x2;
~~~~~~~~~~~~~~~~~~~~ ^
Utilities/Thread.cpp:1368:53: error: use of undeclared identifier 'REG_ERR'
const bool is_writing = context->uc_mcontext.gregs[REG_ERR] & 0x2;
^
Utilities/Thread.cpp:1393:89: error: no member named 'gregs' in '__mcontext'
...%s location %p at %p.", cause, info->si_addr, RIP(context)));
^~~~~~~~~~~~
Utilities/Thread.cpp:784:18: note: expanded from macro 'RIP'
#define RIP(c) (*X64REG((c), 16))
^~~~~~~~~~~~~~~
Utilities/Thread.cpp:767:55: note: expanded from macro 'X64REG'
#define X64REG(context, reg) (&(context)->uc_mcontext.gregs[reg_table[reg]])
~~~~~~~~~~~~~~~~~~~~~~ ^
* Thread: add explict casts for incomplete pthread_t on some platforms
Utilities/Thread.cpp:1467:17: error: no viable overloaded '='
ctrl->m_thread = thread;
~~~~~~~~~~~~~~ ^ ~~~~~~
Utilities/Atomic.h:776:12: note: candidate function not viable: cannot convert argument of
incomplete type 'pthread_t' (aka 'pthread *') to 'const atomic_t<unsigned long>' for 1st
argument
atomic_t& operator =(const atomic_t&) = delete;
^
Utilities/Atomic.h:902:7: note: candidate function not viable: cannot convert argument of
incomplete type 'pthread_t' (aka 'pthread *') to 'const type' (aka 'const unsigned long') for
1st argument
type operator =(const type& rhs)
^
Utilities/Thread.cpp:1656:3: error: no matching function for call to 'pthread_detach'
pthread_detach(m_thread.raw());
^~~~~~~~~~~~~~
/usr/include/pthread.h:218:6: note: candidate function not viable: no known conversion from 'type'
(aka 'unsigned long') to 'pthread_t' (aka 'pthread *') for 1st argument
int pthread_detach(pthread_t);
^
* build: dlopen() maybe in libc
/usr/bin/ld: cannot find -ldl
c++: error: linker command failed with exit code 1 (use -v to see invocation)
* build: iconv() maybe available on some BSDs in libc
/usr/bin/ld: cannot find -liconv
c++: error: linker command failed with exit code 1 (use -v to see invocation)
* build: hidapi-hidraw is only built on Linux
/usr/bin/ld: cannot find -lhidapi-hidraw
c++: error: linker command failed with exit code 1 (use -v to see invocation)
* Thread: use getrusage() on more POSIX-like systems
* Qt: don't return NULL handle on other platforms
rpcs3/rpcs3qt/gs_frame.cpp:120:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^
* build: properly disable Vulkan on other platforms
In file included from rpcs3/rpcs3_app.cpp:40:
In file included from rpcs3/Emu/RSX/VK/VKGSRender.h:3:
rpcs3/Emu/RSX/VK/VKHelpers.h:1209:42: error: unknown type name 'device_queues'
std::vector<VkBool32> supportsPresent(device_queues);
^
rpcs3/Emu/RSX/VK/VKHelpers.h:1211:4: error: expected member name or ';' after declaration specifiers
for (u32 index = 0; index < device_queues; index++)
^
rpcs3/Emu/RSX/VK/VKHelpers.h:1221:4: error: expected member name or ';' after declaration specifiers
for (u32 i = 0; i < device_queues; i++)
^
rpcs3/Emu/RSX/VK/VKHelpers.h:1256:4: error: expected member name or ';' after declaration specifiers
if (graphicsQueueNodeIndex != presentQueueNodeIndex)
^
rpcs3/Emu/RSX/VK/VKHelpers.h:1261:4: error: expected member name or ';' after declaration specifiers
CHECK_RESULT(vkGetPhysicalDeviceSurfaceFormatsKHR(dev, surface, &formatCount, nullptr));
^
[...]
/usr/bin/ld: cannot find -lvulkan
c++: error: linker command failed with exit code 1 (use -v to see invocation)
* build: make install/strip work by moving commands
* Qt: create surface for GL context if it wasn't ready
#0 strlen (str=0x0) at /usr/src/lib/libc/string/strlen.c:100
#1 0x000000000090f02e in std::__1::char_traits<char>::length (__s=0x0)
at /usr/include/c++/v1/__string:215
#2 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string (__s=0x0, this=<optimized out>) at /usr/include/c++/v1/string:1547
#3 gl::capabilities::initialize (this=0x2ba32a0 <gl::g_driver_caps>)
at rpcs3/Emu/RSX/GL/GLHelpers.h:133
#4 0x000000000090d3dd in gl::get_driver_caps () at rpcs3/Emu/RSX/GL/GLHelpers.cpp:56
#5 0x00000000008fa511 in GLGSRender::on_init_thread (this=0x838d30018)
at rpcs3/Emu/RSX/GL/GLGSRender.cpp:484
#6 0x0000000000938f9e in rsx:🧵:on_task (this=0x838d30018)
at rpcs3/Emu/RSX/RSXThread.cpp:334
#7 0x0000000000abc329 in task_stack::task_type<named_thread::start_thread(std::__1::shared_ptr<void> const&)::$_10>::invoke() ()
#8 0x0000000000abc114 in thread_ctrl::start(std::__1::shared_ptr<thread_ctrl> const&, task_stack)::$_7::__invoke(void*) ()
#9 0x0000000801e60c35 in thread_start (curthread=0x843650a00)
at /usr/src/lib/libthr/thread/thr_create.c:289
#10 0x0000000000000000 in ?? ()
* build: don't abort without git metadata
-- Found Git: /usr/local/bin/git (found version "2.13.1")
fatal: Not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
CMake Warning at git-version.cmake:12 (message):
git rev-list failed, unable to include version.
* build: non-parallel needs git-version.h earlier
rpcs3/rpcs3_version.cpp:3:10: fatal error: 'git-version.h' file not found
#include "git-version.h"
^~~~~~~~~~~~~~~
1 error generated.
explicitly request anisotropic filtering and BC compression
clean up a leaking framebuffer handle reference when using debug overlay
Wait for device instead of queue to ensure no conflict during renderer shutdown
Clip scissor regions when doing surface clears
* Fix windows build. I made sure to do everything with a win32 prefix to not effect linux build.
* Make the window resizable instead of fixed in the corner.
* Ignore moc files and things in the debug/release folder. I might also ignore rpcs3qt.vcxproj and its filters as they're autogenerated by importing the qt project file. But, this helps clean out clutter for now.
* Add cmake. This doesn't interact with the rest of rpcs3 nor the main cmake file. That's the next thing I'm doing. I'll probably need to modify them so it'll take me time to figure out. But, this will build rpcs3qt on linux and build as is with using qt.
* The build works. I'd like to thank my friends, Google and Stackoverflow.
Setted up by importing rpcs3Qt project using Qt's visual studio plugin.
* Cleanup. Remove all the stuff in the rpcs3qt folder as its incorporated elsewhere. Remove the rpcs3qt project file as its now built into the solution and cmake doesn't care about pro files.
* Update readme to reflect getting Qt.
* Remove wxwidgets as submodule and add zlib instead. Wxwidgets was our old way of having zlib. I also added build dependencies to rpcs3qt so you should no longer get link errors on the first clean rebuild.
* Add rpcs3_version, few GUI tweaks
* Set defaultSize to 70% of screen size
* Add the view menu (#3)
* Added the view menu with the corresponding elements. Now, the debugger/log are hidden by default. The view menu has a checkbox which you click to show/hide the dock widgets.
* Make log visible by default
* Improve UI by making it into a checkbox that's easier to use.
* fix qt build for vs2017 (seems to work fine in 2015 with plugin but needs testing by other users)
* updated readme for qt
* update appveyor for qt
- cleaned formatting for the post build command
* fix build (#6)
* fix build legit this time i promise
* [Ready] Gamepadsettings (#4)
* WIP Gamepadsettings
pushbutton Eventhandling missing
* GamepadSettings should work except for cfg Init
Some KeyInputs are missing
* Update padsettingsdialog.h
* Update padsettingsdialog.cpp (#5)
* Update padsettingsdialog.cpp
removed silly tabs
* Update padsettingsdialog.cpp
* GetKeyCode simplified
* rename pad settings to keyboard settings o.O
* rename keyboard setting to input settings
* Remvoed the QT_UI defines.
* Readded new line at end of file. Replaced define in padsettings with constant.
* GUI fixes (Settings)
* Stub the logger UI. Nothing special besides a simple stub.
* Unstub the log. I haven't tested TTY but it should work.
Only thing to do, but this is in general, is add persistent settings.
* Minor refactoring to simplify code.
* Fix image loading. I'm 90% sure it works because it loads the path as expected and that's the same format I used in my gamelist implementation for the images.
* Made game lists much more functional than it was.
* mainwindow
* gamelist
* Please forgive me for I have lambdaed.
Added the ability to toggle showing columns via a context menu.
* Fix GameList further
* sort by name on init fixed
* Created the baseline refactoring. I'm going to start working on the callbacks now. May need to implement other classes in the process. Fun stuff, I know.
* adds InstallPkg (tested) and InstallPup (should work but makes unknown shenanigans) implementation
adds RefreshGameList
obliterates 10sec Refresh
* messages
* Rpcs3 gs frame (#16)
* Messing with project settings try to get trails of cold steel to boot.bluh
Definitely one change is needed in linker settings for RPCS3 to not crash immediately.
Can't even see how horribly botched my implementation of GSFrame is because we aren't booting lol. Something is gone awry with elf.
* remove random ! not that it matters much right now
* minor additions
* "Working" with debug mode though you have to ignore an assert reached from Qt. Qt is upset that the rsx thread is calling stuff on the UI thread despite not owning it. However, I can't do a thing to change that atm. (The fix would be to do what the TODO says in System.cpp-- making gsframe and stuff get initialized via system call)
Crashes due to needing pad callback to be done.
* With this build in debug mode, Trails of Cold steel will get FPS. (caveat. You have to ignore when Qt throws a debug assert lol)
* Fix release mode. Fix the Qt debug assert by using ancient occault rituals. I want to be able to remove the blocking connects but it won't work right now without it. It isn't perfect but it's good enough for now IMO.
* Add enters to the end of files.
* Removing target and setting source of events to be the application instead of the main window. The main window isn't the game window, and I don't really know what widget will be targetted for the game event. Works, though, it's admittedly probably not optimal by ANY means.
* Fix comment.
* Fix libpng wit zlib.
* Move Qt GUI into RPCS3Qt. (#17)
Restore wx GUI.
* fix install-progressdialogs randomly not showing
* install-progressdialog cosmetics
* add stylesheet file loading
* apply request
* Add stylesheet to git ignore.
* XInput..
* Joystick...
* Rpcs3 qt small fixes (#20)
* Small fixes. Have emulator stop when x button is pressed on game window. Have emulator/application stop when the main window is closed.
* If I forget another new line ending for a file.............................................
* Add CgDisasm (#21)
* fix install-progressdialogs randomly not showing
* install-progressdialog cosmetics
* add stylesheet file loading
* apply request
* add CgDisasm
add code to disable contextmenu options
fix gamelist issue
* missing proj changes
* Add ability to open stylesheets from menu.
* Mega searcher (#23)
* add MemoryStringSearcher
set minimum Sizes for mainwindow and CgDisasm
* minor fixes
* Since the system.cpp callbacks for emulator state were unused, I removed them. Then, I replaced them with callbacks for the Gui.
* added stylesheet options
setfocus on settings fixed
newline added
* added signals and slots for EmuRun and EmuStop
* update ui
update ui now works
added callback onReady
added EnableMenues
added ps3 commands
* added restart logic to menu
* newline
* event header removed
* Added graphic settings class. (#26)
* Added graphic settings class. First thing is to have the dock widgets and window size/location be stateful. Minor bug with debugger frame changing size on hide/show on default setup on second load. But, otherwise, fine. Also, the GUI doesn't update to accomodate the statefulness of the widgets. But, that'll come in time as I update this class.
* Add view debugger, logger, gamelist to settings and synchronize them.
* Separate initializing actions from connects
* Add invisible fullscreen cursor and double click event.
* Add the UI log settings.
* Add MemoryViewer (#30)
* Add Memoryviewer
Image Button crashes/not fully implemented
focus on some button annoying
minor changes for question dialogs
* GuiSettings Refactoring (#31)
* Add settings for columns shown and which one is saved
* I accidentally refactored the settings class. Added ability to reset to default GUI. Added statefulness to column widths.
* add gui tab
* Fix logging at startup.
* Preset settings.I think I ironed out MOST of the glitches. Will work on the rest of it soon. Should be a lot simpler as I won't have to use the so-called meta settings. Also, renamed all settings methods to CapitalCase.
* Removed dock widget controls.
* Added style sheets. Removed the option from the menu.
* Rewrite to use folder design. Much simpler! Yay! Simpler. Better, right?
* It's remarkable how tricky this is.
* Added convenience button to open up the settings folder in explorer
* Add newlines at end of file
* simplified logic. Fixed a bug.. hopefully not more bugs
* Fix the undocumented feature
* Make the dialog big enough to have entire text on title shown. If talkashie changes the font to size 1203482 I don't care lol
* Make warning messagebox instead of changing the title of the dialog.
* marking...
* Hcorion suggested changes.
* [WIP] autopause (#32)
* autopause added
needs fixing
headers do not show text
* fix compile stuff
* Add MsgDialog + edge widgets (#33)
* Add MsgDialog
needs magic
* add "Debugger" Buttons to menubar
* Adapt ds4 changes. I'm not sure if they work as I don't have a compatible controller. But, at the same time, it's kind of silly all I had to do was remove stdguiafx to get compilation.
* [Ready] Add KernelExplorer (#36)
* KernelExplorer added
* Fix build. Connect mainwindow to show explorer.
* qstr formatting added
hid header, fixed button size
* Taskbar Progress for install PUP/PKG (#37)
* Add Taskbar Progress for both PKG and PUP installer
* fix missing ifdefs for windows
* add mainwindow icon + thumbnail toolbar
* add game specific icons to the GSFrame
* fix icon crash
* fix appIcon's aspect ratio in SetAppIconFromPath
* Fix black borders in RGB32 icons
* rename thumbar related buttons
* EmuSettings (#35)
* Core tab done minus doing the library list.
* Graphics tab.
* Audio tab
* Input tab
* Added the other tabs
* LLE part one-- load existing libraries sorted. (I'd finish it but I'm going to look at a PR by mega)
* add search and add other libraries that aren't checked.
* Finish adding lle selecting things.
* marking my territory (#38)
fixed settingsdialog glitch and width
added groupbox to gui buttons
removed parents from layouts
* add debuggerframe + RSXDebugger (#34)
* Add Debuggerframe
* add RSXDebugger
* add RSXDebugger fo real
* RSXDebugger improved
minor adjustments
* add utf8 conversions like neko told me to
hopefully i did not utf8-ise too many things xD
* fix some variables
* maybe fix image buffers in RSXDebugger
* fixed image view (pretty sure)
* fixed image buffer (hopefully)
* QT Opengl frame (#41)
* fix RSX Debugger headers (#40)
* fix some debugger layout issues
fix RSX Debugger headers + some comments
* add kd-11's SPU options
fix D3D12 showing on non-compatible systems
tidy up coretab
* improve D3D12 behaviour in graphicstab:
adapter selection and D3D12 render won't show on non-compatible systems
add monospace font to cgDisasm
* enable update only on visibility
* Rpcs3 qt llvm build (#42)
* LLVM pushed so mega can test
* probably is what is needed with Release LLVM
* should probably have RPCS3-Qt be using release-llvm
* include zlib the same way.
* don't talk to me about how I made this happen.
* I applied the magical treatment to debug mode too. Though, it's entirely probably that doing it once in LLVM-release mode made this entirely redundant
* hack
* progress bar for LLVM spawns but doesn't close yet.
* fix msgDialog (#43)
fix oskDialog
* Minor bug fixzz
* fix osk and msgdialog for real (#44)
* fix msgDialog
fix oskDialog
* fix OskDialog part 2
fix MsgDialog part 2
* This bug is evil, and it should be ashamed of itself.
* Refactor YAML. Commented out gui options that aren't added to config yet (add em back later when we merge that in)
* Fix pad stuff.
* add SaveDataUtility (#45)
* add SaveDataUtility
* fix slots
* fix slots again
fix lists not showing stuff
fix dialogs not showing
add colClicked
refactor stuff and polish some layouts
* add SaveDataDialog.h and SaveDataDialog.cpp
* tidy up mainwindow
* add callback
* fix RegisterEditor (#47)
* fix RegisterEditor
* fix other dialogs' immortality (gasp...vampires)
* remove debug leftovers
* fix InstructionEditor (#46)
* fix InstructionEditor
* fix typo
* Fix MouseClickEvents in RSXDebugger (#50)
* Fix MouseClickEvents in RSXDebugger
Fix focus on MemoryViewer and RSXDebugger
Adjust PadButtonWidth
* fix another comment
* fix debuggerframe events (#49)
* Fix pad settings bro (#48)
* Fix pad settings bro
* fix comment
* Icons and Menu-Additions (#39)
* Add Icons and iconlogic to cornerWidget and actions
* add cornerWidget toggle
fix dockwidget action state on start
remove DoSettings
* fix game removal bug
remove tableitem focus rectangle
therefore add TableItemDelegate.h
* remove grid and focus rectangle from autopausedialog
* add fullscreen checkbox to misctab
minor padsettings layout improvements
* Add show category submenu to view menu
Add gamelist filter accordingly
fix minor bug where play icon was displayed despite pause label
add boolean b_fullscreen to mainwindow for later use in GSFrame
* fix headers in autopausesettings
fix remove bug in autopausesettings
add delete keypressevent in autopausesettings
fix missing tr() and minor refactoring in gamelist
* add default Icons for play/pause/stop/restart
* Fix fullscreen start. Some stuff was wrong with settings, just trust me.
* remove fullscreen leftovers and fix merge
* SPU stuff. (There was also a weird thing with config.h in GLGSFrame.h with an include that I removed to fix build)
* please neko's lambda fetishes (#53)
* please neko's lambda fetish in mainwindow
* please neko's lambda fetish in gamelistframe
* please neko's lambda fetish in logframe
* fix neko's lambda fetish in debuggerframe
* pleasefixdofetishsomething in Autopausesettingsdialog
* fix sth sth lambda in cg disasm
* lambda stuff in instructioneditor
* lambda kernelexplorer
* lambda-ise memoryviewer
* lambda rsxdebugger
* lambda savedatautil
this could be done even more, but the functions are not implemented
* Rpcs3 qt fixes -- shadow taskbar bug (#52)
* SShadow's bug of taskbar progress staying fixed on cancelling pkg install.
* other taskbar
* i'm still a baka
* Fix a warning
* qtQt refactoring (#54)
* fix neko's snake fetish
* File names should match headers. Are these the names I want? Not necessarily. But, this is much less confusing.
* i thought I committed everything with stage all.........................
* remove unused utilities
* The most important commit of them all.
* Disable legacy opengl buffers when not using opengl.
* fix code review comment
* Quick crash patch. Neko removed autopause. SO, I remove it too from emusettings/misc tab
* Merge lovely things from master (#55)
* Configuration simplified
* untrivial parts of the merge
* no need for these options anymore
* Minor change to fix column widths at startup (not sure why it doesn't work already, but adding the true makes it work so......... whatever)
* here ya go
* FIx hitting okay in settings causing graphics to messup (#57)
* fixes + msgdialog taskbarprogress (#56)
* fix ok button in taskbar
add taskicon progressbar for msgdialog
add tablewidgetitem to rsxdebugger
fix comments in save_data_utility.cpp
* fix d3d adapter default
* fix taskicon progressbar not being destroyed properly
* add last_path to filedialogs
* fix msgdialog crash on ok (#58)
* fix thread stopping in debbugerFrame (#59)
* Move Emu.init to be first. This will fix the qt stuff seeming to ignore the virtual filesystem in the config. (VFS to be made soon maybe) (#60)
* Fix full screen opening on double RIGHT click.
* fix other instances of double click ...
* Fix locaiton of gui config. (#61)
* fix d3d bug (#62)
* fix d3d bug
* small utf8 addition
* Fix cmake for qt (#64)
* Initial CMake fix
* Fix compilation with GCC
* Get rid of awful hack
* Update cotire with qt support
* Maybe fix travis
* Emergency Hack Relief Program Activated
* Fix travis build (#65)
* make about dialog great again (#67)
and add previous additions
* Fix library sort / smart gamelist context menu (#63)
* fix library sort
* add Title to custom game config dialog
* disable options on gamelist context menu
* use namespace for category Strings
* introduce sstr
* fix some tr nonsense
* Rpcs3qt Appveyor (#68)
Fix appyveyor build!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
* possible fix for gamelist icons (#69)
add warning for appicon
* Fix clang build (#66)
Hcorion, the build savior.
* Rpcs3 qt resources (#70)
* Resource files attempt 1
* Autorcc should probably be on?
* forgot the most important file lol
* Forgot an instance of the icon in the code...
* Patch fix for clang build.
* vulkan/d3d12 combobox merge (#71)
* add vulkan adapterbox and merge with d3d12 box
* fix adapter text on other renderer
* gather render strings
* attempt fix on gamelist row height
* adjust adapter behaviour to new guideline
* Compiler of Peace.
* High critical hit rate.
* Mugi eating strawberries is savage.
* Apply KD-11 Hotfix (#73)
* Most of Ani Adjusts (#72)
* Most of the adjustments are made here.
* fix gamelist rowheight
* fix msg dialog layout and disable_cancel
* cleanup
* fix disable cancle again
* fix debuggerframe buttons and doubleclick
* Add a fun little bonus feature :) (#74)
* category filters simplyfied (#75)
* Cleaning up cmake a bit.
* fixezzzzzz (#76)
* upgrade Info Boxes
* upgrade file explorer
* refactor GetSettings and SetSettings
* second refactoring
* cleanup
* travis is a grammar nazi
* second travis shenanigans
* third travis weirdo thingy
* travis 4 mega fun
* travis 5 default to def
* finish refactoring for settings
fix gamelist headers
* hotfix msgdialog and infobox (#77)
* msgdialog fix 1
* fix zombie infobox
* Rpcs3 Qt Welcome Page (#78)
* Add a welcome dialog.
* Add enter to end of file
* i'm an idiot
* last mistake i hope
* sponsored via --> funded by
* RPCS3 does not condone piracy.
* Mega Adjusts
* Ani Adjustments and a few refactorings
* Yay
* Add Gamelist Icon Sizes (#79)
* Reverting Mega's suggestion. If people can use alt-f4 to get around this dialog, they can probably use an emulator too.
* Fix firmware file choice dialog in QT GUI (#80)
* ani adjusts 2 + minor icon size simplifications (#81)
FPS Additions
* Update Travis to Qt 5.9 (#82)
* Update VKFragmentProgram.cpp
added missing exponent parameter
* fixed misplaced exponent in VKFragmentProgram.cpp
parameter that belonged to pow() was being passed to exp() instead, causing the shader compilation to fail
* fix for opengl fog_mode exponential2
same fix as the vulkan version
* directx fog_mode exponential2 fix
misplaced parameter
* directx fog_mode exponential2_abs fix
* vulkan fog_mode exponential2_abs fix
* opengl fog_mode exponential2 fix
vk: More surface fixes and debug stuff
vk: Crude thread sync implementation to prevent cb desync crashes due to resource usage
fix build
more fixes
vulkan: Do not flush command queue if address cannot be flushed
vk: More fixes for accuracy. Needs optimizations
vk: Batch all flush-to-buffer operations in the non-critical path
- More work is needed to make queue submission asynchronous
* Added checksum check to TROPHY.TRP loader
* Implemented sceNpTrophyGetGameProgress, sceNpTrophyGetGameIcon & sceNpTrophyGetTrophyIcon
* Updates to up to date APIs and tiny changes
* Code style fixes for checksum verifier, and another fix for trophy functions
* Format fix
Improve scaling and separate sampler state from texture state
gl: Unify all texture cache objects under one structure separate by use case
gl: Texture cache fixes
- Acquire lock when finding matching textures
- Account for swizzled surfaces when deciding whether to cpu memcpy
- Handle swizzled images on the GPU
Before
! LDR: **** cellSpurs export: [0x31F5196B] at 0x13ab56c
After
! LDR: **** cellSpurs export: [cellSpursRemoveSystemWorkloadForUtility] at 0x13ab56c
* cellSave fix plus bugfixes
allows allocation of last byte in memory block
prevents rpcs3 from crashing when closing non existent socket
* Fix overflow
* add more socket options
fix typo
prevent sys_net from operating on nullptr sockets
Motorstorm Apocalypse calls for cellMediatorGetSignatureLength,
cellMediatorCreateContext, cellMediatorGetProviderUrl,
cellMediatorGetStatus
LittleBigPlanet 2 and 3 may call for 0x37E1F502 (unknown name) on
cellCrossController
Resistance 3 and Uncharted 2 may call for the functions registered
on cellSysutilNpEula
cellSysmodule: Register 0xF044 module (cellSysutilNpEula)
Found by debugging Uncharted 2 Demo (NPEA90055)
Helps in all games that register sys module configuration 'multi-player'
cellSysmodule: Register 0x0054 module (libmedi)
Found on Motorstorm Apocalypse [NPEA00315] (thanks Zangetsu for the log)
cellSysmodule: Register 0x005C module (cellCrossController)
Found on LittleBigPlanet 2 [BCES00850] (thanks Zangetsu for the log)
- Add parameters to cellOvisInitializeOverlayTable, cellOvisFixSpuSegments and cellOvisInvalidateOverlappedSegments functions
- Modify return type for cellOvisFixSpuSegments and cellOvisInvalidateOverlappedSegments functions
- Replace UNIMPLEMENTED_FUNC by cellOvis.todo
Changed "fmt::throw_exception("Unimplemented" HERE); "
into:
"UNIMPLEMENTED_FUNC(cellAvconfExt); "
"return CELL_OK;"
Allow NPEB01283 to go further in boot (pass the intro and reaches the menu)...
* Handle directory creation in cellGameDataCheckCreate2
Stops some games from displaying information about not enough memory on
hdd
* Returning CELL_OK causes some games to loop on sceNp functions
for "shaman magic"
* cellGameDataCheckCreate2 added param.sfo creating/rewriting
* fix fs::file null
and one readability change
* For debugging purposes
When fs::file problem is located will be improved
* Fixed wrong operators
* Conversion from vfs to fs
Should take care of fs::null
* Cleanup
removed some unnecessary logging
* Fix successive function calls
second call was always ending in error since it didn't create the conent
permission
* Changes according to Neko's review
* Change to use u32 value
* PPUInterpreter: Fix undefined behavior of rol8 and rol16 with inline assembly
* PPUInterpreter: Fix undefined behavior of rol32 and rol64
* PPUInterpreter: Change left rotate functions to inline functions and move to types.h
cellDiscGameGetBootDiscInfo is called by non-disc games for some reason.
That wasn't accounted for and therefore it would try to read PARAM.SFO
from an unmounted path and throw an access violation.
Tested with NBA Live 08 Demo NPUB90029, probably fixes similar games as
well
* sceNp: Fix ExitSpawn and ExitSpawn2
Fixes sceNpDrmProcessExitSpawn and sceNpDrmProcessExitSpawn2
functions
The problem was that first argument klicensee was missing, therefore
shifting every other argument out of place and throwing an access
violation at the end.
* Use npDrmIsAvailable on sceNpDrmProcessExitSpawn
Tries to decrypt DRM file with provided klicensee
* Implement sceNpDrmVerifyUpgradeLicense
Implements sceNpDrmVerifyUpgradeLicense / sceNpDrmVerifyUpgradeLicense2