In some very unscientific benchmark:
spu_thread::do_dma_transfer() was taking 2.27% of my CPU before, now
0.07%, while __memmove_avx_unaligned_erms() was taking 1.47% and now
2.88%, which added makes about 0.8% saved.
I saw stvx v128{0} (aligned 16-bytes store) usage on the first 16-bytes of CellSaveDataCBResult before funcStat in fw.
Also I saw 4 stw of u32{0} on it as well before funcFile, funcFixed and funcList.
So just add the resets for result before all callbacks, and make all
members of savedata_ontext 16 -byte aligned in case there are more
members guaranteed to be aligned.
* Always memset 0 setBuf->buf (to bufSize) before funcStat if the direcory is not new.
* Always memset 0 setBuf->buf (to bufSize) if listGet->dirNum became non-zero (listGet->dirListNum can be zero yet memset still occurs) .
* Clear traces of setList setup before funcStat (after funcFixed/List, only if listGet->dirNum != 0, callback can hack this value and prevent the memset).
Fixes a corner case viewing unaligned memory at the end of spu memory.
Also unaligned view isn't suitable for the debugger, for these purposes the memory viewer should be used instead.
- Use native FMA to emulate VMADDFP, with a fallback for processors that don't support FMA
- Use native FMA to emulate VNMSUBFP as well, but note that it differs from the emulated path with regards to negative zero
FAudio is preferred, but if we didn't build with FAudio, we default to OpenAL
Also changed it to explicitly use xaudio backend on windows, rather than the static cast we had before.
'id' is not the idm id, also explicitly join the thread so a situation where the thread is still active and communicating other threads (e.g via MMIO or MFC) yet its ID is removed won't happen. (logic breakage, destroyed thread can't be active)
- Fixes a data leak that can happen when a surface is rejected due to aspect mismatch.
- Mismatch can lead to rejection due to area covered excluding the RTT and inevitable upload a texture from CPU at the same location.
- Overlapping fbo/shader_read resources are not allowed.