Instead of plain waiting while equal to some value,
it can be something like less, or greater, or even bitcount.
But it's a draft and untested. Hopefully doesn't break anything.
Hashtable increased and flatten, tree-alike extensions removed.
Some things simplified, so it can actually decrease perf a bit.
But most platforms shouldn't be affected.
Removed limit of 56 waiters per pointer.
Real limit now is about 65535.
Fix desired locking operation (to fix "sudo" memory).
It was discovered that some systems have outdated configuration.
With too tight limit, it's almost impossible to lock anything in memory.
Move bits to the highest, set RWX order.
Use only one reserved value (W = locked).
Assume lock size 128 for range_locked.
Add new "Size" template argument that replaces normal argument.
Fix logic error in callstacks handling code, always set first to false after first iteration.
Add explicit check for zero return addresses. Current code validity checks may not check for it properly when it sits on interrupt handler entry point (which may contain valid code).
Do not allow 0x3FFF0 to be a back chain address because it needs space for LR save area, only 0x3FFE0 and below satisfy this criteria.
Arbitrary maximum set to 8, but really we need 2, maybe 3.
Added atomic_wait::list object for multi-waiting.
Added atomic_wait::get_unique_tsc just in case.
Only compare masks for overlap for second overload (with mask provided).
Explicit "new value" can be provided in new 3-arg overloads.
Also rename atomic_storage_futex -> atomic_wait_engine.
Add pointer comparison to notifiers (to prevent spurious wakeups).
Fix a bug with a possible double notification in raw_notify().
Fix a bug with incorrect allocatin bit slots for cond_handle.
Add a semaphore counter to track max allowed number of threads.
Use #define for some constants to STRINGIZE them in errors.
Add some error messages when certain limits are reached.
Fix a bug with a wrong check simply throwing std::abort.
Use "special" notify_all patch with batch processing for every arch.
Fix Win7 bug who no one probably noticed.
Reduced static memory amount for waitable atomics.
Allow notifier to skip notifications if wait/notify masks don't overlap.
Improve raw_notify to wake up the thread by its id, add thread_id arg.
Add optional mask argument to notify_one() and notify_all().