Run another thread to collect profile data from SPU threads. Use this data to prioritize compiling hot spot SPU blocks. Implement stx::init_mutex::wait_for_initialized() helper.
Pseudo-mutex to protect initialization and finalization