Asynchronous Job Launches
Need to finally implement the asynchronous interface, Zynq will benefit most. Basic idea:
- On launch, enter job id into map to slot id (and/or vice versa), then return immediately.
- At some point, user space calls wait on job, which sets a pthread condition on the state in the job struct.
- Kernel space interrupt handler adds slot id to queue (lock-free?)
- User space management thread reads from device file which is connected to the queue, each read pops an element (use OS mechanisms to support reading multiple at once, at least 128).
- User space management thread updates state in job struct, which awakens the user space thread, normal handling resumes.
This should give a massive performance boost, especially if we can provide lock-free implementations of the queue(s).