1. 23 Mar, 2022 1 commit
  2. 14 Mar, 2022 1 commit
  3. 25 Feb, 2022 1 commit
  4. 21 Feb, 2022 7 commits
  5. 11 Feb, 2022 1 commit
  6. 08 Feb, 2022 7 commits
  7. 19 Jan, 2022 1 commit
  8. 10 Jan, 2022 3 commits
  9. 05 Jan, 2022 3 commits
  10. 20 Dec, 2021 1 commit
  11. 17 Dec, 2021 1 commit
  12. 13 Dec, 2021 1 commit
  13. 06 Dec, 2021 1 commit
  14. 02 Dec, 2021 1 commit
  15. 01 Dec, 2021 1 commit
  16. 30 Nov, 2021 7 commits
    • Torben's avatar
      Add include for version check · a9f339d1
      Torben authored
    • Torben's avatar
      Move #if to exclude imports on older kernels · 0740ed64
      Torben authored
    • Torben's avatar
      Fix typo in doc and readmes · 50d9a33d
      Torben authored
    • Torben's avatar
      Add SVM docu and examples · 164bdf4d
      Torben authored
      Add an extra documentation article for the SVM feature in the documentation folder.
      Add new example software for the SVM feature for Rust and C++. Both programs contain four different tests using the arraysum, arrayinit and/or arrayupdate HLS example kernels. They demonstrate how to use the different page migration types provided with this feature.
    • Torben's avatar
      Return all buffer after PE release · 1ec629c4
      Torben authored
      If SVM is enabled 'pe.release()' returns all buffers which has been passed to 'pe.start()' as DataTransferAlloc parameter, no matter whether from_device has been set to 'true' or not. This way the user program can regain ownership of all buffers.
      See the SVM pipeline example for a use case.
    • Torben's avatar
      Add VirtualAddress param · 3b68f500
      Torben authored
      Add VirtualAddress as additional PEParameter. It is used to pass a virtual pointer to the PE without initiating a migration when using the SVM feature.
      Before the argument is passed to the PE, the runtime checks whether the loaded bitstream supports SVM.
    • Torben's avatar
      Switch to write lock and add second try · bdd749c1
      Torben authored
      Hold the mmap_write_lock() instead of mmap_read_lock() for host-to-device migrations to prevent that the CPU and device page fault handler run at the same time. During concurrent page accesses causing a Ping-Pong effect migrations to device failed and most accelerators do not handle the return SLVERRs correctly.
      If the migration of pages fail for ODPMs in host-to-device direction, try it a second time (this may happen if a race for a zero-page between device and host occurs).
  17. 25 Nov, 2021 2 commits
    • Torben's avatar
    • Torben's avatar
      Add SVM support · d61fd2e2
      Torben authored
      This commit adds support for Shared Virtual Memory (SVM) between host and FPGA with physical page migrations. This means the on-FPGA accelerator uses the same virtual address space as the user program running on the host CPU. Required memory pages are migrated automatically or user-managed between host and device memory.
      On hardware side an on-FPGA IOMMU is plugged between User IPs and memory controller to perform the virtual to physical address translations. It has multiple TLBs and issues page faults to the host after TLB misses.
      The additional PageDMA core has a very simple structure since it only copies whole pages. It also allows to only clear pages in device memory. Interrupts are only issued if enabled.
      The physical migration of memory pages using the HMM API requires many steps to be done in kernel space (e.g. collecting/allocating/freeing pages, updating TLBs and page tables). Also the memory allocator in the Rust runtime does not allow to free  formerly memory sections in smaller fragments. To avoid context switches between user and kernel space and to have a clear seperation between SVM and conventional memory management, the SVM management is done completely in the device driver.
      Runtime and driver detect whether the bitstream uses SVM by searching for the on-FPGA IOMMU.
      The driver implements two types of page migrations: On-demand Page Migrations (ODPMs) which are triggered by a device IOMMU or CPU page fault, and User-Managed Page Migrations (UMPMs) which are triggered from the user space application.
      Device page faults are handled in a worker thread of the concurrency managed workqueue (cmwq). The IOMMU and driver allow to handle multiple page faults in-flight for higher efficiency. However, CPU page faults must be handled one-by-one.
      Device page structs and the corresponding device memory regions are allocated seperately to guarantee physical contiguous memory sections in device memory. The physical address of a page is saved in the zone_device_data field of the struct page.
      The user space runtime has been changed as little as possible. The overall workflow does not change. The allocator is exchanged with a Dummy doing nothing. The DMA module forwards copy commands to the driver via ioctl(). This way the user can use the well-known WrappedPointer to trigger UMPMs. It is important that the base address of the array is passed as virtual address, nonetheless.
      If the user does not want to use UMPMs, he/she passes the virtual base address of the array (casted to uint64_t/u64) as single argument to the accelerator, and the IOMMU will issue page faults to trigger the corresponding ODPMs during runtime. This way no changes at the API are required at all!
      Since not all Linux kernels are compatible to our SVM implementation (at least 5.10.x is required, and the CONFIG_DEVICE_PRIVATE flag must been set), an additional flag to 'tapasco-build-libs' has been introduced to enable the SVM support.