tapasco issueshttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues2019-01-15T16:48:33Zhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/4OOC: Fix problem with multiple top-levels2019-01-15T16:48:33ZJens KorinthOOC: Fix problem with multiple top-levelsOOC has problems if top-levels in the synthesis files are ambiguous. Try to find a better inspection to find the _right_ top-level.OOC has problems if top-levels in the synthesis files are ambiguous. Try to find a better inspection to find the _right_ top-level.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/7Make new tutorial video series2017-12-28T14:41:30ZJens KorinthMake new tutorial video seriesStart new video tutorials: Overview, Installation, Example, Detail How-Tos.
Ideas for series:
* [ ] Overview
* [ ] APIs
* [ ] HLS C/C++ Kernel (rot13?)
* [ ] Compose
* [ ] Design Space Exploration
* [ ] Multi-Platform Support
...Start new video tutorials: Overview, Installation, Example, Detail How-Tos.
Ideas for series:
* [ ] Overview
* [ ] APIs
* [ ] HLS C/C++ Kernel (rot13?)
* [ ] Compose
* [ ] Design Space Exploration
* [ ] Multi-Platform Support
* [ ] VC709
* [ ] ZC706
* [ ] ZedBoard
* [ ] PyNQ
* [ ] Command Line
* [ ] Helper ToolsJens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/9Regression Test for Release2018-07-01T14:56:28ZJens KorinthRegression Test for ReleaseNeed to define a standard test that can be executed on each `Platform` to check functionality. For simplicity, restrict it to a single bitstream.
Write script to perform all steps in batch?
* [ ] bitstream: counter, arraysum, arrayinit,...Need to define a standard test that can be executed on each `Platform` to check functionality. For simplicity, restrict it to a single bitstream.
Write script to perform all steps in batch?
* [ ] bitstream: counter, arraysum, arrayinit, arrayupdate (fill w/counter to max)
* [ ] status core: read all data, check bitstream
* [ ] transfer data to and from, check
* [ ] test interrupts via counter
* [ ] run arraysum
* [ ] run arrayinit
* [ ] run arrayupdate
* [ ] run arraysum MT
* [ ] run arrayinit MT
* [ ] run arrayupdate MThttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/11Implement Software Environment singleton2017-12-28T14:43:21ZJens KorinthImplement Software Environment singletonNeed a central place for all external software tools and versions. E.g., Vivado, Vivado HLS, FlexLM, etc.
Implement methods to report software environment (and compute versions only once at start).Need a central place for all external software tools and versions. E.g., Vivado, Vivado HLS, FlexLM, etc.
Implement methods to report software environment (and compute versions only once at start).Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/14Features as Json2017-07-31T14:30:28ZJens KorinthFeatures as JsonExternalize Features as Json; automatically derive GUI + parsers (pure Map approach).
```
{
"Name": "SomeFeature",
"Description": "...",
"Bit": 4, // use in TPC status core
"Parameters": [
{
"Name": "BoolParam",
"K...Externalize Features as Json; automatically derive GUI + parsers (pure Map approach).
```
{
"Name": "SomeFeature",
"Description": "...",
"Bit": 4, // use in TPC status core
"Parameters": [
{
"Name": "BoolParam",
"Kind": "Bool"
},
{
"Name": "StringParam",
"Kind": "String"
},
{
"Name": "IntParam",
"Kind": "Int"
},
{
"Name": "Size",
"Kind": {
"Kind": "Range",
"From": 1024,
"To": 10240
}
},
{
"Name": "Mode",
"Kind": {
"Kind": "Enum",
"Values": ["A", "B", "C"]
}
}
]
}
```Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/15HLS: Move Synthesizer Implementation into kernel description2017-05-10T09:12:06ZJens KorinthHLS: Move Synthesizer Implementation into kernel descriptionThe `kernel.description` / `kernel.json` should contain the HLS implementation; the source files determine which `HighLevelSynthesizer` must be used.The `kernel.description` / `kernel.json` should contain the HLS implementation; the source files determine which `HighLevelSynthesizer` must be used.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/18Example code: Remove globals2017-12-28T14:41:22ZJens KorinthExample code: Remove globalsTPC Examples contain global TPC objects, which inspires bad code (see EVO use cases). Remove and introduce proper handling.TPC Examples contain global TPC objects, which inspires bad code (see EVO use cases). Remove and introduce proper handling.Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/19Cleanup: Kernels, Examples2018-07-01T15:08:25ZJens KorinthCleanup: Kernels, Examples* [x] remove broken kernels
* [x] check that all kernels work in HLS
* [ ] check that all kernels have example programs
* [ ] check that example programs *actually work*
* [x] move through all directories, sift files* [x] remove broken kernels
* [x] check that all kernels work in HLS
* [ ] check that all kernels have example programs
* [ ] check that example programs *actually work*
* [x] move through all directories, sift files2018.2https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/20Slurm HLS: Store evaluation temporaries in accessible directory2019-01-15T16:48:40ZJens KorinthSlurm HLS: Store evaluation temporaries in accessible directoryEvaluation log and files are stored in `/tmp`, which is fine on a local system. In SLURM mode this means that the log file cannot be tracked in iTPC in many cases, since `/tmp` is node-local. Check if it is possible to move to a location...Evaluation log and files are stored in `/tmp`, which is fine on a local system. In SLURM mode this means that the log file cannot be tracked in iTPC in many cases, since `/tmp` is node-local. Check if it is possible to move to a location that is at least accessible across the workgroup.Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/27Tcl: Unified address map generation2019-01-15T16:50:37ZJens KorinthTcl: Unified address map generationUse common code to instantiate address maps across all platforms. Make it re-usable for future `Architecture`s.Use common code to instantiate address maps across all platforms. Make it re-usable for future `Architecture`s.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/28Asynchronous Memory Transfers2019-01-15T16:51:39ZJens KorinthAsynchronous Memory TransfersSimilar to asynchronous job launches: Check if asynchronous memory transfers could be useful. I guess probably not so much, because we need to wait for the transfers to finish, before we can launch the job in anycase - worst/ideal case w...Similar to asynchronous job launches: Check if asynchronous memory transfers could be useful. I guess probably not so much, because we need to wait for the transfers to finish, before we can launch the job in anycase - worst/ideal case would be that the job starts immediately and data must be available. It would be possible to add mem barriers based on the job struct, but I do not think this would be worth the effort.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/29Tcl: Writes[T]2019-01-15T16:53:41ZJens KorinthTcl: Writes[T]Implement Tcl serialization support like Json: Define package `tcl` with `Writes[T]`. Default types should include `Writes[(String, Int)]` (which writes `set name 42`) and similar.
_Interface_
```
trait Writes[T] {
def writes(t: T): ...Implement Tcl serialization support like Json: Define package `tcl` with `Writes[T]`. Default types should include `Writes[(String, Int)]` (which writes `set name 42`) and similar.
_Interface_
```
trait Writes[T] {
def writes(t: T): String
}
object Tcl {
def toTcl[T](t: T)(implicit w: Writes[T]): String = w.writes(t)
}
```https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/30Implement HLS kernels2017-05-10T10:45:22ZJens KorinthImplement HLS kernels* [x] Countdown / Timer
* [x] arrayinit
* [x] arrayadd
* [x] arrayupdate
* [ ] varrayinit
* [ ] varrayadd
* [ ] varraysum
* [ ] vectoradddot
* [ ] sobel
* [ ] mandelbrot
* [ ] sudoku
* [x] rot13
* [ ] warraw (dep collision check)
* [ ] f...* [x] Countdown / Timer
* [x] arrayinit
* [x] arrayadd
* [x] arrayupdate
* [ ] varrayinit
* [ ] varrayadd
* [ ] varraysum
* [ ] vectoradddot
* [ ] sobel
* [ ] mandelbrot
* [ ] sudoku
* [x] rot13
* [ ] warraw (dep collision check)
* [ ] fir
* [ ] n-queenshttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/31Implement lock-free hash table in libtpc/libplatform2019-01-15T16:55:01ZJens KorinthImplement lock-free hash table in libtpc/libplatformhttps://github.com/mintomic/samples
Could be used to remove fixed number of slots in overall `Architecture`.https://github.com/mintomic/samples
Could be used to remove fixed number of slots in overall `Architecture`.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/33Add "keep all runs" to `DesignSpaceExplorationJob`2019-01-22T12:27:35ZJens KorinthAdd "keep all runs" to `DesignSpaceExplorationJob`Might be useful for debugging.Might be useful for debugging.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/34Improved "Replay" Mode for DSE log2019-01-22T12:29:10ZJens KorinthImproved "Replay" Mode for DSE log`tapasco-logviewer` can already display logs, but it would be nice to have _timeline_, allowing to iteratively play back the events.`tapasco-logviewer` can already display logs, but it would be nice to have _timeline_, allowing to iteratively play back the events.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/35Tcl: Extract Subsystems2017-05-10T10:52:49ZJens KorinthTcl: Extract SubsystemsHave subsystem Tcl packages for generic subsystems, e.g., IRQ, Memory, PCIe. Each should create a pre-wired cell.Have subsystem Tcl packages for generic subsystems, e.g., IRQ, Memory, PCIe. Each should create a pre-wired cell.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/36Infrastructure: Tapasco Status Core2019-01-22T15:51:46ZJens KorinthInfrastructure: Tapasco Status CoreUpgrade TPC Status Core to incorporate performance counters. Extend libtpc to gather statistics after the run, possibly writing to a file using a environment variable.
* [x] Version Register: Vivado
* [x] Version Register: TaPaSCo
* [ ]...Upgrade TPC Status Core to incorporate performance counters. Extend libtpc to gather statistics after the run, possibly writing to a file using a environment variable.
* [x] Version Register: Vivado
* [x] Version Register: TaPaSCo
* [ ] PerfCounter: # of IRQs/slot
* [ ] PerfCounter: busy cycles/slot
* [ ] PerfCounter: IRQ cycles (waiting for ACK)/slothttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/37Architecture per ThreadUnit2017-05-10T10:56:13ZJens KorinthArchitecture per ThreadUnitExtend TPC to use one Architecture per ThreadUnit; allows to combine different Architectures in one bitstream.Extend TPC to use one Architecture per ThreadUnit; allows to combine different Architectures in one bitstream.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/38Feature: OCM Memory2017-05-10T10:56:50ZJens KorinthFeature: OCM MemoryImplement an optional `Feature` to map OCM memory into address space. Parameters: size + offsetImplement an optional `Feature` to map OCM memory into address space. Parameters: size + offsethttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/39Partial Reconfiguration: Speed up synthesis2017-05-10T11:07:13ZJens KorinthPartial Reconfiguration: Speed up synthesisSynthesis of MIG core and PCIe takes very long time, significant amount of overall time for a run. Idea: Use a pre-synthesized design, ideally even placed and routed, for the Platform, and only add the dynamic ThreadPool afterwards.
Thi...Synthesis of MIG core and PCIe takes very long time, significant amount of overall time for a run. Idea: Use a pre-synthesized design, ideally even placed and routed, for the Platform, and only add the dynamic ThreadPool afterwards.
This approach could be achieved by wiring the Platform first, defining a general address mapping. Then synthesize, place and route the entire design and remove the Threadpool cell afterwards, replacing it with a black box. This design checkpoint could then be loaded and only the Threadpool would be added.
Since Design Checkpoints (DCPs) are specific to each Vivado version, there should be a command to initialize the DCPs once for each Platform.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/40Feature: BRAM2019-01-22T15:51:13ZJens KorinthFeature: BRAMImplement a `Feature` to generate a chunk of BRAM and map it into address space for small allocations. Parameters: size + offsetImplement a `Feature` to generate a chunk of BRAM and map it into address space for small allocations. Parameters: size + offsethttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/41Fix FlexLM status queries2019-01-22T15:51:30ZJens KorinthFix FlexLM status queries* [ ] Fix/check licence requirements of `Task` instances
* [ ] Fix/check `lmstat` support* [ ] Fix/check licence requirements of `Task` instances
* [ ] Fix/check `lmstat` supporthttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/47LKM: man pages2019-01-22T15:53:04ZJens KorinthLKM: man pagesWrite `man` pages for device driver files.Write `man` pages for device driver files.Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/49Tcl: Plugins passing arguments can lead to errors2018-07-01T15:07:35ZJens KorinthTcl: Plugins passing arguments can lead to errorsCheck what the arguments passing mechanism is good for; depending on the plugin order it can lead to problems in consecutive calls.Check what the arguments passing mechanism is good for; depending on the plugin order it can lead to problems in consecutive calls.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/64DSE: Abort runs after PlacerErrors2019-01-22T15:54:09ZJens KorinthDSE: Abort runs after PlacerErrorsWhen a run results in a `PlacerError` it is extremely unlikely that any run with the same (or a larger) `Composition` will succeed. Runs are already pruned after the batch finishes, but it could be useful to be even more aggressive and a...When a run results in a `PlacerError` it is extremely unlikely that any run with the same (or a larger) `Composition` will succeed. Runs are already pruned after the batch finishes, but it could be useful to be even more aggressive and abort runs in the current batch, if they would be pruned. This could speed up batches and increase convergence speed.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/66Improve LogTrackingPanel2019-01-22T15:56:36ZJens KorinthImprove LogTrackingPanelThe `LogTrackingPanel` is not yet as useful as it could be. Ideas:
* [ ] highlight lines from different files in different colors
* [ ] prepend logfile name (probably unreadable)
* [ ] implement search field for free text / regex search...The `LogTrackingPanel` is not yet as useful as it could be. Ideas:
* [ ] highlight lines from different files in different colors
* [ ] prepend logfile name (probably unreadable)
* [ ] implement search field for free text / regex searching
* [ ] enable quick filters: ERROR, CRITICAL, WARNINGhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/83Boot: Replace Xilinx Root FS2019-01-22T15:57:40ZJens KorinthBoot: Replace Xilinx Root FSThe rootfs is currently repurposed from the official PyNQ image (publicly available). This has been convenient, but in the long term it would be preferable to build a custom rootfs from scratch with less baggage. Replace it with an Ubunt...The rootfs is currently repurposed from the official PyNQ image (publicly available). This has been convenient, but in the long term it would be preferable to build a custom rootfs from scratch with less baggage. Replace it with an Ubuntu rootfs, or buildroot.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/84Automate installation2017-06-01T12:20:38ZJens KorinthAutomate installationPlatform requires some additional bring-up, e.g., customization and installation of udev rules. Should have a script that automates the process and can be run once as `sudo`.Platform requires some additional bring-up, e.g., customization and installation of udev rules. Should have a script that automates the process and can be run once as `sudo`.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/93BlueDMA support in ZC7062019-06-26T06:28:06ZJens KorinthBlueDMA support in ZC706ZC706 could benefit from an DMA engine feature, which allows to use the on-board DDR banks. Port BlueDMA to Zynq and implement a Platform `Feature` for it.ZC706 could benefit from an DMA engine feature, which allows to use the on-board DDR banks. Port BlueDMA to Zynq and implement a Platform `Feature` for it.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/97Improve SLURM job names2019-01-22T15:59:24ZJens KorinthImprove SLURM job namesSLURM compose jobs currently have names like `compose-0xd4c6a941-axi4mm-pynq-180.00`. Use new naming scheme instead to show full configuration. Possibly also change the comment to the working directory.SLURM compose jobs currently have names like `compose-0xd4c6a941-axi4mm-pynq-180.00`. Use new naming scheme instead to show full configuration. Possibly also change the comment to the working directory.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/98Core Import: Add Synthesis and PnR parameters2019-01-22T16:00:31ZJens KorinthCore Import: Add Synthesis and PnR parametersIt would be useful to be able to control the parameters of synthesis and implementation directly from TaPaSCo. Maybe we should define modes, e.g.,
* **fastest** - lowest effort, minimal runtime
* **fast** - slightly slower, but stil...It would be useful to be able to control the parameters of synthesis and implementation directly from TaPaSCo. Maybe we should define modes, e.g.,
* **fastest** - lowest effort, minimal runtime
* **fast** - slightly slower, but still short runtime
* **normal** - default options
* **optimal** - slower, get as close to _real_ values as possible
* **aggressive_performance** - maximal optimization to performance
* **aggressive_area** - maximal optimization areahttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/103Make synthesis and implementation effort configurable2019-01-22T16:05:11ZJaco HofmannMake synthesis and implementation effort configurableThe default settings used at the moment are AlternateRoutability + Retiming for Synthesis and Explore + PHYS_OPT_DESIGN for Implementation. These settings could be considered to be very high effort. A switch could be added to let the use...The default settings used at the moment are AlternateRoutability + Retiming for Synthesis and Explore + PHYS_OPT_DESIGN for Implementation. These settings could be considered to be very high effort. A switch could be added to let the user decide between different "effort levels". For most synthesis runs it is not necessary to go with very high effort and the user might be happy about the much lower run-time.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/106TPC-Debug: Monitor Device Registers looks for INTC02019-01-22T16:04:27ZJaco HofmannTPC-Debug: Monitor Device Registers looks for INTC0With the changes to MSIx there is only one Interrupt Controller left and that one does not expose any status registers right now so the red warning: "INTC0: 0xffffffff" might be misleading.With the changes to MSIx there is only one Interrupt Controller left and that one does not expose any status registers right now so the red warning: "INTC0: 0xffffffff" might be misleading.Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/107[VC709] Seperate addresses of different memory regions2019-07-09T11:03:05ZJaco Hofmann[VC709] Seperate addresses of different memory regionsCurrently TPC has different devices at the same address depending on the viewpoint. For example the TPC configuration registers start at 0x0 which is visible from the host. The on-board DDR memory is also located at 0x0 but only visible ...Currently TPC has different devices at the same address depending on the viewpoint. For example the TPC configuration registers start at 0x0 which is visible from the host. The on-board DDR memory is also located at 0x0 but only visible by the DMA engine and the PEs. It might be advisable to split these memory regions. A new address map could look like
| Address | Device |
| --- | --- |
| 0x0001000000000000 | MIG |
| 0x0002000000000000 | Configuration |
| 0x0003000000000000 | PEs |
etc. Accordingly Configuration and PEs would be separated into different BARs.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/113Rewrite Getting Started Guides2018-07-01T14:58:21ZJens KorinthRewrite Getting Started Guideshttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/124Evaluate 64 bit for platform_addr_t2020-04-03T17:49:05ZJaco HofmannEvaluate 64 bit for platform_addr_tAll platforms except for legacy Zynq, such as the PCIe based systems or MPSoC, use larger than 32 bit addresses. While we currently get by with smaller addresses this might change in the future and we should consider a move to 64 bit add...All platforms except for legacy Zynq, such as the PCIe based systems or MPSoC, use larger than 32 bit addresses. While we currently get by with smaller addresses this might change in the future and we should consider a move to 64 bit addresses.
I currently don't see any problem just changing the address width. The Zynq platform should continue to work with the required casts and all other platforms currently cast to 64 bit addresses anyway.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/127[ZCU102] Evaluate is very slow2019-10-16T13:46:04ZJaco Hofmann[ZCU102] Evaluate is very slowWhen running evaluate on a small core (~6000 LUTs) the process takes about 3 to 4 minutes for the 7-Series devices. When run on the ZCU102 Zynq Ultrascale+ device the same process requires over an hour.
For example Phase 3 Initial Routi...When running evaluate on a small core (~6000 LUTs) the process takes about 3 to 4 minutes for the 7-Series devices. When run on the ZCU102 Zynq Ultrascale+ device the same process requires over an hour.
For example Phase 3 Initial Routing requires more than 40 Minutes instead of about a minute for the other platforms.
Testing was done with Vivado 2016.4.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/129Fallback option of requested amount of MSI-X interrupts is not available2019-10-23T11:56:06ZJaco HofmannFallback option of requested amount of MSI-X interrupts is not availableThe driver currently simply fails if the OS is not able/willing to provide the requested number of interrupts.
There should be a fall back option that gets enabled automatically if the requested amount of interrupts can not be provided.The driver currently simply fails if the OS is not able/willing to provide the requested number of interrupts.
There should be a fall back option that gets enabled automatically if the requested amount of interrupts can not be provided.Jaco HofmannJaco Hofmannhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/137TaPaSCo is stuck after all jobs finished in verbose mode2019-10-30T15:34:08ZJens KorinthTaPaSCo is stuck after all jobs finished in verbose modeWhen verbose-mode is activated (`-v`) and logs are tracked, the `MultiFileWatcher`s prevent TaPaSCo from exiting normally. Check that all watchers are properly terminated after their corresponding job has ended.When verbose-mode is activated (`-v`) and logs are tracked, the `MultiFileWatcher`s prevent TaPaSCo from exiting normally. Check that all watchers are properly terminated after their corresponding job has ended.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/144Support PE-local memories in HLS2019-01-22T16:12:26ZJens KorinthSupport PE-local memories in HLSUse new PE-local memory support to enable a new kind of HLS port pattern: `localmem`. A Tcl script should automatically wrap the PE with BRAM and also make the BRAM accessible via secondary S-AXI. Using the new PE-local memories, it shou...Use new PE-local memory support to enable a new kind of HLS port pattern: `localmem`. A Tcl script should automatically wrap the PE with BRAM and also make the BRAM accessible via secondary S-AXI. Using the new PE-local memories, it should be possible to use BRAMs for HLS-based kernels, e.g., AES.Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/154Allow direct view of the device memory on PCIe2019-01-22T16:15:25ZJaco HofmannAllow direct view of the device memory on PCIeThis can be implemented by using a sliding window and a second BAR. The Xilinx Core does not support this feature directly, though. Will use a little Bluespec Module that has one configuration register for the address offset which forwar...This can be implemented by using a sliding window and a second BAR. The Xilinx Core does not support this feature directly, though. Will use a little Bluespec Module that has one configuration register for the address offset which forwards the requests accordingly.Jaco HofmannJaco Hofmannhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/160Local memory slots not considered in area estimation, causing DSE to fail2019-12-18T09:54:11ZJens KorinthLocal memory slots not considered in area estimation, causing DSE to failIf a PE has local memories (or more than one slave interface, for that matter), DSE will still try to build more instances than will fit in the current 128 slots limit. There are several possible solutions:
1. Have separate enumerati...If a PE has local memories (or more than one slave interface, for that matter), DSE will still try to build more instances than will fit in the current 128 slots limit. There are several possible solutions:
1. Have separate enumeration for memory slots (affects status core, `platform_info` and potentially requires a more sophisticated way to determine accessibility for each PE).
2. Fix the algorithms to account for each slave interface instead of just assuming one.
Need to think about it some more; I guess, each PE will always have exactly _one_ control slave interface. We could require a naming convention to identify it if more than one candidate is present on a PE, e.g., `S_AXI_CTRL` or similar. All other slave interfaces could be assigned a base address from a different pool, e.g., using the upper 64 base addresses already reserved for platform addresses. But we'd have to come up with some O(k) or at least O(n) scheme to find the base addresses of all slaves on a PE. :thinking:2018.2https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/161Allow PE Masters to have any valid AXI Data Width2019-01-22T16:23:47ZJaco HofmannAllow PE Masters to have any valid AXI Data WidthThe data width of PE masters is currently limited to either 32 or 64 bit. Considering that most platforms outside of Zynq have much broader memory controllers it is beneficial to support all valid AXI Data Widths up to 1024 bits. This mi...The data width of PE masters is currently limited to either 32 or 64 bit. Considering that most platforms outside of Zynq have much broader memory controllers it is beneficial to support all valid AXI Data Widths up to 1024 bits. This might also be relevant for Zynq platforms if the designer of a PE wants to keep their logic simple and rely on data width converters to interface with the memories correctly.Jens KorinthJens Korinthhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/162LED feature on VC709 crashes Vivado2020-03-04T22:47:08ZJens KorinthLED feature on VC709 crashes VivadoEnabling the LED feature on VC709 compositions reproducibly crashes Vivado. While this is certainly a Vivado bug, we should investigate a workaround.Enabling the LED feature on VC709 compositions reproducibly crashes Vivado. While this is certainly a Vivado bug, we should investigate a workaround.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/163Implement tapasco_load_bitstream* functions2019-01-22T16:25:12ZJens KorinthImplement tapasco_load_bitstream* functionsSince its inception, the TaPaSCo/TPC API had two functions to load a new bitstream at runtime. This is meant to support complex use cases where an application switches between multiple bitstreams optimized for the specific stage of compu...Since its inception, the TaPaSCo/TPC API had two functions to load a new bitstream at runtime. This is meant to support complex use cases where an application switches between multiple bitstreams optimized for the specific stage of computation. This is arguably a useful thing and reasonably simple to implement on Zynq (given appropriate permissions on `/dev/xdevcfg`).
Is there a way to implement similar support on PCIe devices with reasonable effort? I suppose it would involve an ICAP as a platform component; however, I'm not sure if this works with non-partial bitstreams.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/164Properly unregister the driver if a device is removed2019-01-22T16:25:45ZJaco HofmannProperly unregister the driver if a device is removedFix errors occurring because the tlkm driver does not react properly on device remove requests.Fix errors occurring because the tlkm driver does not react properly on device remove requests.2018.2Jaco HofmannJaco Hofmannhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/165AXI Interconnect does not handle AXI4 -> AXI4 Lite properly for small transfers2020-03-18T17:12:05ZJaco HofmannAXI Interconnect does not handle AXI4 -> AXI4 Lite properly for small transfersIt seems like the AXI interconnect does not handle protocol conversion from AXI4 to AXI4-Lite properly and ignores the strb signal on reads. Accordingly, whenever a request comes e.g. through PCIe that is larger than the AXI4-Lite slave ...It seems like the AXI interconnect does not handle protocol conversion from AXI4 to AXI4-Lite properly and ignores the strb signal on reads. Accordingly, whenever a request comes e.g. through PCIe that is larger than the AXI4-Lite slave data width it will result in superfluous transactions. That's not a big deal for writes as the strb signal is set properly. However, for reads there is no such signal in AXI4-Lite and if the read has some effect on the state of the device it will result in hard to debug problems. This is known to Xilinx but seems to be wont-fix: https://forums.xilinx.com/t5/Embedded-Development-Tools/AXI4-gt-AXI-Lite-wstrb-behavior/td-p/645535https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/166Add PE to interrupt mapping in Status Core2019-01-22T16:27:20ZJaco HofmannAdd PE to interrupt mapping in Status CoreInterrupts are currently mapped iterative to the corresponding interrupt line. To increase flexibility the status core can store the mapping used.
Advantages are flexible mappings that enable the use of more than one interrupt per PE.Interrupts are currently mapped iterative to the corresponding interrupt line. To increase flexibility the status core can store the mapping used.
Advantages are flexible mappings that enable the use of more than one interrupt per PE.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/167Interrupts go missing sometimes2020-04-03T11:43:05ZJaco HofmannInterrupts go missing sometimesThe PCIe MSIx interrupts coming from the DMA engine are received properly by the interrupt controller. The interrupt controller properly issues a AXI write request to the correct address in host memory. The PCIe AXI bridge does ACK the t...The PCIe MSIx interrupts coming from the DMA engine are received properly by the interrupt controller. The interrupt controller properly issues a AXI write request to the correct address in host memory. The PCIe AXI bridge does ACK the transfer and Bresp is OKAY. However, sometimes the interrupts do not reach the host for some reason. This can be confirmed checking /proc/interrupt.
This might be related to the interrupt controller taking too long. However, the DMA interrupt simply increases a value and schedules the userspace. This should not take too long.
Another alternative is that the PCIe bridge looses data when it is under heavy pressure.
For now as a quick fix I will try to disable a certain interrupt whenever the interrupt has just fired and see if that fixes the problem at the cost of latency. If that doesn't help maybe there is some possibility to remove protocol converters in between the interrupt handler and the PCIe bridge to avoid problems with those.
Overall no clear indication to what might go wrong as long as we don't have the hardware to debug right on the PCIe bus.https://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/169Investigate Logic Utilization reports2020-04-03T12:24:05ZCarsten HeinzInvestigate Logic Utilization reportsIt seems that the utilization report does not make sense for BRAM in the user logic. Sometimes utilization for user logic is higher than for the complete system logic.It seems that the utilization report does not make sense for BRAM in the user logic. Sometimes utilization for user logic is higher than for the complete system logic.Carsten HeinzCarsten Heinzhttps://git.esa.informatik.tu-darmstadt.de/tapasco/tapasco/-/issues/174Compose fails after HLS runs2019-01-22T16:31:55ZJaco HofmannCompose fails after HLS runsSometimes a compose job fails after successful HLS runs with the following error:
```bash
[16:22:41 <pool-1-thread-2: ImportTask> INFO] Import of 'arrayinit_axi4mm.zip' with target axi4mm@vc709
[16:22:41 <pool-1-thread-2: Import$> INFO]...Sometimes a compose job fails after successful HLS runs with the following error:
```bash
[16:22:41 <pool-1-thread-2: ImportTask> INFO] Import of 'arrayinit_axi4mm.zip' with target axi4mm@vc709
[16:22:41 <pool-1-thread-2: Import$> INFO] SynthesisReport for arrayinit not found, starting evaluation ...
[16:22:41 <pool-1-thread-2: EvaluateIP$> INFO] starting evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arrayinit/axi4mm/vc709/ipcore/arrayinit_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz, output in /tmp/372075065893313964/evaluate.log
[16:30:38 <pool-1-thread-2: EvaluateIP$> INFO] evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arrayinit/axi4mm/vc709/ipcore/arrayinit_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz finished successfully, report in /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arrayinit/axi4mm/vc709/ipcore/arrayinit_export.xml
[16:30:38 <pool-1-thread-3: VivadoHighLevelSynthesis$> INFO] starting run 'arraysum' for axi4mm@vc709: output in /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/hls/axi4mm.log
[16:31:16 <pool-1-thread-3: VivadoHighLevelSynthesis$> INFO] Vivado HLS finished successfully for 'arraysum' for axi4mm@vc709
[16:31:16 <main: HighLevelSynthesis$> INFO] all HLS tasks have finished.
[16:31:16 <main: HighLevelSynthesis$> WARN] executed HLS with co-sim for [Kernel @/home/wimi/jah/projects/tapasco/tapasco_2018.2/kernel/arraysum/kernel.json]
Name = arraysum
TopFunction = arraysum
Version = 1.0
Files = /home/wimi/jah/projects/tapasco/tapasco_2018.2/kernel/arraysum/arraysum.c
TestbenchFiles = /home/wimi/jah/projects/tapasco/tapasco_2018.2/kernel/arraysum/arraysum-tb.c
CompilerFlags =
TestbenchCompilerFlags =
Args = arr by reference
OtherDirectives = None, but no co-simulation report was found
[16:31:16 <pool-1-thread-2: ImportTask> INFO] Import of 'arraysum_axi4mm.zip' with target axi4mm@vc709
[16:31:16 <pool-1-thread-2: Import$> INFO] SynthesisReport for arraysum not found, starting evaluation ...
[16:31:16 <pool-1-thread-2: EvaluateIP$> INFO] starting evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/ipcore/arraysum_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz, output in /tmp/9791558545089762559/evaluate.log
[16:39:08 <pool-1-thread-2: EvaluateIP$> INFO] evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/ipcore/arraysum_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz finished successfully, report in /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/ipcore/arraysum_export.xml
[16:39:08 <main: Compose$> INFO] all HLS tasks finished successfully, beginning compose run...
[16:39:08 <pool-1-thread-4: ComposeTask> ERROR] java.lang.Exception: could not find all required cores for target axi4mm@vc709, missing: arrayinit, arraysum
```Lukas SommerLukas Sommer