Interrupts go missing sometimes
The PCIe MSIx interrupts coming from the DMA engine are received properly by the interrupt controller. The interrupt controller properly issues a AXI write request to the correct address in host memory. The PCIe AXI bridge does ACK the transfer and Bresp is OKAY. However, sometimes the interrupts do not reach the host for some reason. This can be confirmed checking /proc/interrupt.
This might be related to the interrupt controller taking too long. However, the DMA interrupt simply increases a value and schedules the userspace. This should not take too long.
Another alternative is that the PCIe bridge looses data when it is under heavy pressure.
For now as a quick fix I will try to disable a certain interrupt whenever the interrupt has just fired and see if that fixes the problem at the cost of latency. If that doesn't help maybe there is some possibility to remove protocol converters in between the interrupt handler and the PCIe bridge to avoid problems with those.
Overall no clear indication to what might go wrong as long as we don't have the hardware to debug right on the PCIe bus.