Skip to content

Latest commit

 

History

History
355 lines (210 loc) · 16.5 KB

README.md

File metadata and controls

355 lines (210 loc) · 16.5 KB

Innova-2 Flex XCKU15P XDMA PCIe DDR4 GPIO Demo

This is a Vivado 2023.2 starter project for the XCKU15P FPGA on the Innova-2 Flex SmartNIC MNV303212A-ADLT. It is a superset of the innova2_xdma_demo project that adds DDR4.

Refer to the innova2_flex_xcku15p_notes project for instructions on setting up an Innova-2 system with all drivers including Xilinx's PCIe XDMA Drivers.

Refer to this tutorial for detailed instructions on generating a similar project from scratch.

There is also a version for the 4GB MNV303212A-ADIT/MNV303212A-ADAT variant available.

Block Design

Block Design Diagram

AXI Addresses

Block Address (Hex) Size
M_AXI BRAM_CTRL_0 0x80000000 2M
M_AXI GPIO_0 0x70100000 64K
M_AXI GPIO_1 0x70110000 64K
M_AXI GPIO_2 0x70120000 64K
M_AXI GPIO_4 0x70140000 64K
M_AXI DDR4_0 0x200000000 8G
M_AXI DDR4_CTRL_0 0x70200000 1M
M_AXI_LITE BRAM_CTRL_1 0x00080000 8K
M_AXI_LITE GPIO_3 0x00090000 64K

AXI Addresses

Table of Contents

Program the Design into the XCKU15P Configuration Memory

Refer to the innova2_flex_xcku15p_notes project's instructions on Loading a User Image. Binary Memory Configuration Bitstream Files are included in this project's Releases.

wget https://github.com/mwrnd/innova2_8gb_adlt_xdma_ddr4_demo/releases/download/v0.2/innova2_8gb_adlt_xdma_ddr4_demo_bitstream.zip
unzip innova2_8gb_adlt_xdma_ddr4_demo_bitstream.zip
md5sum *bin
echo fd2fe52b344f46725ca083e4a108b6f8 should be md5sum of innova2_8gb_adlt_xdma_ddr4_demo_primary.bin
echo cfe2edd5c91cb6d7f41d00969b0041be should be md5sum of innova2_8gb_adlt_xdma_ddr4_demo_secondary.bin
echo 90055e5b1a28b98dea2f6703b68040fd should be md5sum of xdma_wrapper.bit

Testing the Design

lspci

After programming the bitstream and rebooting, the design should show up as Memory controller: Xilinx Corporation Device 9038. It shows up at PCIe Bus Address 03:00 for me.

lspci -d 10ee:

lspci Xilinx Devices

The following lspci commands list all Mellanox and Xilinx devices and show their relation.

lspci -nn | grep "Mellanox\|Xilinx"
lspci -tv | grep "0000\|Mellanox\|Xilinx"

lspci Xilinx and Mellanox Devices

The FPGA is attached to a PCIe Bridge (02:08.0), as are the two Ethernet Controllers (02:10.0).

01:00.0 PCI bridge [0604]: Mellanox Technologies MT28800 Family [ConnectX-5 PCIe Bridge] [15b3:1974]
02:08.0 PCI bridge [0604]: Mellanox Technologies MT28800 Family [ConnectX-5 PCIe Bridge] [15b3:1974]
02:10.0 PCI bridge [0604]: Mellanox Technologies MT28800 Family [ConnectX-5 PCIe Bridge] [15b3:1974]
03:00.0 Memory controller [0580]: Xilinx Corporation Device [10ee:9038]
04:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
04:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]

-[0]-+-00.0 Intel Corporation Device 3e0f
     +-1d.0-[01-04]----00.0-[02-04]--+-08.0-[03]----00.0 Xilinx Corporation Device 9038
     |                               \-10.0-[04]--+-00.0 Mellanox Technologies MT27800 Family [ConnectX-5]
     |                                            \-00.1 Mellanox Technologies MT27800 Family [ConnectX-5]

The current PCIe Link status is useful. Note this is the FPGA to ConnectX-5 PCIe Bridge link.

sudo lspci -nnvd 10ee:  ;  sudo lspci -nnvvd 10ee: | grep Lnk

lspci Link Status

dmesg | grep -i xdma provides details on how Xilinx's PCIe XDMA driver has loaded.

dmesg xdma

AXI BRAM Communication

The XDMA driver from dma_ip_drivers creates character device files for access to an AXI Bus. For DMA transfers to M_AXI blocks, /dev/xdma0_h2c_0 is Write-Only and /dev/xdma0_c2h_0 is Read-Only. To read from an AXI Block at address 0x12345600 you would read from address 0x12345600 of the /dev/xdma0_c2h_0 (Card-to-Host) file. To write you would write to the appropriate address of the /dev/xdma0_h2c_0 (Host-to-Card) file. For single word (32-Bit) register-like reads and writes to M_AXI_LITE blocks, /dev/xdma0_user is Read-Write.

The commands below generate 2MB of random data, then send it to a URAM Block (0x80000000) in the XCKU15P, then read it back and confirm the data is identical.

cd dma_ip_drivers/XDMA/linux-kernel/tools/
dd if=/dev/urandom bs=256 count=8192 of=TEST
sudo ./dma_to_device   --verbose --device /dev/xdma0_h2c_0 --address 0x80000000 --size 2097152  -f    TEST
sudo ./dma_from_device --verbose --device /dev/xdma0_c2h_0 --address 0x80000000 --size 2097152 --file RECV
md5sum TEST RECV

XDMA BRAM Test

AXI GPIO Control

User LED

The design includes an AXI GPIO block to control Pin B6, the D18 LED on the back of the Innova-2. The LED control is inverted on the board so the design includes a signal inverter. The LED can be turned on by writing a 0x01 to the GPIO_DATA Register. Only a single bit is enabled in the port so excess bit writes are ignored. No direction control writes are necessary as the port is set up for output-only (the GPIO_TRI Direction Control Register is fixed at 0xffffffff).

AXI GPIO

The LED GPIO Block is connected to the M_AXI_LITE port so access to it is via 32-bit=1-word reads and writes to the /dev/xdma0_user file using the reg_rw utility from dma_ip_drivers. The commands below should turn on then turn off the D18 LED in between reads of the GPIO register.

cd dma_ip_drivers/XDMA/linux-kernel/tools/
sudo ./reg_rw /dev/xdma0_user 0x90000 w
sudo ./reg_rw /dev/xdma0_user 0x90000 w 0x0001
sudo ./reg_rw /dev/xdma0_user 0x90000 w
sudo ./reg_rw /dev/xdma0_user 0x90000 w 0x0000
sudo ./reg_rw /dev/xdma0_user 0x90000 w

XDMA GPIO Testing

DDR4 Communication and Throughput

The DDR4 Controller prevents data reads from uninitialized memory. DDR4 must first be written before it can be read.

Your system must have enough free memory to test DDR4 DMA transfers. Run free -m to determine how much RAM you have available and keep the amount of data to transfer below that. The commands below generate 512MB of random data then transfer it to and from the Innova-2. The address of the DDR4 is 0x200000000 as noted earlier.

The dd command is used to generate a file (of=DATA) from pseudo-random data (if=/dev/urandom). The value for Block Size (bs) will be multiplied by the value for count to produce the size in bytes of the output file. For example, 8192*65536=536870912=0x20000000=512MiB. Use a block size (bs=) that is a multiple of your drive's block size. df . informs you on which drive your current directory is located. dumpe2fs will tell you the drive's block size.

df .
sudo dumpe2fs /dev/sda3 | grep "Block size"

Determine SSD or Hard Drive Block Size

To test the full 8GB of memory you can increment the address by the data size enough times that all 8Gib = 8589934592 = 0x200000000 has been tested.

If you have 8GB+ of free memory space, generate 8GB of random data with the dd command options bs=8192 count=1048576 and test the DDR4 in one go.

dd uses decimal numbers. Convert from hexadecimal using printf.

If checksums do not match, vbindiff DATA RECV can be used to determine differences between the sent and received data and the failing address locations.

Note that data is loaded from your system drive into memory then sent to the Innova-2 DDR4 over PCIe DMA. Likewise it is loaded from the Innova-2's DDR4 into system RAM, then onto disk. The wall time of these functions can therefore be significantly longer than the DMA Memory-to-Memory over PCIe transfer time.

free -m
dd if=/dev/urandom bs=8192 count=65536 of=DATA
printf "%ld\n" 0x200000000
sudo dd if=DATA of=/dev/xdma0_h2c_0 count=1 bs=536870912 seek=8589934592 oflag=seek_bytes
sudo dd if=/dev/xdma0_c2h_0 of=RECV count=1 bs=536870912 skip=8589934592 iflag=skip_bytes
md5sum DATA RECV

DDR4 dd Random Data Test

The tools from dma_ip_drivers can also be used to run the same test:

cd dma_ip_drivers/XDMA/linux-kernel/tools  
free -m
dd if=/dev/urandom bs=8192 count=65536 of=DATA
sudo ./dma_to_device   --verbose --device /dev/xdma0_h2c_0 --address 0x200000000 --size 536870912 -f     DATA
sudo ./dma_from_device --verbose --device /dev/xdma0_c2h_0 --address 0x200000000 --size 536870912 --file RECV
md5sum DATA RECV

DDR4 XDMA Tools Random Data Test

Test DDR4 Correct Data Retention

Test the first 1GB = 1073741824 bytes of the DDR4 memory space using a binary all-zeros file.

cd dma_ip_drivers/XDMA/linux-kernel/tools/
dd if=/dev/zero of=DATA bs=8192 count=131072
printf "%ld\n" 0x200000000
sudo dd if=DATA of=/dev/xdma0_h2c_0 count=1 bs=1073741824 seek=8589934592 oflag=seek_bytes
sudo dd if=/dev/xdma0_c2h_0 of=RECV count=1 bs=1073741824 skip=8589934592 iflag=skip_bytes
md5sum DATA RECV

Test DDR4 With All-Zeros File

Test the first 1GB = 1073741824 bytes of the DDR4 memory space using a binary all-ones file.

cd dma_ip_drivers/XDMA/linux-kernel/tools/
tr '\0' '\377' </dev/zero | dd of=DATA bs=8192 count=131072 iflag=fullblock
printf "%ld\n" 0x200000000
sudo dd if=DATA of=/dev/xdma0_h2c_0 count=1 bs=1073741824 seek=8589934592 oflag=seek_bytes
sudo dd if=/dev/xdma0_c2h_0 of=RECV count=1 bs=1073741824 skip=8589934592 iflag=skip_bytes
md5sum DATA RECV

Test DDR4 With All-Ones File

DDR4 Communication Error

If you attempt to send data to the DDR4 address but get write file: Unknown error 512 it means DDR4 did not initialize properly or the AXI bus has encountered an error and stalled. Reboot and try again. If that fails, proceed to the Innova-2 DDR4 Troubleshooting project.

sudo ./dma_to_device --verbose --device /dev/xdma0_h2c_0 --address 0x0 --size 8192 -f TEST

Error 512

Custom Software for Accessing AXI Blocks

pread/pwrite combine lseek and read/write. Note the Linux Kernel has a write limit of 0x7FFFF000=2147479552 bytes per call.

#include <unistd.h>

ssize_t pread(int fd, void *buf, size_t count, off_t offset);
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);

innova2_xdma_ddr4_test.c is a simple program that demonstrates XDMA communication in C using pread and pwrite to communicate with AXI Blocks.

gcc -Wall innova2_xdma_ddr4_test.c -o innova2_xdma_ddr4_test -lm
sudo ./innova2_xdma_ddr4_test

innova2_xdma_test.c Run

Recreating the Design in Vivado

Run the source command from the main Vivado 2023.2 window.

cd innova2_8gb_adlt_xdma_ddr4_demo
dir
source innova2_8gb_adlt_xdma_ddr4_demo.tcl

Source Project Files

Click on Generate Bitstream.

Generate Bitstream

Once the Bitstream is generated, run Write Memory Configuration File, select bin, mt25qu512_x1_x2_x4_x8, SPIx8, Load bitstream files, and a location and name for the output binary files. The bitstream will end up in the innova2_8gb_adlt_xdma_ddr4_demo/innova2_8gb_adlt_xdma_ddr4_demo.runs/impl_1 directory as xdma_wrapper.bit. Vivado will add the _primary.bin and _secondary.bin extensions as the Innova-2 uses dual MT25QU512 FLASH ICs in x8 for high speed programming.

Write Memory Configuration File

Proceed to Loading a User Image

Resource Utilization

Design run details:

Design Run Output

Resource Utilization Chart:

Resource Utilization Chart

Block Design Customization Options

XDMA

The Innova-2's XCKU15P is wired for x8 PCIe at PCIe Block Location: X0Y2. It is capable of 8.0 GT/s Link Speed.

XDMA Basic Customizations

For this design I set the PCIe Base Class to Memory Controller and the Sub-Class to Other.

XDMA PCIe ID Customizations

I disable the Configuration Management Interface.

XDMA Misc Customizations

DDR4

The DDR4 Memory Part is selected as MT40A1G16WBU-083E which is compatible with the MT40A1G16KNR-075 x16 Twin Die ICs with D9WFR FBGA Code on the Innova-2.

DDR4 Memory Part

The DDR4 is configured for a Memory Speed of 833ps -> 1200MHz -> 2400 MT/s Transfer Rate. The DDR4 reference clock is 9996ps -> 100.04MHz. CAS Latency is set to 16 and CAS Write Latency is set to 12.

DDR4 Basic Configuration

Data Mask and DBI is set to NO DM NO DBI which automatically enables ECC on a 72-Bit interface.

When is ECC Enabled

The Arbitration Scheme is set to RD PRI REG under AXI Options.

DDR4 AXI Configuration