Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LeNet - Segmentation Fault - unhandled level 3 translation fault (11) [at runtime] #220

Open
davfre98 opened this issue Jun 23, 2021 · 0 comments

Comments

@davfre98
Copy link

Hi, I am trying to run LeNet on the NVDLA AWS virtual platform with nv_small.

After several attempts, I keep getting a Segmentation Fault error.

All the details and results are shown below:

NVDLA AWS VP SETUP

(AMI: nvdla_vp_fpga_ami_ubuntu; AFI: agfi-05d68b424ef03f66e [nv_small]; EC2 instance: f1.2xlarge)

  • Setup using steps 2.1 - 2.5 from NVDLA AWS VP.
  • Sanity test ./nvdla_runtime --loadable kmd/CDP/CDP_L0_0_small_fbuf runs correctly.

LENET MODEL AND INPUT

These are the LeNet files from Caffe: lenet files.zip

Compilation is done copying those Caffe files into the sw/prebuilt/x86-ubuntu directory and doing:

./nvdla_compiler --prototxt lenet_mnist.prototxt --caffemodel lenet_mnist.caffemodel -o . 
--profile fast-math --cprecision int8 --configtarget nv_small --calibtable lenet_mnist.json 
--quantizationMode per-filter --informat nchw

These are the output files: compiler_output_files.zip

The compilation is done at my local computer and uploaded to the AWS VP.

The LeNet files can also be found at ESP Columbia NVDLA Tutorial. The compilation I reproduced it myself, as mentioned above.

EXECUTION, LOGS AND ERROR

In the AWS instance, the LeNet files and the compiled output are placed in the /usr/local/nvdla directory.
The image to be used for inference is also uploaded: seven.zip. It is taken from ESP Columbia NVDLA Tutorial.

Then, the Virtual Simulator is started:

sudo ./aarch64_toplevel -c aarch64_nvdla.lua --fpga
Login the kernel with account 'root' and password 'nvdla'

mount -t 9p -o trans=virtio r /mnt
cd /mnt/sw/prebuilt/linux/
insmod drm.ko
insmod opendla_small.ko

The LeNet files, the compiled outputs and the seven.pgm image are copied to /mnt/sw/prebuilt/linux, and inference is attempted:

./nvdla_runtime --loadable fast-math.nvdla --image seven.pgm

The log is as follows:

(image)
nvdla_lenet_log

(text)
creating new runtime context...
.Emulator starting
ppgminfo 1 28 28
pgm2dimg 1 28 28 1 32 896 896
[ 251.084271] nvdla_runtime[1276]: unhandled level 3 translation fault (11) at 0xffffaabe2000, esr 0x92000047, in libc-2.23.so[ffffab5a1000+12c000]
[ 251.085368] CPU: 0 PID: 1276 Comm: nvdla_runtime Tainted: G O 4.13.3 #1
[ 251.085852] Hardware name: linux,dummy-virt (DT)
[ 251.086198] task: ffff80003da83800 task.stack: ffff80003bf00000
[ 251.086746] PC is at 0xffffab619494
[ 251.088713] LR is at 0x409768
[ 251.088970] pc : [<0000ffffab619494>] lr : [<0000000000409768>] pstate: 20000000
[ 251.089400] sp : 0000ffffe07543a0
[ 251.089632] x29: 0000ffffe07543a0 x28: 0000000000000000
[ 251.090003] x27: 0000000000000000 x26: 0000000000000000
[ 251.090333] x25: 0000000000000000 x24: 0000000000000000
[ 251.090650] x23: 0000000000000000 x22: 0000000000000000
[ 251.092569] x21: 0000000000403408 x20: 0000000000000000
[ 251.092935] x19: 0000ffffe0754f11 x18: 0000000000000000
[ 251.093260] x17: 0000000000440348 x16: 0000ffffab619380
[ 251.093585] x15: 000000000000044e x14: 0000000000000000
[ 251.093909] x13: 0000000000000000 x12: 0000000000000000
[ 251.094265] x11: 0000000000003c00 x10: 0000000000000000
[ 251.094618] x9 : 0000000000000000 x8 : 0000000000000000
[ 251.096458] x7 : 0000000000003c00 x6 : 0000ffffaabe1ff0
[ 251.096825] x5 : 0000000000000040 x4 : 0000000000000000
[ 251.097147] x3 : 0000ffffaabe0000 x2 : 0000000000004180
[ 251.097463] x1 : 000000000f6155c0 x0 : 0000ffffaabe0000
Segmentation fault

Notes

  • I've attempted using the images at /mnt/sw/regression/images/digits as well but to no avail.
  • In the lenet_mnist.prototxt it says that the input is 28x28x1, and the seven.pgm is also 28x28x1, at least as far as I checked (the .pgm file says "28 28", and I converted it to .jpg to check the height and width and it is 28x28).
  • I have also tried running AlexNet with its proper-sized .jpg input and got the same Segmentation Fault error after some 112 minutes. My issue here though focuses on LeNet.

I'd really appreciate any help to solve this Segmentation Fault issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant