Support TensorRT conversion and serving feature #32

redrussianarmy · 2020-11-06T11:36:23Z

I realized that the Tensorflow Lite does not support inference with using Nvidia GPU. I have a device of Nvidia Jetson Xavier. My current inference is made with unoptimized transformers model on GPU. It is faster than inference with TF Lite model on CPU.

After my research, I have found 2 types of model optimization such as TensorRT or TF-TRT. I have made some trials to achieve the conversion of fine-tuned transformers model to TensorRT but I could not achieve. It would be better if the dialog-nlu supports TensorRT conversion and serving feature.

MahmoudWahdan · 2020-11-06T12:18:23Z

Hi @redrussianarmy
Thank you for sharing your experience.
I'll give it a try and let you know.

Tflite doesn't support serving on PC GPUs, but supports mobile GPUs. I don't know if it supports all edge devices GPUs or not.

One question that came to my mind:
Did you try mixing transformers with layer_pruning feature and tflite conversion with hybrid_quantization?

k_layers_to_prune = 4 # try different values
config = {
...
...
    "layer_pruning": {
        "strategy": "top",
        "k": k_layers_to_prune
    }
}

nlu = TransformerNLU.from_config(config)
nlu.train(train_dataset, val_dataset, epochs, batch_size)

nlu.save(save_path, save_tflite=True, conversion_mode="hybrid_quantization")

nlu = TransformerNLU.load(model_path, quantized=True, num_process=4)

utterance = "add sabrina salerno to the grime instrumentals playlist"
result = nlu.predict(utterance)

redrussianarmy · 2020-11-06T13:35:40Z

Hi @MahmoudWahdan
Thank you for your quick reply.

I have tried mixing transformers with layer_pruning feature and tflite conversion with hybrid_quantization as you mentioned. Unfortunately, the result is same. Prediction does not work on GPU of Nvidia Jetson Xavier.

I am looking forward to seeing new TensorRT conversion feature :)

MahmoudWahdan · 2020-11-06T17:27:19Z

Hi @redrussianarmy
Sure, This is a new thing that I'll try and of course it will be useful.
I'll keep you updated.

MahmoudWahdan added the feature Request for new feature label Nov 6, 2020

MahmoudWahdan self-assigned this Nov 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support TensorRT conversion and serving feature #32

Support TensorRT conversion and serving feature #32

redrussianarmy commented Nov 6, 2020 •

edited

Loading

MahmoudWahdan commented Nov 6, 2020

redrussianarmy commented Nov 6, 2020 •

edited

Loading

MahmoudWahdan commented Nov 6, 2020

Support TensorRT conversion and serving feature #32

Support TensorRT conversion and serving feature #32

Comments

redrussianarmy commented Nov 6, 2020 • edited Loading

MahmoudWahdan commented Nov 6, 2020

redrussianarmy commented Nov 6, 2020 • edited Loading

MahmoudWahdan commented Nov 6, 2020

redrussianarmy commented Nov 6, 2020 •

edited

Loading

redrussianarmy commented Nov 6, 2020 •

edited

Loading