Quantize CDJ - Search News

ailia-ai/onnx-quantization

This is a example to quantize onnx. The input is onnx of float. Quantization is done using onnxruntime. The output is onnx of int8. The default is to quantize using only 2 images, which is less ...

GitHub

Reduce T5 model size by 3X and increase the inference speed up to 5X.

T5 models can be used for several NLP tasks such as summarization, QA, QG, translation, text generation, and more. Sequential text generation is naturally slow, and for larger T5 models it gets even ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

ailia-ai/onnx-quantization

Reduce T5 model size by 3X and increase the inference speed up to 5X.

Trending now