Skip to content

amulil/vector_by_onnxmodel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vector_by_onnxmodel

accelerate generating vector by using onnx model

install

conda create -n vo python=3.10
pip install -r requirements.txt
pip install optimum[onnxruntime-gpu]

how to use

python generate.py
python generate_optimum.py

result(~4x faster)

# you can see the inference time of onnx model is much faster than using sentence_transformers
# used model: https://huggingface.co/amu/tao-8k
OnnxModel Runtime gpu Inference time = 4.52 ms
Sentence Transformer gpu Inference time = 22.19 ms

result[optimum](~5x faster)

# you can see the inference time of onnx model is much faster than using sentence_transformers
# used model: https://huggingface.co/amu/tao-8k
# On one A100 GPU
[Optimum] OnnxModel Runtime gpu Inference time = 3.22 ms
Sentence Transformer gpu Inference time = 17.63 ms

About

accelerate generating vector by using onnx model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages