You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One possible way to reduce latency is to use a bigger machine, since Concrete makes very good use of parallelism: we obtain around 40 seconds inference time on that model using the hpc7a instances on AWS.
A second possible approach is, depending on the use-case, to optimize your model by making it smaller or pruning it. You might want to look into structured pruning. See this section of the documentation about such techniques - some are already used in the CIFAR example that you link.
Hi andrei, thanks for your quick reply. I am using a high end workstation with 64 GB RAM and 24 core processor.
I will take a look at the pruning.
Thanks
Hi, I found this community very helpful. I have a question related to inference time.
I am running following use case example.
https://github.com/zama-ai/concrete-ml/blob/main/use_case_examples/cifar/cifar_brevitas_training/evaluate_one_example_fhe.py
The issue is that FHE inference over a single image took more than 28 min.
Is there a way to optimize it or reduce the inference time.
The text was updated successfully, but these errors were encountered: