希望能应用TensorRT加速训练和推理 #942

WSC741606 · 2024-05-16T03:11:26Z

Describe the feature
TensorRT10发布了，同时还有TensorRT-LLM，是否能用其对训练和推理加速呢？

Paste any useful information
下述来自NVIDIA的推广邮件

The TensorRT ecosystem of API releases include TensorRT 10.0, TensorRT-LLM 0.10, and TensorRT Model Optimizer 0.11.

Highlights from this release include:
TensorRT 10: support for weight-stripped engines, weight offload for NVIDIA Grace Hopper™ systems, Python 3.12
TensorRT-LLM 0.10: Llama3, Phi3, Grok1, and more; FP8 MoEs; improved API simplicity and consolidation
TensorRT Model Optimizer 0.11: provides state-of-the-art techniques like quantization and sparsity to reduce model complexity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

希望能应用TensorRT加速训练和推理 #942

希望能应用TensorRT加速训练和推理 #942

WSC741606 commented May 16, 2024

希望能应用TensorRT加速训练和推理 #942

希望能应用TensorRT加速训练和推理 #942

Comments

WSC741606 commented May 16, 2024