-
Notifications
You must be signed in to change notification settings - Fork 758
Issues: NVIDIA/TensorRT-LLM
[Issue Template]Short one-line summary of the issue #270
#783
opened Jan 1, 2024 by
juney-nvidia
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
AWQ performance issue for higher batches
bug
Something isn't working
#1757
opened Jun 8, 2024 by
canamika27
2 of 4 tasks
server.cc:251] failed to enable peer access for some device pairs
bug
Something isn't working
#1754
opened Jun 8, 2024 by
Godlovecui
2 of 4 tasks
Enc-Dec C++ Runtime Paged KV - Inflight Batching output junks while inference with multiple input texts
#1753
opened Jun 7, 2024 by
thanhlt998
[Question] "Building from source code is necessary if you want the best performance"
question
Further information is requested
triaged
Issue has been triaged by maintainers
#1750
opened Jun 6, 2024 by
DreamGenX
Does TensorRT-LLM support high concurrent requests?
bug
Something isn't working
waiting for feedback
#1745
opened Jun 6, 2024 by
Godlovecui
2 of 4 tasks
Quantizing Phi-3 128k Instruct to FP8 fails.
feature request
New feature or request
Investigating
quantization
Issue about lower bit quantization, including int8, int4, fp8
#1741
opened Jun 5, 2024 by
kalradivyanshu
2 of 4 tasks
Performance issue at whisper in many aspects : latency, reproducibility, and more
bug
Something isn't working
Investigating
#1740
opened Jun 5, 2024 by
lionsheep24
3 of 4 tasks
[ERROR] Assertion failed: Can't free tmp workspace for GEMM tactics profiling.
duplicate
This issue or pull request already exists
feature request
New feature or request
Investigating
#1739
opened Jun 5, 2024 by
grvsh02
1 of 2 tasks
Inflight batching for fp8 Llama and Mixtral is broken
bug
Something isn't working
Investigating
quantization
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
#1738
opened Jun 5, 2024 by
bprus
2 of 4 tasks
Can't convert-checkpoint Mistral 7B v0.3: safetensors_rust.SafetensorError: File does not contain tensor model.embed_tokens.weight
feature request
New feature or request
Investigating
triaged
Issue has been triaged by maintainers
#1732
opened Jun 5, 2024 by
Ace-RR
2 of 4 tasks
Warning: Function too large, generated debug information may not be accurate.
neeed more info
question
Further information is requested
#1730
opened Jun 5, 2024 by
nanmi
Conditionals seems to be evaluated eagerly
feature request
New feature or request
Investigating
#1724
opened Jun 4, 2024 by
CrimsonRadiator
2 of 4 tasks
Lora support with LLama3-70B and AWQ Quantization
feature request
New feature or request
Investigating
triaged
Issue has been triaged by maintainers
#1721
opened Jun 4, 2024 by
smehta2000
2 of 4 tasks
When the request is large, the Triton server has a very high TTFT
bug
Something isn't working
#1719
opened Jun 4, 2024 by
Godlovecui
2 of 4 tasks
Track input and output token count for every request
feature request
New feature or request
Investigating
#1718
opened Jun 3, 2024 by
brarj413
After deployment, each request exception generates a core.xxxx file
bug
Something isn't working
#1715
opened Jun 3, 2024 by
taorui-plus
2 of 4 tasks
AssertionError: Each dimension must specify a 3-elements tuple or list in the order of (min,opt,max), got {dim=}
Investigating
quantization
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
#1714
opened Jun 3, 2024 by
doruksonmez
Llava multimodel example is giving segfault
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#1709
opened Jun 1, 2024 by
buddhapuneeth
2 of 4 tasks
Diversity Search not resulting in diverse outputs
triaged
Issue has been triaged by maintainers
#1707
opened May 31, 2024 by
Bhuvanesh09
Support for Python 3.11 (+ windows)
bug
Something isn't working
#1706
opened May 30, 2024 by
Sharrnah
2 of 4 tasks
build docker images from source is too large
neeed more info
triaged
Issue has been triaged by maintainers
#1705
opened May 30, 2024 by
Fred-cell
24.05-trtllm-python-py3 image size
question
Further information is requested
triaged
Issue has been triaged by maintainers
#1704
opened May 30, 2024 by
Prots
High WER and Incomplete Transcription Issue with Whisper
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#1697
opened May 29, 2024 by
teith
2 of 4 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-05-09.