-
Notifications
You must be signed in to change notification settings - Fork 758
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Bump gradio from 4.19.2 to 4.36.0 in /examples/qwen
dependencies
Pull requests that update a dependency file
#1751
opened Jun 6, 2024 by
dependabot
bot
Loading…
Reference input randomSeeds by idx rather than batchSlot
#1742
opened Jun 5, 2024 by
pathorn
Loading…
Bump transformers from 4.36.2 to 4.38.0 in /examples/multimodal
dependencies
Pull requests that update a dependency file
triaged
Issue has been triaged by maintainers
#1689
opened May 28, 2024 by
dependabot
bot
Loading…
add cached generation buffer
triaged
Issue has been triaged by maintainers
waiting for feedback
#1685
opened May 28, 2024 by
michael200892458
Loading…
Make Executor timeout configurable
triaged
Issue has been triaged by maintainers
waiting for feedback
#1655
opened May 23, 2024 by
DreamGenX
Loading…
Optimize python benchmark logging
triaged
Issue has been triaged by maintainers
#1646
opened May 22, 2024 by
michaelnny
Loading…
Fix CUDA OOM when creating Mixtral checkpoint
triaged
Issue has been triaged by maintainers
waiting for feedback
#1629
opened May 19, 2024 by
VivekBits2210
Loading…
Add support for non-power-of-two heads with Alibi
triaged
Issue has been triaged by maintainers
#1611
opened May 15, 2024 by
vmarkovtsev
Loading…
[feat]: Support weight only gemm with 2bit
triaged
Issue has been triaged by maintainers
waiting for feedback
#1568
opened May 9, 2024 by
gavinchen430
Loading…
Support SDXL and its distributed inference
waiting for feedback
#1514
opened Apr 28, 2024 by
Zars19
Loading…
fix: correct cudaSetDevice error when GPUs per node are fewer than their ranks in inter-node inference
#1495
opened Apr 24, 2024 by
littlefatfat
Loading…
llama convert add rotary_scaling param in cli_args
waiting for feedback
#1385
opened Apr 1, 2024 by
activezhao
Loading…
Relax python dependencies
triaged
Issue has been triaged by maintainers
#1346
opened Mar 24, 2024 by
tdeboissiere
Loading…
[fix] avoid the overflow issue when supporting 32k sequence length
#1076
opened Feb 11, 2024 by
llsj14
Loading…
Update README.md account for new cuDNN version - installation instruction only works with Archive version
#1073
opened Feb 9, 2024 by
ewandel
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.