NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 758
Star 7.1k

Code
Issues 538
Pull requests 69
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 28 Milestones 0

New pull request New

69 Open 201 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

DeepSeek MoE support

#1758 opened Jun 9, 2024 by akhoroshev

Loading…

Bump gradio from 4.19.2 to 4.36.0 in /examples/qwen dependencies

Pull requests that update a dependency file

#1751 opened Jun 6, 2024 by dependabot bot

Loading…

Reference input randomSeeds by idx rather than batchSlot

#1742 opened Jun 5, 2024 by pathorn

Loading…

Bump transformers from 4.36.2 to 4.38.0 in /examples/multimodal dependencies

Pull requests that update a dependency file

triaged

Issue has been triaged by maintainers

#1689 opened May 28, 2024 by dependabot bot

Loading…

add cached generation buffer triaged

Issue has been triaged by maintainers

waiting for feedback

#1685 opened May 28, 2024 by michael200892458

Loading…

Make Executor timeout configurable triaged

Issue has been triaged by maintainers

waiting for feedback

#1655 opened May 23, 2024 by DreamGenX

Loading…

Optimize python benchmark logging triaged

Issue has been triaged by maintainers

#1646 opened May 22, 2024 by michaelnny

Loading…

Fix CUDA OOM when creating Mixtral checkpoint triaged

Issue has been triaged by maintainers

waiting for feedback

#1629 opened May 19, 2024 by VivekBits2210

Loading…

Add support for non-power-of-two heads with Alibi triaged

Issue has been triaged by maintainers

#1611 opened May 15, 2024 by vmarkovtsev

Loading…

[feat]: Support weight only gemm with 2bit triaged

Issue has been triaged by maintainers

waiting for feedback

#1568 opened May 9, 2024 by gavinchen430

Loading…

Use first bad_words as extra parameters, and implement min-p

#1536 opened May 2, 2024 by pathorn • Draft

Support SDXL and its distributed inference waiting for feedback

#1514 opened Apr 28, 2024 by Zars19

Loading…

Remove the <s> token from post_prompt of multimodal

#1508 opened Apr 26, 2024 by yupbank

Loading…

fix: correct cudaSetDevice error when GPUs per node are fewer than their ranks in inter-node inference

#1495 opened Apr 24, 2024 by littlefatfat

Loading…

llama convert add rotary_scaling param in cli_args waiting for feedback

#1385 opened Apr 1, 2024 by activezhao

Loading…

Add SmoothQuant for T5 (decoder only right now)

#1366 opened Mar 27, 2024 by eycheung

Loading…

Relax python dependencies triaged

Issue has been triaged by maintainers

#1346 opened Mar 24, 2024 by tdeboissiere

Loading…

fix: wrong request processing order

#1177 opened Feb 28, 2024 by prnake

Loading…

modify for main_0f041b7b57_jetson

#1115 opened Feb 20, 2024 by sunnyqgg

Loading…

fix import error in parallel build

#1094 opened Feb 17, 2024 by llan-ml

Loading…

[fix] avoid the overflow issue when supporting 32k sequence length

#1076 opened Feb 11, 2024 by llsj14

Loading…

update einops in mpt requirements script

#1075 opened Feb 10, 2024 by hllj

Loading…

Update README.md account for new cuDNN version - installation instruction only works with Archive version

#1073 opened Feb 9, 2024 by ewandel

Loading…

Update README.md

#1050 opened Feb 5, 2024 by MustaphaU

Loading…

Automate cuDNN setup

#1032 opened Feb 1, 2024 by Muhtasham

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly