-
Notifications
You must be signed in to change notification settings - Fork 76
Issues: triton-inference-server/tensorrtllm_backend
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Deepseek model streaming mode with Chinese character �?
bug
Something isn't working
#493
opened Jun 7, 2024 by
activezhao
2 of 4 tasks
2x docker image size increase for trtllm: from 8.38 GB (24.03) to 18.46 GB (24.04)
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#489
opened Jun 3, 2024 by
lopuhin
2 of 4 tasks
Grafana Dashboard (Feature Request)
feature request
New feature or request
#485
opened May 31, 2024 by
hestabit-dev
4 tasks
missing nv_trt_llm_request_metrics from python backend
bug
Something isn't working
#479
opened May 28, 2024 by
Hao-YunDeng
2 of 4 tasks
[Question] Best practises to track inputs and predictions?
question
Further information is requested
triaged
Issue has been triaged by maintainers
#475
opened May 24, 2024 by
FernandoDorado
tensorrt_llm_bls disregards top_k / temperature setting
bug
Something isn't working
#472
opened May 23, 2024 by
janpetrov
1 of 4 tasks
random_seed
seems to be ignored (or at least inconsistent) for inflight_batcher_llm
bug
#468
opened May 21, 2024 by
dyoshida-continua
2 of 4 tasks
unexpected error when creating modelInstanceState: [json.exception.out_of_range.403] key 'name' not found
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#467
opened May 21, 2024 by
Godlovecui
2 of 4 tasks
[Bug] Zero temperature curl request affects non-zero temperature requests
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#464
opened May 20, 2024 by
Hao-YunDeng
2 of 4 tasks
Can you provide an example of a visual language model or multimodal model launch by triton server?
triaged
Issue has been triaged by maintainers
#463
opened May 20, 2024 by
lzcchl
How to deploy one model instance across multiple GPUs to tackle the OOM problem?
question
Further information is requested
triaged
Issue has been triaged by maintainers
#462
opened May 16, 2024 by
shil3754
decoding_mode top_k_top_p does not take effect for llama2 not same with huggingface
triaged
Issue has been triaged by maintainers
#461
opened May 16, 2024 by
yjjiang11
1 of 4 tasks
Implement XC-Cache to improve long context inference performance
#460
opened May 15, 2024 by
avianion
Tritonserver won't start up running Smaug 34b
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#459
opened May 15, 2024 by
workuser12345
2 of 4 tasks
two seemingly identical functions in the same file
triaged
Issue has been triaged by maintainers
#458
opened May 15, 2024 by
dongluw
Mixtral 8x7-v0.1 Hangs after serving a few requests
bug
Something isn't working
#457
opened May 15, 2024 by
aaditya-srivathsan
2 of 4 tasks
[tensorrt-llm backend] A question about launch_triton_server.py
question
Further information is requested
triaged
Issue has been triaged by maintainers
#455
opened May 13, 2024 by
victorsoda
Example Further information is requested
gpu_device_ids
for multi-model usage?
question
#448
opened May 9, 2024 by
vnkc1
2 of 4 tasks
InFlightBatching seems not working
need more info
triaged
Issue has been triaged by maintainers
#442
opened May 6, 2024 by
larme
2 of 4 tasks
Deployement failed for BERT
triaged
Issue has been triaged by maintainers
#440
opened May 3, 2024 by
vivekjoshi556
Deploying Mixtral-8x7B-v0.1 with Triton 24.02 on A100 (160GB) raises "Cuda Runtime (out of memory)" exception
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#438
opened Apr 29, 2024 by
kelkarn
2 of 4 tasks
GptManager’s scalability issues with input & output parameters
feature request
New feature or request
#437
opened Apr 28, 2024 by
service-kit
Encountered an error in forward function: std::bad_cast
bug
Something isn't working
#435
opened Apr 26, 2024 by
wangqy1216
1 of 4 tasks
max_batch_size
seems to have no impact on model performance
bug
#429
opened Apr 23, 2024 by
VitalyPetrov
3 of 4 tasks
Performance Issue with return_context_logits Enabled in TensorRT-LLM
bug
Something isn't working
#428
opened Apr 23, 2024 by
gywlssww
2 of 4 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-05-09.