Skip to content

Ollama token counts #1179

Answered by marcklingen
aiseei asked this question in Support
Feb 15, 2024 · 1 comments · 5 replies
Discussion options

You must be logged in to vote

Hi @aiseei,
Tracking token usage when using Ollama works well with Langfuse.

Ollama returns token counts at the end of the stream or together with the response when not streaming. Just tried this locally and it worked well.
Example from the api reference:

{
  "model": "llama2",
  "created_at": "2023-08-04T19:22:45.499127Z",
  "response": "",
  "done": true,
  "context": [1, 2, 3],
  "total_duration": 10706818083,
  "load_duration": 6338219291,
  "prompt_eval_count": 26, # <- input tokens
  "prompt_eval_duration": 130079000,
  "eval_count": 259, # <- output tokens
  "eval_duration": 4232710000
}

You can then add these token counts to the generation object in Langfuse to track them (docs).

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@aiseei
Comment options

@flefevre
Comment options

@spicoflorin
Comment options

@marcklingen
Comment options

@spicoflorin
Comment options

Answer selected by marcklingen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants
Converted from issue

This discussion was converted from issue #1178 on February 16, 2024 00:30.