You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But 6.55T/s is the speed that would have been achieved if the model generated 387 tokens. The model actually generated only 78 tokens, so the real generation speed is 78 / 59.05 = 1.32 tokens / s
The text was updated successfully, but these errors were encountered:
Can you see if the latest version solves this issue?
It's good thank you
However, as you can see, if I abort the generation, there is a new log "Generating (301 / 300 tokens)" which is wrong. I don't know if it's related. Let me know if I should open a new issue for this.
It says
But 6.55T/s is the speed that would have been achieved if the model generated 387 tokens. The model actually generated only 78 tokens, so the real generation speed is 78 / 59.05 = 1.32 tokens / s
The text was updated successfully, but these errors were encountered: