Skip to content

hupe1980/go-tiktoken

Repository files navigation

✂️ go-tiktoken

Build Status Go Reference

OpenAI's tiktoken tokenizer written in Go. The vocabularies are embedded and do not need to be downloaded at runtime.

Installation

go get github.com/hupe1980/go-tiktoken

How to use

package main

import (
	"fmt"
	"log"

	"github.com/hupe1980/go-tiktoken"
)

func main() {
	encoding, err := tiktoken.NewEncodingForModel("gpt-3.5-turbo")
	if err != nil {
		log.Fatal(err)
	}

	ids, tokens, err := encoding.Encode("Hello World", nil, nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println("IDs:", ids)
	fmt.Println("Tokens:", tokens)
}

Output:

IDs: [9906 4435]
Tokens: [Hello  World]

For more example usage, see _examples.

Supported Encodings

  • ✅ o200k_base
  • ✅ cl100k_base
  • ✅ p50k_base
  • ✅ p50k_edit
  • ✅ r50k_base
  • ✅ gpt2
  • ✅ claude

License

MIT