Skip to content

Commit

Permalink
revised readme and app notes based on jart suggestions
Browse files Browse the repository at this point in the history
  • Loading branch information
mofosyne committed Apr 5, 2024
1 parent b2598c9 commit 131432e
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 16 deletions.
44 changes: 34 additions & 10 deletions APPLICATION.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Application Notes

This Applicatio Notes is targeted at both model packagers and application developers and is for infomation that is not directly relevant to users who are just simply trying to use llamafiles in a standalone manner. Instead it is for developers who want their models to better integrate with other developers models or applications. (Think of this as an informal ad-hoc community standards page)
This Application Notes is targeted at both model packagers and application developers and is for infomation that is not directly relevant to users who are just simply trying to use llamafiles in a standalone manner. Instead it is for developers who want their models to better integrate with other developers models or applications. (Think of this as an informal ad-hoc community standards page)

## Finding Llamafiles

While we do not have a package manager for llamafiles, applications developers are recommended
to search for AI models tagged as `llamafile` in Hugging Face AI repository.
Be sure to display the publishing user or organisation and to sort by heart count.
Be sure to display the publishing user or organisation and to sort by trending.

Within a llamafile repository entry in Hugging Face, there may be multiple `*.llamafile` files
to choose from. The current convention to describe each sub entries of llamafiles is to
Expand All @@ -26,17 +26,41 @@ For example a model card for a llamafile should have this section that you can p
<!-- README_llamafile.md-provided-files end -->
```

## Packaging A Llamafile
## Llamafile Naming Convention

The file naming convention for each llamafile is `<model-name>.<Quantisation Level>.llamafile` (e.g. `phi-2.Q2_K.llamafile`).
Llamafiles follow a naming convention of `<Model>-<Version>-<Parameters>-<Quantization>.llamafile`.

## Installing A Llamafile
The components are:
1. **Model**: A descriptive name for the model type or architecture.
2. **Version (Optional)**: Denotes the model version number, starting at `v1` if not specified, formatted as `v<Major>.<Minor>`.
- Best practice to include model version number only if model has multiple versions and assume the unversioned model to be the first version and/or check the model card.
3. **Parameters**: Indicates the number of parameters and their scale, represented as `<count><scale-prefix>`:
- `T`: Trillion parameters.
- `B`: Billion parameters.
- `M`: Million parameters.
- `K`: Thousand parameters.
4. **Quantization**: This part specifies how the model parameters are quantized or compressed. The notation is influenced by the `./quantize --help` command in `llama.cpp`.
- Uncompressed formats:
- `F16`: 16-bit floats per weight
- `F32`: 32-bit floats per weight
- Quantization (Compression) formats:
- `Q<X>`: X bits per weight, where `X` could be `4` (for 4 bits) or `8` (for 8 bits) etc...
- Variants provide further details on how the quantized weights are interpreted:
- `_K`: k-quant models, which further have specifiers like `_S`, `_M`, and `_L` for small, medium, and large, respectively, if they are not specified, it defaults to medium.
- `_<num>`: Different approaches, with even numbers indicating the model weights as a scaling factor multiplied by the quantized weight and odd numbers indicating the model weights as a combination of an offset factor plus a scaling factor multiplied by the quantized weight. This convention was found from this [llama.cpp issue ticket on QX_4](https://github.com/ggerganov/llama.cpp/issues/1240).
- Even Number (0 or 2): `<model weights> = <scaling factor> * <quantised weight>`
- Odd Number (1 or 3): `<model weights> = <offset factor> + <scaling factor> * <quantised weight>`

Llamafile are completely standalone and portable so does not require installation.
However we have a path convention for ease of discovery by local applications scripts/program.
## Installing A Llamafile And Making It Accessible To Other Local Applications

- **Linux** : `~/.llamafile/*.llamafile`
Llamafiles are designed to be standalone and portable, eliminating the need for a traditional installation. For optimal discovery and integration with local application scripts/programs, we recommend the following search paths:

<!-- TODO: Windows llamafile installation convention -->
- **System-wide Paths**:
- `/usr/share/llamafile` (Linux/MacOS/BSD): Ideal for developers creating packages, commonly accessed via package managers like `apt get install` in Debian-based Linux OSes.
- `/opt/llamafile` (Linux/MacOS/BSD): Positioned in the `/opt` directory, suitable for installers downloaded directly from the web.
- `C:\llamafile` (Windows): A direct path for Windows systems.

<!-- TODO: Mac llamafile installation convention -->
- **User-specific Path**:
- `~/.llamafile` (Linux/MacOS/BSD): Located in the user's home directory, facilitating user-specific configurations in line with Unix-like conventions.

For applications or scripts referencing the Llamafile path, setting the environment variable `$LLAMAFILE_PATH` to a singular path can enhance configuration simplicity and system consistency.
17 changes: 11 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,16 +53,21 @@ chmod +x llava-v1.5-7b-q4.llamafile

**Having trouble? See the "Gotchas" section below.**

### Installing A Llamafile
## Installing A Llamafile And Making It Accessible To Other Local Applications

Llamafile are completely standalone and portable so does not require installation.
However we have a path convention for ease of discovery by local applications scripts/program.
Llamafiles are designed to be standalone and portable, eliminating the need for a traditional installation. For optimal discovery and integration with local application scripts/programs, we recommend the following install paths:

- **Linux** : `~/.llamafile/*.llamafile`
- **System-wide Paths**:
- `/opt/llamafile` (Linux/MacOS/BSD)
- `C:\llamafile` (Windows)

<!-- TODO: Windows llamafile installation convention -->
- **User-specific Path**:
- `~/.llamafile` (Linux/MacOS/BSD)

<!-- TODO: Mac llamafile installation convention -->
- **Additional Search Locations**: These path serves as a reference for applications or scripts that might expect to find the Llamafile here. However, direct installations to this directory are discouraged unless you know what you are doing.
- `/usr/share/llamafile` (Linux/MacOS/BSD)

For applications or scripts referencing the Llamafile path, setting the environment variable `$LLAMAFILE_PATH` to a singular path.


### JSON API Quickstart
Expand Down

0 comments on commit 131432e

Please sign in to comment.