Skip to content
WXY edited this page Feb 15, 2021 · 6 revisions

Think of this wiki as an "operations manual" for how to deploy the service to solve problems.

Below is a brief on configuration concepts. For a full list of all configuration properties and how to use them see the Configurations Reference

How it works

Typical fs-curator pipeline

  • Directories are nominated as hoppers. Which are then watched for rename / move events. As well as closed for writing events for files.
  • The file or directory is then offered to transforms which performs changes in a colocated temporary directory.
  • Transforms are repeated until no further transforms are possible.
  • Files are then linked into a centralized "collection" directory, we take this opportunity to dedupe the new file and assign it a numeric ID for global ordering.
    • This mono-collection allows us to use your filesystem to its full potential. Granting the ability to do efficient b-tree based searches without meta-data files.
  • The resulting files are offered to stores. Which may choose to create links or copies.

Hoppers

Hoppers tells the service what file and/or directory changes it should care about and where to find them. It also dictates the transforms that could be applied.

[hopper]
path = /home/user/Downloads
include = source_path /\.zip$/i
transform = unzip
transform = unrar
transform = untar
transform = normalize_name
store = programs
store = pictures
store = videos

Transforms

A transform invokes an external program to change files. This process is repeated until no further transforms are possible. Transforms can feed into each other, be careful not to create infinite loops.

[transform]
name = unzip
include = file_path /\.zip$/i
exec = /usr/bin/unzip
args = -o {from} -d {to}

When a transform matches a file or directory:

  1. If the change is a file, it is "boxed" in its own directory to avoid interfering with existing files and FS watchers
    • Note that this is not repeated for files created by the transform (i.e. archives in archives)
  2. A temporary directory is created and the executable is invoked with the path of the file and the path of the temporary directory
  3. The temporary directory is scanned for files and moved back into the box
  4. The process repeats from step 2 with the next matching transform

Stores

Stores are directories where the files will eventually be presented. Files that fails to land in a store are considered lost and will be instead sent to the lost and found directory.

[store]
name = unzipped
path = /home/user/unzipped
include = source_path /\.zip$/i