Filespooler provides the
fspl queue-write command to easily add files to a queue. However, the design of Filespooler intentionally makes it easy to add files to the queue by some other command. For instance, Using Filespooler over Syncthing has Syncthing do the final write, the nncp-file (but not the nncp-exec) method in Using Filespooler over NNCP had NNCP do it, and so forth.
This page documents the requirements for a tool to write to the Filespooler queue. Note that
fspl queue-write is designed to implement these requirements for local writes. When some other tool such as Syncthing or NNCP performs the direct write into the queue directory, care must be taken to make sure it is done correctly.
The fundamental thing we are concerned with is avoiding race conditions during file write. That is, we don’t want Filespooler to start processing a file that hasn’t yet been completely written.
The Requirements For Writing a File to the Queue
- Once the file is completely written and closed, and ready for Filespooler to process, it MUST reside in the
jobsdirectory and meet this pattern:
- Any files that reside in the
jobsdirectory and are NOT yet completely written must NOT meet that pattern.
- Every valid job file in the
jobsdirectory must have a unique name that meets the pattern under item 1. Both random and non-random names are fine, so long as they are unique.
Methods of Compliance
For any write, the text between
.fspl needs to be unique for each job.
fspl queue-write generates a UUID, but anything unique will do. You can call
fspl gen-filename to generate a unique filename for this purpose.
Then, there is the question of how to make sure files that are not yet completely written are invisible to filespooler (items 1 and 2 above). There are two common ways of doing this:
- Writing to a file with a temporary name in
- Writing to a file outside
jobs, then renaming or hard linking into
Let’s discuss them both.
Writing to a file with a temporary name in the jobs directory
This is the approach used by
fspl queue-write. You can easily do it with a shell script sort of like this:
FILENAME="`fspl gen-filename`" TMPNAME="$FILENAME.tmp" cat > "$TMPNAME" mv "$TMPNAME" "$FILENAME"
This is roughly what
fspl queue-write does itself. Since a rename is atomic – that is, the file exists completely at either its old or its new name – this is a safe way to do it.
Writing to a temporary file outside the jobs directory
You can create other subdirectories besides
jobs in your queue. You can write a file there, then use
ln to hard link it into
jobs, and then delete the link at the temporary location.
The problem with this is that
mv is only a rename if both the source and target file are on the same filesystem. A hard link or rename can’t cross a filesystem boundary. So some of the approaches documented here – for instance, using
rclone mount as described at Using Filespooler over rclone and S3, rsync.net, etc., will not work because the destination is a different filesystem than the source. Therefore, this method must be used with much more care.
Testing a setup
Before using any setup like this, it is good to test if it looks write. I advise generating a very large packet, then using
ls -la queuedir/jobs on the destination to see how it appears while it is being written. If you notice a
fspl-*.fspl file with a growing size, then it’s NOT working right. If the file only appears with its full, final size, then it is working right. A growing file with a temporary name is fine.
Links to this note
Sometimes, one wants to verify the integrity and authenticity of a Filespooler job file before processing it.
Filespooler is a way to execute commands in strict order on a remote machine, and its communication method is by files. This is a perfect mix for Syncthing (and others, but this page is about Filespooler and Syncthing).
Filespooler is designed around careful sequential processing of jobs. It doesn’t have native support for parallel processing; those tasks may be best left to the queue managers that specialize in them. However, there are some strategies you can consider to achieve something of this effect even in Filespooler.
It seems that lately I’ve written several shell implementations of a simple queue that enforces ordered execution of jobs that may arrive out of order. After writing this for the nth time in bash, I decided it was time to do it properly. But first, a word on the why of it all.
You can use Filespooler with a number of other filesystems and storage options. s3fs, for instance, lets you mount S3 filesystems locally. I can’t possibly write about every such option, so I’ll write about one: rclone.
Filespooler lets you request the remote execution of programs, including stdin and environment. It can use tools such as S3, Dropbox, Syncthing, NNCP, ssh, UUCP, USB drives, CDs, etc. as transport; basically, a filesystem is the network for Filespooler. Filespooler is particularly suited to distributed and Asynchronous Communication.