Calliope Sounds: Using directories as queues

Software managed queues are great tools. Most enterprise system installations have some sort of message broker available. These are sophisticated tools. This posting is not about them. This posting is about using directories as queues.

The basic idea of using a directory as a queue is to place new "work" as a file in a "queue" directory. When the queue-processor is ready to process the new work-files it first moves the queue-directory's files to the queue-processor's "working" directory and then processes them.

The assumption here is that the work-files have all their content. But we all know that writing to a file takes time and during this writing time the queue-processor might awake and start processing the new, incomplete work-files. This is not good. If you are using a directory as a queue it is likely your queue-processor is a script and, unfortunately, most script solutions have poor error handling. The result is that the incomplete work-file gets ignored.

The good news is that there is an easy solution. I first saw this solution used within the qmail MTA. A feature of Unix file systems is that adding to or removing files from a directory is an atomic action. That is, all processes wanting to alter the set of files in a directory are queued up and only one at a time is allowed to make changes. So, when you write the work-content file make sure to write it first outside of the queue-directory and then move the completed work-file to the queue-directory. It is critical that the the work-content creator do the moving.

The archetype for this is to create a queue-directory containing two sub-directories

mkdir queue/new/
mkdir queue/tmp/

Add the work-content file to queue/tmp with a globally unique file name. Once writing is finished and the file closed, then move it from queue/tmp/ to queue/new/. This even works nicely across networks, for example

QUEUE=/x/y/z
SOURCE="/a/b/c/foo.txt"
TARGET="$QUEUE/tmp/$(basename $SOURCE)-$(uuid)"
scp -q $SOURCE user@host:$TARGET
ssh -q user@host "mv $TARGET $QUEUE/new/$(basename $SOURCE)"