Blocking in bash scripts

Sometimes you need to make sure that no more than one instance of your bash script is running at a time. If your platform has a flock command, then this is quite simple to do:

#!/bin/bash

LOCK_FILE=/tmp/my-script.lock
LOCK_FD=9

get_lock() {
    # need to use eval here for proper expansion
    eval "exec $LOCK_FD>$LOCK_FILE"
    flock -n $LOCK_FD
}

get_lock || exit

# ...

When using this approach, remember that all child processes inherit file handles opened by the parent process. I had a script that ran from cron. This script started ssh-agent if it wasn’t already running and executed commands via ssh on multiple servers. ssh-agent inherited a local file descriptor and as a result the script was executed only once when ssh-agent was started. To avoid this situation, you must explicitly close the lock file when invoking a command that spawns a child process. In my case, I had to do this:

#!/bin/bash

LOCK_FILE=/tmp/my-script.lock
LOCK_FD=9
SSH_KEY=/root/.ssh/id_rsa.for.ssh-agent

get_lock() {
    # need to use eval here for proper expansion
    eval "exec $LOCK_FD>$LOCK_FILE"
    flock -n $LOCK_FD
}

get_lock || exit

socket=$(find /tmp/ssh-*/agent.* -user root 2>/dev/null || true)
if [ -z "$socket" ]; then
    # need to use eval here for proper expansion
    # we need to close explicitly fd of the lock file
    # otherwise open fd is kept by ssh-agent and lock can't be aquired until ssh-agent exits
    eval ". <(ssh-agent $LOCK_FD>&-)"
    ssh-add $SSH_KEY
    return
else
# ...
fi
#...

If for some reason you can’t use flock the functionality you need can be implemented using bash exclusively:

#!/bin/bash

set -u

PID_LIST=/tmp/test-get-lock.pid

get_lock() {
    local pid
    while true; do
        while read pid; do
            kill -0 $pid || continue
            [ "$pid" != "$BASHPID" ] && return 1
            echo $BASHPID >$PID_LIST.new && mv $PID_LIST.new $PID_LIST && return 0
        done < $PID_LIST
        echo $BASHPID >>$PID_LIST
    done
}

if get_lock 2>/dev/null; then
    sleep 1
    pids="$(cat $PID_LIST)"
    pid=$(echo "$pids"|head -n1)
    [ "$BASHPID" != "$pid" ] && echo "pid: $BASHPID unexpected pid: $pid $pids"
    echo "pid: $BASHPID get_lock success"
else
    echo "pid: $BASHPID get_lock failed"
fi

Here’s how it works:

How reliable is this solution? While debugging, I used the following command for testing:

rm -f /tmp/*.log
for x in {0000..9999}; do ./lock-test.sh >/tmp/$x.log 2>&1 & done
wait
echo "success: $(grep success /tmp/*.log|wc -l), failure: $(grep failed /tmp/*.log|wc -l), unexpected pid: $(grep unexpected /tmp/*.log|wc -l)"

The absence of unexpected pids meant that the code worked correctly. For final testing, I used the following command:

for y in {000..999}; do
  echo -n " $y"
  bash -c 'rm -f /tmp/*.log
    for x in {0000..9999}; do ./lock-test.sh >/tmp/$x.log 2>&1 & done
    wait' 2>/dev/null; grep unexpected /tmp/*.log && break
done

I ran this test on my 4-core i7 laptop, 2-core virtual machine, and 24-core server. No problems were found in any of the cases. However, I admit that my testing was not exhaustive and the proposed code may not work correctly under some circumstances. However, if you use this code to make the script run from cron work in a single instance, there will most likely be no problems.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *