Remote task queue using bash & ssh for variable number of live workers -


i want distribute work master server multiple worker servers using batches.

ideally have tasks.txt file list of tasks execute

cmd args 1 cmd args 2 cmd args 3 cmd args 4 cmd args 5 cmd args 6 cmd args 7 ... cmd args n 

and each worker server connect using ssh, read file , mark each line in progress or done

#cmd args 1  #worker1 - done #cmd args 2  #worker2 - in progress #cmd args 3  #worker3 - in progress #cmd args 4  #worker1 - in progress  cmd args 5 cmd args 6 cmd args 7 ... cmd args n 

i know how make ssh connection, read file, , execute remotely don't know how make read , write atomic operation, in order not have cases 2 servers start same task, , how update line.

i each worker go list of tasks , lock next available task in list rather server actively commanding workers, have flexible number of workers clones start or close according how fast need tasks complete.

update:

and ideea worker script :

#!/bin/bash  taskcmd="" taskline=0 masterssh="ssh usr@masterhost" tasksfile="/path/to/tasks.txt"  function gettask(){     while [[ $taskcmd == "" ]]             sleep 1;         taskcmd_and_taskline=$($masterssh "#read_and_lock_next_available_line $tasksfile;")         taskcmd=${taskcmd_and_taskline[0]}         taskline=${taskcmd_and_taskline[1]}     done }  function updatetask(){     message=$1     $masterssh "#update_currenttask $tasksfile $taskline $message;" }   function dotask(){     return $taskcmd; }   while [[ 1 -eq 1 ]]      gettask     updatetask "in progress"     dotask      taskerrcode=$?     if [[ $taskerrcode -eq 0 ]]              updatetask "done, finished successfully"     else         updatetask "done, error $taskerrcode"     fi     taskcmd="";     taskline=0;  done 

you can use flock concurrently access file:

exec 200>>/some/any/file ## create file descriptor flock -w 30 200 ## concurrently access /some/any/file, timeout of 30 sec. 

you can point file descriptor tasks list or other file, of course same file in order flock work. lock me removed process created done or fail. can remove lock when don't need anymore:

flock -u 200 

an usage sample:

ssh user@x.x.x.x '   set -e   exec 200>>f   echo locking...   flock -w 10 200   echo working...   sleep 5 ' 

set -e fails script if step fails. play sleep time , execute script in parallel. 1 sleep execute @ time.


Comments