i want distribute work master server multiple worker servers using batches.
ideally have tasks.txt
file list of tasks execute
cmd args 1 cmd args 2 cmd args 3 cmd args 4 cmd args 5 cmd args 6 cmd args 7 ... cmd args n
and each worker server connect using ssh, read file , mark each line in progress or done
#cmd args 1 #worker1 - done #cmd args 2 #worker2 - in progress #cmd args 3 #worker3 - in progress #cmd args 4 #worker1 - in progress cmd args 5 cmd args 6 cmd args 7 ... cmd args n
i know how make ssh connection, read file, , execute remotely don't know how make read , write atomic operation, in order not have cases 2 servers start same task, , how update line.
i each worker go list of tasks , lock next available task in list rather server actively commanding workers, have flexible number of workers clones start or close according how fast need tasks complete.
update:
and ideea worker script :
#!/bin/bash taskcmd="" taskline=0 masterssh="ssh usr@masterhost" tasksfile="/path/to/tasks.txt" function gettask(){ while [[ $taskcmd == "" ]] sleep 1; taskcmd_and_taskline=$($masterssh "#read_and_lock_next_available_line $tasksfile;") taskcmd=${taskcmd_and_taskline[0]} taskline=${taskcmd_and_taskline[1]} done } function updatetask(){ message=$1 $masterssh "#update_currenttask $tasksfile $taskline $message;" } function dotask(){ return $taskcmd; } while [[ 1 -eq 1 ]] gettask updatetask "in progress" dotask taskerrcode=$? if [[ $taskerrcode -eq 0 ]] updatetask "done, finished successfully" else updatetask "done, error $taskerrcode" fi taskcmd=""; taskline=0; done
you can use flock
concurrently access file:
exec 200>>/some/any/file ## create file descriptor flock -w 30 200 ## concurrently access /some/any/file, timeout of 30 sec.
you can point file descriptor tasks list or other file, of course same file in order flock
work. lock me removed process created done or fail. can remove lock when don't need anymore:
flock -u 200
an usage sample:
ssh user@x.x.x.x ' set -e exec 200>>f echo locking... flock -w 10 200 echo working... sleep 5 '
set -e
fails script if step fails. play sleep
time , execute script in parallel. 1 sleep
execute @ time.
Comments
Post a Comment