A script for running processes in parallel in Bash

In Bash you can start new processes (theads) on the background simply by running a command with ampersand &. The wait command can be used to wait until all background processes have finished (to wait for a certain process do wait PID where PID is a process ID). So here’s a simple pseudocode for parallel processing:

for ARG in  $*; do
    command $ARG &
    if [ "$NPROC" -ge 4 ]; then

I.e. you run 4 processes at a time and wait until all of them have finished before executing the next four. This is a sufficient solution if all of the processes take equally long to finish. However this is suboptimal if running time of the processes vary a lot.

A better solution is to track the process IDs and poll if all of them are still running. In Bash $! returns the ID of last initiated background process. If a process is running, the corresponding PID is found in directory /proc/.

Based on the ideas given in a Ubuntu forum thread and a template on command line parsing, I wrote a simple script “parallel” that allows you to run virtually any simple command concurrently.

Assume that you have a program proc and you want to run something like proc *.jpg using three concurrent processes. Then simply do

parallel -j 3 proc *.jpg

The script takes care of dividing the task. Obviously -j 3 stands for three simultaneous jobs.
If you need command line options, use quotes to separate the command from the variable arguments, e.g.

parallel -j 3 "proc -r -A=40" *.jpg

Furthermore, -r allows even more sophisticated commands by replacing asterisks in the command string by the argument:

parallel -j 6 -r "convert -scale 50% * small/small_*" *.jpg

I.e. this executes convert -scale 50% file1.jpg small/small_file1.jpg for all the jpg files. This is a real-life example for scaling down images by 50% (requires imagemagick).

Finally, here’s the script. It can be easily manipulated to handle different jobs, too. Just write your command between #DEFINE COMMAND and #DEFINE COMMAND END.

MAX_NPROC=2 # default
REPLACE_CMD=0 # no replacement by default
USAGE="A simple wrapper for running processes in parallel.
Usage: `basename $0` [-h] [-r] [-j nb_jobs] command arg_list
 	-h		Shows this help
	-r		Replace asterix * in the command string with argument
	-j nb_jobs 	Set number of simultanious jobs [2]
 	`basename $0` somecommand arg1 arg2 arg3
 	`basename $0` -j 3 \"somecommand -r -p\" arg1 arg2 arg3
 	`basename $0` -j 6 -r \"convert -scale 50% * small/small_*\" *.jpg"

function queue {

function regeneratequeue {
		if [ -d /proc/$PID  ] ; then

function checkqueue {
		if [ ! -d /proc/$PID ] ; then
			regeneratequeue # at least one PID has finished

# parse command line
if [ $# -eq 0 ]; then #  must be at least one arg
	echo "$USAGE" >&2
	exit 1

while getopts j:rh OPT; do # "j:" waits for an argument "h" doesnt
    case $OPT in
	h)	echo "$USAGE"
		exit 0 ;;
	r)	REPLACE_CMD=1 ;;
	\?)	# getopts issues an error message
		echo "$USAGE" >&2
		exit 1 ;;

# Main program
echo Using $MAX_NPROC parallel threads
shift `expr $OPTIND - 1` # shift input args, ignore processed args

for INS in $* # for the rest of the arguments
	if [ $REPLACE_CMD -eq 1 ]; then
		CMD="$COMMAND $INS" #append args
	echo "Running $CMD" 

	$CMD &

	queue $PID

	while [ $NUM -ge $MAX_NPROC ]; do
		sleep 0.4
wait # wait for all processes to finish before exit

25 Responses to “A script for running processes in parallel in Bash”

  1. Paul Says:

    $CMD &

    eval “$CMD &”

    If you want to do things like:
    par.sh ‘tr -d ” ” * > $(basename * .txt)-stripped.txt’ *.txt

    Without the eval it’ll treat > and $(basename…) as arguments to tr.

  2. Leon Roy Says:

    Great script. Curiously when I use it to batch compress a folder of .wav files to .mp3 it doesn’t always take the same amount of time, sometimes finishing around 1m20s, sometimes 1m40s.

  3. kawakamasu Says:

    > Paul
    Good point, never thought of that.

    > Leon Roy
    Hmm, maybe you have some other processes running that occasionally steal your cputime. There’s an command line utility called htop that you can use to monitor what your CPUs are actually doing…

  4. Joe Says:

    Thank you, great script!
    Using it to spawn some server instances.
    Any idea how to keep track (in Bash) the spawned
    processes and kill them after N seconds?


  5. kawakamasu Says:

    I don’t know, I never tried that. But since you have the PIDs, you could poll for the run times in the checkqueue routine and terminate processes if necessary. I suppose that there is a way for getting run times in Bash.

  6. Ole Tange Says:

    GNU Parallel http://www.gnu.org/software/parallel makes it possible to distribute the jobs to computers you have ssh access to.

    Watch the intro video http://www.youtube.com/watch?v=OpaiGYxkSuQ

  7. EmiNarcissus Says:

    This is very impressive. I was trying to line up couple tasks uploading files to hosting sites. But single parallel could not reach the network limit. So I would like to try this script any way. Thx a lot~

  8. Angelo Says:

    Hmmm, I wish there was a way to not have to wait for all the jobs to finish… I want to be able to start a new job as soon as one finishes.

  9. Stefano Coniglio Says:

    Very, very much useful.

    Since I’m using a set of fairly heterogeneous commands, I slightly modified the script so that it takes as input a text file with a set of commands (one per line) to be executed –said file is compiled a priori via a dedicated script.

    A general purpose modification: I suggest to substitute
    eval $CMD &
    $CMD &

    Without eval, any output redirection options that are present in the command being executed (e.g., echo “foo” > goo.text) are not correctly interpreted.

  10. Backtogeek's Journey » Optimizing Shell Scripts Says:

    […] From my testing it seems to produce varying results, you can use the techniques outlined here: https://pebblesinthesand.wordpress.com/2008/05/22/a-srcipt-for-running-processes-in-parallel-in-bash/ to see if your scripts can benefit from parallel […]

  11. Quick Tip: throttling batch processes in Terminal - Brett Terpstra Says:

    […] processes, and I know that someone out there must have long beat me to the solution. There it was: parallel. It’s a script you can download and make executable in your path, and then run it with a few […]

  12. GNU Parallell make best use of your multicore computer « zandyware Says:

    […] parallel comes to rescue. I first ran into this post https://pebblesinthesand.wordpress.com/2008/05/22/a-srcipt-for-running-processes-in-parallel-in-bash/. But I quickly found GNU parallel was a better choice. Fully loaded with detailed documentation. […]

  13. Jonathan H. Wage » Archive » A cool script for running PHPUnit tests in parallel processes Says:

    […] we’re exploring options for getting better build times. We came across this script found here for executing commands and running the processes in […]

  14. photo sharing script Says:

    Really helpful script for supporting multiprocessing systems.

  15. nemesis Says:

    Thank you for this nice piece of code!

    I know this thread is quite old but two questions come to my mind if somebody could kindly answer:

    1) Is copying $QUEUE to $OLDCHQUEUE really needed in line 36?

    2) Even more, is the function “checkqueue” worth at all? Couldn’t we just call “regeneratequeue”? I see we save a couple of assignations not going directly to “regeneratequeue” but I don’t think that this represent a substantial gain. Is there any extra reason to justify having the “checkqueue” function or is it just a matter of clear design so to only regenerate the queue when it’s really necessary?

    Thank you!

  16. marrotte Says:

    If you want to keep your process pool full, consider doing something like this inside the for loop: while `pgrep -c command` -gt $MAX_PROCESS`; do sleep $SLEEP_TIME done. Otherwise, you’ll be waiting on your slowest process to complete before spawning off NPROC threads again.

  17. Knigge Says:

    did someone notice that you misspelled “script” in the heading? ;)

    thanks for the nice counting script anyway, cheers!

  18. Pause all but x CPU-intensive tasks | CL-UAT Says:

    […] a bash script that looks as if it does something close to what you want to do — it starts up a number of […]

  19. Pause all but x CPU-intensive tasks | XL-UAT Says:

    […] a bash script that looks as if it does something close to what you want to do — it starts up a number of […]

  20. Bash脚本实现批量作业并行化 – 点滴记录 Says:

    […] A srcipt for running processes in parallel in Bash […]

  21. Joe Says:

    This is a great article! Tried a lot of different ways to throttle my parallel executions. I ended up using wait. Helped me immensely!


    + Joe

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: