Linux Parallel Example

Parallel computing is a broad topic and this article will focus on how Linux can be used to implement a parallel application. We will look at two models of parallel programming: message passing and shared memory constructs.

Consider the following scenario. I have two programs A and B. Program A outputs to stdout lines of strings while program B process lines from stdin. The way to use these two programs is of course: foo@bar:$ A B Now I've noticed that this eats up only one core; hence I am wondering: Are programs A and B sharing the same computational resources? If so, is there a way to run A and B concurrently? Another thing that I've noticed is that A runs much much faster than B, hence I am wondering if could somehow run more B programs and let them process the lines that A outputs in parallel. That is, A would output its lines, and there would be N instances of programs B that would read these lines (whoever reads them first) process them and output them on stdout.

So my final question is: Is there a way to pipe the output to A among several B processes without having to take care of race conditions and other inconsistencies that could potentially arise? A problem with split -filter is that the output can be mixed up, so you get half a line from process 1 followed by half a line from process 2. GNU Parallel guarantees there will be no mixup. So assume you want to do: A B C But that B is terribly slow, and thus you want to parallelize that. Then you can do: A parallel -pipe B C GNU Parallel by default splits on n and a block size of 1 MB. This can be adjusted with -recend and -block.

You can find more about GNU Parallel at: You can install GNU Parallel in just 10 seconds with: wget -O - pi.dk/3 sh Watch the intro video on. When you write A B, both processes already run in parallel. If you see them as using only one core, that's probably because either of CPU affinity settings (perhaps there is some tool to spawn a process with different affinity) or because one process isn't enough to hold a whole core, and the system 'prefers' not to spread out computing. To run several B's with one A, you need a tool such as split with the -filter option: A split OPTIONS -filter='B' This, however, is liable to mess up the order of lines in the output, because the B jobs won't be running all at the same speed. If this is an issue, you might need to redirect B i-th output to an intermediate file and stitch them together at the end using cat.

This, in turn, may require a considerable disk space. Other options exist (e.g. You could limit each instance of B to a single line-buffered output, wait until a whole 'round' of B's has finished, run the equivalent of a reduce to split's map, and cat the temporary output together), with varying levels of efficiency. The 'round' option just described for example will wait for the slowest instance of B to finish, so it will be greatly dependent on the available buffering for B; mbuffer might help, or it might not, depending on what the operations are.

Examples Generate the first 1000 numbers and count the lines in parallel: seq 1 1000 split -n r/10 -u -filter='wc -l' 100 100 100 100 100 100 100 100 100 100 If we were to 'mark' the lines, we'd see that each first line is sent to process #1, each fifth line to process #5 and so on.