Execution of external programs
Julia borrows the back apostrophe notation for shell, Perl, and Ruby commands. However, in Julia, the spelling
julia> `echo hello`
`echo hello`
it has a number of differences from the behavior in various shells, Perl or Ruby.
-
Instead of executing the command immediately, the reverse apostrophes create an object.
Cmd
to represent the command. You can use this object to connect a command to others using transmission channels, and execute it using the functionrun
, reading it using the functionread
or writing to it using the functionwrite
. -
When executing a command, Julia does not record its output unless you have specifically provided for this. The default command output is sent to a constant 'stdout`, as when using the library’s
system
call (libc
). -
The command is never executed in the shell. Julia analyzes the syntax of the command directly, interpolating variables accordingly and performing word separation, as the shell would do, observing the syntax of quotation marks in the shell. The command is executed as a direct child process of
julia
using thefork
andexec
calls.
The following examples assume a Posix environment, as in Linux or macOS. In Windows, many similar commands, such as |
The following is a simple example of executing an external command.
julia> mycommand = `echo hello`
`echo hello`
julia> typeof(mycommand)
Cmd
julia> run(mycommand);
hello
hello
is the output of the echo
command, sent to a constant stdout
. If the execution of an external command fails, the execution method raises an exception. ProcessFailedException
.
julia> read(`echo hello`, String)
"hello\n"
julia> readchomp(`echo hello`)
"hello"
More generally, to read from or write to an external command, you can use the function open
.
julia> open(`less`, "w", stdout) do io
for i = 1:3
println(io, i)
end
end
1
2
3
The program name and individual arguments in the command can be accessed and iterated as if the command were an array of strings.
julia> collect(`echo "foo bar"`)
2-element Vector{String}:
"echo"
"foo bar"
julia> `echo "foo bar"`[2]
"foo bar"
Interpolation
Suppose you need to do something more complicated and use the file name in the file
variable as the command argument. You can use $
for interpolation in the same way as in a string literal (see the section Lines).
julia> file = "/etc/passwd"
"/etc/passwd"
julia> `sort $file`
`sort /etc/passwd`
When running external programs through the shell, the following problem often occurs: if the file name contains characters that are special to the shell, they may cause undesirable behavior. Let’s say, for example, that you need to sort the contents not of the /etc/passwd
file, but of the /Volumes/External HD/data.csv
file. Let’s try to do this.
julia> file = "/Volumes/External HD/data.csv"
"/Volumes/External HD/data.csv"
julia> `sort $file`
`sort '/Volumes/External HD/data.csv'`
How was the file name enclosed in quotation marks? Julia knows that the file
variable is intended for interpolation as the only argument, so this word is enclosed in quotation marks. In fact, this is not entirely accurate: the shell never interprets the value of the file
variable, so there is no need for actual quotation marks. The quotation marks are inserted only for presentation to the user. This will work even when interpolating the value as part of a shell word.
julia> path = "/Volumes/External HD"
"/Volumes/External HD"
julia> name = "data"
"data"
julia> ext = "csv"
"csv"
julia> `sort $path/$name.$ext`
`sort '/Volumes/External HD/data.csv'`
As you can see, the space in the path
variable is escaped accordingly. But what if you need to interpolate a few words? In this case, just use an array (or any other iterable container).
julia> files = ["/etc/passwd","/Volumes/External HD/data.csv"]
2-element Vector{String}:
"/etc/passwd"
"/Volumes/External HD/data.csv"
julia> `grep foo $files`
`grep foo /etc/passwd '/Volumes/External HD/data.csv'`
If you interpolate the array as part of a shell word, Julia emulates argument generation. {a,b,c}
shells.
julia> names = ["foo","bar","baz"]
3-element Vector{String}:
"foo"
"bar"
"baz"
julia> `grep xylophone $names.txt`
`grep xylophone foo.txt bar.txt baz.txt`
Moreover, when interpolating multiple arrays into a single word, the shell’s behavior of forming a Cartesian product is emulated.
julia> names = ["foo","bar","baz"]
3-element Vector{String}:
"foo"
"bar"
"baz"
julia> exts = ["aux","log"]
2-element Vector{String}:
"aux"
"log"
julia> `rm -f $names.$exts`
`rm -f foo.aux foo.log bar.aux bar.log baz.aux baz.log`
Since you can interpolate literal arrays, this generative functionality can be used without creating temporary array objects.
julia> `rm -rf $["foo","bar","baz","qux"].$["aux","log","pdf"]`
`rm -rf foo.aux foo.log foo.pdf bar.aux bar.log bar.pdf baz.aux baz.log baz.pdf qux.aux qux.log qux.pdf`
Enclosing in quotation marks
A developer inevitably wants to write complex commands, so you should consider using quotation marks. Here is a simple example of one-line Perl code in a shell prompt.
sh$ perl -le '$|=1; for (0..3) { print }' 0 1 2 3
A Perl expression should be enclosed in single quotes for two reasons: so that spaces do not split the expression into multiple shell words, and so that using Perl variables such as $|
(yes, this is the name of a variable in Perl) does not result in interpolation. In other cases, you can use double quotes to make the interpolation happen.
sh$ first="A" sh$ second="B" sh$ perl -le '$|=1; print for @ARGV' "1: $first" "2: $second" 1: A 2: B
In general, the syntax of Julia’s back apostrophes is carefully thought out, so you can simply cut and paste shell commands as they are into back apostrophes, and they will work: escaping, quoting, and interpolation will function the same way as in the shell. The only difference is that the interpolation is integrated and takes into account Julia’s idea of what is a single string value and what is a container for multiple values. Let’s try to run the two examples above in Julia.
julia> A = `perl -le '$|=1; for (0..3) { print }'`
`perl -le '$|=1; for (0..3) { print }'`
julia> run(A);
0
1
2
3
julia> first = "A"; second = "B";
julia> B = `perl -le 'print for @ARGV' "1: $first" "2: $second"`
`perl -le 'print for @ARGV' '1: A' '2: B'`
julia> run(B);
1: A
2: B
The results are identical, and Julia’s interpolation behavior mimics the behavior of the shell, with some improvements related to the fact that Julia supports iterable first-class objects, while most shells use strings separated by a certain space for this, which creates ambiguity. When trying to transfer shell commands to Julia, try cutting and pasting them first. Since Julia displays commands before executing them, you can simply study their interpretation without any negative consequences.
Pipelines
Shell metacharacters such as |
, &
, and >
must be enclosed in quotation marks (or escaped) inside Julia back apostrophes.
julia> run(`echo hello '|' sort`);
hello | sort
julia> run(`echo hello \| sort`);
hello | sort
This expression invokes the echo
command with three words as arguments: hello
, |
and sort'. The result is a single line: `hello | sort'. Then how can we build a pipeline? Instead of using `|" inside the reverse apostrophes, a pipeline is used (`pipeline
).
julia> run(pipeline(`echo hello`, `sort`));
hello
The output of the echo
command is passed to the sort
command. Of course, this is not very interesting, since you only need to sort one row, but you can perform more remarkable actions.
julia> run(pipeline(`cut -d: -f3 /etc/passwd`, `sort -n`, `tail -n5`))
210
211
212
213
214
The five highest user ID values in the UNIX system are displayed here. The cut
, sort
, and tail
commands are generated as direct children of the current `julia' process without an intermediate shell process. Julia independently performs the work of configuring the transfer and connection of file descriptors, which is usually done by the shell. Because Julia does this on its own, it provides maximum control and can implement things that shells cannot.
Julia can execute multiple commands in parallel.
julia> run(`echo hello` & `echo world`);
world
hello
The order of output here is non-deterministic, since two echo processes run almost simultaneously and compete for the right to write to the descriptor first. 'stdout`, which is common to them and the parent process `julia'. Julia allows you to transfer the output from both of these processes to another program.
julia> run(pipeline(`echo world` & `echo hello`, `sort`));
hello
world
From the point of view of UNIX pipelining, what happens here is that one UNIX transfer channel object is created and written by both echo processes, and the other end of the transfer channel is read by the sort command.
I/O redirection can be performed by passing named arguments 'stdin`, stdout
and stderr
to the `pipeline' function.
pipeline(`do_work`, stdout=pipeline(`sort`, "out.txt"), stderr="errs.txt")
Preventing deadlocks in pipelines
When reading both ends of the pipeline and writing to them from the same process, it is important to avoid the situation of forced buffering of all data by the kernel.
For example, when reading the entire output of a command, the read(out, String)
function should be called, rather than wait(process)
, since the former will actively consume all data written by the process, while the latter will try to save data to the kernel buffers while waiting for the reader object to be connected.
Another common solution is to separate the pipeline’s read object and write object into separate tasks (Task
).
writer = @async write(process, "data")
reader = @async do_compute(read(process, String))
wait(writer)
fetch(reader)
(Usually, the read object is not a separate task, since we retrieve it immediately using `fetch' anyway.)
A complex example
The combination of a high-level programming language, first-class command abstraction, and automatic channel configuration between processes is a powerful tool. To give you some idea of the complex pipelines that can be easily created, here are some more complex examples. We apologize for the excessive use of Perl one-liners.
julia> prefixer(prefix, sleep) = `perl -nle '$|=1; print "'$prefix' ", $_; sleep '$sleep';'`;
julia> run(pipeline(`perl -le '$|=1; for(0..5){ print; sleep 1 }'`, prefixer("A",2) & prefixer("B",2)));
B 0
A 1
B 2
A 3
B 4
A 5
This is a classic example of how one producer supports two parallel consumers: one perl process generates strings with numbers from 0 to 5, and two parallel processes consume this output, one of which adds the letter A to the strings as a prefix, and the other adds the letter B. Which consumer gets the first row is non-deterministic, but after this race is won, the rows are alternately used by one process, then another. (When setting $|=1
in Perl, each output statement clears the handle 'stdout`, which is necessary for this example to work. Otherwise, the entire output is buffered and immediately output to the transmission channel so that only one consumer process can read it.)
Here is an even more complex multi-stage example of "producer-consumer".
julia> run(pipeline(`perl -le '$|=1; for(0..5){ print; sleep 1 }'`,
prefixer("X",3) & prefixer("Y",3) & prefixer("Z",3),
prefixer("A",2) & prefixer("B",2)));
A X 0
B Y 1
A Z 2
B X 3
A Y 4
B Z 5
This example is similar to the previous one, except that it has two stages for consumers and the stages have different delays, so they use different numbers of parallel workflows to maintain maximum throughput.
Be sure to try out all these examples to see how they work.
Cmd
objects
The syntax of the reverse apostrophe creates an object of the type Cmd
. Such an object can also be built directly from an existing Cmd
object or a list of arguments.
run(Cmd(`pwd`, dir=".."))
run(Cmd(["pwd"], detach=true, ignorestatus=true))
This allows you to specify several aspects of the Cmd
runtime using named arguments. For example, the keyword dir
controls the working directory `Cmd'.
julia> run(Cmd(`pwd`, dir="/"));
/
And the keyword env
allows you to set runtime variables.
julia> run(Cmd(`sh -c "echo foo \$HOWLONG"`, env=("HOWLONG" => "ever!",)));
foo ever!
For a list of additional named arguments, see the description Cmd
. Commands setenv
and 'addenv` are used to replace or add to the Cmd
runtime variables, respectively.
julia> run(setenv(`sh -c "echo foo \$HOWLONG"`, ("HOWLONG" => "ever!",)));
foo ever!
julia> run(addenv(`sh -c "echo foo \$HOWLONG"`, "HOWLONG" => "ever!"));
foo ever!