At the end of this self-learning lab, you should be able to:
- Understand the Unix Standard I/O
- Use the commands
- Understand process exit codes and use
||for chaining in Bash
- Understand environment variables and use
Very often, processes output a large amount of output. To extract useful information out of these output, we use text processing commands from the terminal.
Stdin, stdout and stderr
All processes have one input called "stdin" and two outputs called "stdout" and "stderr". When the process is run from command line directly in a shell, the stdout and stderr are merged together to display on a shell.
Pipe from/to file
You can pipe the stdout or stderr of a command to a file.
The following syntax writes the stdout of
cmd to the file at path
The shell will only display the stderr output,
and the stdout output is written to
$ cmd >file.txt
Alternatively, write the stderr of a to
$ cmd 2>error.txt
2 mean here?
It is a standard that
2 refer to the stdin, stdout and stderr of any process.
2>error.txt means to write the stream
2 (the stderr stream) to
In Linux, there is a special file called
It is a black hole where the file always remains empty
no matter how much data you write into it;
all file writes are accepted successfully but silently discarded.
To indicate that we want to "ignore" some output, we can pipe them to
Similarly, you can pipe a file into the stdin of a command using the
$ cmd <file.txt
Try it yourself
To begin with, let's create a command that outputs both stdout and stderr.
nano count.sh, which opens an interactive CLI text editor for the file
Then copy the following:
#!/bin/bash i=1 while [ $i -le 3 ]; do echo This line writes $i to stdout echo This line writes $i to stderr >&2 ((i++)) done
^X-Y Enter to confirm saving the file.
You can see hints for these keybindings at the bottom of the nano screen:
^G Get Help ^O Write Out ^W Where Is ^K Cut Text ^J Justify ^X Exit ^R Read File ^\ Replace ^U Paste Text ^T To Spell
Let's try making the file executable and run it:
$ chmod +x count.sh $ ./count.sh This line writes 1 to stdout This line writes 1 to stderr This line writes 2 to stdout This line writes 2 to stderr This line writes 3 to stdout This line writes 3 to stderr
Let's verify whether the claims about writing to stdout/stderr are true:
$ ./count.sh >/dev/null This line writes 1 to stderr This line writes 2 to stderr This line writes 3 to stderr $ ./count.sh 2>/dev/null This line writes 1 to stdout This line writes 2 to stdout This line writes 3 to stdout
Let's write the stderr to another file called
$ ./count.sh 2>error.txt This line writes 1 to stdout This line writes 2 to stdout This line writes 3 to stdout $ cat error.txt This line writes 1 to stderr This line writes 2 to stderr This line writes 3 to stderr
Pipe to process
We can pipe the stdout of one process to the stdin of another process using the
$ command1 | command2
This will result in the following data flow:
+-------------+ +--------------+ | shell input | +---------------> shell output | +-------------+ | +--------------+ | | ^ ^ | stderr| stderr | | | +----------+ +----------+-+ | +--->| command1 +----->| command2 |-----+ +----------+ +----------+ stdin stdout stdin stdout
tee command, as its name tells, pipes data in a τ shape.
This is what the command
tee file.txt command does:
+-------+ +-----+ +--------+ | stdin |---->| tee |---->| stdout | +-------+ +-----+ +--------+ | | +----------+ +------>| file.txt | +----------+
In modern versions of Bash, you can use the
>(another cmd) syntax,
which will be resolved into a temporary file
that would write data into the stdin of
grep means Global Regular Expression Print.
... Is that confusing?
Maybe just memorize it as "
grep grabs occurrences of a search".
It filters the input and only outputs all lines that match the search
(or all lines that don't, if you provide the
You can use
grep to find all occurrences of a word:
$ grep robot /usr/share/dict/words robot robot's robotic robotics robotics's robots
/usr/share/dict/words is a file where each line contains a valid English word.
man 5 american-english for more information.
grep treats the first argument as a "pattern",
which is interpreted in various rules.
- If you want
grepto treat your argument as-is (so that it does not treat symbols like
- If you want
grepto treat your argument as a PCRE ("Perl-compatible Regular Expression") (which is the regular expression flvaour used by Python's
A few useful flags in
grep to pay attention to:
-v: invert selection
-i: case insensitive search
-r: search all files in a directory
- Very useful for checking how other people are using a certain function when working on an unfamiliar project!
Remember to escape your regex pattern if you run
grep on a shell.
'' is a great choice to wrap your argument
since backslashes inside
'' do not get processed.
In Git repositories, you can also use
git grep instead of
to grep all staged files.
Other text-processing commands
These files all have their man pages. Read the man page to see their precise usage!
wc: count the number of lines, words and bytes in the input
cut: truncats each line
-c: Truncate specific columns
-f: split the line by a character (only one character!) and get a specific field
sort: sorts the input
uniq: removes adjacent duplicate lines (
sortit before using
-c: count number of duplicate lines
diff: compare two files
- You can also use
colordiffto get deleted lines in red and added lines in green.
- You can also use
tail: take only the first/last lines of the input
Many commands accept a file as its argument to read from,
but if no file is provided, it uses stdin as the source.
head file.txt and
head <file.txt do the same thing.
Checkpoint: How to read lines 5-8 from the file
$ head -n8 </usr/share/dict/words | tail -n4
First take the first 8 lines, then filter the last 4 in the first 8. This algorithm works as long as the file is more than 8 lines long.
If you want a reliable algorithm that works properly
even when the file has less than 8 lines,
have a look at the
sed commands, which are more complex.
Every command and process exits with an "exit code", which is a (usually small) integer.
A command that terminated successfully exits with the exit code
A non-zero exit code indicates an error in the process.
Shell commands and functions like
cd also have exit codes,
but this is just an emulation at the shell level.
Consider this sentence in English:
Find a line with the word "needle" or exit.
What does this mean? This means you should exit if you can't find a line with the word "needle". That's equivalent to the following line:
$ grep needle <file || exit
In many programming languages,
|| means "or" and
&& means "and".
If the first command is false (exits with a non-zero code),
the command behind
|| is run.
Similar, if the first command is
true (exits with 0),
the command behind
&& is run.
true always exits with 0,
and the command
false always exits with 1.
They are useful dummy commands to use when you want to coerce a certain exit code.
some other command || true coerces the exit code to
some other command is successful or not.
Environment variables (env var) are used to pass string data to child processes. Env vars set in a process will be inherited by the child process (unless otherwise specified). This allows optionally passing certain settings at a global level.
In a shell, environment variables can be added/updated by running the
export ARG_NAME="arg value"
You can set variables local to the shell (not inherited by child processes)
if you omit the
In this case, it is a shell variable rather than an environment variable.
In Bash, you can pass a variable as an argument to a command using the
ARG_NAME="arg value" echo "$ARG_NAME"
Note that the variable must be wrapped with
otherwise the argument is expanded directly and spaces in the variable would lead to separate arguments.
You can also temporarily set env vars for a single command
ARG_NAME="arg value" cmd line,
which runs the command
cmd line with the env var
ARG_NAME set to
Try it yourself
env command to see the environment variables in your current shell.
grep FOO_BAR to see that there is no env var called
FOO_BAR="qux" in the shell (do not use
env | grep FOO_BAR again.
You can see that
FOO_BAR is still not an env var for
export FOO_BAR="qux" and grep
There is a line
Now let's check the behaviour of quoted variables.
FOO="bar qux" and run
ls to see that two directories
qux are created.
rmdir them and try
mkdir "$FOO" instead.
Now there is just one directory called
Typically, we put env vars we want to always predefine in the .bashrc file, which is a file run automatically when you start a bash login shell.
Interpolating command output
The output of a command can be embedded in another command using the
As with environment variables, remember to wrap the syntax with
"" to encapsulate spaces.
Try it yourself
## This command shows your Ubuntu version $ lsb_release -cs focal ## Let's trying passing it to echo $ echo Ubuntu "$(lsb_release -cs)" Ubuntu focal ## What if we used `<()` instead? $ echo Ubuntu <(lsb_release -cs) Ubuntu /dev/fd/63
We are passing the output of
lsb_release to somewhere
that does not actually read its contents,
so you may get a "Broken pipe" error.