Bash by Example: Piping
Connecting commands together using the pipe operator to create data processing pipelines, understanding how standard output flows to standard input, using xargs to convert stdin to command arguments, leveraging process substitution for advanced redirection, and building complex multi-stage text processing workflows.
Code
#!/bin/bash
# Basic Pipe: Output of 'ls' -> Input of 'grep'
echo "--- Finding .txt files ---"
ls -l | grep ".txt"
# Chaining multiple pipes
echo -e "\n--- Counting lines in .txt files ---"
# cat all txt files -> sort them -> count unique lines -> count total
cat *.txt 2>/dev/null | sort | uniq | wc -l
# Piping with xargs
echo -e "\n--- Deleting .tmp files ---"
# find outputs filenames -> xargs passes them as arguments to rm
# (Dry run with echo)
find . -name "*.tmp" | xargs -I {} echo "Would delete: {}"
# Process Substitution (Treating output as a file)
echo -e "\n--- Diffing Outputs ---"
# Compare the output of two commands without temp files
diff <(ls demo/a) <(ls demo/b) 2>/dev/nullExplanation
The pipe operator | is one of the most powerful features in Unix philosophy, embodying the principle: "Write programs that do one thing well and work together." A pipe connects the Standard Output (stdout) of the command on the left to the Standard Input (stdin) of the command on the right, allowing data to flow seamlessly between programs. This enables you to build sophisticated data processing pipelines by chaining together simple, specialized tools like grep, sed, awk, sort, uniq, and wc.
When you create a pipeline like ls | grep pattern | sort | head, the kernel creates anonymous pipes (FIFO buffers) in memory to connect each command's output to the next command's input. No temporary files are created—data flows as a stream of bytes. The kernel automatically manages flow control: if the reading process is slow, the writing process is paused, and vice versa. All commands in a pipeline run concurrently, which can significantly improve performance for large data sets.
Advanced piping techniques include xargs, which converts standard input into command-line arguments for programs that don't read from stdin (like rm, cp, or chmod). Process Substitution using <(...) treats command output as if it were a file, which is invaluable for tools like diff that expect file arguments. For example, diff <(ls dir1) <(ls dir2) compares directory listings without creating temporary files. Understanding pipes unlocks the full power of command-line composition.
Code Breakdown
ls -l produces text output. The pipe sends that text to grep, which filters for lines containing ".txt". The filtered output is displayed.cat → sort → uniq → wc -l. The final command counts unique lines.xargs converts find output (list of filenames) into arguments for echo (or rm in real use). The -I {} flag uses {} as a placeholder.<(ls ...) creates a temporary file descriptor (like /dev/fd/63) that diff can read. It avoids creating actual temp files on disk.
