Command-Line Stream Editors

Last modified: May 06, 2025

This article is written in: 🇺🇸

Command-Line Stream Editors

sed (Stream Editor) and awk are powerful command-line utilities that originated from Unix and have become indispensable tools in Unix-like operating systems, including Linux and macOS. They are designed for processing and transforming text, allowing users to perform complex text manipulations with simple commands. This guide provides a comprehensive overview of both utilities, including their history, usage, syntax, options, and practical examples.

Sed

Developed in the 1970s by Lee E. McMahon of Bell Labs, sed is a non-interactive stream editor used to perform basic text transformations on an input stream (a file or input from a pipeline). It was designed to support scripting and command-line usage, automating repetitive editing tasks.

Main idea:

Performs editing operations automatically without user interaction.
Processes input line by line, making it efficient for large files.
Supports powerful pattern matching using regular expressions.
Allows the use of scripts for complex editing tasks.

Syntax

The basic syntax of sed is:

sed [OPTIONS] 'SCRIPT' INPUTFILE...

OPTIONS are used to modify the behavior of sed, allowing customization of how the command processes and interprets input.
The SCRIPT consists of one or more editing commands that sed applies to the input, defining specific transformations or modifications to be executed.
INPUTFILE refers to one or more files that sed processes, enabling batch or single-file editing based on the specified script commands.

How `sed` Works

User
 |
 | Uses 'sed' with a script and input file(s)
 v
+-------------------------------+
| sed Command                   |
|  - Reads Input Line by Line   |
|  - Applies Script to Each Line|
|  - Outputs Modified Lines     |
+-------------------------------+
 |
 | Outputs to Terminal or File
 v

sed reads the input text line by line, applies the specified commands to each line, and outputs the result. If no input file is specified, sed reads from standard input.

Common Operations with `sed`

Substitution

The substitution command replaces text matching a pattern with new text.

sed 's/pattern/replacement/flags' inputfile

The s command initiates a substitution, signaling sed to replace matched patterns within the input.
The pattern represents the regular expression or specific text that sed searches for in each line.
The replacement is the text that replaces any matches found by the pattern, allowing for content modification.
flags are optional modifiers that alter the behavior of the substitution, such as making it global with g or applying other specific actions.

Common Flags:

Flag	Description
`g`	Global replacement in the line.
`i`	Case-insensitive matching.
`p`	Prints the line if a substitution occurred.

Example: Replace 'apple' with 'orange' globally:

sed 's/apple/orange/g' fruits.txt

Deletion

Delete lines matching a pattern or at a specific line number.

sed '/pattern/d' inputfile

Example: Remove empty lines:

sed '/^$/d' file.txt

Explanation:

^$ matches empty lines.
d deletes the matching lines.

Insertion and Appending

Insert or append text before or after a line matching a pattern.

Insert (i):

sed '/pattern/i\text to insert' inputfile

Example: Insert a header before the first line:

sed '1i\Header Text' file.txt

Append (a):

sed '/pattern/a\text to append' inputfile

Transformation

Transform characters using the y command, similar to tr.

sed 'y/source/destination/' inputfile

Example: Convert lowercase to uppercase:

sed 'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/' file.txt

Advanced `sed` Techniques

Regular Expressions in `sed`

sed supports regular expressions for pattern matching.

Metacharacters:

Symbol	Description
`.`	Matches any single character.
`*`	Matches zero or more occurrences of the preceding character.
`[]`	Character class; matches any one character inside the brackets.
`^`	Matches the start of a line.
`$`	Matches the end of a line.
`\`	Escapes a metacharacter.

Example: Replace lines starting with 'Error':

sed '/^Error/s/^Error/Warning/' logs.txt

Addressing

Specify lines to apply commands to using line numbers or patterns.

sed 'address command' inputfile

Address Types:

Using single line deletion, the command sed '5d' file.txt deletes only line 5 from the specified file.
For line range deletion, the command sed '2,4d' file.txt removes lines 2 through 4, allowing targeted multi-line deletion.
With a pattern range, the command sed '/start/,/end/d' file.txt deletes all lines from the first occurrence of 'start' up to and including 'end', enabling deletion based on matching patterns.

Holding Space

sed has a pattern space and a hold space for complex text manipulations.

Example: Swap adjacent lines:

sed 'N; s/\(.*\)\n\(.*\)/\2\n\1/' file.txt

Explanation:

N reads the next line into the pattern space.
The s command swaps the two lines.

Practical Examples and Use Cases

I. Replace All Occurrences of a String:

sed 's/old/new/g' file.txt

II. Delete Lines Containing a Pattern:

sed '/unwanted_pattern/d' file.txt

III. Insert Text After a Line Matching a Pattern:

sed '/pattern/a\New line of text' file.txt

IV. Edit Files In-Place with Backup:

sed -i.bak 's/foo/bar/g' file.txt

-i.bak edits the file in-place and creates a backup with .bak extension.

V. Change Delimiters in CSV Files:

sed 's/,/|/g' data.csv > data.psv

Converts comma-separated values to pipe-separated values.

Tips and Best Practices

Enclose scripts in single quotes to prevent shell interpretation.
Use backslashes to escape characters like /, &, and \.
Use without -i to test commands before modifying files in-place.
With -E (or sed -r in GNU sed), you can use extended regex syntax.

Example:

sed -E 's/([0-9]{3})-([0-9]{2})-([0-9]{4})/XXX-XX-\3/' ssn.txt

Awk

Developed in the 1970s by Alfred Aho, Peter Weinberger, and Brian Kernighan at Bell Labs (hence the name awk), awk is a powerful text-processing language. It is designed for data extraction and reporting, offering a programming language with C-like syntax and features.

Main idea:

Treats each line as a record and fields separated by delimiters.
Executes actions based on pattern matches.
Supports variables, arrays, and control flow statements.
Provides functions for string manipulation, arithmetic, and more.
Allows user-defined functions.

Syntax

The basic syntax of awk is:

awk 'PATTERN { ACTION }' INPUTFILE

The PATTERN represents a regular expression or condition that awk uses to identify matching lines or data segments within the input.
The ACTION includes the commands that awk executes whenever it finds a match for the specified pattern, determining how matching data is processed.
The INPUTFILE is the file that awk processes, allowing it to apply the pattern and action rules across the contents of the specified file.

How `awk` Works

User
 |
 | Uses 'awk' with a program and input file(s)
 v
+-------------------------------+
| awk Command                   |
|  - Reads Input Line by Line   |
|  - Splits Line into Fields    |
|  - Applies Pattern and Action |
+-------------------------------+
 |
 | Outputs to Terminal or File
 v

awk reads the input file line by line, splits each line into fields based on a delimiter (default is whitespace), and then executes the specified actions on lines matching the pattern.

Common Operations with `awk`

Field and Record Processing

Fields are individual data segments within a line, accessed using positional variables like $1, $2, and so forth up to $NF, where $NF represents the total number of fields.
Records refer to each line within the input, with each record accessible using the NR variable, which denotes the current record number in the sequence.

Example: Print the first and third fields:

awk '{ print $1, $3 }' data.txt

Patterns and Actions

Execute actions only on lines matching a pattern.

Example: Print lines where the second field equals 'Error':

awk '$2 == "Error" { print }' logs.txt

Variables and Operators

awk supports arithmetic and string operations.

Example: Sum values in the third field:

awk '{ sum += $3 } END { print "Total:", sum }' data.txt

Advanced `awk` Techniques

Control Structures

awk supports if, while, for, and other control structures.

Example: Conditional Processing

awk '{
  if ($3 > 100) {
    print $1, $2, "High"
  } else {
    print $1, $2, "Low"
  }
}' data.txt

Built-in Functions

awk provides numerous built-in functions for mathematical and string operations.

String Functions:

Function	Description
`length(str)`	Returns the length of `str`.
`substr(str, start, length)`	Extracts substring.
`tolower(str)`	Converts to lowercase.
`toupper(str)`	Converts to uppercase.

Example: Convert the second field to uppercase:

awk '{ $2 = toupper($2); print }' data.txt

User-Defined Functions

Define custom functions for reuse.

Example: Define a function to calculate the square:

awk 'function square(x) { return x * x }
     { print $1, square($2) }' data.txt

Practical Examples and Use Cases

I. Calculate Average of a Column:

awk '{ total += $3; count++ } END { print "Average:", total/count }' data.txt

II. Filter Rows Based on Field Value:

awk '$4 >= 50 { print }' scores.txt

III. Reformat Output:

awk '{ printf "%-10s %-10s %5.2f\n", $1, $2, $3 }' data.txt

Formats output with fixed-width columns and two decimal places.

IV. Count Occurrences of Unique Values:

awk '{ count[$1]++ } END { for (word in count) print word, count[word] }' words.txt

V. Process Delimited Data with Custom Field Separator:

awk -F ':' '{ print $1, $3 }' /etc/passwd

: sets the field separator to colon.
Prints username and UID.

Tips and Best Practices

Enclose scripts in single quotes to prevent shell interpretation.
Use backslashes to escape characters like /, &, and \.
Use without -i to test commands before modifying files in-place.
With -E (or sed -r in GNU sed), you can use extended regex syntax.

Example:

sed -E 's/([0-9]{3})-([0-9]{2})-([0-9]{4})/XXX-XX-\3/' ssn.txt

Ensure variables are initialized to avoid unexpected results.
Use -F to set custom field separators.
Use BEGIN and END Blocks for actions before processing starts or after it ends.

Example:

awk 'BEGIN { print "Start Processing" } { print $0 } END { print "End Processing" }' file.txt

Challenges

Research and describe the core differences between sed and awk, focusing on their primary functionalities. Compare how each tool handles text streams and structured data, and discuss when it might be more appropriate to use sed versus awk.
Describe the sequence of operations sed performs on a text stream, including how it reads input, processes it in the pattern space, and outputs results. Explain the purpose of the pattern space and how sed uses it to manage the text transformations on each line.
By default, awk uses whitespace as a delimiter to separate columns. Explain how to modify this default setting to use other delimiters, such as a comma (,) or a colon (:). Provide examples demonstrating how to set these delimiters with the -F option in awk.
Demonstrate how to extract data from multiple columns in awk with a specific example. Show how to retrieve data from the first, third, and fifth columns simultaneously, and explain how this syntax differs from extracting each column individually.
Use awk to filter lines where a particular column does not match a specific pattern. Provide an example that demonstrates this process, such as displaying lines from a file where the second column does not contain the word "error."
Explain how awk can perform data aggregation tasks, such as calculating the sum of values in a specific column. Provide an example of using awk to read a file with multiple rows of numeric data and compute the sum of all values in a given column.
Show how to use sed to replace all occurrences of a word in a text file with another word. Demonstrate how sed can be used both to perform a global replacement within each line and to limit replacements to the first occurrence of the word on each line.
Explain how to use awk to format and print data in a specific way. For instance, given a file with names and scores, demonstrate how you could use awk to print the names and scores in a formatted table with aligned columns.
Use sed to delete lines containing a specific pattern from a text file. Describe the command you used and explain how sed processes the file to selectively remove lines based on pattern matching.
Combine sed and awk in a pipeline to perform more complex text transformations. For example, use sed to remove blank lines from a file, and then use awk to calculate the average of numeric values in a specific column. Explain how combining these tools in a pipeline can solve more advanced text processing tasks.