Last modified: October 10, 2024

This article is written in: 🇺🇸

Grep

The grep command is one of the most powerful and versatile tools in the Unix and Unix-like operating systems, including Linux and macOS. Its name stands for global regular expression print, and it is primarily used for searching plain-text data sets for lines that match a regular expression or a fixed string. grep is an essential utility for system administrators, developers, and anyone who works with text processing on the command line.

grep was originally developed in the early 1970s by Ken Thompson, one of the creators of Unix. It was inspired by the ed editor's search command, which used regular expressions to search for patterns in text files. The name "grep" comes from the ed command g/re/p, which stands for "global search for regular expression and print matching lines."

grep is invaluable for a variety of tasks, including:

Syntax

At its core, grep searches input files for lines that match a given pattern and outputs the matching lines. The basic syntax is:

grep [OPTIONS] PATTERN [FILE...]

How It Works Internally

User
 |
 | Provides 'grep' with a pattern and file(s)
 v
+-------------------------------+
| grep Command                  |
|  - Reads input file(s)        |
|  - Scans each line            |
|  - Compares lines to pattern  |
|  - Collects matching lines    |
+-------------------------------+
 |
 | Outputs matching lines
 v
Terminal/Shell

grep reads the specified file(s) line by line, applies the search pattern to each line, and outputs lines that match the pattern.

Example: Searching for a Word in a File

To search for the word "key" in a file named file_name.txt, use:

grep 'key' file_name.txt

Example Output:

10: This is a sample line with the word key.
35: Another line that contains the key term.
57: Yet another example where key appears.

In the output above, each line contains two main elements: the line number and the content where the term "key" is found.

Common Options

grep offers a plethora of options to refine and control the search behavior. Understanding these options can greatly enhance the effectiveness of your searches.

Frequently Used Flags:

Flag Description
-c Count: Prints only a count of matching lines per input file.
-i Ignore Case: Makes the search case-insensitive.
-r Recursive: Recursively searches through directories and subdirectories.
-v Invert Match: Inverts the search, displaying lines that do not match the pattern.
-n Line Number: Displays the line numbers of matching lines.
-e Pattern: Allows the specification of multiple patterns to search for.
-E Extended Regex: Enables extended regular expressions, allowing the use of advanced regex operators without escaping them.
-l Files with Matches: Lists the names of files that contain at least one matching line.
--color Highlight: Highlights the matching strings within the output for better visibility.
-w Word Match: Matches whole words only, not substrings.
-x Line Match: Matches only those lines that exactly match the entire pattern.
-A NUM After Context: Prints NUM lines of trailing context after matching lines.
-B NUM Before Context: Prints NUM lines of leading context before matching lines.
-C NUM Context: Prints NUM lines of output context, both before and after matching lines.
--exclude Exclude Files: Skips files whose base name matches the given GLOB pattern.
--include Include Files: Searches only files whose base name matches the given GLOB pattern.

Using the -e Flag for Multiple Patterns

The -e option allows you to specify multiple patterns in a single grep command.

grep -e 'abc' -e 'def' file_name.txt

Suppose file_name.txt contains:

This is a line containing abc.
This is another line with def.
This line does not contain either pattern.
This line contains both abc and def.

Example Output:

This is a line containing abc.
This is another line with def.
This line contains both abc and def.

Using Extended Regular Expressions with -E

The -E flag enables extended regular expressions, which provide more powerful pattern matching capabilities.

To find lines containing either 'abc' or 'def':

grep -E 'abc|def' file_name.txt

Sample Input (file_name.txt):

This is a line containing abc.
This is another line with def.
This line contains both abc and def.
Nothing relevant here.

Example Output:

This is a line containing abc.
This is another line with def.
This line contains both abc and def.

Recursive Search with -r

The -r option allows you to search through all files in a directory and its subdirectories.

Find where does the following ip address 192.45.92.0 appear:

grep -r '192.45.92.0' /var/log/

Sample Directory Structure:

/var/log/syslog
/var/log/auth.log

Sample Content (/var/log/syslog):

Connecting to server at 192.45.92.0
Service started successfully.

Example Output:

/var/log/syslog:Connecting to server at 192.45.92.0
/var/log/auth.log:Failed login attempt from 192.45.92.0

Using the -l Flag to List Filenames

The -l option lists only the names of files containing matches, not the matching lines themselves.

Find all files containing the word 'nginx' in the /var/log/ directory:

grep -l 'nginx' /var/log/*

Example Output:

/var/log/nginx/error.log
/var/log/nginx/access.log

Excluding Patterns with the -v Flag

The -v option inverts the match, displaying lines that do not match the specified pattern.

Exclude comment and empty lines from the SSH configuration file:

grep -v '^#' /etc/ssh/sshd_config | grep -v '^$'

Sample Input (/etc/ssh/sshd_config):

# This is a comment
Port 22

# Another comment
PermitRootLogin no

Example Output:

Port 22
PermitRootLogin no

Combining Exclusions with Extended Regex

You can achieve the same result as above more efficiently using extended regular expressions:

grep -Ev '^(#|$)' /etc/ssh/sshd_config

Case-Insensitive Recursive Search with Error Suppression

To perform a case-insensitive search and suppress error messages:

grep -ri '10.0.0.1' /var/log/ 2>/dev/null

Searching for Patterns in System Files

To find group names in /etc/group containing 'rw.':

grep 'rw.' /etc/group

Sample Input (/etc/group):

rwuser:x:1001:
rwgroup:x:1002:
reader:x:1003:

Example Output:

rwuser:x:1001:
rwgroup:x:1002:

Highlighting Matches with --color

To highlight matching patterns in the output:

grep --color ':[0-9]*:' /etc/group

Sample Input (/etc/group):

user:x:1001:1002:user
admin:x:1000:

Example Output:

user:x:1001:1002:user
admin:x:1000:

In this example, :[0-9]*: matches a colon, followed by zero or more digits, followed by another colon. The --color flag highlights the matching part in color.

Understanding Regular Expressions (RegEx)

Regular expressions are patterns used to match character combinations in strings. In grep, regular expressions allow for complex pattern matching, providing flexibility and precision.

Basic RegEx Symbols

Symbol Description
. Matches any single character except a newline.
^ Matches the start of a line.
$ Matches the end of a line.
* Matches zero or more occurrences of the preceding element.
+ Matches one or more occurrences of the preceding element (when using -E for extended regex).
? Matches zero or one occurrence of the preceding element.
[] Denotes a character class. Matches any one character within the brackets.
[^] Negated character class. Matches any character not in the brackets.
() Groups multiple tokens together and remembers the matched text (capturing group).
| Logical OR operator. Matches patterns on either side of the |.
\ Escapes the following character, removing its special meaning.

Practical Examples of RegEx Symbols

Using ^ to Match Start of Line

Find all lines that begin with # (commonly used for comments):

grep '^#' /opt/test.txt

Sample Input (/opt/test.txt):

# This is a comment
echo "Hello, World!"
# Another comment line
echo "Goodbye!"

Example Output:

# This is a comment
# Another comment line

In this example, the ^# expression matches any line where # is the first character. This is useful for extracting comments or lines with specific starting characters.

Using $ to Match End of Line

Search for lines that end with ; (commonly used in code):

grep ';$' script.sh

Sample Input (script.sh):

echo "Hello, World!";
echo "This line has no semicolon"
echo "Goodbye!";

Example Output:

echo "Hello, World!";
echo "Goodbye!";

Here, ;$ matches lines that end with a semicolon, which helps identify complete statements in code files.

Using . to Match Any Character

Find lines containing "a" followed by any character and then "c":

grep 'a.c' /opt/test.txt

Sample Input (/opt/test.txt):

abc
a-c
a3c
ac

Example Output:

abc
a-c
a3c

The . matches any single character except a newline, so it finds sequences like "abc", "a-c", or "a3c" but skips "ac."

Using \ to Escape Special Characters

To search for a literal period .:

grep '\.' /opt/test.txt

Sample Input (/opt/test.txt):

example.com
another_example
this.line

Example Output:

example.com
this.line

Here, \. treats the dot as a regular character, matching lines containing a period.

Using () for Grouping

Search for lines containing repeated sequences:

grep -E '(abc){2}' /opt/test.txt

Sample Input (/opt/test.txt):

abcabc
abc
xyzabcabcxyz

Example Output:

abcabc
xyzabcabcxyz

The (abc){2} pattern matches two consecutive occurrences of "abc." This requires the -E flag for extended regex.

Using ? for Optional Elements

Find lines containing "color" or "colour":

grep 'colou?r' /opt/test.txt

Sample Input (/opt/test.txt):

color
colour
colored
colourful

Example Output:

color
colour

The u? makes the character u optional, so it matches both "color" and "colour."

Using Character Classes []

Search for lines containing "grey" or "gray":

grep 'gr[ae]y' colors.txt

Sample Input (colors.txt):

grey
gray
grEy
grAy

Example Output:

grey
gray

The [ae] matches either a or e, allowing the command to find both "grey" and "gray."

Negated Character Classes [^]

Find lines that contain any character except digits:

grep '[^0-9]' file.txt

Sample Input (file.txt):

12345
hello123
no_digits_here

Example Output:

hello123
no_digits_here

The [^0-9] matches any character that is not a digit, filtering out purely numerical lines.

Advanced RegEx Quantifiers

Quantifiers specify the number of times an element must occur for a match.

Common Quantifiers:

Quantifier Description
* Matches zero or more occurrences of the preceding element.
+ Matches one or more occurrences of the preceding element.
? Matches zero or one occurrence of the preceding element.
{n} Matches exactly n occurrences of the preceding element.
{n,} Matches n or more occurrences of the preceding element.
{,m} Matches zero to m occurrences of the preceding element.
{n,m} Matches between n and m occurrences of the preceding element.

Using * (Zero or More)

Find lines with zero or more o characters following hell:

grep -E 'hello*' greetings.txt

Sample Input (greetings.txt):

hell
hello
hellooooo
hi there

Example Output:

hell
hello
hellooooo

The o* pattern matches zero or more occurrences of the letter o after hell, so it will match "hell", "hello", and any variant with additional os.

Using + (One or More)

Search for lines containing one or more digits:

grep -E '[0-9]+' data.txt

Sample Input (data.txt):

User ID: 123
Order number: 4567
Reference code: AB12
No numbers here.

Example Output:

User ID: 123
Order number: 4567

The [0-9]+ pattern matches sequences of one or more digits, finding "123" and "4567".

Using {n} (Exactly n Occurrences)

Find lines with exactly four-letter words:

grep -E '\b\w{4}\b' words.txt

Sample Input (words.txt):

This is a test.
Many words have four characters.
Time flies.

Example Output:

This is a test.
Time flies.

The \w{4} pattern matches words of exactly four characters. \b denotes word boundaries, ensuring only whole four-letter words are matched.

Using {n,} (At Least n Occurrences)

Search for lines with words containing at least 8 characters:

grep -E '\b\w{8,}\b' large_words.txt

Sample Input (large_words.txt):

supercalifragilisticexpialidocious
complexity
simplify

Example Output:

supercalifragilisticexpialidocious
complexity

The \w{8,} pattern matches words with at least eight characters.

Using {,m} (Up to m Occurrences)

Find lines with words of up to 5 characters:

grep -E '\b\w{,5}\b' words.txt

Sample Input (words.txt):

tiny big vast enormous small huge

Example Output:

tiny big small huge

The \w{,5} pattern matches words with up to five characters, filtering shorter words.

Using {n,m} (Between n and m Occurrences)

Extract lines with words containing between 3 and 6 characters:

grep -E '\b\w{3,6}\b' words.txt

Sample Input (words.txt):

sunlight
blue sky
green
horizon
earth

Example Output:

blue sky green earth

The \w{3,6} pattern matches words with lengths from three to six characters, identifying "blue", "sky", "green", and "earth".

Contextual Searches with grep

Sometimes, you may need more context around the matching lines. grep provides options to display lines before and after a match.

Options for Contextual Searches:

Option Description
-A n Displays n lines After the matching line.
-B n Displays n lines Before the matching line.
-C n Displays n lines of context (Before and After) the match.

Displaying Lines After a Match

grep -A 2 'ERROR' logfile.txt

Sample Input (logfile.txt):

INFO: Process started
INFO: Connection established
ERROR: Unable to retrieve data
WARNING: Retry attempt 1
INFO: Data retrieval successful
INFO: Process completed

Example Output:

ERROR: Unable to retrieve data
WARNING: Retry attempt 1
INFO: Data retrieval successful

Displaying Lines Before a Match

grep -B 3 'failed' auth.log

Sample Input (auth.log):

INFO: User login attempt
INFO: Password entered
WARNING: Account locked
ERROR: Login failed
INFO: User logout

Example Output:

INFO: User login attempt
INFO: Password entered
WARNING: Account locked
ERROR: Login failed

Displaying Lines Before and After a Match

grep -C 1 'timeout' server.log

Sample Input (server.log):

INFO: Request received
INFO: Processing request
ERROR: Connection timeout
INFO: Retrying connection
INFO: Connection successful

Example Output:

INFO: Processing request
ERROR: Connection timeout
INFO: Retrying connection

Excluding and Including Files in Searches

When performing recursive searches, you might want to include or exclude certain files or file types.

Excluding Files

Example:

Exclude all .log files from the search:

grep -r --exclude='*.log' 'pattern' /path/to/search/

--exclude='*.log' tells grep to skip files ending with .log.

Including Only Specific Files

Search only within .txt files:

grep -r --include='*.txt' 'pattern' /path/to/search/

--include='*.txt' restricts the search to files ending with .txt.

Combining Multiple Options

You can combine multiple options to tailor the grep command for specific needs. This example demonstrates searching for the case-insensitive pattern 'error' in all .log files under /var/log/, excluding files larger than 1MB, and displaying line numbers with context:

find /var/log/ -type f -name '*.log' -size -1M -exec grep -inC 2 'error' {} +

Breaking down the command:

Command/Option Description
find Locates files that match specified criteria.
/var/log/ Specifies the directory to search within.
-type f Limits the search to files only, excluding directories.
-name '*.log' Searches for files with the .log extension.
-size -1M Includes files smaller than 1MB.
-exec grep -inC 2 'error' Executes grep on each found file with specified options.
grep -i Performs a case-insensitive search for the term 'error'.
grep -n Displays line numbers alongside matched lines.
grep -C 2 Shows two lines of context before and after each match.
{} Placeholder for each file found by find, passed to grep.
+ Appends all found files to a single grep command, optimizing performance.

Suppose you have the following .log files within /var/log/:

/var/log/syslog.log:

INFO: System started.
WARNING: Low memory detected.
ERROR: Disk not found.
INFO: Process halted.
INFO: Restarting services.
ERROR: Network failure.

/var/log/server.log:

INFO: Server initialized.
ERROR: Connection timed out.
WARNING: High CPU usage.
INFO: Server shutdown.

Example Output:

/var/log/syslog.log:3-ERROR: Disk not found.
  INFO: Process halted.
  INFO: Restarting services.
  ERROR: Network failure.

/var/log/server.log:2-ERROR: Connection timed out.
  WARNING: High CPU usage.
  INFO: Server shutdown.

Tips and Best Practices

Challenges

  1. List all visible files and directories within the current working directory, excluding any that contain the word "test" in their names. Explain the use of grep for pattern matching and how to use it in combination with other commands like ls to filter results.
  2. Search through the command history for all commands containing the word "clone" that were executed within the past week. Discuss how different shells (like Bash or Zsh) handle history files and how grep can be used to narrow down specific timeframes and patterns.
  3. Check if a user named "adam" exists on the system by searching the /etc/passwd file. If found, display their home directory and default shell information. Discuss how user information is stored in /etc/passwd and the role of grep in filtering out relevant lines.
  4. In a file named file.txt, count the number of lines that contain the word "apple." Explain the use of grep -c for counting occurrences and discuss how regular expressions can make searching more precise.
  5. Using file.txt, locate all lines containing a pattern of three consecutive uppercase letters (e.g., [A-Z]{3}). Then, convert these matched lines to uppercase using tr or another suitable tool. Explore how grep can be paired with text transformation commands for more complex tasks.
  6. Search through the system log files (e.g., /var/log/syslog) for lines containing the word "error" and save the results to a new file named errors.log. Discuss how grep can aid in monitoring and troubleshooting by filtering logs for relevant keywords.
  7. In a directory containing multiple text files, find and list all files that contain the exact phrase "network connection failed." Explain how grep -l helps locate files based on specific content without showing the matched lines themselves.
  8. Use grep to locate lines in a script file that contain comments (#), then count the number of commented lines. Discuss the usefulness of grep for analyzing code and configuration files, particularly for identifying specific patterns like comments or TODOs.
  9. Search a file named data.csv for rows where the third column contains the value "pending." Discuss how grep can be combined with awk or cut to target specific columns in structured files, making it easier to filter data by fields.
  10. Find all occurrences of IP addresses in a file named access.log using a regular expression pattern that matches IPv4 addresses. Save the matched IPs to a new file called ip_addresses.txt. Discuss how grep with regular expressions enables pattern-specific searches, useful for tasks like log analysis.

Table of Contents

  1. Grep
    1. Syntax
    2. How It Works Internally
    3. Example: Searching for a Word in a File
    4. Common Options
      1. Using the -e Flag for Multiple Patterns
      2. Using Extended Regular Expressions with -E
      3. Recursive Search with -r
      4. Using the -l Flag to List Filenames
      5. Excluding Patterns with the -v Flag
      6. Combining Exclusions with Extended Regex
      7. Case-Insensitive Recursive Search with Error Suppression
      8. Searching for Patterns in System Files
      9. Highlighting Matches with --color
  2. Understanding Regular Expressions (RegEx)
    1. Basic RegEx Symbols
    2. Practical Examples of RegEx Symbols
      1. Using ^ to Match Start of Line
      2. Using $ to Match End of Line
      3. Using . to Match Any Character
      4. Using \ to Escape Special Characters
      5. Using () for Grouping
      6. Using ? for Optional Elements
      7. Using Character Classes []
      8. Negated Character Classes [^]
    3. Advanced RegEx Quantifiers
      1. Using * (Zero or More)
      2. Using + (One or More)
      3. Using {n} (Exactly n Occurrences)
      4. Using {n,} (At Least n Occurrences)
      5. Using {,m} (Up to m Occurrences)
      6. Using {n,m} (Between n and m Occurrences)
    4. Contextual Searches with grep
      1. Displaying Lines After a Match
      2. Displaying Lines Before a Match
      3. Displaying Lines Before and After a Match
    5. Excluding and Including Files in Searches
      1. Excluding Files
      2. Including Only Specific Files
    6. Combining Multiple Options
    7. Tips and Best Practices
    8. Challenges