Last modified: July 30, 2024

This article is written in: 🇺🇸

Disk Usage Management

The ability to manage and monitor disk usage is crucial when maintaining servers. Disk usage is often checked when diagnosing system issues, planning for future storage requirements, or cleaning up unused files and directories.

Understanding the df command

The df (disk filesystem) command provides information about the filesystems on your machine. It shows details such as total size, used space, available space, and the percentage of space used. To display these statistics in a human-readable format, using units like KB, MB, or GB, you can use the -h (human-readable) option.

For example, executing df -h might produce an output like the following:

Filesystem Size Used Available Use% Mounted on
/dev/sda1 2.0T 1.0T 1.0T 50% /
/dev/sda2 500G 200G 300G 40% /boot

This output provides the following information:

Exploring the du Command

The du (disk usage) command is used to estimate the space occupied by files or directories. To display the output in a human-readable format, you can use the -h option. The -s option provides a summarized result for directories. For example, running du -sh . will show the total size of the current directory in a human-readable format.

To find the top 10 largest directories starting from the root directory (/), you can use the following command:

du -x / | sort -nr | head -10

An example output might look like this:

10485760    /usr
5120000     /var
2097152     /lib
1024000     /opt
524288      /boot
256000      /home
128000      /bin
64000       /sbin
32000       /etc
16000       /tmp

In this command:

This command sequence helps you quickly identify the directories consuming the most space on your system.

To further improve the speed of the du command, especially when dealing with many subdirectories, you can use xargs -P to parallelize the processing. This approach takes advantage of multiple CPU cores, allowing du to run on multiple directories simultaneously. Additionally, combining it with awk can help format the output more cleanly.

Here’s an enhanced example that finds the top 10 largest directories and uses xargs to speed up the process:

find / -maxdepth 1 -type d | xargs -I{} -P 4 du -sh {} 2>/dev/null | sort -hr | head -10 | awk '{printf "%-10s %s\n", $1, $2}'

Explanation:

I. find / -maxdepth 1 -type d: This command finds all directories at the root level (/), limiting the search to the top-level directories only (-maxdepth 1).

II. xargs -I{} -P 4 du -sh {} 2>/dev/null:

III. sort -hr: Sorts the output in human-readable format and in reverse order, so the largest directories come first.

IV. head -10: Limits the output to the top 10 largest directories.

V. awk '{printf "%-10s %s\n", $1, $2}': Formats the output, ensuring the size and directory name align neatly. The %-10s ensures the size column has a fixed width, making the output more readable.

By using xargs -P, you can significantly reduce the time it takes to compute the disk usage of directories, especially on systems with many directories and multiple CPU cores. This method effectively utilizes system resources to perform the operation more efficiently.

The ncdu Command

For a more visual and interactive representation of disk usage, you can use ncdu (NCurses Disk Usage). ncdu is a ncurses-based tool that provides a user-friendly interface to quickly assess which directories are consuming the most disk space. If it is not already installed, you can install it via your package manager, such as apt for Debian-based systems or yum for Red Hat-based systems.

Running the command ncdu -x / will start the program at the root directory (/) and present an interactive interface. Here, you can navigate through directories using arrow keys and view their sizes, making it easier to identify space hogs.

Here’s an example of what the output might look like in a non-interactive, textual representation:

ncdu 1.15 ~ Use the arrow keys to navigate, press ? for help
--- / -----------------------------------------------------------------------
    4.6 GiB [##########] /usr
    2.1 GiB [####      ] /var
  600.0 MiB [#         ] /lib
  500.0 MiB [#         ] /opt
  400.0 MiB [          ] /boot
  300.0 MiB [          ] /sbin
  200.0 MiB [          ] /bin
  100.0 MiB [          ] /etc
   50.0 MiB [          ] /tmp
   20.0 MiB [          ] /home
   10.0 MiB [          ] /root
    5.0 MiB [          ] /run
    1.0 MiB [          ] /srv
    0.5 MiB [          ] /dev
    0.1 MiB [          ] /mnt
    0.0 MiB [          ] /proc
    0.0 MiB [          ] /sys
 Total disk usage: 8.8 GiB  Apparent size: 8.8 GiB  Items: 123456

In this output:

ncdu is especially useful for quickly finding large directories and files, thanks to its intuitive interface. The ability to easily navigate through directories makes it a powerful tool for managing disk space on your system.

Cleaning Up Disk Space

Once you've identified what's using your disk space, the next step is often to free up space. Here are a few strategies:

Automating Disk Usage Checks

For ongoing disk usage monitoring, consider setting up automated tasks. For instance, you can schedule a cron job that runs df and du at regular intervals and sends reports via email or logs them for later review.

Monitoring disk usage proactively can prevent potential issues related to low disk space, such as application errors, slow performance, or system crashes.

Bash Script Example for Disk Usage Monitoring

#!/bin/bash

# Script to monitor disk usage and report

# Set the path for the log file
LOG_FILE="/var/log/disk_usage_report.log"

# Get disk usage with df
echo "Disk Usage Report - $(date)" >> "$LOG_FILE"
echo "---------------------------------" >> "$LOG_FILE"
df -h >> "$LOG_FILE"

# Get top 10 directories consuming space
echo "" >> "$LOG_FILE"
echo "Top 10 Directories by Size:" >> "$LOG_FILE"
du -x / | sort -nr | head -10 >> "$LOG_FILE"

# Optionally, you can send this log via email instead of writing to a file
# For email, you can use: mail -s "Disk Usage Report" recipient@example.com < "$LOG_FILE"

# End of script

sudo chmod +x /path/to/disk_usage_monitor.sh && sudo mv /path/to/disk_usage_monitor.sh /etc/cron.daily/

Challenges

  1. Display the free space available on the root filesystem (/).
  2. For each mounted filesystem, show the percentage of space used.
  3. Provide information about all filesystems, including those that are not currently mounted.
  4. Determine the size of the directory you're currently in.
  5. Check and report the size of the /home directory.
  6. Identify the 10 largest directories in the system.
  7. Track and report the amount of data being written to the disk in real-time.
  8. Locate individual files that are taking up the most space on the disk.
  9. Take snapshots of disk usage at different times and compare them to identify growth trends.
  10. Break down disk usage statistics by the types of files (e.g., .txt, .jpg, .log).

Table of Contents

    Disk Usage Management
    1. Understanding the df command
    2. Exploring the du Command
    3. The ncdu Command
    4. Cleaning Up Disk Space
    5. Automating Disk Usage Checks
      1. Bash Script Example for Disk Usage Monitoring
    6. Challenges