BudiBadu Logo
Samplebadu

Bash by Example: Find Large Files

Bash 5.0+

Locating files larger than a specific size using find command with -size operator, implementing disk space analysis by identifying large files, using human-readable size units like M for megabytes and G for gigabytes, sorting results by size for prioritization, and automating large file cleanup or archival processes.

Code

#!/bin/bash

SEARCH_DIR="."
SIZE="+10M" # Files larger than 10 Megabytes

echo "Searching for files larger than $SIZE in $SEARCH_DIR..."

# Create a dummy large file (15MB) for demo
# fallocate is faster than dd for this
if command -v fallocate >/dev/null; then
    fallocate -l 15M large_test_file.dat
else
    dd if=/dev/zero of=large_test_file.dat bs=1M count=15 status=none
fi

# Find and display size
# -size +10M matches files > 10MB
# -exec ls -lh {} shows details
find "$SEARCH_DIR" -type f -size "$SIZE" -exec ls -lh {} \; | awk '{ print $5, $9 }'

echo "Done."
rm large_test_file.dat

Explanation

Disk space analysis often requires finding the "heavy hitters"—the few large files consuming most of the space. The find command excels at this with its -size predicate.

The size format supports suffixes like k (kilobytes), M (megabytes), and G (gigabytes). The + prefix (e.g., +100M) means "larger than", while - means "smaller than". Without a prefix, it searches for files of exactly that size (rounded up to the block size).

This script combines finding with listing. By executing ls -lh on the results, we get a human-readable confirmation of the file sizes immediately. This is often more useful than just a list of paths when you are deciding what to delete.

Code Breakdown

11
fallocate pre-allocates space instantly without writing zeros. It's the fastest way to create large dummy files for testing.
19
-size +10M. Note the capitalization. In standard find, k is lowercase but M and G are uppercase.