Introduction
Manually checking your server’s health is a repetitive and inefficient task. A much better approach is to automate these checks with a simple Bash script. This allows you to get a quick, consistent overview of your system’s vital signs—like CPU load, memory, and disk space—anytime you need it.
This guide will walk you through creating a straightforward yet powerful health monitoring script. You’ll learn to check critical metrics and set custom thresholds to alert you when something needs attention.
Step 1: Create and Prepare the Script File
First, create a new file for your script. We’ll name it health_check.sh.
touch health_check.sh
Next, make the file executable. This permission allows you to run it directly from your terminal.
chmod +x health_check.sh
Now, open health_check.sh in your favorite text editor (like nano, vim, or VS Code) to start building the script.
Step 2: Build the Health Check Script
We will build the script piece by piece. The final result will be a single file that checks multiple system metrics.
The Basic Structure
Every Bash script should start with a “shebang” (#!) that tells the system which interpreter to use. We will also define some variables for color-coded output and set warning thresholds.
Copy this foundational code into your health_check.sh file:
#!/bin/bash
# ------------------------------------------------------------------
# A simple script to monitor basic system health metrics.
#
# Author: ShellFu
# ------------------------------------------------------------------
# --- Configuration ---
# Set the warning thresholds for CPU, memory, and disk usage (in percent).
CPU_THRESHOLD=80
MEM_THRESHOLD=80
DISK_THRESHOLD=90
# --- Colors for Output ---
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# --- Functions ---
# Function to print a separator line
print_separator() {
printf -- '-%.0s' {1..60}
printf "\n"
}
# --- Main Logic ---
echo "System Health Report - $(date)"
print_separator
This setup defines thresholds and color codes, making the final report easy to read.
Step 3: Check CPU Load
The system’s load average gives you an idea of how busy the CPU has been over the last 1, 5, and 15 minutes. We can get this from the uptime command.
Add this function to your script below the --- Functions --- comment:
# Function to check system load average
check_load() {
echo -e "${YELLOW}System Load:${NC}"
# Get load average from the 'uptime' command
LOAD_AVG=$(uptime | awk -F'load average: ' '{print $2}')
echo " Load Average (1m, 5m, 15m): $LOAD_AVG"
# For a simple CPU check, we'll use the 1-minute load average.
# We compare it to the number of CPU cores.
LOAD_1MIN=$(echo "$LOAD_AVG" | awk -F, '{print $1}')
CPU_CORES=$(nproc)
# bc is used for floating-point comparison
if (( $(echo "$LOAD_1MIN > $CPU_CORES" | bc -l) )); then
echo -e " ${RED}Status: High 1-minute load average detected.${NC}"
else
echo -e " ${GREEN}Status: Load average is within normal limits.${NC}"
fi
print_separator
}
uptime | awk -F'load average: ' '{print $2}': This command gets the output ofuptimeand usesawkto print only the load average values.nproc: This command returns the number of available processing units (CPU cores). A common rule of thumb is that a load average higher than the number of cores indicates a performance bottleneck.
Step 4: Check Memory Usage
Next, let’s check how much RAM is being used. The free command provides this information.
Add this function to your script:
# Function to check memory usage
check_memory() {
echo -e "${YELLOW}Memory Usage:${NC}"
# Get memory usage and format it. 'free -m' shows values in megabytes.
MEM_INFO=$(free -m)
TOTAL_MEM=$(echo "$MEM_INFO" | awk 'NR==2 {print $2}')
USED_MEM=$(echo "$MEM_INFO" | awk 'NR==2 {print $3}')
MEM_PERCENT=$((USED_MEM * 100 / TOTAL_MEM))
echo " Total Memory: ${TOTAL_MEM}MB"
echo " Used Memory: ${USED_MEM}MB (${MEM_PERCENT}%)"
if [ "$MEM_PERCENT" -gt "$MEM_THRESHOLD" ]; then
echo -e " ${RED}Status: High memory usage detected.${NC}"
else
echo -e " ${GREEN}Status: Memory usage is normal.${NC}"
fi
print_separator
}
free -m: Displays system memory in megabytes.awk 'NR==2 {print $2}':awkprocesses the output line by line.NR==2tells it to only look at the second line (the “Mem:” line), and{print $2}prints the second column (total memory).$((...)): This is Bash’s arithmetic expansion, used here to calculate the usage percentage.
Step 5: Check Disk Space
Running out of disk space is a common problem. This function will check the usage of the root filesystem (/).
Add this function to your script:
# Function to check disk space usage for the root partition
check_disk() {
echo -e "${YELLOW}Disk Usage (/):${NC}"
# 'df -h /' gets disk usage for the root filesystem in human-readable format.
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
echo " Usage: ${DISK_USAGE}%"
if [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ]; then
echo -e " ${RED}Status: Low disk space detected.${NC}"
else
echo -e " ${GREEN}Status: Disk space is sufficient.${NC}"
fi
print_separator
}
df -h /: Reports the disk space for the root (/) filesystem in a “human-readable” format (e.g., GB, MB).awk 'NR==2 {print $5}': This grabs the usage percentage from the output.sed 's/%//': This command removes the%symbol so we can perform a numerical comparison.
Step 6: Putting It All Together
Now, let’s call these functions from the main part of our script. Replace the --- Main Logic --- section with the following:
# --- Main Logic ---
echo "System Health Report - $(date)"
print_separator
check_load
check_memory
check_disk
echo "Health check complete."
Your final health_check.sh file should now look like this:
#!/bin/bash
# ------------------------------------------------------------------
# A simple script to monitor basic system health metrics.
#
# Author: ShellFu
# ------------------------------------------------------------------
# --- Configuration ---
# Set the warning thresholds for CPU, memory, and disk usage (in percent).
CPU_THRESHOLD=80
MEM_THRESHOLD=80
DISK_THRESHOLD=90
# --- Colors for Output ---
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# --- Functions ---
# Function to print a separator line
print_separator() {
printf -- '-%.0s' {1..60}
printf "\n"
}
# Function to check system load average
check_load() {
echo -e "${YELLOW}System Load:${NC}"
# Get load average from the 'uptime' command
LOAD_AVG=$(uptime | awk -F'load average: ' '{print $2}')
echo " Load Average (1m, 5m, 15m): $LOAD_AVG"
# For a simple CPU check, we'll use the 1-minute load average.
# We compare it to the number of CPU cores.
LOAD_1MIN=$(echo "$LOAD_AVG" | awk -F, '{print $1}')
CPU_CORES=$(nproc)
# bc is used for floating-point comparison
if (( $(echo "$LOAD_1MIN > $CPU_CORES" | bc -l) )); then
echo -e " ${RED}Status: High 1-minute load average detected.${NC}"
else
echo -e " ${GREEN}Status: Load average is within normal limits.${NC}"
fi
print_separator
}
# Function to check memory usage
check_memory() {
echo -e "${YELLOW}Memory Usage:${NC}"
# Get memory usage and format it. 'free -m' shows values in megabytes.
MEM_INFO=$(free -m)
TOTAL_MEM=$(echo "$MEM_INFO" | awk 'NR==2 {print $2}')
USED_MEM=$(echo "$MEM_INFO" | awk 'NR==2 {print $3}')
MEM_PERCENT=$((USED_MEM * 100 / TOTAL_MEM))
echo " Total Memory: ${TOTAL_MEM}MB"
echo " Used Memory: ${USED_MEM}MB (${MEM_PERCENT}%)"
if [ "$MEM_PERCENT" -gt "$MEM_THRESHOLD" ]; then
echo -e " ${RED}Status: High memory usage detected.${NC}"
else
echo -e " ${GREEN}Status: Memory usage is normal.${NC}"
fi
print_separator
}
# Function to check disk space usage for the root partition
check_disk() {
echo -e "${YELLOW}Disk Usage (/):${NC}"
# 'df -h /' gets disk usage for the root filesystem in human-readable format.
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
echo " Usage: ${DISK_USAGE}%"
if [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ]; then
echo -e " ${RED}Status: Low disk space detected.${NC}"
else
echo -e " ${GREEN}Status: Disk space is sufficient.${NC}"
fi
print_separator
}
# --- Main Logic ---
echo "System Health Report - $(date)"
print_separator
check_load
check_memory
check_disk
echo "Health check complete."
Step 7: Run the Script
Execute the script from your terminal to see the report.
./health_check.sh
You’ll see a clean, color-coded report. If all metrics are within the thresholds, the output will look something like this:
System Health Report - Sun Nov 16 22:31:38 UTC 2025
------------------------------------------------------------
System Load:
Load Average (1m, 5m, 15m): 0.05, 0.15, 0.20
Status: Load average is within normal limits.
------------------------------------------------------------
Memory Usage:
Total Memory: 7950MB
Used Memory: 2150MB (27%)
Status: Memory usage is normal.
------------------------------------------------------------
Disk Usage (/):
Usage: 45%
Status: Disk space is sufficient.
------------------------------------------------------------
Health check complete.
If disk usage were above 90%, that section would show a red warning message instead.
Conclusion
You now have a simple, effective Bash script for monitoring your system’s health. You can run it anytime for a quick status check or customize it further. For example, you could add checks for running services (systemctl is-active nginx) or network connectivity.
As a next step, automate this script using cron to run it on a schedule. To run it every hour, edit your crontab:
crontab -e
And add the following line, making sure to use the full path to your script:
0 * * * * /home/youruser/health_check.sh > /var/log/health_report.log 2>&1
This simple automation is a cornerstone of effective Linux system administration.