Posted on
Administration

Creating a System Health Check Bash Script

Author
  • User
    Linux Bash
    Posts by this author
    Posts by this author

A system health check Bash script can be used to monitor the status of critical components like CPU, memory, disk usage, and services. Here's how to create one:


Step-by-Step Guide

1. Define the Script Purpose

The script will:

  • Check CPU usage.

  • Monitor memory usage.

  • Report disk space usage.

  • Verify running services.

  • Log the results.

  • Optionally, send notifications.


2. Create the Script

Here’s an example of a health check script:

#!/bin/bash

# Configuration
LOGFILE="/var/log/system_health.log"
THRESHOLD_CPU=80
THRESHOLD_MEM=80
THRESHOLD_DISK=90
SERVICES=("nginx" "mysql" "ssh")

# Ensure the log file exists
touch $LOGFILE

echo "System Health Check - $(date)" >> $LOGFILE
echo "---------------------------------" >> $LOGFILE

# CPU Usage
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2 + $4}')
echo "CPU Usage: $CPU_USAGE%" >> $LOGFILE
if (( $(echo "$CPU_USAGE > $THRESHOLD_CPU" | bc -l) )); then
    echo "WARNING: High CPU usage detected!" >> $LOGFILE
fi

# Memory Usage
MEM_USAGE=$(free | awk '/Mem:/ {printf("%.2f"), $3/$2 * 100.0}')
echo "Memory Usage: $MEM_USAGE%" >> $LOGFILE
if (( $(echo "$MEM_USAGE > $THRESHOLD_MEM" | bc -l) )); then
    echo "WARNING: High memory usage detected!" >> $LOGFILE
fi

# Disk Usage
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
echo "Disk Usage: $DISK_USAGE%" >> $LOGFILE
if (( DISK_USAGE > THRESHOLD_DISK )); then
    echo "WARNING: High disk usage detected!" >> $LOGFILE
fi

# Check Services
echo "Service Status:" >> $LOGFILE
for SERVICE in "${SERVICES[@]}"; do
    if systemctl is-active --quiet $SERVICE; then
        echo "$SERVICE is running." >> $LOGFILE
    else
        echo "WARNING: $SERVICE is not running!" >> $LOGFILE
    fi
done

# Log completion
echo "Health check completed at $(date)." >> $LOGFILE
echo "---------------------------------" >> $LOGFILE

# Optional: Send an alert if there are warnings
if grep -q "WARNING" $LOGFILE; then
    echo "System Health Check completed with warnings. Check the log at $LOGFILE"
    # Uncomment to send email (requires mail setup)
    # mail -s "System Health Warning" admin@example.com < $LOGFILE
fi

3. Make the Script Executable

Save the script as health_check.sh and make it executable:

chmod +x health_check.sh

4. Automate with Cron

Schedule the script to run periodically: 1. Edit the crontab: bash crontab -e 2. Add a cron job (e.g., every hour): bash 0 * * * * /path/to/health_check.sh


5. Customizations

Email Notifications

Send an email if warnings are detected:

grep "WARNING" $LOGFILE | mail -s "System Health Alert" admin@example.com

Check Additional Metrics

  • Network Latency:

    ping -c 1 google.com | awk -F'=' '/time=/{print $NF}'
    
  • Specific Port Availability:

    nc -zv localhost 80
    
  • Custom Services or Processes: Replace systemctl checks with pgrep for non-systemd environments:

    pgrep -x "service_name" > /dev/null && echo "Running" || echo "Not running"
    

Log Rotation

Rotate logs to prevent excessive file growth:

LOGFILE_MAX_LINES=1000
if [ $(wc -l < $LOGFILE) -gt $LOGFILE_MAX_LINES ]; then
    mv $LOGFILE "${LOGFILE}_$(date +%Y%m%d%H%M%S)"
    touch $LOGFILE
fi

6. Test the Script

Run the script manually to ensure it works:

./health_check.sh

7. Example Output

Sample log:

System Health Check - Fri Jan 5 15:45:23 UTC 2025

---------------------------------
CPU Usage: 45.5%
Memory Usage: 76.34%
Disk Usage: 85%
Service Status:
nginx is running.
mysql is not running!
ssh is running.
Health check completed at Fri Jan 5 15:45:23 UTC 2025.

---------------------------------

This script can be expanded and tailored to your specific monitoring needs. Let me know if you’d like to add any additional checks or features!

Further Reading

For those interested in expanding their knowledge on Bash scripting and system monitoring, the following resources provide additional information and more advanced guidance:

These resources will provide both foundational and intricate knowledge applicable to various scenarios in system monitoring and management using Bash scripting.