- Posted on
- • Administration
How to Monitor and Restart Failed Services with Bash
- Author
-
-
- User
- Linux Bash
- Posts by this author
- Posts by this author
-
Monitoring and restarting failed services with a Bash script is a practical way to maintain service uptime. Here's a step-by-step guide:
1. Check Service Status
The systemctl
command is used to monitor services:
- Check if a service is active:
bash
systemctl is-active <service_name>
Returns active
if the service is running, or inactive
/failed
otherwise.
- Check if a service is failed:
bash systemctl is-failed <service_name>
Returnsfailed
if the service has failed, oractive
/inactive
otherwise.
2. Create a Monitoring Script
Here’s a simple script to monitor and restart services:
Example Script: Monitor and Restart Services
#!/bin/bash
# List of services to monitor
SERVICES=("nginx" "mysql" "ssh")
# Loop through each service
for SERVICE in "${SERVICES[@]}"; do
# Check if the service is active
if ! systemctl is-active --quiet $SERVICE; then
echo "$(date): $SERVICE is down. Attempting to restart..."
sudo systemctl restart $SERVICE
# Verify if the service restarted successfully
if systemctl is-active --quiet $SERVICE; then
echo "$(date): $SERVICE restarted successfully."
else
echo "$(date): Failed to restart $SERVICE. Manual intervention required."
fi
else
echo "$(date): $SERVICE is running."
fi
done
3. Automate the Script
To run this script periodically:
Option 1: Cron Job
- Edit the crontab:
bash crontab -e
- Add a cron job to execute the script every 5 minutes (or your preferred interval):
bash */5 * * * * /path/to/your/script.sh >> /path/to/logfile.log 2>&1
Option 2: Systemd Timer
Create a Service File:
/etc/systemd/system/monitor-services.service
[Unit] Description=Monitor and Restart Failed Services [Service] ExecStart=/path/to/your/script.sh
Create a Timer File:
/etc/systemd/system/monitor-services.timer
[Unit] Description=Run Service Monitoring Script Periodically [Timer] OnBootSec=1min OnUnitActiveSec=5min [Install] WantedBy=timers.target
Enable and Start the Timer:
sudo systemctl enable monitor-services.timer sudo systemctl start monitor-services.timer
4. Enhance the Script
Send Notifications on Failure
Use email or messaging systems to alert admins:
- Email Notification:
bash
echo "Service $SERVICE failed at $(date)" | mail -s "Service Alert" admin@example.com
- Integrate Messaging APIs (e.g., Slack, Telegram) for instant alerts.
Log Failures
Log service status and restart attempts:
LOGFILE="/var/log/service-monitor.log"
echo "$(date): Checking $SERVICE..." >> $LOGFILE
if ! systemctl is-active --quiet $SERVICE; then
echo "$(date): $SERVICE is down. Restarting..." >> $LOGFILE
sudo systemctl restart $SERVICE >> $LOGFILE 2>&1
fi
5. Test the Script
- Simulate a service failure:
bash sudo systemctl stop <service_name>
- Run the script manually:
bash bash /path/to/your/script.sh
- Verify the service restarts and logs/alerts are generated.
This setup ensures failed services are quickly detected and restarted, with logs and notifications to inform you of issues.