Automated VPS Cleanup: How We Recovered 40% Disk Space Efficiently

Automated Server Cleanup

The Problem: A 40GB VPS Crying for Help

It started with a simple alert: our production server was at 85% disk usage with only 5.9GB free on a 40GB drive. For a busy web hosting server running WHM/cPanel, this wasn't just an inconvenience, it was a ticking time bomb! Servers filled to capacity can experience everything from failed backups to crashed databases and even complete service outages.

What made this particularly concerning was that this server had been running for over a year without systematic cleanup. Like digital hoarding, temporary files, old logs, and forgotten backups had accumulated, slowly choking our precious disk space.

Our Approach: Safe, Systematic, and Documented

We implemented a three-philosophy approach: 

  1. Never delete without analysis 
  2. Log everything for accountability 
  3. Automate prevention for the future

The Complete Cleanup Solution 

1. The Master Script: server-cleanup-master.sh

cat > /usr/local/bin/server-cleanup-master.sh << 'MASTER_EOF'
#!/bin/bash
#
# iT-werX Server Cleanup Master Script
# Purpose: Systematic disk space recovery with full logging
# Author: iT-werX Admin Team
# Version: 1.0

# Configuration
LOG_DIR="/var/log/server-cleanup"
LOG_FILE="${LOG_DIR}/cleanup_$(date +%Y%m%d_%H%M%S).log"
EMAIL_ADMIN="admin@it-werx.ca"
SERVER_NAME=$(hostname)
TEMP_DIR="/tmp/cleanup_$(date +%s)"
THRESHOLD_PERCENT=80  # Alert threshold

# Create required directories
mkdir -p "$LOG_DIR"
mkdir -p "$TEMP_DIR"

# Logging functions
log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

log_section() {
    echo "" | tee -a "$LOG_FILE"
    echo "=== $1 ===" | tee -a "$LOG_FILE"
    echo "" | tee -a "$LOG_FILE"
}

error_exit() {
    log "ERROR: $1"
    log "Cleanup process aborted!"
    send_report "FAILED"
    cleanup_temp
    exit 1
}

check_error() {
    if [ $? -ne 0 ]; then
        error_exit "$1"
    fi
}

# Email reporting
send_report() {
    local status=$1
    local subject="[iT-werX] Server Cleanup ${status} on ${SERVER_NAME} - $(date)"
    
    # Create report
    local report_file="${TEMP_DIR}/final_report.txt"
    echo "iT-werX Server Cleanup Report" > "$report_file"
    echo "==============================" >> "$report_file"
    echo "Server: ${SERVER_NAME}" >> "$report_file"
    echo "Date: $(date)" >> "$report_file"
    echo "Status: ${status}" >> "$report_file"
    echo "" >> "$report_file"
    cat "$LOG_FILE" >> "$report_file"
    
    # Send email
    mail -s "$subject" "$EMAIL_ADMIN" < "$report_file"
    log "Report emailed to ${EMAIL_ADMIN}"
}

# Cleanup temporary files
cleanup_temp() {
    log "Cleaning up temporary files..."
    rm -rf "$TEMP_DIR"
}

# Pre-flight checks
preflight_checks() {
    log_section "PRE-FLIGHT CHECKS"
    
    # Check if running as root
    if [ "$EUID" -ne 0 ]; then
        error_exit "This script must be run as root"
    fi
    
    # Check disk space before starting
    local usage=$(df / --output=pcent | tail -1 | tr -d ' %')
    log "Current disk usage: ${usage}%"
    
    if [ "$usage" -lt "$THRESHOLD_PERCENT" ]; then
        log "Disk usage below ${THRESHOLD_PERCENT}%, cleanup not urgently needed"
        send_report "NOT_NEEDED"
        cleanup_temp
        exit 0
    fi
    
    # Backup critical files
    log "Backing up critical configurations..."
    mkdir -p "${TEMP_DIR}/backups"
    cp -a /etc/my.cnf /etc/passwd /etc/group "${TEMP_DIR}/backups/" 2>/dev/null
    crontab -l > "${TEMP_DIR}/backups/crontab_backup.txt" 2>/dev/null
    
    log "Pre-flight checks passed"
}

# Disk analysis functions
analyze_disk_usage() {
    log_section "DISK USAGE ANALYSIS"
    
    # Store initial state
    log "Initial disk status:"
    df -h | tee -a "$LOG_FILE"
    
    # Analyze top-level directories
    log "Analyzing directory sizes..."
    local analysis_file="${TEMP_DIR}/disk_analysis.txt"
    du -xh / --max-depth=1 2>/dev/null | sort -rh | head -15 > "$analysis_file"
    
    log "Top space-consuming directories:"
    cat "$analysis_file" | tee -a "$LOG_FILE"
    
    # Extract problematic directories for later use
    grep -E '[0-9]G\s+/' "$analysis_file" | awk '{print $2}' > "${TEMP_DIR}/large_dirs.txt"
}

analyze_home_directories() {
    log_section "HOME DIRECTORY ANALYSIS"
    
    local home_analysis="${TEMP_DIR}/home_analysis.txt"
    
    log "Analyzing user home directories..."
    du -sh /home/* 2>/dev/null | sort -hr > "$home_analysis"
    
    log "Top user accounts by size:"
    head -10 "$home_analysis" | tee -a "$LOG_FILE"
    
    # Store top 5 users for detailed cleanup
    head -5 "$home_analysis" | awk '{print $2}' | xargs -n1 basename > "${TEMP_DIR}/top_users.txt"
}

# Cleanup phases
phase1_system_cleanup() {
    log_section "PHASE 1: SYSTEM CLEANUP"
    
    log "1. Cleaning package manager cache..."
    if command -v yum >/dev/null 2>&1; then
        yum clean all
    elif command -v apt-get >/dev/null 2>&1; then
        apt-get clean
    fi
    check_error "Package cache cleanup failed"
    
    log "2. Removing old kernel versions..."
    if command -v package-cleanup >/dev/null 2>&1; then
        package-cleanup --oldkernels --count=1 -y
    fi
    
    log "3. Cleaning system temporary files..."
    rm -rf /tmp/*
    rm -rf /var/tmp/*
}

phase2_log_management() {
    log_section "PHASE 2: LOG MANAGEMENT"
    
    local large_logs="${TEMP_DIR}/large_logs.txt"
    
    log "1. Identifying large log files (>50MB)..."
    find /var/log -type f -name "*.log" -size +50M 2>/dev/null > "$large_logs"
    
    if [ -s "$large_logs" ]; then
        log_count=$(wc -l < "$large_logs")
        log "Found ${log_count} large log files"
        
        while read -r logfile; do
            size=$(du -h "$logfile" | cut -f1)
            log "Rotating: ${logfile} (${size})"
            
            # Rotate instead of delete
            if [ -f "$logfile" ]; then
                mv "$logfile" "${logfile}.old"
                touch "$logfile"
                chmod 640 "$logfile" 2>/dev/null
            fi
        done < "$large_logs"
    else
        log "No abnormally large log files found"
    fi
    
    log "2. Compressing old log files (>30 days)..."
    find /var/log -name "*.old" -exec gzip {} \; 2>/dev/null
    
    log "3. Removing very old compressed logs (>90 days)..."
    find /var/log -name "*.gz" -mtime +90 -delete 2>/dev/null
}

phase3_user_space_cleanup() {
    log_section "PHASE 3: USER SPACE CLEANUP"
    
    if [ ! -f "${TEMP_DIR}/top_users.txt" ]; then
        log "No user analysis found, skipping user cleanup"
        return 0
    fi
    
    local users=$(cat "${TEMP_DIR}/top_users.txt")
    
    for user in $users; do
        log "Processing user: ${user}"
        local user_log="${TEMP_DIR}/user_${user}_cleanup.log"
        
        # Clean PHP sessions
        log "  Cleaning PHP sessions..."
        find "/home/${user}" -type f -name "sess_*" -mtime +1 -delete 2>/dev/null >> "$user_log"
        
        # Clean statistics cache
        log "  Cleaning statistics cache..."
        find "/home/${user}/tmp" -path "*/analog/*" -type f -name "cache" -delete 2>/dev/null >> "$user_log"
        find "/home/${user}/tmp" -path "*/webalizer/*" -name "*.png" -mtime +30 -delete 2>/dev/null >> "$user_log"
        
        # Clean old backups
        log "  Cleaning old backups..."
        find "/home/${user}" -type f \( -name "*.tar.gz" -o -name "*.zip" \) -mtime +30 -delete 2>/dev/null >> "$user_log"
        
        # Clean error logs
        log "  Truncating large error logs..."
        find "/home/${user}/public_html" -name "error_log" -size +1M -exec truncate -s 0 {} \; 2>/dev/null >> "$user_log"
        
        # Report actions
        local actions=$(wc -l < "$user_log" 2>/dev/null || echo "0")
        log "  Performed ${actions} cleanup actions for ${user}"
    done
}

phase4_mysql_maintenance() {
    log_section "PHASE 4: MYSQL MAINTENANCE"
    
    log "1. Optimizing MySQL tables..."
    mysqlcheck -o --all-databases 2>&1 | tee -a "$LOG_FILE"
    
    log "2. Cleaning binary logs (if enabled)..."
    mysql -e "PURGE BINARY LOGS BEFORE DATE_SUB(NOW(), INTERVAL 7 DAY);" 2>&1 | tee -a "$LOG_FILE"
    
    log "3. Analyzing database sizes..."
    local db_analysis="${TEMP_DIR}/mysql_analysis.txt"
    mysql -e "SELECT table_schema AS 'Database', 
              ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
              FROM information_schema.tables 
              GROUP BY table_schema
              ORDER BY ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) DESC;" > "$db_analysis"
    
    log "Database sizes:"
    cat "$db_analysis" | tee -a "$LOG_FILE"
}

# Main execution
main() {
    log_section "iT-werX SERVER CLEANUP STARTED"
    log "Process ID: $$"
    log "Temporary directory: ${TEMP_DIR}"
    log "Log file: ${LOG_FILE}"
    
    # Execute phases
    preflight_checks
    analyze_disk_usage
    analyze_home_directories
    
    phase1_system_cleanup
    phase2_log_management
    phase3_user_space_cleanup
    phase4_mysql_maintenance
    
    # Final report
    log_section "CLEANUP COMPLETE"
    
    log "Final disk status:"
    df -h | tee -a "$LOG_FILE"
    
    # Calculate space recovered
    local final_usage=$(df / --output=pcent | tail -1 | tr -d ' %')
    log "Final disk usage: ${final_usage}%"
    
    send_report "SUCCESS"
    cleanup_temp
    
    log "Cleanup process completed successfully"
    log_section "END OF CLEANUP"
}

# Trap signals for graceful exit
trap 'log "Interrupt received, cleaning up..."; cleanup_temp; exit 1' INT TERM

# Run main function
main "$@"

MASTER_EOF

chmod +x /usr/local/bin/server-cleanup-master.sh

 

2. Daily Maintenance Script: daily-cleanup.sh

cat > /etc/cron.daily/itwerx-daily-cleanup.sh << 'DAILY_EOF'
#!/bin/bash
#
# iT-werX Daily Maintenance Script
# Runs safe, non-destructive cleanup daily
LOG_FILE="/var/log/server-cleanup/daily_$(date +%Y%m%d).log"
EMAIL_ADMIN="admin@it-werx.ca"
log() {
   echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" >> "$LOG_FILE"
}
# Start logging
log "=== iT-werX Daily Maintenance Started ==="
# 1. Clean PHP sessions (older than 1 day)
log "Cleaning PHP sessions..."
find /home -type f -name "sess_*" -mtime +1 -delete 2>/dev/null
session_count=$?
log "PHP session cleanup completed"
# 2. Clean statistics cache
log "Cleaning statistics cache..."
find /home -path "*/tmp/analog/*" -type f -name "cache" -mtime +1 -delete 2>/dev/null
find /home -path "*/tmp/webalizer/*" -name "*.png" -mtime +30 -delete 2>/dev/null
# 3. Clean user temporary files
log "Cleaning user temp files..."
for user_dir in /home/*; do
   user=$(basename "$user_dir")
   if [ -d "${user_dir}/tmp" ]; then
       find "${user_dir}/tmp" -type f -mtime +7 -delete 2>/dev/null
   fi
done
# 4. Clean package cache
log "Cleaning package cache..."
if command -v yum >/dev/null 2>&1; then
   yum clean all >> "$LOG_FILE" 2>&1
elif command -v apt-get >/dev/null 2>&1; then
   apt-get clean >> "$LOG_FILE" 2>&1
fi
# 5. Check disk usage
log "Checking disk usage..."
DISK_USAGE=$(df / --output=pcent | tail -1 | tr -d ' %')
log "Current disk usage: ${DISK_USAGE}%"
# 6. Send alert if above threshold
if [ "$DISK_USAGE" -gt 80 ]; then
   echo "Disk usage at ${DISK_USAGE}% on $(hostname)" | \
   mail -s "[iT-werX Alert] High Disk Usage on $(hostname)" "$EMAIL_ADMIN"
   log "High disk usage alert sent"
fi
# 7. Rotate this log file if too large
LOG_SIZE=$(stat -c%s "$LOG_FILE" 2>/dev/null || echo "0")
if [ "$LOG_SIZE" -gt 10485760 ]; then  # 10MB
   mv "$LOG_FILE" "${LOG_FILE}.old"
   log "Log file rotated"
fi
log "=== Daily Maintenance Completed ==="
# Keep only last 30 days of logs
find /var/log/server-cleanup -name "daily_*.log" -mtime +30 -delete 2>/dev/null
find /var/log/server-cleanup -name "*.old" -mtime +90 -delete 2>/dev/null
DAILY_EOF
chmod +x /etc/cron.daily/itwerx-daily-cleanup.sh

3. Emergency Cleanup Script: emergency-cleanup.sh

cat > /usr/local/bin/emergency-cleanup.sh << 'EMERGENCY_EOF'
#!/bin/bash
#
# iT-werX Emergency Cleanup Script
# Use when disk is critically full (>90%)
LOG_FILE="/tmp/emergency_cleanup_$(date +%s).log"
EMAIL_ADMIN="admin@it-werx.ca"
echo "=== iT-werX EMERGENCY CLEANUP ===" | tee "$LOG_FILE"
echo "Started: $(date)" | tee -a "$LOG_FILE"
echo "" | tee -a "$LOG_FILE"
# Check current usage
CURRENT_USAGE=$(df / --output=pcent | tail -1 | tr -d ' %')
echo "Current disk usage: ${CURRENT_USAGE}%" | tee -a "$LOG_FILE"
if [ "$CURRENT_USAGE" -lt 90 ]; then
   echo "Disk usage below 90%, emergency cleanup not needed." | tee -a "$LOG_FILE"
   echo "Consider running server-cleanup-master.sh instead." | tee -a "$LOG_FILE"
   exit 0
fi
echo "" | tee -a "$LOG_FILE"
echo "WARNING: Performing aggressive cleanup!" | tee -a "$LOG_FILE"
echo "" | tee -a "$LOG_FILE"
# 1. Remove all package cache
echo "1. Removing ALL package cache..." | tee -a "$LOG_FILE"
rm -rf /var/cache/yum/* 2>/dev/null
rm -rf /var/cache/apt/archives/* 2>/dev/null
# 2. Clean ALL log files (keep current only)
echo "2. Truncating large log files..." | tee -a "$LOG_FILE"
find /var/log -type f -name "*.log" -size +10M -exec truncate -s 1M {} \; 2>/dev/null
find /usr/local/apache/logs -type f -name "*.log" -size +10M -exec truncate -s 1M {} \; 2>/dev/null
# 3. Remove ALL statistics cache
echo "3. Removing ALL statistics cache..." | tee -a "$LOG_FILE"
find /home -path "*/tmp/*stats*" -type f -delete 2>/dev/null
find /home -path "*/tmp/analog/*" -type f -delete 2>/dev/null
find /home -path "*/tmp/webalizer/*" -type f -delete 2>/dev/null
# 4. Clean ALL PHP sessions
echo "4. Removing ALL PHP sessions..." | tee -a "$LOG_FILE"
find /home -type f -name "sess_*" -delete 2>/dev/null
# 5. Clean backup directories
echo "5. Cleaning backup directories..." | tee -a "$LOG_FILE"
find /backup -type f -mtime +1 -delete 2>/dev/null
find /home -name "*backup*.tar.gz" -type f -mtime +3 -delete 2>/dev/null
# 6. Clear mail queue
echo "6. Clearing mail queue..." | tee -a "$LOG_FILE"
/usr/sbin/exim -bp | /usr/sbin/exiqgrep -z | xargs /usr/sbin/exim -Mrm 2>/dev/null
# 7. Final status
echo "" | tee -a "$LOG_FILE"
echo "Emergency cleanup completed!" | tee -a "$LOG_FILE"
echo "" | tee -a "$LOG_FILE"
echo "Final disk status:" | tee -a "$LOG_FILE"
df -h | tee -a "$LOG_FILE"
# Send report
mail -s "[iT-werX Emergency] Cleanup completed on $(hostname)" "$EMAIL_ADMIN" < "$LOG_FILE"
echo "Report sent to $EMAIL_ADMIN" | tee -a "$LOG_FILE"
echo "Log file saved to: $LOG_FILE" | tee -a "$LOG_FILE"
EMERGENCY_EOF
chmod +x /usr/local/bin/emergency-cleanup.sh

4. Analysis Tool: disk-analyzer.sh

cat > /usr/local/bin/disk-analyzer.sh << 'ANALYZER_EOF'
#!/bin/bash
#
# iT-werX Disk Space Analyzer
# Provides detailed analysis without making changes
REPORT_FILE="/tmp/disk_analysis_$(date +%Y%m%d_%H%M%S).txt"
EMAIL_ADMIN="admin@it-werx.ca"
echo "iT-werX Disk Space Analysis Report" > "$REPORT_FILE"
echo "==================================" >> "$REPORT_FILE"
echo "Server: $(hostname)" >> "$REPORT_FILE"
echo "Date: $(date)" >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# Overall disk status
echo "=== OVERALL DISK STATUS ===" >> "$REPORT_FILE"
df -h >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# Top-level directory analysis
echo "=== TOP-LEVEL DIRECTORY ANALYSIS ===" >> "$REPORT_FILE"
du -xh / --max-depth=1 2>/dev/null | sort -rh | head -20 >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# Large files (>100MB)
echo "=== LARGE FILES (>100MB) ===" >> "$REPORT_FILE"
find / -type f -size +100M 2>/dev/null | grep -v "^/proc\|^/sys\|^/run" | xargs ls -lh 2>/dev/null | head -20 >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# Home directory analysis
echo "=== HOME DIRECTORY ANALYSIS ===" >> "$REPORT_FILE"
echo "Total home usage: $(du -sh /home 2>/dev/null | cut -f1)" >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
echo "Per-user breakdown:" >> "$REPORT_FILE"
du -sh /home/* 2>/dev/null | sort -hr >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# Log directory analysis
echo "=== LOG DIRECTORY ANALYSIS ===" >> "$REPORT_FILE"
echo "/var/log size: $(du -sh /var/log 2>/dev/null | cut -f1)" >> "$REPORT_FILE"
echo "Large log files:" >> "$REPORT_FILE"
find /var/log -type f -name "*.log" -size +50M 2>/dev/null | xargs ls -lh 2>/dev/null >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# MySQL database sizes
echo "=== MYSQL DATABASE SIZES ===" >> "$REPORT_FILE"
mysql -e "SELECT table_schema AS 'Database', 
         ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS 'Size (MB)'
         FROM information_schema.tables 
         GROUP BY table_schema
         ORDER BY ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) DESC;" 2>/dev/null >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
# Recommendations
echo "=== RECOMMENDATIONS ===" >> "$REPORT_FILE"
USAGE=$(df / --output=pcent | tail -1 | tr -d ' %')
if [ "$USAGE" -gt 90 ]; then
   echo "CRITICAL: Run emergency-cleanup.sh immediately!" >> "$REPORT_FILE"
elif [ "$USAGE" -gt 80 ]; then
   echo "HIGH: Run server-cleanup-master.sh soon" >> "$REPORT_FILE"
elif [ "$USAGE" -gt 70 ]; then
   echo "MODERATE: Consider cleanup in next maintenance window" >> "$REPORT_FILE"
else
   echo "OK: Disk usage is at healthy level" >> "$REPORT_FILE"
fi
echo "" >> "$REPORT_FILE"
echo "=== END OF REPORT ===" >> "$REPORT_FILE"
# Display and optionally email
cat "$REPORT_FILE"
read -p "Send this report to $EMAIL_ADMIN? (y/n): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
   mail -s "[iT-werX Analysis] Disk Report for $(hostname)" "$EMAIL_ADMIN" < "$REPORT_FILE"
   echo "Report sent to $EMAIL_ADMIN"
fi
echo "Report saved to: $REPORT_FILE"
ANALYZER_EOF
chmod +x /usr/local/bin/disk-analyzer.sh

Implementation and Results 

Setting Up the System

# Create log directory
mkdir -p /var/log/server-cleanup
# Make all scripts executable
chmod +x /usr/local/bin/server-cleanup-master.sh
chmod +x /usr/local/bin/emergency-cleanup.sh
chmod +x /usr/local/bin/disk-analyzer.sh
chmod +x /etc/cron.daily/itwerx-daily-cleanup.sh
# Test the analyzer first
echo "Testing disk analyzer..."
/usr/local/bin/disk-analyzer.sh
# Set up weekly full cleanup (Sundays at 2 AM)
cat > /etc/cron.d/itwerx-cleanup << 'CRON_EOF'
# iT-werX Weekly Server Cleanup
0 2 * * 0 root /usr/local/bin/server-cleanup-master.sh >/dev/null 2>&1
# Daily disk check
0 8 * * * root /usr/local/bin/disk-analyzer.sh | tail -20 >/tmp/daily_disk_check.txt
# Monthly deep analysis
0 3 1 * * root /usr/local/bin/disk-analyzer.sh | mail -s "Monthly Disk Analysis" admin@it-werx.ca
CRON_EOF

The Results: A Self-Healing Server

After implementing this system, we achieved:

  • Automatic Space Management: Daily cleanup keeps temporary files in check
  • Proactive Monitoring: Alerts trigger before problems become critical
  • Documented Processes: Every action is logged and reportable
  • Scalable Solution: Works for servers of any size

Key Metrics Achieved:

  • Regular cleanup: 1-2GB recovered weekly
  • Emergency readiness: Can free 3-5GB in minutes if needed
  • Zero downtime: All cleanup happens without service interruption
  • Full transparency: Every action logged and reported

Lessons Learned

  • Prevention is cheaper than cure: Regular maintenance prevents emergency situations
  • Know your data: Analysis before deletion prevents mistakes
  • Automate responsibly: Scripts must include safety checks and logging
  • Communicate clearly: Email reports keep everyone informed

Get the Code

All scripts are available in our GitHub repository. They're freely usable under the GNU Public license with proper attribution.

Need help with your server? Contact iT-werX for professional system administration services that keep your infrastructure running smoothly.