Add comprehensive tooling for production deployment: Scripts (scripts/): - backup-db.sh: Automated database backups with 7-day retention - restore-db.sh: Safe database restore with confirmation prompts - health-check.sh: Complete service health monitoring - README.md: Operational scripts documentation Monitoring (docs/MONITORING.md): - Application health monitoring - Docker container monitoring - External monitoring setup (UptimeRobot, Pingdom) - Log monitoring and rotation - Alerting configuration - Incident response procedures - SLA targets and metrics All scripts include: - Environment support (dev/prod) - Error handling and validation - Detailed status reporting - Safety confirmations where needed
187 lines
3.8 KiB
Markdown
187 lines
3.8 KiB
Markdown
# spotlight.cam - Operations Scripts
|
|
|
|
Utility scripts for managing spotlight.cam deployment, backups, and monitoring.
|
|
|
|
## 📋 Available Scripts
|
|
|
|
### 1. Database Backup (`backup-db.sh`)
|
|
|
|
Creates a timestamped backup of the PostgreSQL database.
|
|
|
|
```bash
|
|
# Backup development database
|
|
./scripts/backup-db.sh dev
|
|
|
|
# Backup production database
|
|
./scripts/backup-db.sh prod
|
|
```
|
|
|
|
**Features:**
|
|
- Automatic timestamping
|
|
- Automatic cleanup (keeps last 7 days)
|
|
- Backup size reporting
|
|
- Error handling
|
|
|
|
**Backups location:** `./backups/`
|
|
|
|
**Setup cron job for automatic backups:**
|
|
```bash
|
|
# Edit crontab
|
|
crontab -e
|
|
|
|
# Add daily backup at 2 AM (production)
|
|
0 2 * * * cd /path/to/spotlightcam && ./scripts/backup-db.sh prod >> logs/backup.log 2>&1
|
|
```
|
|
|
|
---
|
|
|
|
### 2. Database Restore (`restore-db.sh`)
|
|
|
|
Restores database from a backup file.
|
|
|
|
```bash
|
|
# Restore development database
|
|
./scripts/restore-db.sh ./backups/backup_dev_20251120_120000.sql dev
|
|
|
|
# Restore production database
|
|
./scripts/restore-db.sh ./backups/backup_prod_20251120_120000.sql prod
|
|
```
|
|
|
|
**Safety features:**
|
|
- Confirmation prompt
|
|
- File existence check
|
|
- Container status validation
|
|
|
|
---
|
|
|
|
### 3. Health Check (`health-check.sh`)
|
|
|
|
Monitors all services and checks system health.
|
|
|
|
```bash
|
|
# Check development environment
|
|
./scripts/health-check.sh dev
|
|
|
|
# Check production environment
|
|
./scripts/health-check.sh prod
|
|
```
|
|
|
|
**Checks:**
|
|
- ✅ nginx container status
|
|
- ✅ Frontend container status
|
|
- ✅ Backend container status
|
|
- ✅ Database container status
|
|
- ✅ API health endpoint
|
|
- ✅ Database connection
|
|
|
|
**Exit codes:**
|
|
- `0` - All systems operational
|
|
- `1` - One or more services unhealthy
|
|
|
|
**Setup monitoring:**
|
|
```bash
|
|
# Check every 5 minutes and send alerts on failure
|
|
*/5 * * * * /path/to/spotlightcam/scripts/health-check.sh prod || /path/to/alert-script.sh
|
|
```
|
|
|
|
---
|
|
|
|
## 🔧 Setup Instructions
|
|
|
|
### Make scripts executable
|
|
```bash
|
|
chmod +x scripts/*.sh
|
|
```
|
|
|
|
### Create backups directory
|
|
```bash
|
|
mkdir -p backups
|
|
```
|
|
|
|
### Test scripts
|
|
```bash
|
|
# Test backup
|
|
./scripts/backup-db.sh dev
|
|
|
|
# Test health check
|
|
./scripts/health-check.sh dev
|
|
```
|
|
|
|
---
|
|
|
|
## 📊 Monitoring Setup
|
|
|
|
### Option 1: Cron-based monitoring
|
|
```bash
|
|
# Add to crontab
|
|
crontab -e
|
|
|
|
# Health check every 5 minutes
|
|
*/5 * * * * /path/to/spotlightcam/scripts/health-check.sh prod || echo "Health check failed" | mail -s "spotlight.cam Alert" admin@example.com
|
|
|
|
# Daily backup at 2 AM
|
|
0 2 * * * /path/to/spotlightcam/scripts/backup-db.sh prod
|
|
```
|
|
|
|
### Option 2: External monitoring
|
|
Recommended external services:
|
|
- **UptimeRobot** - Free, checks every 5 min
|
|
- **Pingdom** - Advanced monitoring
|
|
- **StatusCake** - Free tier available
|
|
|
|
Monitor these endpoints:
|
|
- `https://spotlight.cam` - Frontend
|
|
- `https://spotlight.cam/api/health` - Backend API
|
|
|
|
---
|
|
|
|
## 🚨 Troubleshooting
|
|
|
|
### Backup fails
|
|
```bash
|
|
# Check if database container is running
|
|
docker ps | grep slc-db
|
|
|
|
# Check database logs
|
|
docker logs slc-db-prod
|
|
|
|
# Manually test connection
|
|
docker exec slc-db-prod pg_isready -U spotlightcam
|
|
```
|
|
|
|
### Health check fails
|
|
```bash
|
|
# View detailed container status
|
|
docker ps -a
|
|
|
|
# Check specific service logs
|
|
docker logs slc-backend-prod
|
|
docker logs slc-db-prod
|
|
|
|
# Restart services
|
|
docker compose --profile prod restart
|
|
```
|
|
|
|
### Restore fails
|
|
```bash
|
|
# Check backup file integrity
|
|
head -n 10 ./backups/backup_prod_20251120_120000.sql
|
|
|
|
# Verify database is accepting connections
|
|
docker exec slc-db-prod psql -U spotlightcam -c "SELECT version();"
|
|
```
|
|
|
|
---
|
|
|
|
## 📝 Best Practices
|
|
|
|
1. **Always test backups** - Regularly test restore process in dev environment
|
|
2. **Monitor disk space** - Ensure backups directory has enough space
|
|
3. **Secure backups** - Store backups off-server (AWS S3, Backblaze B2)
|
|
4. **Regular testing** - Test health checks and disaster recovery procedures
|
|
5. **Log rotation** - Use logrotate for script logs
|
|
|
|
---
|
|
|
|
**Last Updated:** 2025-11-20
|