docs: add comprehensive testing guide with flaky test documentation

- Create docs/TESTING.md with full testing documentation - Document 3 flaky test suites with exact locations and workarounds - Update README.md with accurate test status (~348-349 passing) - Add troubleshooting guide for race conditions - Include CI/CD considerations and future improvements Key sections: - Running tests (commands, environment, configuration) - Flaky tests: matching-algorithm, ratings-stats-flow, events - Known issues & fixes applied (migrations, constraints, config) - Debugging strategies and test isolation tips
2025-12-06 14:45:21 +01:00
parent 5599c6f72b
commit 68f8322221
2 changed files with 362 additions and 8 deletions
--- a/README.md
+++ b/README.md
@@ -198,20 +198,43 @@ docker compose exec backend npm run cli -- users:list --limit 20

 ## 📊 Test Coverage

-**Backend: 351/351 tests passing - 100% ✅** (73% overall coverage)
+**Backend: ~348-349 passing, ~5-6 flaky, 11 skipped** (365 total tests)

-### Test Suites
- **Matching Algorithm**: 19/19 integration tests
-  - Fundamentals, collision detection, limits, fairness, edge cases
- **Ratings & Stats Flow**: 9/9 E2E tests
-  - Atomic updates, race prevention, idempotency
+### Test Status
+- ✅ **20/23 test suites stable** - Consistently passing
+- ⚠️ **3/23 test suites flaky** - Pass individually, may fail in full suite (race conditions)
+- ⏭️ **11 tests skipped** - Socket.IO server setup required (3), outdated auth tests (2), rate limiting (1)
+
+### Stable Test Suites
+- **Matching Algorithm**: 19/19 tests (flaky in full suite - run individually)
+- **Ratings & Stats Flow**: 9/9 tests (flaky in full suite - run individually)
 - **Matching Runs Audit**: 6/6 tests
-  - origin_run_id tracking, sequential runs, filtering
 - **Incremental Matching**: 5/5 tests
 - **Recording Stats Integration**: 6/6 tests
 - **WebRTC Signaling**: 12/12 tests
- **WebRTC API**: 9/9 tests (Cloudflare TURN integration, fallbacks, authentication)
+- **WebRTC API**: 9/9 tests
 - **Socket.IO**: 12/12 tests
+- **Authentication**: 24/24 tests
+- **Dashboard**: 12/12 tests
+- **Users**: 23/25 tests (2 skipped - endpoints are public)
+- **Matches**: 24/24 tests
+- **Events**: 55/55 tests (flaky in full suite - run individually)
+- **And 7 more...**
+
+### Flaky Tests (Known Issues)
+See [docs/TESTING.md](docs/TESTING.md) for detailed analysis of:
+- `matching-algorithm.test.js` - Race conditions in concurrent runs
+- `ratings-stats-flow.test.js` - Database state dependencies
+- `events.test.js` - Cleanup interference between tests
+
+**Workaround:** Run flaky tests individually when needed:
+```bash
+docker compose exec backend npm test -- matching-algorithm.test.js  # ✅ 19/19
+docker compose exec backend npm test -- ratings-stats-flow.test.js  # ✅ 9/9
+docker compose exec backend npm test -- events.test.js              # ✅ 55/55
+```
+
+**Documentation:** See [docs/TESTING.md](docs/TESTING.md) for comprehensive testing guide
 - **API Routes**: Full CRUD coverage (auth, events, matches, dashboard, webrtc)

 ### Code Coverage Highlights
@@ -272,6 +295,7 @@ spotlightcam/
    ├── ARCHITECTURE.md        # Technical implementation details
    ├── DEPLOYMENT.md          # Production deployment guide
    ├── MONITORING.md          # Operations & monitoring
+    ├── TESTING.md             # Testing guide & flaky test documentation
    ├── TESTING_MATCHING_RATINGS.md  # Comprehensive test documentation
    ├── WEBRTC_TESTING_GUIDE.md      # WebRTC testing guide
    ├── GOOGLE_ANALYTICS_SETUP.md    # GA4 integration guide
@@ -371,6 +395,7 @@ docker compose exec backend npm run cli -- matches:list --limit 20 --status acce
 **Main documentation:**
 - `docs/TODO.md` - Active tasks, security audit, roadmap
 - `docs/ARCHITECTURE.md` - Technical details, WebRTC flow, API endpoints
+- `docs/TESTING.md` - **Testing guide & flaky test troubleshooting** ⭐
 - `docs/TESTING_MATCHING_RATINGS.md` - Comprehensive test documentation (45 tests)
 - `docs/DEPLOYMENT.md` - Production deployment guide
 - `docs/MONITORING.md` - Operations and monitoring
--- a/docs/TESTING.md
+++ b/docs/TESTING.md
@@ -0,0 +1,329 @@
+# Testing Guide
+
+## Overview
+
+SpotlightCam uses Jest for backend testing with a comprehensive test suite covering API endpoints, matching algorithms, WebRTC signaling, and database operations.
+
+**Current Status:** ~348-349 passing, ~5-6 flaky, 11 skipped
+
+## Running Tests
+
+```bash
+# Run all tests
+docker compose exec backend npm test
+
+# Run specific test file
+docker compose exec backend npm test -- <filename>
+
+# Run with coverage
+docker compose exec backend npm test -- --coverage
+
+# Run in watch mode (development)
+docker compose exec backend npm test -- --watch
+```
+
+## Test Environment
+
+Tests run in isolated environment with:
+- `NODE_ENV=test` - Enables test-specific behavior
+- CAPTCHA disabled - `TURNSTILE_SECRET_KEY` unset for tests
+- Rate limiting disabled - Bypassed when `NODE_ENV=test`
+- Separate test database - Migrations applied automatically
+
+Configuration: `backend/jest.setup.js`
+
+## Test Structure
+
+### Passing Test Suites (20/23)
+- ✅ `app.test.js` - Application health checks
+- ✅ `auth.test.js` - Authentication endpoints
+- ✅ `auth-phase1.5.test.js` - Extended auth tests
+- ✅ `dashboard.test.js` - Dashboard API
+- ✅ `users.test.js` - User profile endpoints (2 skipped)
+- ✅ `matches.test.js` - Match management
+- ✅ `matching.test.js` - Basic matching
+- ✅ `matching-runs-audit.test.js` - Matching run tracking
+- ✅ `spam-protection-notifications.test.js` - Spam protection (3 skipped)
+- ✅ `csrf.test.js` - CSRF protection
+- ✅ `webrtc-api.test.js` - WebRTC API endpoints
+- ✅ `wsdc.test.js` - WSDC integration
+- ✅ And 8 more...
+
+### Flaky Test Suites (3/23)
+
+These tests **pass when run individually** but may fail in full suite runs due to race conditions and test isolation issues.
+
+#### 1. `matching-algorithm.test.js`
+**Symptoms:**
+- Random failures in TC13: Location score beats fairness
+- Expected values don't match actual in full suite
+
+**Root Cause:**
+- Race conditions in concurrent matching runs
+- Shared database state between tests
+- Timing-dependent algorithm scoring
+
+**Location:** `backend/src/__tests__/matching-algorithm.test.js:526`
+
+**How to Run Individually:**
+```bash
+docker compose exec backend npm test -- matching-algorithm.test.js
+# Result: 19/19 passing ✅
+```
+
+**Temporary Workaround:** Run individually when debugging matching algorithm
+**Long-term Fix Needed:** Better test isolation, transaction rollback between tests
+
+---
+
+#### 2. `ratings-stats-flow.test.js`
+**Symptoms:**
+- Cascade failures through entire flow (STEP 4 → STEP 8)
+- `toBeGreaterThan` assertions fail
+- Null reference errors on match objects
+
+**Root Cause:**
+- Depends on exact database state from previous tests
+- No proper cleanup between test runs
+- Auto-matching suggestions not created consistently
+
+**Location:** `backend/src/__tests__/ratings-stats-flow.test.js:222-396`
+
+**Affected Steps:**
+- STEP 4: Run matching algorithm
+- STEP 5: Recorder accepts suggestion
+- STEP 6a/6b: Rating exchange
+- STEP 7: Double-rating prevention
+- STEP 8: Manual match stats verification
+
+**How to Run Individually:**
+```bash
+docker compose exec backend npm test -- ratings-stats-flow.test.js
+# Result: 9/9 passing ✅
+```
+
+**Temporary Workaround:** Run individually to verify ratings flow
+**Long-term Fix Needed:**
+- Add `beforeEach` cleanup for suggestions
+- Use transaction-based test isolation
+- Mock matching service for predictable results
+
+---
+
+#### 3. `events.test.js`
+**Symptoms:**
+- Random failures in heats/all endpoint
+- "Record to update not found" in competitor-number tests
+- Works fine when run alone
+
+**Root Cause:**
+- EventParticipant records deleted by other tests
+- EventUserHeat cleanup interferes with test setup
+- Database state pollution from parallel tests
+
+**Location:** `backend/src/__tests__/events.test.js:670,1083`
+
+**Affected Tests:**
+- GET /api/events/:slug/heats/all
+- PUT /api/events/:slug/competitor-number (set & clear)
+
+**How to Run Individually:**
+```bash
+docker compose exec backend npm test -- events.test.js
+# Result: 55/55 passing ✅
+```
+
+**Temporary Workaround:** Run individually when testing event features
+**Long-term Fix Needed:**
+- Proper `afterEach` cleanup for event participants
+- Use unique event slugs per test
+- Consider database snapshots/rollback
+
+---
+
+### Skipped Tests (11 total)
+
+#### Socket.IO Tests (3 skipped)
+**Files:**
+- `spam-protection-notifications.test.js`:
+  - TC6: Should emit notification when new suggestion created
+  - TC8: Should group multiple suggestions per recorder
+- `spam-protection-notifications.test.js`:
+  - TC4: Rate limiting test (conflicts with test environment)
+
+**Reason:** Require Socket.IO server setup on port 3002, timeout in test environment
+
+**How to Enable:**
+1. Configure Socket.IO server for tests
+2. Ensure proper connection handling in test lifecycle
+3. Remove `.skip` from test definitions
+
+#### Public Endpoint Tests (2 skipped)
+**File:** `users.test.js`
+- Should reject request without authentication (GET /api/users/:username)
+- Should reject request without authentication (GET /api/users/:username/ratings)
+
+**Reason:** Endpoints are intentionally public - tests are outdated
+
+---
+
+## Known Issues & Fixes Applied
+
+### ✅ Fixed: Missing Database Migrations
+**Problem:** Tests failing with "column does not exist" or "table does not exist"
+
+**Fixed Migrations:**
+- `20251206123000_add_is_admin` - Admin flag for users
+- `20251206124000_add_suggestion_id_to_matches` - Auto-matching integration
+- `20251206125000_add_source_to_matches` - Manual vs auto tracking
+- `20251206125500_add_stats_applied_to_matches` - Prevent duplicate stats
+- `20251206130000_create_activity_logs` - Full activity logging table
+
+### ✅ Fixed: Unique Constraint Errors
+**Problem:** EventUserHeat constraint too restrictive
+
+**Solution:** Updated constraint to include `heatNumber`:
+```sql
+-- Old (too restrictive)
+@@unique([userId, eventId, divisionId, competitionTypeId, role])
+
+-- New (allows multiple heats with same role)
+@@unique([userId, eventId, divisionId, competitionTypeId, heatNumber, role])
+```
+
+**Affected Files:** All tests using EventUserHeat
+**Location:** `backend/prisma/schema.prisma:281`
+
+### ✅ Fixed: Test Environment Configuration
+**Problem:** CAPTCHA blocking test registration, rate limiting blocking spam tests
+
+**Solution:**
+- Unset `TURNSTILE_SECRET_KEY` in `jest.setup.js`
+- Skip rate limiting when `NODE_ENV=test`
+- Made CAPTCHA validation optional in validators
+
+**Files Modified:**
+- `backend/jest.setup.js` - Environment setup
+- `backend/src/middleware/validators.js` - Conditional CAPTCHA
+- `backend/src/routes/matches.js` - Rate limiter bypass
+
+### ✅ Fixed: Username Uniqueness in Tests
+**Problem:** Hardcoded usernames causing constraint violations
+
+**Solution:** Use dynamic timestamps in test usernames:
+```javascript
+// Bad (causes conflicts)
+username: 'outsider'
+
+// Good (unique per run)
+username: `outsider_${Date.now()}`
+```
+
+**Location:** `backend/src/__tests__/matches.test.js:321`
+
+---
+
+## Debugging Flaky Tests
+
+### Strategy 1: Run Individually
+```bash
+# If a test fails in full suite, try running it alone
+docker compose exec backend npm test -- <failing-test-file>.test.js
+```
+
+### Strategy 2: Check Database State
+```bash
+# Connect to test database
+docker compose exec db psql -U postgres -d spotlightcam
+
+# Check table contents
+SELECT COUNT(*) FROM users;
+SELECT COUNT(*) FROM event_participants;
+SELECT COUNT(*) FROM recording_suggestions;
+```
+
+### Strategy 3: Enable Detailed Logging
+```javascript
+// In test file, add:
+beforeEach(() => {
+  console.log('=== Test Starting ===');
+});
+
+afterEach(async () => {
+  console.log('=== Test Cleanup ===');
+  // Add cleanup code here
+});
+```
+
+### Strategy 4: Isolate Test Data
+```javascript
+// Use unique identifiers
+const testSlug = `test-event-${Date.now()}`;
+const testEmail = `user-${Date.now()}@test.com`;
+```
+
+---
+
+## Future Improvements
+
+### High Priority
+1. **Test Isolation:** Implement transaction-based test isolation
+   - Use `BEGIN` before each test
+   - `ROLLBACK` after each test
+   - Keeps database clean automatically
+
+2. **Parallel Test Safety:** Fix race conditions
+   - Use test-specific database schemas
+   - Or serialize tests with `--runInBand`
+
+3. **Socket.IO Tests:** Enable skipped socket tests
+   - Configure test Socket.IO server
+   - Add proper connection lifecycle management
+
+### Medium Priority
+4. **Flaky Test Dashboard:** Track flakiness over time
+5. **Better Cleanup:** Automated cleanup in `afterEach` hooks
+6. **Test Data Builders:** Factory functions for consistent test data
+
+### Low Priority
+7. **Performance:** Reduce test suite runtime
+8. **Coverage:** Increase from current ~40% to 80%+
+
+---
+
+## CI/CD Considerations
+
+### For GitHub Actions / GitLab CI:
+```yaml
+# Recommended: Run tests with retries for flaky tests
+- name: Run Tests
+  run: docker compose exec backend npm test
+  retry: 2  # Retry flaky tests once
+```
+
+### Test Categorization:
+```bash
+# Fast, stable tests (for PR checks)
+npm test -- --testPathIgnorePatterns='(matching-algorithm|ratings-stats-flow|events).test.js'
+
+# Full suite (for main branch)
+npm test
+```
+
+---
+
+## Contributing
+
+When adding new tests:
+1. Use dynamic identifiers (`Date.now()`, `uuid()`)
+2. Clean up created data in `afterEach`
+3. Don't rely on global test execution order
+4. Test in isolation before committing
+5. Document any known flakiness
+
+## Contact
+
+For test-related questions or issues, refer to:
+- Main documentation: `/docs/README.md`
+- Architecture: `/docs/ARCHITECTURE.md`
+- Deployment: `/docs/DEPLOYMENT.md`