feat: Add litestream-test harness for comprehensive database testing #748

corylanou · 2025-09-15T22:44:01Z

Summary

This PR introduces a comprehensive testing harness (litestream-test) for validating Litestream's behavior with large databases, various write patterns, and edge cases. This is the first implementation based on discussions with @benbjohnson about the need for thorough testing, particularly for databases larger than 1GB and various operational scenarios.

Background

Based on requirements outlined in [internal planning documents], we need comprehensive testing for:

Databases larger than 1GB (critical SQLite lock page edge case at 0x40000000)
Various write patterns and frequencies
Database shrinking scenarios (grow then delete data)
Interruption and recovery scenarios (checkpoint during Litestream downtime)
Multi-level compaction validation
LTX file continuity checking

Implementation

New Binary: `/cmd/litestream-test/`

Four primary commands implemented:

populate - Quickly create test databases to target sizes
- Configurable page sizes (critical for 1GB lock page testing)
- Multiple tables with indexes
- Batch inserts for efficiency
- Support for databases from MB to GB scale
load - Generate continuous workload on databases
- Write patterns: constant, burst, random, wave
- Configurable read/write ratios
- Multiple concurrent workers
- Real-time statistics reporting
validate - Verify replication integrity
- Quick check, integrity check, checksum comparison
- LTX file continuity checking
- Full data validation between source and restored DBs
- Integration with existing Litestream restore
shrink - Test database shrinking scenarios
- Configurable delete percentage
- Optional VACUUM and checkpoint operations
- All SQLite checkpoint modes supported (PASSIVE, FULL, RESTART, TRUNCATE)

Testing

Comprehensive testing was performed using the demo script available here:
Test Harness Demo Script (Gist)

Test Results

✅ Database Creation

Created databases from 5MB to 50MB successfully
Verified correct page size configuration (4KB, 8KB tested)
Tables with proper indexes created

✅ Write Patterns

Burst pattern: Correctly generated bursts (205 writes in 5s, then 0, then 204)
Random pattern: Generated variable rates as expected
Wave pattern: Smooth oscillating pattern confirmed
Load generation successfully added 600+ rows during tests

✅ Page Size Configuration

Successfully created database with 8KB pages (verified: 8192 bytes)
Ready for 1GB lock page boundary testing

✅ Shrink Operations

Deleted 40% of data (10,239 rows)
FULL checkpoint executed successfully
Database size reduced from 66MB to 45MB

Example Usage

# Create a 1GB database for lock page testing
./bin/litestream-test populate -db /tmp/test.db -target-size 1GB -page-size 8192

# Generate continuous load
./bin/litestream-test load -db /tmp/test.db -write-rate 100 -duration 1m -pattern burst

# Test shrinking
./bin/litestream-test shrink -db /tmp/test.db -delete-percentage 50 -checkpoint

# Validate replication
./bin/litestream-test validate -source-db /tmp/test.db -replica-url s3://bucket/test

Key Features

✅ Handles SQLite lock page at 1GB boundary (page calculation included)
✅ Supports interruption/recovery test scenarios
✅ Uses crypto/rand for secure random data generation
✅ Comprehensive error handling and structured logging
✅ Compatible with existing Litestream binaries

Next Steps

This is the first pass of the testing framework. Future enhancements could include:

Automated test suites for CI/CD
Multi-database concurrent testing
Network failure simulation
Performance regression detection
VFS testing capabilities

Notes

All code follows existing Litestream patterns and conventions
Uses same dependencies (go-sqlite3, slog)
Passes all pre-commit hooks (goimports, go-vet, staticcheck)
No changes to existing Litestream code

cc: @benbjohnson @corylanou

- Implement populate command for quickly creating test databases - Add load command for generating continuous read/write workloads - Create validate command for integrity and checksum verification - Add shrink command for testing database shrinking scenarios - Support configurable page sizes, write patterns, and validation modes - Handle databases crossing 1GB boundary (SQLite lock page edge case) Part of comprehensive testing framework for validating Litestream behavior with large databases, various write patterns, and edge cases.

- Move all test scripts to cmd/litestream-test/scripts/ directory - Create comprehensive README.md documenting all test scripts - Move test results documentation to .local/test-results/ (gitignored) - Clean up root directory test artifacts Test scripts include: - reproduce-critical-bug.sh: Reproduces checkpoint during downtime bug - test-1gb-boundary.sh: Tests SQLite 1GB lock page handling - test-fresh-start.sh: Tests fresh database creation workflow - test-rapid-checkpoints.sh: Stress tests rapid checkpoint cycling - test-wal-growth.sh: Tests 100MB+ WAL file handling - test-concurrent-operations.sh: Tests 5 concurrent database replications - verify-test-setup.sh: Ensures local builds are used 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…entation - Add S3 LTX file retention cleanup testing scripts: - test-s3-retention-small-db.sh: Tests 50MB database with 2min retention - test-s3-retention-large-db.sh: Tests 1.5GB database crossing SQLite lock page - test-s3-retention-comprehensive.sh: Master script with comparative analysis - S3-RETENTION-TESTING.md: Complete documentation and usage guide - Scripts use local Python S3 mock for isolated testing - Validate Ben's concern about LTX file cleanup after retention period - Include critical SQLite 1GB lock page boundary testing - Update scripts/README.md with new S3 retention test documentation - Clean up temporary markdown files and analysis artifacts 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Disable MD031 (blanks-around-fences) and MD032 (blanks-around-lists) - Fix README.md heading style to use consistent setext format - Convert bold text to proper headings in S3-RETENTION-TESTING.md - Add required blank lines around headings 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Add four new testing scripts for automated validation: - analyze-test-results.sh: Parse and analyze test output logs - test-overnight-s3.sh: Extended S3 replication stress testing - test-overnight.sh: General overnight stress testing suite - test-quick-validation.sh: Fast validation for common scenarios These scripts provide comprehensive test coverage for replication, retention, and recovery scenarios across different storage backends. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Populate database before starting litestream to ensure data exists - Adjust aggressive test parameters for shorter test duration - Improve monitoring with WAL size tracking and replica metrics - Fix table detection for accurate row counting - Better error filtering to exclude non-critical warnings - Enhanced summary with clearer success/failure indicators 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

corylanou mentioned this pull request Sep 16, 2025

CRITICAL: Restore fails with 'nonsequential page numbers' after checkpoint during Litestream downtime #752

Open

corylanou force-pushed the feat/litestream-test-harness branch from c526be1 to 3744986 Compare September 17, 2025 13:31

corylanou and others added 5 commits September 24, 2025 14:42

remove misc file

a7f105b

corylanou force-pushed the feat/litestream-test-harness branch from a92760a to b817367 Compare September 24, 2025 19:43

corylanou and others added 3 commits September 25, 2025 07:48

Detect full checkpoints (#761)

d5c4608

corylanou merged commit ee36d3e into main Sep 25, 2025
9 checks passed

corylanou deleted the feat/litestream-test-harness branch September 25, 2025 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add litestream-test harness for comprehensive database testing #748

feat: Add litestream-test harness for comprehensive database testing #748

Uh oh!

corylanou commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

feat: Add litestream-test harness for comprehensive database testing #748

feat: Add litestream-test harness for comprehensive database testing #748

Uh oh!

Conversation

corylanou commented Sep 15, 2025

Summary

Background

Implementation

New Binary: /cmd/litestream-test/

Testing

Test Results

Example Usage

Key Features

Next Steps

Notes

Uh oh!

Uh oh!

Uh oh!

New Binary: `/cmd/litestream-test/`