Skip to content

Conversation

corylanou
Copy link
Collaborator

@corylanou corylanou commented Aug 12, 2025

Summary

Problem 1: Concurrent Map Write (Critical)

The db.maxLTXFileInfos.m map was being accessed without mutex protection in two locations:

  • Line 893: delete(db.maxLTXFileInfos.m, 0)
  • Lines 898-904: Writing to db.maxLTXFileInfos.m[0]

This caused sporadic "fatal error: concurrent map writes" panics during tests.

Problem 2: PageSize Race (Minor)

The db.pageSize field was being read directly in SnapshotReader() without synchronization while it could be written during init(). This is unlikely to occur in practice since:

  • pageSize is only written once during initialization
  • It's an int which is typically atomic on most architectures
  • The value never changes after being set

However, it's technically a race according to Go's memory model, and the fix is trivial (use the existing thread-safe PageSize() method), so it's worth fixing for correctness.

Solution

  1. Added mutex Lock/Unlock calls around the unprotected map operations
  2. Changed SnapshotReader() to use db.PageSize() instead of direct field access

Test plan

  • Added TestDB_ConcurrentMapWrite to reproduce the race conditions
  • Verified test detects races when run with go test -race
  • Confirmed fixes resolve both race conditions
  • All existing tests pass

🤖 Generated with Claude Code

corylanou and others added 4 commits August 12, 2025 15:08
The sync() method in db.go was accessing db.maxLTXFileInfos.m without
proper mutex protection at lines 893 and 898, while other parts of the
code properly used locks. This caused a "fatal error: concurrent map writes"
panic when multiple goroutines accessed the map concurrently during
compaction and snapshot operations.

Added mutex Lock/Unlock around map operations to prevent race conditions.
Also added a race test to help detect similar issues in the future.

Fixes #696

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Use the thread-safe PageSize() method instead of direct field access
to prevent race condition when SnapshotReader is called concurrently
with init() during initial Sync.
Run pre-commit hooks to fix formatting issues that would fail CI
@corylanou corylanou requested a review from benbjohnson August 12, 2025 21:08
@corylanou corylanou merged commit 81fdd8f into main Aug 12, 2025
16 checks passed
@corylanou corylanou deleted the fix/concurrent-map-write-696 branch August 12, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants