Skip to content

Conversation

Sh1ragami
Copy link
Member

@Sh1ragami Sh1ragami commented Sep 18, 2025

Description

  • Added WithServerCertHotReload(bool) option (default: off).
    When enabled, the server reloads the TLS keypair on each handshake via GetCertificate.

  • Replaced certStore with atomic.Pointer for simplicity and safety.

  • Added hot reload support for both server (GetCertificate) and client (GetClientCertificate).

  • Ensured fallback to the last known good certificate in case reload fails.

  • Added micro-benchmarks to evaluate runtime overhead.


TLS Hot Reload Benchmarks (10 runs average, benchstat)
Benchmark sec/op B/op allocs/op
TLS_HandshakePerOp_Static 627.7µ ± 2% 260.3Ki ± 0% 1.927k ± 0%
TLS_HandshakePerOp_HotReload 663.1µ ± 13% 275.4Ki ± 0% 2.118k ± 0%

Related Issue

Versions

  • Vald Version: v1.7.17
  • Go Version: v1.24.5
  • Rust Version: v1.88.0
  • Docker Version: v28.3.2
  • Kubernetes Version: v1.33.3
  • Helm Version: v3.18.4
  • NGT Version: v2.4.3
  • Faiss Version: v1.11.0

Checklist

Special notes for your reviewer

Summary by CodeRabbit

  • New Features

    • Optional TLS certificate hot-reload for servers and clients to pick up updated certs without restarts.
    • Safe fallback to the last known good certificate if a reload fails.
    • Hot-reload is disabled by default and must be explicitly enabled.
  • Tests

    • Added benchmarks and an in-process TLS test server to measure performance and stability with/without hot-reload.
    • Automated certificate rotation to validate reload behavior.

- Add WithServerCertHotReload(bool) (default off); when enabled, reload
keypair
  on each handshake via GetCertificate
- Add lightweight certStore and micro-benchmarks
- Bench (ref): Static ~0.49 ns/op; HotReload ~51 µs/op, ~171 allocs/op
Copy link
Contributor

coderabbitai bot commented Sep 18, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a server-side TLS hot-reload option and implements reload-on-handshake/request certificate loading with an atomic last-known-good fallback for server and client credentials. Also adds an in-package benchmark test harness that starts a TLS gRPC server and can rotate certificate files to exercise hot-reload.

Changes

Cohort / File(s) Summary of changes
TLS hot-reload option & defaults
internal/tls/option.go
Adds WithServerCertHotReload(enabled bool) Option and includes WithServerCertHotReload(false) among default options so hot-reload remains opt-in (disabled by default).
TLS credentials: reload logic
internal/tls/tls.go
Adds hotReload bool and certPtr atomic.Pointer[tls.Certificate] to credentials; imports sync/atomic. Implements preload and two paths: when hotReload is true, sets GetCertificate/GetClientCertificate and GetConfigForClient/GetClientConfig to reload certs per handshake/request, update certPtr, and fall back to last-known-good on load errors; when false, loads certs once and stores them in certPtr.
Benchmarks & test scaffolding
internal/tls/tls_bench_test.go
Adds TLS benchmark helpers: serverStarter to start an in-process TLS gRPC server (optional hot-reload), reloadTLSCerts to rotate certificate files periodically, and a mock Index server used by benchmarks.
Repo metadata
.gitfiles
Adds the new benchmark test file to tracked files.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Server
  participant TLSCreds as TLS_Credentials
  participant FS as Cert_Files

  Note over Server,TLSCreds: Server configures TLS (hotReload=true/false)
  Server->>TLSCreds: Configure(certPath, keyPath, hotReload)
  TLSCreds->>FS: loadKeyPair(certPath, keyPath)
  FS-->>TLSCreds: tls.Certificate
  TLSCreds->>TLSCreds: store in atomic certPtr

  rect rgba(230,245,255,0.6)
  Note over Client,Server: Handshake / request (hotReload=true path)
  Client->>Server: ClientHello / request
  Server->>TLSCreds: GetCertificate() / GetClientCertificate()
  TLSCreds->>FS: loadKeyPair(certPath, keyPath)
  alt load ok
    FS-->>TLSCreds: tls.Certificate
    TLSCreds->>TLSCreds: update atomic certPtr
    TLSCreds-->>Server: current certificate
  else load fails
    TLSCreds->>TLSCreds: read last-known-good from certPtr
    TLSCreds-->>Server: fallback certificate
  end
  Server-->>Client: ServerHello / response
  end

  Note over FS: External rotation process may replace cert files between handshakes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed While the title references hot-reload, it lacks context by not specifying that it applies to TLS certificate handling and omits mention of the new Option and test scaffolding, making it only partially descriptive. It also includes internal shorthand (“minimal test impl”) that may not clearly communicate scope to all readers. The title is not unrelated but could better summarize the addition of TLS certificate hot reload support.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/tls/tls-cert-hot-reload

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b27c3e5 and 02bf17d.

📒 Files selected for processing (4)
  • .gitfiles (1 hunks)
  • internal/tls/option.go (2 hunks)
  • internal/tls/tls.go (4 hunks)
  • internal/tls/tls_bench_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitfiles
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/tls/tls_bench_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (78)
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
  • GitHub Check: runner / golangci-lint
  • GitHub Check: Run tests for internal packages
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: runner / textlint
🔇 Additional comments (4)
internal/tls/tls.go (2)

167-168: Return the freshly loaded server cert.

Line 168 returns c.certPtr.Load() immediately after storing &kp2. A concurrent handshake can race between the store and load, causing this goroutine to serve a certificate it never validated while also paying an extra atomic read. Returning &kp2 keeps the handshake aligned with the cert we just loaded; the Store already preserves it for future fallbacks.

 			c.certPtr.Store(&kp2)
-			return c.certPtr.Load(), nil
+			return &kp2, nil

253-254: Return the freshly loaded client cert.

Same pattern on Line 254: reloading via c.certPtr.Load() invites a race with other handshakes and adds redundant atomic work. Return &kp2 so this handshake always uses the certificate it just obtained while the stored pointer keeps the fallback path working.

 				c.certPtr.Store(&kp2)
-				return c.certPtr.Load(), nil
+				return &kp2, nil
internal/tls/option.go (2)

42-45: Opt-in default keeps legacy consumers safe

Explicitly defaulting hot reload to off (and documenting why) preserves existing TLS behavior while teeing up the new capability for adopters with zero surprise overhead. Nicely done.


150-155: Option wiring matches the existing pattern

This mirrors the other option helpers and cleanly threads the flag into the credentials builder, making it trivial for callers to toggle hot reload when they actually need it. Looks great.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vdaas-ci
Copy link
Collaborator

[CHATOPS:HELP] ChatOps commands.

  • 🙆‍♀️ /approve - approve
  • 🍱 /format - format codes and add licenses
  • /gen-test - generate test codes
  • 🏷️ /label - add labels
  • 🔚 2️⃣ 🔚 /label actions/e2e-deploy - run E2E deploy & integration test

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
internal/tls/option.go (1)

150-156: Add GoDoc for exported option.

Public symbol without a doc string hurts discoverability and linting.

Apply this diff:

+// WithServerCertHotReload enables per-handshake certificate reload via
+// tls.Config.GetCertificate. When true, the server reads cert/key files
+// on each handshake. Default is false (opt-in).
 func WithServerCertHotReload(enabled bool) Option {
     return func(c *credentials) error {
         c.hotReload = enabled
         return nil
     }
 }
internal/tls/store.go (1)

24-27: Prefer atomic.Pointer for type safety and fewer branches.

atomic.Pointer[tls.Certificate] is simpler and faster than atomic.Value with type assertions.

Apply this diff:

-import (
-    "crypto/tls"
-    "sync/atomic"
-)
+import (
+    "crypto/tls"
+    "sync/atomic"
+)

 // certStore stores and serves the latest tls.Certificate pointer safely.
 type certStore struct {
-    v atomic.Value // *tls.Certificate
+    p atomic.Pointer[tls.Certificate]
 }

 func newCertStore() *certStore { return &certStore{} }

 func (s *certStore) Set(c *tls.Certificate) {
     if s == nil || c == nil {
         return
     }
-    s.v.Store(c)
+    s.p.Store(c)
 }

 func (s *certStore) Get() *tls.Certificate {
     if s == nil {
         return nil
     }
-    if v := s.v.Load(); v != nil {
-        if c, ok := v.(*tls.Certificate); ok {
-            return c
-        }
-    }
-    return nil
+    return s.p.Load()
 }

Also applies to: 31-36, 38-48

internal/tls/tls_bench_test.go (3)

13-35: Benchmark name is misleading; it doesn’t call GetCertificate.

Rename for clarity.

Apply this diff:

-func Benchmark_GetCertificate_Static(b *testing.B) {
+func Benchmark_StaticCertAccess(b *testing.B) {

52-55: Fail fast if prewarm reload fails.

Bench results are meaningless if the first reload already failed.

Apply this diff:

-    if c, err := cfg.GetCertificate(chi); err == nil {
-        certSink = c
-    }
+    c, err := cfg.GetCertificate(chi)
+    if err != nil {
+        b.Fatalf("prewarm GetCertificate failed: %v", err)
+    }
+    certSink = c

37-66: Add a parallel benchmark to model concurrent handshakes.

Hot-reload reads cert/key each call; concurrency matters.

Add alongside the existing benches:

func Benchmark_GetCertificate_HotReload_Parallel(b *testing.B) {
    cfg, err := NewServerConfig(
        WithCert(testdata.GetTestdataPath("tls/server.crt")),
        WithKey(testdata.GetTestdataPath("tls/server.key")),
        WithServerCertHotReload(true),
    )
    if err != nil {
        b.Fatalf("failed to create hot-reload server config: %v", err)
    }
    if cfg.GetCertificate == nil {
        b.Fatalf("GetCertificate not set for hot reload config")
    }
    chi := &gotls.ClientHelloInfo{ServerName: "bench"}
    if _, err := cfg.GetCertificate(chi); err != nil {
        b.Fatalf("prewarm GetCertificate failed: %v", err)
    }
    b.ReportAllocs()
    b.ResetTimer()
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            c, err := cfg.GetCertificate(chi)
            if err != nil {
                b.Fatalf("GetCertificate failed: %v", err)
            }
            certSink = c
        }
    })
    runtime.KeepAlive(certSink)
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffcca50 and 811a3f7.

📒 Files selected for processing (4)
  • internal/tls/option.go (2 hunks)
  • internal/tls/store.go (1 hunks)
  • internal/tls/tls.go (3 hunks)
  • internal/tls/tls_bench_test.go (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (130)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
🔇 Additional comments (3)
internal/tls/option.go (1)

43-45: Defaulting hot‑reload to off is the right call. Please document the overhead.

Opt‑out by default avoids surprise latency on handshakes. Ensure README/Helm values note the perf trade‑offs and how to enable.

internal/tls/tls.go (2)

71-74: State additions look good.

hotReload flag and cert store are scoped correctly to credentials.


87-89: Good: initialize store eagerly.

Avoids nil checks in fast paths.

Copy link

codecov bot commented Sep 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 17.32%. Comparing base (ffcca50) to head (02bf17d).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3247      +/-   ##
==========================================
- Coverage   17.34%   17.32%   -0.03%     
==========================================
  Files         129      129              
  Lines       14557    14557              
==========================================
- Hits         2525     2522       -3     
- Misses      11742    11744       +2     
- Partials      290      291       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@kpango kpango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the Pull Request regarding dynamic TLS certificate updates! I have reviewed it carefully and found that multiple fixes are required.

I've provided feedback on specific sections, so please review them. Also, you probably addressed the Server Cert, but are you forgetting about the Client Cert?

tls.Config ↓
GetClientCertificate func(CertificateRequestInfo) (Certificate, error)
GetConfigForClient func(ClientHelloInfo) (Config, error)

// hotReload toggles per-handshake reload using GetCertificate.
hotReload bool
// store keeps the latest loaded certificate.
store *certStore
Copy link
Collaborator

@kpango kpango Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the certStore is essentially unused elsewhere, there is no need for store.go and certStore implementation. It would be simpler to just use the atomic.Pointer[tls.Certificate] field here.

}
c.cfg.Certificates = []tls.Certificate{kp}
c.store.Set(&kp)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using GetCertificate, cfg.Certificates should still be set for NameToCertificate. So this conditional branch only sets one or the other, which causes problems.

Comment on lines 13 to 66
func Benchmark_GetCertificate_Static(b *testing.B) {
cfg, err := NewServerConfig(
WithCert(testdata.GetTestdataPath("tls/server.crt")),
WithKey(testdata.GetTestdataPath("tls/server.key")),
WithServerCertHotReload(false),
)
if err != nil {
b.Fatalf("failed to create static server config: %v", err)
}
if len(cfg.Certificates) == 0 {
b.Fatalf("no static certificate loaded")
}
get := func() *gotls.Certificate { return &cfg.Certificates[0] }

_ = get()

b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
certSink = get()
}
runtime.KeepAlive(certSink)
}

func Benchmark_GetCertificate_HotReload(b *testing.B) {
cfg, err := NewServerConfig(
WithCert(testdata.GetTestdataPath("tls/server.crt")),
WithKey(testdata.GetTestdataPath("tls/server.key")),
WithServerCertHotReload(true),
)
if err != nil {
b.Fatalf("failed to create hot-reload server config: %v", err)
}
if cfg.GetCertificate == nil {
b.Fatalf("GetCertificate not set for hot reload config")
}

chi := &gotls.ClientHelloInfo{ServerName: "bench"}

if c, err := cfg.GetCertificate(chi); err == nil {
certSink = c
}

b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
c, err := cfg.GetCertificate(chi)
if err != nil {
b.Fatalf("GetCertificate failed: %v", err)
}
certSink = c
}
runtime.KeepAlive(certSink)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmark is used differently from the actual case flow, and since it compares simple function calls with memory calls, it's obvious without benchmarking whether the HotReload process is slow.

The focus at this time should be building a simple Echo server in a Client-Server environment and accurately evaluating its QPS and latency using testing.B based code.

- Normal communication (Static vs. HotReload)

- Handshake included (Static vs. HotReload)

- Certificate rotation in progress
Copy link

cloudflare-workers-and-pages bot commented Sep 19, 2025

Deploying vald with  Cloudflare Pages  Cloudflare Pages

Latest commit: 02bf17d
Status: ✅  Deploy successful!
Preview URL: https://9b0f106a.vald.pages.dev
Branch Preview URL: https://feature-tls-tls-cert-hot-rel.vald.pages.dev

View logs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/tls/tls.go (2)

135-142: Server must not “fallback” to insecure; reject missing cert/key.

A TLS server cannot operate without a certificate. Returning a config with InsecureSkipVerify is meaningless on the server side and will fail at handshake time. Always error if cert or key is missing.

Apply:

-	if c.cert == "" || c.key == "" {
-		if c.insecure {
-			return &Config{
-				InsecureSkipVerify: c.insecure,
-			}, nil
-		}
-		return nil, errors.ErrTLSCertOrKeyNotFound
-	}
+	if c.cert == "" || c.key == "" {
+		return nil, errors.ErrTLSCertOrKeyNotFound
+	}

208-215: Client should allow system roots by default (no hard error).

If neither CA nor client cert/key is provided, the client should proceed with system RootCAs (and optional InsecureSkipVerify if explicitly set). The current early return forces an error.

Apply:

-	if c.cert == "" && c.key == "" && c.ca == "" {
-		if c.insecure {
-			return &Config{
-				InsecureSkipVerify: c.insecure,
-			}, nil
-		}
-		return nil, errors.ErrTLSCertOrKeyNotFound
-	}
+	if c.cert == "" && c.key == "" && c.ca == "" {
+		// Use system roots by default. If c.insecure is true, newCredential already set it.
+		// No early return needed.
+	}
♻️ Duplicate comments (1)
internal/tls/tls.go (1)

158-169: Log reload failures and avoid extra atomic on success.

  • Add a warn on reload failure when falling back to last good cert.
  • Return &kp2 directly to skip an extra atomic.Load.

Apply:

-        c.cfg.GetCertificate = func(chi *tls.ClientHelloInfo) (*tls.Certificate, error) {
+        c.cfg.GetCertificate = func(chi *tls.ClientHelloInfo) (*tls.Certificate, error) {
             kp2, err := loadKeyPair(c.sn, c.cert, c.key)
             if err != nil {
                 // fall back to last good certificate
                 if cur := c.certPtr.Load(); cur != nil {
-                    return cur, nil
+                    log.Warnf("GetCertificate reload failed; serving last good cert: %v", err)
+                    return cur, nil
                 }
                 return nil, err
             }
             c.certPtr.Store(&kp2)
-            return c.certPtr.Load(), nil
+            return &kp2, nil
         }
🧹 Nitpick comments (7)
internal/tls/tls.go (3)

171-179: Drop GetConfigForClient cloning; it’s unnecessary with GetCertificate and adds allocs.

When GetCertificate is set, NameToCertificate is ignored. Keeping Certificates current via Clone here adds overhead without benefit. Simplify by removing this hook.

Apply:

-        // Ensure NameToCertificate stays sensible by cloning config with latest cert.
-        c.cfg.GetConfigForClient = func(chi *tls.ClientHelloInfo) (*tls.Config, error) {
-            cfg := c.cfg.Clone()
-            cfg.GetConfigForClient = nil
-            if cur := c.certPtr.Load(); cur != nil {
-                cfg.Certificates = []tls.Certificate{*cur}
-            }
-            return cfg, nil
-        }

143-146: Don’t auto-set ServerName defaults; let the client infer or require explicit override.

Auto-setting ServerName to “vald-server” (server) or “vald-client” (client) risks hostname verification failures. Leave empty unless explicitly configured.

Apply:

-	if c.sn == "" {
-		c.sn = "vald-server"
-		c.cfg.ServerName = c.sn
-	}
+	// Do not force ServerName on server configs.

@@
-        if c.sn == "" {
-            c.sn = "vald-client"
-            c.cfg.ServerName = c.sn
-        }
+        // Do not force ServerName on client; tls.Dial infers it from the host.

Also applies to: 232-235


245-254: Client reload: add warn on fallback and avoid extra atomic.

Mirror the server path improvements.

Apply:

-            c.cfg.GetClientCertificate = func(cri *tls.CertificateRequestInfo) (*tls.Certificate, error) {
+            c.cfg.GetClientCertificate = func(cri *tls.CertificateRequestInfo) (*tls.Certificate, error) {
                 kp2, err := loadKeyPair(c.sn, c.cert, c.key)
                 if err != nil {
                     if cur := c.certPtr.Load(); cur != nil {
-                        return cur, nil
+                        log.Warnf("GetClientCertificate reload failed; serving last good cert: %v", err)
+                        return cur, nil
                     }
                     return nil, err
                 }
                 c.certPtr.Store(&kp2)
-                return c.certPtr.Load(), nil
+                return &kp2, nil
             }
internal/tls/tls_bench_test.go (4)

21-23: Silence logs only for benchmarks.

init() affects the whole package. Consider guarding with testing.Short or moving level change into a helper to avoid altering non-benchmark runs.


108-141: ns/op vs qps reporting is fine; also report payload throughput.

Set bytes per op for a clearer throughput metric.

Apply:

     payload := []byte("0123456789abcdef0123456789abcdef")
@@
-    b.ReportAllocs()
+    b.ReportAllocs()
+    b.SetBytes(int64(len(payload)))

171-204: RunParallel + b.Fatalf from worker goroutines.

b.Fatalf in worker goroutines can abruptly end other workers mid-iteration. Prefer b.Errorf + pb.ResetTimer guard or panic (testing catches) for clarity. Not a blocker.


241-279: Rotation toggles only cert; good. Consider jitter to avoid periodic lockstep.

Add small random jitter to tick to avoid synchronization artifacts with handshake bursts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 811a3f7 and fb7b3c7.

📒 Files selected for processing (2)
  • internal/tls/tls.go (4 hunks)
  • internal/tls/tls_bench_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/tls/tls_bench_test.go (4)
internal/log/level/level.go (1)
  • ERROR (36-36)
internal/tls/tls.go (3)
  • Config (38-38)
  • NewServerConfig (129-199)
  • NewClientConfig (202-265)
internal/tls/option.go (4)
  • WithCert (106-111)
  • WithKey (113-118)
  • WithServerCertHotReload (150-155)
  • WithCa (120-125)
internal/test/testdata.go (1)
  • GetTestdataPath (29-31)
internal/tls/tls.go (2)
internal/tls/option.go (1)
  • Option (26-26)
internal/errors/errors.go (1)
  • ErrOptionFailed (42-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (158)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: coverage
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
🔇 Additional comments (1)
internal/tls/tls_bench_test.go (1)

81-101: ServerName inference relies on dial target — use "localhost" or set ccfg.ServerName to match the cert SANs

If the test cert lacks 127.0.0.1 in its SANs the TLS handshake will fail. Update internal/tls/tls_bench_test.go (lines 81–101) to dial via hostname (e.g., localhost) or explicitly set ccfg.ServerName to the certificate's SAN. Verify the cert SANs locally: openssl x509 -in <path-to-server.crt> -noout -text | rg -nC2 'Subject:|DNS:|IP Address:'.

coderabbitai[bot]
coderabbitai bot previously approved these changes Sep 19, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (5)
internal/tls/tls_bench_test.go (5)

238-248: Make atomicReplace portable (Windows rename over existing file fails).

os.Rename(path+".tmp", path) fails on Windows when path exists. Handle replace by removing the destination first (best-effort) before rename.

 func atomicReplace(path string, data []byte) error {
   dir := filepath.Dir(path)
   if err := os.MkdirAll(dir, 0o755); err != nil {
     return err
   }
   tmp := path + ".tmp"
   if err := os.WriteFile(tmp, data, 0o644); err != nil {
     return err
   }
-  return os.Rename(tmp, path)
+  // Best-effort remove on platforms that don't allow overwrite.
+  _ = os.Remove(path)
+  return os.Rename(tmp, path)
 }

220-235: Preserve source file permissions (key files should stay 0600).

copyFile writes with 0644, which is too permissive for private keys. Preserve src mode so server.key stays 0600 if the fixture has it.

 func copyFile(src, dst string) error {
   in, err := os.ReadFile(src)
   if err != nil {
     return err
   }
+  fi, err := os.Stat(src)
+  if err != nil {
+    return err
+  }
   dir := filepath.Dir(dst)
   if err := os.MkdirAll(dir, 0o755); err != nil {
     return err
   }
   tmp := dst + ".tmp"
-  if err := os.WriteFile(tmp, in, 0o644); err != nil {
+  if err := os.WriteFile(tmp, in, fi.Mode().Perm()); err != nil {
     return err
   }
   return os.Rename(tmp, dst)
 }

419-438: Remove unused reloadTLSCerts scaffold (or wire it to rotate files).

Current function spins a ticker and never mutates certs; it only adds complexity and imports. Prefer removing it in this PR and relying on the explicit rotation benchmark above.

-func reloadTLSCerts(b *testing.B) (stop context.CancelFunc) {
-  b.Helper()
-
-  var ctx context.Context
-  ctx, stop = context.WithCancel(b.Context())
-  eg, egctx := errgroup.New(ctx)
-  eg.Go(safety.RecoverFunc(func() error {
-    tick := time.NewTicker(time.Second * 5)
-    // todo reload certs frequently
-    for {
-      select {
-      case <-egctx.Done():
-      case <-tick.C:
-        // reload here using internal/file package
-      }
-    }
-  }))
-
-  return stop
-}
+// (intentionally removed; use Benchmark_EchoTLS_HotReload_WithRotation for rotation behavior)

29-32: Global logger init in tests — ok, but keep it minimal.

Setting ERROR level is fine to quiet benches; consider restoring level in a test cleanup if other tests in the same package rely on different verbosity.


144-151: Optional: built-in ns/op already exists.

testing.B already reports ns/op. Keeping ns/op_avg is fine since it measures just the critical section; if you want less noise, you can drop it and rely on built-in metrics.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb7b3c7 and 0dad688.

📒 Files selected for processing (1)
  • internal/tls/tls_bench_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/tls/tls_bench_test.go (6)
internal/tls/tls.go (3)
  • Config (38-38)
  • NewServerConfig (129-199)
  • NewClientConfig (202-265)
internal/sync/errgroup/group.go (1)
  • Wait (253-258)
internal/test/testdata.go (1)
  • GetTestdataPath (29-31)
internal/config/tls.go (1)
  • TLS (22-43)
internal/servers/server/option.go (1)
  • WithGRPCRegistFunc (404-414)
internal/safety/safety.go (1)
  • RecoverFunc (29-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (188)
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / dump-contexts-to-log
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / dump-contexts-to-log
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / dump-contexts-to-log
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: Run tests for internal packages
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
🔇 Additional comments (2)
internal/tls/tls_bench_test.go (2)

120-141: Fatalf inside RunParallel is acceptable; ensure no partial hangs.

Current usage is fine. Just ensure all write/flush/read error paths call Fatalf to avoid goroutines hanging on a stalled conn.

Also applies to: 186-205, 321-343


100-107: Confirm certificate SANs match 127.0.0.1 or set ServerName explicitly.

Dials target 127.0.0.1. If ca.pem/server.crt don’t include an IP SAN for 127.0.0.1, name verification will fail. If needed, set ccfg.ServerName to a SAN present in server.crt (e.g., “localhost” or “vald-server”).

Also applies to: 168-175, 306-314

@github-actions github-actions bot added size/L and removed size/L labels Sep 22, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0dad688 and a424de8.

📒 Files selected for processing (1)
  • internal/tls/tls_bench_test.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-06-10T19:25:49.832Z
Learnt from: kpango
PR: vdaas/vald#2491
File: pkg/gateway/lb/handler/grpc/handler.go:306-316
Timestamp: 2024-06-10T19:25:49.832Z
Learning: User: kpango
PR: vdaas/vald#2491
File: pkg/gateway/lb/handler/grpc/handler.go:306-316
Timestamp: 2024-05-07T04:15:11.886Z
Learning: In this project, vtprotobuf is used, which provides enhanced methods for protobuf objects: CloneVT (object cloning), EqualVT (object comparison), SizeVT (returns object size), MarshalVT (faster version of proto.Marshal), and UnmarshalVT (faster version of proto.Unmarshal). These methods offer better performance than usual protobuf usage.

Applied to files:

  • internal/tls/tls_bench_test.go
🧬 Code graph analysis (1)
internal/tls/tls_bench_test.go (7)
internal/tls/tls.go (2)
  • NewServerConfig (129-199)
  • NewClientConfig (202-265)
internal/net/net.go (2)
  • SplitHostPort (274-299)
  • JoinHostPort (265-267)
internal/test/testdata.go (1)
  • GetTestdataPath (29-31)
internal/file/file.go (1)
  • CopyFile (214-216)
internal/tls/option.go (7)
  • WithCert (106-111)
  • WithKey (113-118)
  • WithClientAuth (158-173)
  • WithServerCertHotReload (150-155)
  • Option (26-26)
  • WithCa (120-125)
  • WithServerName (136-141)
internal/servers/server/option.go (2)
  • WithGRPCRegistFunc (404-414)
  • WithGRPCOption (388-402)
internal/safety/safety.go (1)
  • RecoverFunc (29-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (200)
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: coverage
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
🔇 Additional comments (3)
internal/tls/tls_bench_test.go (3)

3-24: Fix imports to use vtls alias and remove unused deps.

Several imports are unused and the internal TLS package needs to be aliased to avoid namespace conflicts with crypto/tls. Based on past reviews, this code needs the vtls alias and several unused imports should be removed.

Apply this import fix:

 import (
     "context"
+    "crypto/tls"
     "path/filepath"
     "testing"
     "time"

     "github.com/vdaas/vald/apis/grpc/v1/payload"
     "github.com/vdaas/vald/apis/grpc/v1/vald"
-    "github.com/vdaas/vald/internal/config"
     "github.com/vdaas/vald/internal/file"
     "github.com/vdaas/vald/internal/log"
     "github.com/vdaas/vald/internal/log/level"
     "github.com/vdaas/vald/internal/net"
-    "github.com/vdaas/vald/internal/safety"
-    "github.com/vdaas/vald/internal/servers/server"
-    "github.com/vdaas/vald/internal/servers/starter"
-    "github.com/vdaas/vald/internal/sync/errgroup"
     "github.com/vdaas/vald/internal/test"
-    "github.com/vdaas/vald/internal/tls"
+    vtls "github.com/vdaas/vald/internal/tls"
     "google.golang.org/grpc"
     "google.golang.org/grpc/credentials"
 )

35-110: Replace complex server starter with minimal gRPC TLS implementation.

The current serverStarter uses internal starter/server infrastructure that requires many unused imports. For a benchmark focused on TLS hot-reload performance, a simpler self-contained gRPC server would be more appropriate and easier to maintain.

Replace the complex serverStarter implementation:

-func serverStarter(b *testing.B, hot bool) (ctx context.Context, stop context.CancelFunc, addr string) {
-    b.Helper()
-    ctx, stop = context.WithCancel(b.Context())
-
-    ln, err := net.Listen(net.TCP.String(), "127.0.0.1:0")
-    if err != nil {
-        b.Fatalf("listen: %v", err)
-    }
-    _, port, _ := net.SplitHostPort(ln.Addr().String())
-    _ = ln.Close()
-
-    certPath := test.GetTestdataPath("tls/server.crt")
-    keyPath := test.GetTestdataPath("tls/server.key")
-    if hot {
-        dir := b.TempDir()
-        activeCertPath = filepath.Join(dir, "active.crt")
-        activeKeyPath = filepath.Join(dir, "active.key")
-        _, _ = file.CopyFile(ctx, certPath, activeCertPath)
-        _, _ = file.CopyFile(ctx, keyPath, activeKeyPath)
-        certPath, keyPath = activeCertPath, activeKeyPath
-    } else {
-        activeCertPath, activeKeyPath = "", ""
-    }
-
-    stls, err := tls.NewServerConfig(
-        tls.WithCert(certPath),
-        tls.WithKey(keyPath),
-        tls.WithClientAuth("noclientcert"),
-        tls.WithServerCertHotReload(hot),
-    )
-    if err != nil {
-        b.Fatalf("server TLS config: %v", err)
-    }
-
-    srv, err := starter.New(
-        starter.WithConfig((&config.Servers{
-            Servers: []*config.Server{{
-                Name: "bench-grpc",
-                Mode: server.GRPC.String(),
-                Host: "127.0.0.1",
-                Port: port,
-                GRPC: &config.GRPC{},
-            }},
-        }).Bind()),
-        starter.WithGRPC(func(sc *config.Server) []server.Option {
-            return []server.Option{
-                server.WithGRPCRegistFunc(func(gs *grpc.Server) {
-                    vald.RegisterIndexServer(gs, mockIndexInfoServer{})
-                }),
-                server.WithGRPCOption(grpc.Creds(credentials.NewTLS(stls))),
-            }
-        }),
-    )
-    if err != nil {
-        b.Error(err)
-    }
-
-    go func() { _ = srv.ListenAndServe(ctx) }()
-
-    addr = net.JoinHostPort("127.0.0.1", port)
-    deadline := time.Now().Add(3 * time.Second)
-    for {
-        dctx, cancel := context.WithTimeout(ctx, 200*time.Millisecond)
-        c, err := net.DialContext(dctx, net.TCP.String(), addr)
-        cancel()
-        if err == nil {
-            _ = c.Close()
-            break
-        }
-        if time.Now().After(deadline) {
-            break
-        }
-        time.Sleep(50 * time.Millisecond)
-    }
-    return ctx, stop, addr
-}
+func serverStarter(b *testing.B, hot bool) (ctx context.Context, stop context.CancelFunc, addr string) {
+    b.Helper()
+    ctx, cancel := context.WithCancel(b.Context())
+
+    certPath := test.GetTestdataPath("tls/server.crt")
+    keyPath := test.GetTestdataPath("tls/server.key")
+    if hot {
+        dir := b.TempDir()
+        activeCertPath = filepath.Join(dir, "active.crt")
+        activeKeyPath = filepath.Join(dir, "active.key")
+        _, _ = file.CopyFile(ctx, certPath, activeCertPath)
+        _, _ = file.CopyFile(ctx, keyPath, activeKeyPath)
+        certPath, keyPath = activeCertPath, activeKeyPath
+    } else {
+        activeCertPath, activeKeyPath = "", ""
+    }
+
+    // TLS server config
+    scfg, err := vtls.NewServerConfig(
+        vtls.WithCert(certPath),
+        vtls.WithKey(keyPath),
+        vtls.WithClientAuth("noclientcert"),
+        vtls.WithServerCertHotReload(hot),
+    )
+    if err != nil {
+        b.Fatal(err)
+    }
+
+    lis, err := net.Listen("tcp", "127.0.0.1:0")
+    if err != nil {
+        b.Fatal(err)
+    }
+    s := grpc.NewServer(grpc.Creds(credentials.NewTLS(scfg)))
+    vald.RegisterIndexServer(s, new(mockIndexInfoServer))
+
+    go func() {
+        _ = s.Serve(lis)
+    }()
+    stop = func() {
+        s.GracefulStop()
+        _ = lis.Close()
+        cancel()
+    }
+    
+    // Wait briefly for server to be ready
+    time.Sleep(100 * time.Millisecond)
+    return ctx, stop, lis.Addr().String()
+}

160-166: Fix client TLS config to use vtls alias.

The client configuration calls need the vtls package prefix to resolve correctly.

Apply the vtls prefix fix:

-    ccfg, err := tls.NewClientConfig(
-        tls.WithCa(test.GetTestdataPath("tls/ca.pem")),
-        tls.WithServerName("vald.vdaas.org"),
+    ccfg, err := vtls.NewClientConfig(
+        vtls.WithCa(test.GetTestdataPath("tls/ca.pem")),
+        vtls.WithServerName("vald.vdaas.org"),
     )

@Matts966 Matts966 requested a review from Copilot September 22, 2025 09:54
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements opt-in TLS certificate hot-reload functionality to allow servers and clients to refresh certificates on each handshake without restarts. The feature is disabled by default to avoid performance overhead and includes fallback to the last known good certificate if reload fails.

Key Changes:

  • Added WithServerCertHotReload(bool) option with default disabled state
  • Replaced certificate storage with atomic.Pointer for thread-safe access
  • Implemented per-handshake certificate reloading via GetCertificate and GetClientCertificate callbacks

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
internal/tls/option.go Adds new WithServerCertHotReload option with disabled default
internal/tls/tls.go Implements hot-reload logic with atomic certificate storage and fallback mechanisms
internal/tls/tls_bench_test.go Adds comprehensive benchmarks comparing static vs hot-reload TLS handshake performance

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

return nil, err
}
c.certPtr.Store(&kp2)
return c.certPtr.Load(), nil
Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should return the newly loaded certificate &kp2 directly instead of calling c.certPtr.Load() which could potentially return a different certificate if another goroutine updates it between the Store and Load operations.

Suggested change
return c.certPtr.Load(), nil
return &kp2, nil

Copilot uses AI. Check for mistakes.

return nil, err
}
c.certPtr.Store(&kp2)
return c.certPtr.Load(), nil
Copy link
Preview

Copilot AI Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should return the newly loaded certificate &kp2 directly instead of calling c.certPtr.Load() which could potentially return a different certificate if another goroutine updates it between the Store and Load operations.

Suggested change
return c.certPtr.Load(), nil
return &kp2, nil

Copilot uses AI. Check for mistakes.

@kmrmt
Copy link
Contributor

kmrmt commented Sep 23, 2025

/format

@vdaas-ci
Copy link
Collaborator

[FORMAT] Updating license headers and formatting go codes triggered by kmrmt.

@github-actions github-actions bot removed the size/L label Sep 23, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/tls/tls.go (2)

160-166: Log when serving the last known good certificate on reload failure.

Adds observability for transient reload errors without failing handshakes.

Apply these diffs:

-				if cur := c.certPtr.Load(); cur != nil {
-					return cur, nil
+				if cur := c.certPtr.Load(); cur != nil {
+					log.Warnf("TLS hot-reload failed; serving last known good certificate: %v", err)
+					return cur, nil
-					if cur := c.certPtr.Load(); cur != nil {
-						return cur, nil
+					if cur := c.certPtr.Load(); cur != nil {
+						log.Warnf("Client TLS hot-reload failed; serving last known good certificate: %v", err)
+						return cur, nil

Also applies to: 246-251


236-255: Confirm option semantics: server‑named flag also toggles client hot‑reload.

hotReload is applied for client cert reloads too. If the public option is WithServerCertHotReload, verify this is intended; otherwise introduce a client‑specific option.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a424de8 and 78ff71a.

📒 Files selected for processing (4)
  • .gitfiles (1 hunks)
  • internal/tls/option.go (2 hunks)
  • internal/tls/tls.go (4 hunks)
  • internal/tls/tls_bench_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitfiles
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/tls/tls_bench_test.go
  • internal/tls/option.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (170)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for cmd packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / go build
  • GitHub Check: runner / golangci-lint
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
🔇 Additional comments (3)
internal/tls/tls.go (3)

71-75: LGTM: atomic pointer + opt-in flag are appropriate.

Using atomic.Pointer[tls.Certificate] with an opt-in hotReload flag is a clean, thread-safe choice.


147-156: LGTM: Preloading and setting Certificates for NameToCertificate.

Preloading once and populating cfg.Certificates while storing the pointer for fallback is correct.


167-169: Return the freshly loaded cert pointer (&kp2) instead of re‑loading atomically.

Avoids an unnecessary atomic Load and a tiny race window returning a different pointer.

Apply these diffs:

-			return c.certPtr.Load(), nil
+			return &kp2, nil
-				return c.certPtr.Load(), nil
+				return &kp2, nil

Also applies to: 252-254

Comment on lines +171 to +179
// Ensure NameToCertificate stays sensible by cloning config with latest cert.
c.cfg.GetConfigForClient = func(chi *tls.ClientHelloInfo) (*tls.Config, error) {
cfg := c.cfg.Clone()
cfg.GetConfigForClient = nil
if cur := c.certPtr.Load(); cur != nil {
cfg.Certificates = []tls.Certificate{*cur}
}
return cfg, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

GetConfigForClient clears GetCertificate, breaking hot‑reload.

The returned clone lacks GetCertificate, so the handshake won’t invoke the reload callback, effectively disabling hot‑reload. Copy the callback onto the clone (or drop GetConfigForClient).

Apply this diff:

 		c.cfg.GetConfigForClient = func(chi *tls.ClientHelloInfo) (*tls.Config, error) {
 			cfg := c.cfg.Clone()
 			cfg.GetConfigForClient = nil
+			// Preserve per-handshake reload on the clone.
+			cfg.GetCertificate = c.cfg.GetCertificate
 			if cur := c.certPtr.Load(); cur != nil {
 				cfg.Certificates = []tls.Certificate{*cur}
 			}
 			return cfg, nil
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Ensure NameToCertificate stays sensible by cloning config with latest cert.
c.cfg.GetConfigForClient = func(chi *tls.ClientHelloInfo) (*tls.Config, error) {
cfg := c.cfg.Clone()
cfg.GetConfigForClient = nil
if cur := c.certPtr.Load(); cur != nil {
cfg.Certificates = []tls.Certificate{*cur}
}
return cfg, nil
}
// Ensure NameToCertificate stays sensible by cloning config with latest cert.
c.cfg.GetConfigForClient = func(chi *tls.ClientHelloInfo) (*tls.Config, error) {
cfg := c.cfg.Clone()
cfg.GetConfigForClient = nil
// Preserve per-handshake reload on the clone.
cfg.GetCertificate = c.cfg.GetCertificate
if cur := c.certPtr.Load(); cur != nil {
cfg.Certificates = []tls.Certificate{*cur}
}
return cfg, nil
}
🤖 Prompt for AI Agents
internal/tls/tls.go around lines 171 to 179: the clone returned by
GetConfigForClient clears GetCertificate which prevents the handshake from
invoking the hot-reload callback; restore the original callback on the cloned
config before returning (e.g., set cfg.GetCertificate = c.cfg.GetCertificate) so
hot-reload continues to work, or alternatively remove the custom
GetConfigForClient and rely on the original config if that meets requirements.

if err != nil {
b.Fatalf("listen: %v", err)
}
_, port, _ := net.SplitHostPort(ln.Addr().String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to check for errors in all sections as well.

Suggested change
_, port, _ := net.SplitHostPort(ln.Addr().String())
_, port, err := net.SplitHostPort(ln.Addr().String())
if err != nil {
b.Fatal(err)
}

@Sh1ragami Sh1ragami force-pushed the feature/tls/tls-cert-hot-reload branch from 78ff71a to 6fddf0c Compare September 24, 2025 05:16
@github-actions github-actions bot added size/L and removed size/L labels Sep 24, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/tls/tls_bench_test.go (2)

101-139: Ticker not stopped; potential goroutine leak.

Ensure the ticker is stopped when the goroutine exits.

Apply either the larger refactor above or minimally add:

- tick := time.NewTicker(time.Second * 5)
+ tick := time.NewTicker(5 * time.Second)
+ defer tick.Stop()

29-37: Check and handle SplitHostPort error.

Error is ignored; if parsing fails, host/port may be bogus.

- host, port, _ := net.SplitHostPort(ln.Addr().String())
+ host, port, err := vnet.SplitHostPort(ln.Addr().String())
+ if err != nil {
+   b.Fatal(err)
+ }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78ff71a and 6fddf0c.

📒 Files selected for processing (1)
  • internal/tls/tls_bench_test.go (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2024-06-10T19:25:49.832Z
Learnt from: kpango
PR: vdaas/vald#2491
File: pkg/gateway/lb/handler/grpc/handler.go:306-316
Timestamp: 2024-06-10T19:25:49.832Z
Learning: User: kpango
PR: vdaas/vald#2491
File: pkg/gateway/lb/handler/grpc/handler.go:306-316
Timestamp: 2024-05-07T04:15:11.886Z
Learning: In this project, vtprotobuf is used, which provides enhanced methods for protobuf objects: CloneVT (object cloning), EqualVT (object comparison), SizeVT (returns object size), MarshalVT (faster version of proto.Marshal), and UnmarshalVT (faster version of proto.Unmarshal). These methods offer better performance than usual protobuf usage.

Applied to files:

  • internal/tls/tls_bench_test.go
📚 Learning: 2024-09-21T05:35:36.818Z
Learnt from: kpango
PR: vdaas/vald#2638
File: docs/tutorial/vald-agent-standalone-on-k8s.md:239-239
Timestamp: 2024-09-21T05:35:36.818Z
Learning: In grpc-go, `grpc.DialContext` has been deprecated and `grpc.NewClient` should be used instead.

Applied to files:

  • internal/tls/tls_bench_test.go
📚 Learning: 2024-10-09T05:40:03.660Z
Learnt from: kpango
PR: vdaas/vald#2685
File: internal/config/grpc.go:244-244
Timestamp: 2024-10-09T05:40:03.660Z
Learning: `grpc.WithIdleTimeout` is a client-side `DialOption` in gRPC and can be used in client configurations.

Applied to files:

  • internal/tls/tls_bench_test.go
📚 Learning: 2025-05-01T08:44:07.726Z
Learnt from: kpango
PR: vdaas/vald#0
File: :0-0
Timestamp: 2025-05-01T08:44:07.726Z
Learning: PR #2927 addressed gRPC connection panic issues by removing the singleflight.Group[pool.Conn] field from the gRPCClient struct in internal/net/grpc/client.go and introducing a centralized handleError function for consistent error processing in both Connect and Disconnect methods.

Applied to files:

  • internal/tls/tls_bench_test.go
📚 Learning: 2025-05-01T08:44:07.726Z
Learnt from: kpango
PR: vdaas/vald#0
File: :0-0
Timestamp: 2025-05-01T08:44:07.726Z
Learning: PR #2927 addressed gRPC connection panic issues by removing the singleflight.Group[pool.Conn] field from the gRPCClient struct in internal/net/grpc/client.go and introducing a centralized handleError function for more consistent error processing in both Connect and Disconnect methods.

Applied to files:

  • internal/tls/tls_bench_test.go
🧬 Code graph analysis (1)
internal/tls/tls_bench_test.go (6)
internal/tls/tls.go (1)
  • NewServerConfig (129-199)
internal/net/net.go (2)
  • SplitHostPort (274-299)
  • JoinHostPort (265-267)
internal/test/testdata.go (1)
  • GetTestdataPath (29-31)
internal/tls/option.go (5)
  • Option (26-26)
  • WithCert (106-111)
  • WithKey (113-118)
  • WithClientAuth (158-173)
  • WithServerCertHotReload (150-155)
internal/servers/server/option.go (2)
  • WithGRPCRegistFunc (404-414)
  • WithGRPCOption (388-402)
internal/file/file.go (2)
  • ReadFile (357-371)
  • WriteFile (265-269)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (200)
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: coverage
  • GitHub Check: CodeQL Analyze (go)
  • GitHub Check: CodeQL Analyze (rust)
  • GitHub Check: CodeQL Analyze (actions)
  • GitHub Check: Run tests for pkg packages
  • GitHub Check: Run tests for internal packages
  • GitHub Check: runner / golangci-lint
  • GitHub Check: runner / go build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: check-format-diff
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
  • GitHub Check: build / build
🔇 Additional comments (5)
internal/tls/tls_bench_test.go (5)

29-37: Bind/close/rebind race (TOCTOU).

Listening just to grab a free port, closing it, then starting the server later invites races. The refactor above removes this by serving directly on the acquired listener.


141-149: LGTM: mock service implementation is sufficient for the benchmarks.


3-22: Avoid shadowing std net; alias internal packages; keep imports minimal.

Using internal/net as net likely breaks net.Listen usage; also tls alias collides conceptually with crypto/tls. Alias them and drop heavy deps you no longer need after in-process server refactor.

 import (
-	"bytes"
-	"context"
-	"testing"
-	"time"
-
-	"github.com/vdaas/vald/apis/grpc/v1/payload"
-	"github.com/vdaas/vald/apis/grpc/v1/vald"
-	"github.com/vdaas/vald/internal/config"
-	"github.com/vdaas/vald/internal/file"
-	"github.com/vdaas/vald/internal/net"
-	"github.com/vdaas/vald/internal/safety"
-	"github.com/vdaas/vald/internal/servers/server"
-	"github.com/vdaas/vald/internal/servers/starter"
-	"github.com/vdaas/vald/internal/sync/errgroup"
-	"github.com/vdaas/vald/internal/test"
-	"github.com/vdaas/vald/internal/tls"
-	"google.golang.org/grpc"
-	"google.golang.org/grpc/credentials"
+	"bytes"
+	"context"
+	"net"
+	"path/filepath"
+	"testing"
+	"time"
+
+	"github.com/vdaas/vald/apis/grpc/v1/payload"
+	"github.com/vdaas/vald/apis/grpc/v1/vald"
+	"github.com/vdaas/vald/internal/file"
+	vnet "github.com/vdaas/vald/internal/net"
+	"github.com/vdaas/vald/internal/test"
+	vtls "github.com/vdaas/vald/internal/tls"
+	"google.golang.org/grpc"
+	"google.golang.org/grpc/credentials"
 )

3-22: No changes required — internal/net exports Listen and TCP.

internal/net/net.go defines Listen = net.Listen and the TCP NetworkType constant, so importing internal/net as net is fine.


24-99: Blocker: serverStarter never returns; replace with minimal in‑process gRPC TLS server.

ListenAndServe is called synchronously and blocks. Also, binding to a port, closing it, then reusing creates TOCTOU races. Start gRPC directly on a real listener and return immediately.

-func serverStarter(b *testing.B, hotReloadEnabled bool) (ctx context.Context, stop context.CancelFunc, addr string) {
-	b.Helper()
-	// TODO start server and return close func
-	ctx, stop = context.WithCancel(b.Context())
-
-	// Bind to an ephemeral port to obtain a free port, close it, then start the server on that port.
-	ln, err := net.Listen(net.TCP.String(), "127.0.0.1:0")
-	if err != nil {
-		b.Fatalf("listen: %v", err)
-	}
-	host, port, _ := net.SplitHostPort(ln.Addr().String())
-	_ = ln.Close()
-	addr = net.JoinHostPort(host, port)
-
-	srv, err := starter.New(
-		starter.WithConfig((&config.Servers{
-			TLS: func() *config.TLS {
-				if hotReloadEnabled {
-					return &config.TLS{
-						// write hot reload version here
-						Cert:               test.GetTestdataPath("tls/server.crt"),
-						Key:                test.GetTestdataPath("tls/server.key"),
-						ClientAuth:         "noclientcert",
-						InsecureSkipVerify: false,
-						ServerName:         "vald.vdaas.org",
-						// HotReload:          true,
-					}
-				}
-				return &config.TLS{
-					// write something here
-					Cert:               test.GetTestdataPath("tls/server.crt"),
-					Key:                test.GetTestdataPath("tls/server.key"),
-					ClientAuth:         "noclientcert",
-					InsecureSkipVerify: false,
-					ServerName:         "vald.vdaas.org",
-					// HotReload:        false,
-				}
-			}(),
-			// configure this
-			Servers: []*config.Server{{
-				Name: "bench-grpc",
-				Mode: server.GRPC.String(),
-				Host: host,
-				Port: port,
-				GRPC: &config.GRPC{},
-			}},
-		}).Bind()), // set server configuration here
-		starter.WithGRPC(func(sc *config.Server) []server.Option {
-			// Build server TLS via internal/tls (toggle hot-reload here since config.TLS does not expose the flag).
-			stls, serr := tls.NewServerConfig(
-				tls.WithCert(test.GetTestdataPath("tls/server.crt")),
-				tls.WithKey(test.GetTestdataPath("tls/server.key")),
-				tls.WithClientAuth("noclientcert"),
-				tls.WithServerCertHotReload(hotReloadEnabled),
-			)
-			opts := []server.Option{
-				server.WithGRPCRegistFunc(func(s *grpc.Server) {
-					vald.RegisterIndexServer(s, new(mockIndexInfoServer))
-				}),
-			}
-			if serr == nil {
-				opts = append(opts, server.WithGRPCOption(grpc.Creds(credentials.NewTLS(stls))))
-			} else {
-				// Keep server running without TLS if building TLS failed (adjust to b.Error if desired).
-				b.Logf("failed to build TLS config for server: %v", serr)
-			}
-			return opts
-		}),
-	)
-	if err != nil {
-		b.Error(err)
-	}
-	_ = srv.ListenAndServe(ctx)
-
-	return ctx, stop, addr
-}
+func serverStarter(b *testing.B, hotReloadEnabled bool) (ctx context.Context, stop context.CancelFunc, addr string) {
+	b.Helper()
+	ctx, cancel := context.WithCancel(b.Context())
+
+	lis, err := net.Listen("tcp", "127.0.0.1:0")
+	if err != nil {
+		b.Fatalf("listen: %v", err)
+	}
+	addr = lis.Addr().String()
+
+	// Prepare active certificate path in a temp dir to avoid mutating repository files.
+	tmpDir := b.TempDir()
+	active := filepath.Join(tmpDir, "server_active.crt")
+	if _, err := file.CopyFile(ctx, test.GetTestdataPath("tls/server.crt"), active); err != nil {
+		b.Fatalf("init cert: %v", err)
+	}
+
+	scfg, err := vtls.NewServerConfig(
+		vtls.WithCert(active),
+		vtls.WithKey(test.GetTestdataPath("tls/server.key")),
+		vtls.WithClientAuth("noclientcert"),
+		vtls.WithServerCertHotReload(hotReloadEnabled),
+	)
+	if err != nil {
+		b.Fatalf("tls server config: %v", err)
+	}
+
+	s := grpc.NewServer(grpc.Creds(credentials.NewTLS(scfg)))
+	vald.RegisterIndexServer(s, new(mockIndexInfoServer))
+
+	go func() { _ = s.Serve(lis) }()
+	stop = func() {
+		s.GracefulStop()
+		_ = lis.Close()
+		cancel()
+	}
+	return ctx, stop, addr
+}

Comment on lines 101 to 139
func reloadTLSCerts(b *testing.B) (stop context.CancelFunc) {
b.Helper()

var ctx context.Context
ctx, stop = context.WithCancel(b.Context())
eg, egctx := errgroup.New(ctx)
eg.Go(safety.RecoverFunc(func() error {
tick := time.NewTicker(time.Second * 5)
// TODO: reload certs periodically
for {
select {
case <-egctx.Done():
return nil
case <-tick.C:
// reload here using internal/file package
// Read server.crt and server2.crt alternately, and overwrite the same destination path (no rename/copy).
dst := test.GetTestdataPath("tls/server.crt")
srcA := test.GetTestdataPath("tls/server.crt")
srcB := test.GetTestdataPath("tls/server2.crt")
if !file.Exists(srcB) {
continue
}
next := srcB
if cur, err := file.ReadFile(dst); err == nil {
if a, err := file.ReadFile(srcA); err == nil && bytes.Equal(cur, a) {
next = srcB
} else {
next = srcA
}
}
if bs, err := file.ReadFile(next); err == nil {
_, _ = file.WriteFile(egctx, dst, bytes.NewReader(bs), 0o600)
}
}
}
}))

return stop
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Don’t mutate repo testdata; simplify rotation and avoid partial writes.

Current code overwrites internal/test/data certs, lacks ticker stop, and risks partial writes. Rotate into a temp “active” cert file via copy, and run without errgroup/safety for a benchmark helper.

-func reloadTLSCerts(b *testing.B) (stop context.CancelFunc) {
-	b.Helper()
-
-	var ctx context.Context
-	ctx, stop = context.WithCancel(b.Context())
-	eg, egctx := errgroup.New(ctx)
-	eg.Go(safety.RecoverFunc(func() error {
-		tick := time.NewTicker(time.Second * 5)
-		// TODO: reload certs periodically
-		for {
-			select {
-			case <-egctx.Done():
-				return nil
-			case <-tick.C:
-				// reload here using internal/file package
-				// Read server.crt and server2.crt alternately, and overwrite the same destination path (no rename/copy).
-				dst := test.GetTestdataPath("tls/server.crt")
-				srcA := test.GetTestdataPath("tls/server.crt")
-				srcB := test.GetTestdataPath("tls/server2.crt")
-				if !file.Exists(srcB) {
-					continue
-				}
-				next := srcB
-				if cur, err := file.ReadFile(dst); err == nil {
-					if a, err := file.ReadFile(srcA); err == nil && bytes.Equal(cur, a) {
-						next = srcB
-					} else {
-						next = srcA
-					}
-				}
-				if bs, err := file.ReadFile(next); err == nil {
-					_, _ = file.WriteFile(egctx, dst, bytes.NewReader(bs), 0o600)
-				}
-			}
-		}
-	}))
-
-	return stop
-}
+func reloadTLSCerts(b *testing.B) (stop context.CancelFunc) {
+	b.Helper()
+
+	ctx, cancel := context.WithCancel(b.Context())
+	stop = cancel
+
+	// Source certs to alternate.
+	srcA := test.GetTestdataPath("tls/server.crt")
+	srcB := test.GetTestdataPath("tls/server2.crt")
+	if !file.Exists(srcB) {
+		// no rotation available
+		return stop
+	}
+
+	// Discover the active cert path from serverStarter’s temp dir by looking up the latest one.
+	// If you keep it in a var, prefer wiring it in; otherwise, compute alongside serverStarter and capture it in a closure.
+	activeFinder := func() string {
+		// Replace with the active path you used in serverStarter (e.g., pass it back via a channel or store in a package var).
+		// For now, assume the active file is under the benchmark's TempDir with name server_active.crt.
+		// b.TempDir() can’t be called here; wire the path from serverStarter for correctness.
+		return "" // TODO: pass the active path from serverStarter
+	}
+
+	go func() {
+		tick := time.NewTicker(200 * time.Millisecond)
+		defer tick.Stop()
+		useA := false
+		for {
+			select {
+			case <-ctx.Done():
+				return
+			case <-tick.C:
+				dst := activeFinder()
+				if dst == "" {
+					continue
+				}
+				var src string
+				if useA {
+					src = srcA
+				} else {
+					src = srcB
+				}
+				useA = !useA
+				_, _ = file.CopyFile(ctx, src, dst)
+			}
+		}
+	}()
+	return stop
+}

Add a package-level var to wire the active cert path from serverStarter:

// add near the imports
var activeCertPath string

And in serverStarter (after computing ‘active’): set ‘activeCertPath = active’.

@github-actions github-actions bot added size/L and removed size/L labels Sep 26, 2025
@kpango
Copy link
Collaborator

kpango commented Sep 26, 2025

/format

@vdaas-ci
Copy link
Collaborator

[FORMAT] Updating license headers and formatting go codes triggered by kpango.

@github-actions github-actions bot added size/L and removed size/L labels Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants