Skip to content

Conversation

michaelawyu
Copy link
Member

Description of your changes

This PR adds additional integration tests for verifying the work applier exponential backoff behavior.

Fixes #

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

N/A

Special notes for your reviewer

N/A

Signed-off-by: michaelawyu <[email protected]>
// Track the observation timestamp of the diff details.
lastDiffObservedTime = &work.Status.ManifestConditions[0].DiffDetails.ObservationTime
return nil
}, eventuallyDuration, eventuallyInterval).Should(Succeed(), "Failed to update Work object status")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventuallyInterval is 1 second, I wonder if it will make it flaky?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! These tests run with a special exponential backoff setup (10 sec delay first, then 20 secs with 1.5x slow backoff, and eventually 30 seconds with 2x fast backoff). The results are expected to stay the same for at least 10 seconds after first apply, so it should be alright.

// Need to poll a bit longer to avoid flakiness.
Eventually(diffObservationTimeChangedActual(workName, wantWorkStatus, curDiffObservedTime, lastDiffObservedTime), eventuallyDuration*2, eventuallyInterval).Should(Or(Succeed()))

// Fixed delay is set to 10 seconds. Give a one-sec leeway to avoid flakiness.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was 5 sec? Where is 20 second set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! Yeah, for the actual setup the default value is 5 seconds; this one has to use a longer one as the default values (not the initial 5 second part, but the slow/fast backoffs do not play well with our integration tests -> can be flaky sometimes).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! I've pinpointed the test-specific setup in the comment down below.

Expect(err).ToNot(HaveOccurred())

workApplier = NewReconciler(
superLongExponentialBackoffRateLimiter := NewRequeueMultiStageWithExponentialBackoffRateLimiter(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test specific setup is added here; PTAL.

@michaelawyu
Copy link
Member Author

Resolved the conflict.

Signed-off-by: michaelawyu <[email protected]>
Copy link

codecov bot commented Aug 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

michaelawyu added a commit to michaelawyu/kubefleet that referenced this pull request Sep 9, 2025
Signed-off-by: michaelawyu <[email protected]>
## workaround to bypass the pkg/controllers/workv1alpha1 tests failure
## rollout controller tests need a bit longer to complete, so we increase the timeout
##
# Set up the timeout parameters as some of the test lengths have exceeded the default 10 minute mark.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the timeout set in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! A previous PR has already set the timeout to 20 mins, so the changes have been absolved.

Copy link
Member Author

@michaelawyu michaelawyu Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is 10 minutes, #182 has set it to 20 minutes and got merged before this PR

StopTrying("the work object status has changed unexpectedly").Wrap(fmt.Errorf("work status diff (-got, +want):\n%s", diff)).Now()
}

curDriftObservedTime.Time = work.Status.ManifestConditions[0].DriftDetails.ObservationTime.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pass in the curDriftObservedTime when its value is set inside the function. Looks like this is more of an "out" parameter or return value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ryan! Some of test steps later verify the currently observed timestamp:

It("should wait for the fixed delay period of time before next reconciliation", func() {
			curDiffObservedTime := &metav1.Time{}

			// Need to poll a bit longer to avoid flakiness.
			Eventually(diffObservationTimeChangedActual(workName, wantWorkStatus, curDiffObservedTime, lastDiffObservedTime), eventuallyDuration*2, eventuallyInterval).Should(Or(Succeed()))

			// Fixed delay is set to 10 seconds. Give a one-sec leeway to avoid flakiness.
			Expect(curDiffObservedTime.Sub(lastDiffObservedTime.Time)).To(And(
				BeNumerically(">=", time.Second*9),
				BeNumerically("<=", time.Second*11),
			), "the interval between two observations is not as expected")

			// Update the tracked observation time.
			lastDiffObservedTime = curDiffObservedTime
		})

You are right that it works more like a returned result rather than an passed-in arguement, but the Ginkgo Eventually blocks make it a bit difficult to format that way 😞

@michaelawyu michaelawyu merged commit 93905a0 into kubefleet-dev:main Sep 22, 2025
27 of 30 checks passed
@michaelawyu
Copy link
Member Author

Merged the PR to unblock progress -> can open new PRs if changes are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants