Skip to content

Conversation

pnilan
Copy link
Contributor

@pnilan pnilan commented Sep 15, 2025

What

After exhausting request attempts, the backoff exception handled by the retry decorator in HttpClient._send_with_retry would bubble up w/o a defined failure type. This would cause backoff exceptions that exhaust the all retries to be categorized as FailureType.system_error. This PR adds exception handling in the _send_with_retry method to catch backoff exceptions after all request attempts are exhausted and re-raise as an AirbyteTracedException w/ the failure_type defined in the ErrorResolution that is returned from the error handler's interpret_response method. If failure_type is None then defaults to system_error.

  • Updates the BaseBackoffException, DefaultBackoffException, UserDefinedBackoffException, and RateLimitedBackoffException to take an optional failure_type property.
  • Additionally, I updated the default status error mapping for clearer error messaging.

Impact

  • Fewer non-actionable pages/sentry alerts to the Maintain Team.
  • Clearer error messaging in Cloud UI

Recommended Reading Order

  • http_client.py
  • exceptions.py
  • Tests

Summary by CodeRabbit

  • Improvements

    • HTTP error messages now include status codes and an "Error:" prefix for clearer diagnostics.
    • Connection checks report unavailable streams including the stream name for easier troubleshooting.
    • Exhausted retries now raise a traced error, log contextual details, and preserve failure classification for accurate handling.
    • Token refresh backoff now classifies transient errors explicitly.
  • Bug Fixes

    • Standardized user-facing error text across HTTP handlers for consistency.
  • Tests

    • Unit tests updated to expect revised messages and traced-exception behavior.

@github-actions github-actions bot added the bug Something isn't working label Sep 15, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@pnilan/fix/update-backoff-exception-handling#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch pnilan/fix/update-backoff-exception-handling

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link

github-actions bot commented Sep 15, 2025

PyTest Results (Fast)

3 756 tests  +9   3 744 ✅ +9   6m 11s ⏱️ -8s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 0462263. ± Comparison against base commit 1b42145.

This pull request removes 4 and adds 13 tests. Note that renamed tests count towards both.
unit_tests.sources.declarative.requesters.error_handlers.test_default_error_handler ‑ test_default_error_handler_with_custom_response_filter[_with_http_response_status_400_fail_with_default_failure_type-400-test_response_filter0-ResponseAction.RETRY-FailureType.system_error-Bad request. Please check your request parameters.]
unit_tests.sources.declarative.requesters.error_handlers.test_default_error_handler ‑ test_default_error_handler_with_custom_response_filter[_with_http_response_status_403_fail_with_default_failure_type-403-test_response_filter2-ResponseAction.FAIL-FailureType.config_error-Forbidden. You don't have permission to access this resource.]
unit_tests.sources.streams.http.error_handlers.test_http_status_error_handler ‑ test_given_error_code_in_response_http_status_error_handler_returns_expected_actions[403-ResponseAction.FAIL-FailureType.config_error-Forbidden. You don't have permission to access this resource.]
unit_tests.sources.streams.http.error_handlers.test_http_status_error_handler ‑ test_given_error_code_in_response_http_status_error_handler_returns_expected_actions[404-ResponseAction.FAIL-FailureType.system_error-Not found. The requested resource was not found on the server.]
unit_tests.sources.declarative.requesters.error_handlers.test_default_error_handler ‑ test_default_error_handler_with_custom_response_filter[_with_http_response_status_400_fail_with_default_failure_type-400-test_response_filter0-ResponseAction.RETRY-FailureType.system_error-HTTP Status Code: 400. Error: Bad request. Please check your request parameters.]
unit_tests.sources.declarative.requesters.error_handlers.test_default_error_handler ‑ test_default_error_handler_with_custom_response_filter[_with_http_response_status_403_fail_with_default_failure_type-403-test_response_filter2-ResponseAction.FAIL-FailureType.config_error-HTTP Status Code: 403. Error: Forbidden. You don't have permission to access this resource.]
unit_tests.sources.streams.http.error_handlers.test_http_status_error_handler ‑ test_given_error_code_in_response_http_status_error_handler_returns_expected_actions[403-ResponseAction.FAIL-FailureType.config_error-HTTP Status Code: 403. Error: Forbidden. You don't have permission to access this resource.]
unit_tests.sources.streams.http.error_handlers.test_http_status_error_handler ‑ test_given_error_code_in_response_http_status_error_handler_returns_expected_actions[404-ResponseAction.FAIL-FailureType.system_error-HTTP Status Code: 404. Error: Not found. The requested resource was not found on the server.]
unit_tests.sources.streams.http.test_http_client ‑ test_send_with_retry_raises_airbyte_traced_exception_with_failure_type[400-FailureType.system_error-test error message-DefaultBackoffException]
unit_tests.sources.streams.http.test_http_client ‑ test_send_with_retry_raises_airbyte_traced_exception_with_failure_type[400-FailureType.system_error-test error message-RateLimitBackoffException]
unit_tests.sources.streams.http.test_http_client ‑ test_send_with_retry_raises_airbyte_traced_exception_with_failure_type[400-FailureType.system_error-test error message-UserDefinedBackoffException]
unit_tests.sources.streams.http.test_http_client ‑ test_send_with_retry_raises_airbyte_traced_exception_with_failure_type[401-FailureType.config_error-test error message-DefaultBackoffException]
unit_tests.sources.streams.http.test_http_client ‑ test_send_with_retry_raises_airbyte_traced_exception_with_failure_type[401-FailureType.config_error-test error message-RateLimitBackoffException]
unit_tests.sources.streams.http.test_http_client ‑ test_send_with_retry_raises_airbyte_traced_exception_with_failure_type[401-FailureType.config_error-test error message-UserDefinedBackoffException]
…

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Sep 15, 2025

PyTest Results (Full)

3 759 tests   3 747 ✅  10m 57s ⏱️
    1 suites     12 💤
    1 files        0 ❌

Results for commit 0462263.

♻️ This comment has been updated with latest results.

@pnilan pnilan changed the title fix(http_client): transient_error is raised after exhausting request attempts and failing w/ backoff exception feat(http_client): transient_error is raised after exhausting request attempts and failing w/ backoff exception Sep 15, 2025
@pnilan pnilan removed the bug Something isn't working label Sep 15, 2025
@pnilan pnilan marked this pull request as ready for review September 15, 2025 23:17
@Copilot Copilot AI review requested due to automatic review settings September 15, 2025 23:17
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves error handling in the HTTP client by ensuring backoff exceptions that exhaust all retry attempts are categorized as transient_error instead of system_error, and enhances error message clarity by including HTTP status codes.

  • Adds exception handling for BaseBackoffException in _send_with_retry to catch exhausted retries and re-raise as MessageRepresentationAirbyteTracedErrors with transient_error
  • Updates default HTTP status error messages to include status codes for better debugging
  • Updates tests to expect the new exception type instead of raw backoff exceptions

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
airbyte_cdk/sources/streams/http/http_client.py Adds try-catch block in _send_with_retry to handle exhausted backoff exceptions
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py Updates error messages to include HTTP status codes
unit_tests/sources/streams/http/test_http.py Updates test expectations to catch new exception type and adds retry configuration
unit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.py Updates expected error messages to include status codes
unit_tests/sources/declarative/requesters/test_http_requester.py Updates test expectations for new exception handling
unit_tests/sources/declarative/requesters/error_handlers/test_http_response_filter.py Updates expected error messages in test assertions
unit_tests/sources/declarative/requesters/error_handlers/test_default_error_handler.py Updates expected error messages to match new format
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py Minor code refactoring to extract message variable

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions github-actions bot added the enhancement New feature or request label Sep 15, 2025
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

HttpClient now raises traced Airbyte errors when backoff retries are exhausted and preserves failure_type through backoff exceptions; HTTP error messages were prefixed with status codes; declarative dynamic-stream availability now composes/logs a stream-specific unavailable message; unit tests updated accordingly.

Changes

Cohort / File(s) Summary
Declarative availability check
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py
When a stream is unavailable, compose message = f"Stream {stream.name} is not available: {reason}", log it, and return (False, message) instead of returning the raw reason.
HTTP error mapping
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py
Prepended HTTP Status Code: <code>. Error: to error_message strings for status codes (400, 401, 403, 404, 405, 408, 429, 500, 502, 503, 504).
HttpClient retry/backoff behavior
airbyte_cdk/sources/streams/http/http_client.py
Catch BaseBackoffException from backoff wrappers, log exhaustion, and raise a traced Airbyte error (MessageRepresentationAirbyteTracedErrors) including failure_type and StreamDescriptor instead of returning a Response. Propagate failure_type when constructing backoff exceptions in error-resolution paths.
Backoff exceptions
airbyte_cdk/sources/streams/http/exceptions.py
BaseBackoffException.__init__ and UserDefinedBackoffException.__init__ now accept optional failure_type: Optional[FailureType] and store/forward it.
OAuth token refresh backoff
airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py
On 429/5xx during token refresh, raise DefaultBackoffException(..., failure_type=FailureType.transient_error) to classify transient errors explicitly.
Unit tests — error message updates
unit_tests/.../error_handlers/test_default_error_handler.py, unit_tests/.../error_handlers/test_http_response_filter.py, unit_tests/.../error_handlers/test_http_status_error_handler.py, unit_tests/.../test_availability_strategy.py
Updated expected error_message strings to include HTTP Status Code: {code}. Error: ... prefixes where applicable.
Unit tests — backoff/traced error expectations & wiring
unit_tests/.../test_http_requester.py, unit_tests/.../async_job/test_job_orchestrator.py, unit_tests/.../streams/http/test_http.py, unit_tests/.../streams/http/test_http_client.py
Tests updated to expect AirbyteTracedException on exhausted retries; wired HttpClient into test streams (added APIBudget/MessageRepository mocks); added AutoFailTrueHttpStream.__init__ to default disable_retries=True; added parametrized test verifying failure_type propagation.

Sequence Diagram(s)

sequenceDiagram
    actor Stream
    participant HttpClient
    participant Backoff as BackoffWrappers
    participant Logger
    participant Traced as AirbyteTracedException

    Note over Stream,HttpClient: Request with retry/backoff
    Stream->>HttpClient: send request
    HttpClient->>Backoff: invoke wrapped attempts
    alt Success
        Backoff-->>HttpClient: HTTP Response
        HttpClient-->>Stream: return Response
    else Exhausted (BaseBackoffException)
        Backoff--x HttpClient: raises BaseBackoffException (contains failure_type)
        HttpClient->>Logger: error "Retries exhausted..." (exc_info=True)
        HttpClient-->>Traced: raise traced error (internal/message, failure_type, stream descriptor)
        Traced--x Stream: exception propagated
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • maxi297
  • bnchrch

Want a short checklist of tests focused on traced-error propagation and the updated error-message formatting, wdyt?

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately and succinctly captures the primary change: HttpClient now surfaces/preserves the appropriate failure_type when backoff retries are exhausted and a backoff exception occurs. It names the impacted component (http_client) and describes the core behavior change, so a reviewer scanning history can understand the main intent. The phrasing is clear and not noisy or misleading.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pnilan/fix/update-backoff-exception-handling

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (1)

52-60: Guard against method-vs-property name usage and unbound stream in except.

Some stream types expose name() (method) while others expose name (property). Also, the except block can reference stream before assignment if the failure happens pre-iteration. Can we normalize the stream name and make the except safe, wdyt?

-                    message = f"Stream {stream.name} is not available: {reason}"
-                    logger.warning(message)
-                    return False, message
+                    stream_name = (
+                        stream.name() if callable(getattr(stream, "name", None)) else getattr(stream, "name", type(stream).__name__)
+                    )
+                    message = f"Stream {stream_name} is not available: {reason}"
+                    logger.warning(message)
+                    return False, message
@@
-            error_message = (
-                f"Encountered an error trying to connect to stream {stream.name}. Error: {error}"
-            )
+            stream_name = (
+                stream.name() if "stream" in locals() and callable(getattr(stream, "name", None))
+                else (getattr(stream, "name", type(stream).__name__) if "stream" in locals() else "<unknown>")
+            )
+            error_message = f"Encountered an error trying to connect to stream {stream_name}. Error: {error}"
♻️ Duplicate comments (1)
unit_tests/sources/streams/http/test_http.py (1)

575-577: Forwarding kwargs and forcing disable_retries=True looks good

This addresses the prior feedback about passing kwargs up to the parent. LGTM.

🧹 Nitpick comments (7)
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py (1)

31-85: Unify error-message prefix across handlers for consistency.

These messages now use “HTTP Status Code: ...”, while HttpStatusErrorHandler tests expect “Status Code: ...”. For operator UX and grep-ability, do you want to standardize on one prefix across handlers and tests (I'd lean “HTTP Status Code” everywhere), wdyt?

unit_tests/sources/declarative/requesters/test_http_requester.py (2)

883-885: LGTM – tests now expect transient traced error on exhaustion.

Asserting the raised type is appropriate; do you also want to assert .failure_type == transient_error for completeness, wdyt?


909-911: LGTM – exponential backoff exhaustion path covered.

Same optional check for failure_type could add signal.

unit_tests/sources/streams/http/test_http.py (4)

49-63: Pop consumed kwargs before calling super().init to avoid leaking test-only args

If a caller passes api_budget or message_repository, they’ll still be forwarded to HttpStream.init, which likely doesn’t accept them and could raise a TypeError. Can we pop those (like disable_retries) before super().init and pass the popped values into HttpClient, wdyt?

 def __init__(self, deduplicate_query_params: bool = False, **kwargs):
-        disable_retries = False
-        if "disable_retries" in kwargs:
-            disable_retries = kwargs.pop("disable_retries")
+        disable_retries = kwargs.pop("disable_retries", False)
+        # Avoid leaking test-only kwargs to HttpStream.__init__
+        api_budget = kwargs.pop("api_budget", Mock(spec=APIBudget))
+        message_repository = kwargs.pop("message_repository", Mock(spec=MessageRepository))
         super().__init__(**kwargs)
         self._http_client = HttpClient(
             name=self.name,
             logger=self.logger,
             error_handler=self.get_error_handler(),
-            api_budget=kwargs.get("api_budget", Mock(spec=APIBudget)),
-            authenticator=kwargs.get("authenticator", None),
+            api_budget=api_budget,
+            authenticator=kwargs.get("authenticator", None),
             use_cache=self.use_cache,
             backoff_strategy=self.get_backoff_strategy(),
-            message_repository=kwargs.get("message_repository", Mock(spec=MessageRepository)),
+            message_repository=message_repository,
             disable_retries=disable_retries,
         )

193-196: Also assert the failure_type is transient_error?

Since the intent of this PR is to reclassify as transient_error on exhausted retries, asserting that here would harden the test, wdyt?

-    with pytest.raises(MessageRepresentationAirbyteTracedErrors):
-        list(stream.read_records(SyncMode.full_refresh))
+    with pytest.raises(MessageRepresentationAirbyteTracedErrors) as exc:
+        list(stream.read_records(SyncMode.full_refresh))
+    assert exc.value.failure_type == FailureType.transient_error

310-314: Make the matcher less brittle and assert failure_type as well

Exact-message matching is fragile. Could we match on key fragments and verify failure_type, wdyt?

-    with pytest.raises(
-        MessageRepresentationAirbyteTracedErrors,
-        match="Exhausted available request attempts. Please see logs for more details. Exception: HTTP Status Code: 429. Error: Too many requests.",
-    ):
+    with pytest.raises(
+        MessageRepresentationAirbyteTracedErrors,
+        match=r"Exhausted available request attempts\..*HTTP Status Code:\s*429\b.*Too many requests",
+    ) as exc:
         stream.exit_on_rate_limit = True
         list(stream.read_records(SyncMode.full_refresh))
+    assert exc.value.failure_type == FailureType.transient_error

326-329: Also assert transient_error on exhausted 5xx retries?

This would validate the classification change across 5xx as well, wdyt?

-    with pytest.raises(MessageRepresentationAirbyteTracedErrors):
-        list(stream.read_records(SyncMode.full_refresh))
+    with pytest.raises(MessageRepresentationAirbyteTracedErrors) as exc:
+        list(stream.read_records(SyncMode.full_refresh))
+    assert exc.value.failure_type == FailureType.transient_error
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b42145556698a138010f5435c26fd6b4bd5ddd9 and e699bb1d37e58c8ac4983f0c041e4b21bef5348e.

📒 Files selected for processing (8)
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (1 hunks)
  • airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py (1 hunks)
  • airbyte_cdk/sources/streams/http/http_client.py (3 hunks)
  • unit_tests/sources/declarative/requesters/error_handlers/test_default_error_handler.py (2 hunks)
  • unit_tests/sources/declarative/requesters/error_handlers/test_http_response_filter.py (3 hunks)
  • unit_tests/sources/declarative/requesters/test_http_requester.py (4 hunks)
  • unit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.py (1 hunks)
  • unit_tests/sources/streams/http/test_http.py (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
unit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.py (2)
airbyte_cdk/sources/streams/http/error_handlers/response_models.py (1)
  • ResponseAction (14-19)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2)
airbyte_cdk/sources/streams/http/http_client.py (1)
  • name (530-531)
airbyte_cdk/legacy/sources/declarative/declarative_stream.py (2)
  • name (93-97)
  • name (100-102)
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py (2)
airbyte_cdk/sources/streams/http/error_handlers/response_models.py (2)
  • ErrorResolution (23-26)
  • ResponseAction (14-19)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
airbyte_cdk/sources/streams/http/http_client.py (1)
airbyte_cdk/sources/streams/http/exceptions.py (1)
  • BaseBackoffException (11-26)
unit_tests/sources/declarative/requesters/test_http_requester.py (1)
airbyte_cdk/sources/streams/http/http_client.py (1)
  • MessageRepresentationAirbyteTracedErrors (82-96)
unit_tests/sources/streams/http/test_http.py (3)
airbyte_cdk/sources/message/repository.py (1)
  • MessageRepository (45-60)
airbyte_cdk/sources/streams/call_rate.py (1)
  • APIBudget (513-627)
airbyte_cdk/sources/streams/http/http_client.py (4)
  • HttpClient (99-567)
  • MessageRepresentationAirbyteTracedErrors (82-96)
  • name (530-531)
  • send_request (533-567)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (9)
unit_tests/sources/declarative/requesters/error_handlers/test_default_error_handler.py (2)

95-95: LGTM – assertions updated to new message shape.

Matches DEFAULT_ERROR_MAPPING change.


121-121: LGTM – assertion reflects 403 mapped text.

Consistent with mapping and handler behavior.

unit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.py (2)

37-37: LGTM – expected 403 message updated.

Aligned with HttpStatusErrorHandler format.


43-43: LGTM – expected 404 message updated.

Consistent and precise.

unit_tests/sources/declarative/requesters/error_handlers/test_http_response_filter.py (3)

47-47: LGTM – ignore-action message now matches mapping.


62-62: LGTM – retry-action message now matches mapping.


108-108: LGTM – predicate case updated to new mapped 403 message.

unit_tests/sources/declarative/requesters/test_http_requester.py (2)

42-42: LGTM – switch to MessageRepresentationAirbyteTracedErrors.

This aligns with the new transient-error path on retry exhaustion.


940-942: LGTM – manifest-provided backoff respected and surfaced as transient.

Covers config-based backoff path nicely.

pnilan and others added 2 commits September 16, 2025 07:55
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
unit_tests/sources/streams/http/test_http.py (1)

213-221: This test will now fail; update to expect AirbyteTracedException and drop request/response asserts

Given HttpClient now converts backoff exhaustion into a traced error, this should expect AirbyteTracedException (not UserDefinedBackoffException) and the request/response attributes won’t exist on that exception. Shall we adjust, wdyt?

-    with pytest.raises(UserDefinedBackoffException, match="Too many requests") as excinfo:
+    with pytest.raises(AirbyteTracedException, match="Too many requests"):
         list(stream.read_records(SyncMode.full_refresh))
-    assert isinstance(excinfo.value.request, requests.PreparedRequest)
-    assert isinstance(excinfo.value.response, requests.Response)
♻️ Duplicate comments (1)
unit_tests/sources/streams/http/test_http.py (1)

604-616: Assertions inside pytest.raises are unreachable; move them outside and assert on the logged message

This mirrors an earlier review—those lines won’t run after the raise, and HttpClient stores the logger on _logger. Can we restructure like below, wdyt?

-    with pytest.raises(AirbyteTracedException):
-        _, response = stream._http_client.send_request(
-            "GET", "https://g", {}, exit_on_rate_limit=True
-        )
-        stream._http_client.logger.error.assert_called_with(response.text)
-        assert response.status_code == status_code
+    with pytest.raises(AirbyteTracedException):
+        stream._http_client.send_request("GET", "https://g", {}, exit_on_rate_limit=True)
+    logged_msg = stream._http_client._logger.error.call_args[0][0]
+    assert f"HTTP Status Code: {status_code}" in logged_msg
🧹 Nitpick comments (6)
unit_tests/sources/declarative/async_job/test_job_orchestrator.py (1)

246-248: Assert the failure_type to lock in the contract

Nice switch to AirbyteTracedException. Could we also assert the exception’s failure_type is config_error to harden the test, wdyt?

-        with pytest.raises(AirbyteTracedException):
+        with pytest.raises(AirbyteTracedException) as excinfo:
             list(orchestrator.create_and_get_completed_partitions())
+        assert excinfo.value.failure_type == FailureType.config_error
unit_tests/sources/declarative/requesters/test_http_requester.py (3)

32-32: Imports cleanup LGTM; consider import path consistency

LGTM on trimming imports and moving to AirbyteTracedException. For consistency with other tests (e.g., test_http.py), do you want to import it via airbyte_cdk.utils as well, wdyt?

Also applies to: 37-37


877-879: Capture and assert FailureType on retry exhaustion

Given the new behavior, can we assert that exhausted retries surface as transient_error to prevent regressions, wdyt?

-    with pytest.raises(AirbyteTracedException):
+    with pytest.raises(AirbyteTracedException) as excinfo:
         http_requester._http_client._send_with_retry(request=request_mock, request_kwargs={})
+    assert excinfo.value.failure_type.name == "transient_error"

Add the missing import once at the top of the file:

from airbyte_cdk.models import FailureType  # for stronger type assertions if desired

Also applies to: 903-905, 934-936


944-963: Patch the concrete session instance instead of the base class

Monkey‑patching requests.Session.send may miss calls if LimiterSession overrides send. Would you switch to patching requester._http_client._session.send directly to avoid surprises, wdyt?

-    send_mock = MagicMock(return_value=dummy_response)
-    monkeypatch.setattr(requests.Session, "send", send_mock)
+    send_mock = MagicMock(return_value=dummy_response)
+    monkeypatch.setattr(requester._http_client._session, "send", send_mock, raising=True)
unit_tests/sources/streams/http/test_http.py (2)

189-192: Expectation updated to AirbyteTracedException: consider asserting FailureType

Switch to AirbyteTracedException for 429 exhaustion looks right. Do you also want to assert failure_type is transient_error to cement intent, wdyt?

-    with pytest.raises(AirbyteTracedException):
+    with pytest.raises(AirbyteTracedException) as excinfo:
         list(stream.read_records(SyncMode.full_refresh))
+    assert getattr(excinfo.value, "failure_type").name == "transient_error"

978-981: Bug: set literal instead of dict in test payload

This returns a set, not a dict, and will break consumers expecting mappings. Shall we fix it, wdyt?

-        return [
-            {"id": "abc", "parent": stream_slice.get("id")},
-            {"id", "def", "parent", stream_slice.get("id")},
-        ]
+        return [
+            {"id": "abc", "parent": stream_slice.get("id")},
+            {"id": "def", "parent": stream_slice.get("id")},
+        ]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f2939f and 30d51b1.

📒 Files selected for processing (3)
  • unit_tests/sources/declarative/async_job/test_job_orchestrator.py (1 hunks)
  • unit_tests/sources/declarative/requesters/test_http_requester.py (4 hunks)
  • unit_tests/sources/streams/http/test_http.py (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
unit_tests/sources/declarative/async_job/test_job_orchestrator.py (2)
airbyte_cdk/sources/declarative/async_job/repository.py (1)
  • start (12-13)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
unit_tests/sources/streams/http/test_http.py (5)
airbyte_cdk/sources/message/repository.py (1)
  • MessageRepository (45-60)
airbyte_cdk/sources/streams/call_rate.py (1)
  • APIBudget (513-627)
airbyte_cdk/sources/streams/http/http_client.py (3)
  • HttpClient (99-567)
  • name (530-531)
  • send_request (533-567)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
airbyte_cdk/sources/streams/http/http.py (4)
  • get_error_handler (287-304)
  • use_cache (101-106)
  • exit_on_rate_limit (82-86)
  • exit_on_rate_limit (89-90)
unit_tests/sources/declarative/requesters/test_http_requester.py (3)
airbyte_cdk/sources/streams/call_rate.py (1)
  • HttpAPIBudget (630-677)
airbyte_cdk/sources/streams/http/exceptions.py (1)
  • RequestBodyException (29-32)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (3)
unit_tests/sources/streams/http/test_http.py (3)

45-59: Wiring HttpClient into the test stream is solid

Constructing HttpClient explicitly (with optional disable_retries) makes retry/backoff behavior testable. Looks good to me.


571-573: Passing disable_retries through the constructor: nice

Forwarding kwargs and forcing disable_retries=True for this test stream fixes the earlier initialization gap. LGTM.


299-309: Message assertion matches the new format—good

Nice update to assert the clearer “HTTP Status Code: … Error: …” string.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
airbyte_cdk/sources/streams/http/http_client.py (1)

295-313: Evict attempt counter on retry exhaustion to prevent unbounded growth.

When raising after exhaustion, _evict_key(request) isn’t called, so _request_attempt_count retains the key. Can we evict before re‑raising, wdyt?

         except BaseBackoffException as e:
             self._logger.error(f"Retries exhausted with backoff exception.", exc_info=True)
+            # Avoid unbounded growth of _request_attempt_count on exhausted retries
+            try:
+                self._evict_key(request)
+            except Exception:
+                self._logger.debug("Eviction of request attempt counter failed; continuing re-raise.", exc_info=True)
             raise MessageRepresentationAirbyteTracedErrors(
                 internal_message=f"Exhausted available request attempts. Exception: {e}",
                 message=f"Exhausted available request attempts. Please see logs for more details. Exception: {e}",
-                failure_type=e.failure_type or FailureType.system_error,
+                failure_type=e.failure_type or FailureType.transient_error,
                 exception=e,
                 stream_descriptor=StreamDescriptor(name=self._name),
             )
🧹 Nitpick comments (2)
unit_tests/sources/streams/http/test_http_client.py (1)

749-799: Add a missing “no-failure_type” fallback case to lock in desired default behavior.

Could we add a test where the backoff exception has failure_type=None (or the mapping omits it) and assert that _send_with_retry wraps it as an AirbyteTracedException with the default failure type you intend (see my comment in http_client.py), wdyt?

Example addition:

@@
 @pytest.mark.usefixtures("mock_sleep")
 @pytest.mark.parametrize(
-    "response_code, expected_failure_type, error_message, exception_class",
+    "response_code, expected_failure_type, error_message, exception_class",
     [
@@
         (403, FailureType.transient_error, "test error message", RateLimitBackoffException),
+        # No explicit failure_type in mapping/exception → should fall back to transient_error
+        (429, None, "test error message", DefaultBackoffException),
     ],
 )
 def test_send_with_retry_raises_airbyte_traced_exception_with_failure_type(
     response_code, expected_failure_type, error_message, exception_class
 ):
@@
-    assert e.value.failure_type == expected_failure_type
+    # If the mapping provided None, expect the default (transient) per intended behavior
+    assert e.value.failure_type == (expected_failure_type or FailureType.transient_error)
airbyte_cdk/sources/streams/http/http_client.py (1)

14-14: Fix Ruff I001 by consolidating imports from airbyte_cdk.models.

Ruff flagged the import block ordering. Shall we fold FailureType into the multi‑import to satisfy the linter, wdyt?

-from airbyte_cdk.models import FailureType
 from requests.auth import AuthBase

-from airbyte_cdk.models import (
+from airbyte_cdk.models import (
     AirbyteMessageSerializer,
     AirbyteStreamStatus,
     AirbyteStreamStatusReason,
     AirbyteStreamStatusReasonType,
+    FailureType,
     Level,
     StreamDescriptor,
 )

Also applies to: 17-24

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30d51b1 and b73eb3c.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/streams/http/exceptions.py (2 hunks)
  • airbyte_cdk/sources/streams/http/http_client.py (4 hunks)
  • unit_tests/sources/streams/http/test_http_client.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
unit_tests/sources/streams/http/test_http_client.py (5)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
airbyte_cdk/sources/streams/http/exceptions.py (3)
  • UserDefinedBackoffException (39-63)
  • DefaultBackoffException (66-79)
  • RateLimitBackoffException (82-95)
airbyte_cdk/sources/streams/http/error_handlers/response_models.py (2)
  • ErrorResolution (23-26)
  • ResponseAction (14-19)
airbyte_cdk/sources/streams/http/error_handlers/http_status_error_handler.py (2)
  • HttpStatusErrorHandler (22-110)
  • max_retries (41-42)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
airbyte_cdk/sources/streams/http/exceptions.py (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/streams/http/http_client.py

[error] 5-5: Ruff I001: Import block is un-sorted or un-formatted. Command: 'poetry run ruff check .'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (Fast)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/streams/http/exceptions.py (1)

10-11: LGTM: failure_type is now first‑class on backoff exceptions.

Propagating failure_type through BaseBackoffException and subclasses looks correct and keeps signatures backward‑compatible. Nice work.

Also applies to: 19-22, 58-63, 74-79, 90-95

airbyte_cdk/sources/streams/http/http_client.py (1)

507-508: Nice: failure_type threaded into all backoff exceptions.

Passing failure_type=error_resolution.failure_type into UserDefinedBackoffException, RateLimitBackoffException, and DefaultBackoffException aligns with the new propagation. 👍

Also applies to: 511-516, 518-523

@pnilan pnilan requested a review from maxi297 September 16, 2025 21:47
@pnilan pnilan changed the title feat(http_client): transient_error is raised after exhausting request attempts and failing w/ backoff exception feat(http_client): respective failure_type is raised after exhausting request attempts and failing w/ backoff exception Sep 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
unit_tests/sources/streams/http/test_availability_strategy.py (1)

108-111: Reduce brittleness by asserting substrings/prefix instead of full tuple equality

Comparing the entire error string makes the test fragile to benign copy tweaks in the default error mapping. Would you switch to checking the boolean plus stable substrings/prefix so the intent is preserved while keeping the test resilient, wdyt?

-    assert (
-        False,
-        "HTTP Status Code: 404. Error: Not found. The requested resource was not found on the server.",
-    ) == HttpAvailabilityStrategy().check_availability(http_stream, logger)
+    is_available, reason = HttpAvailabilityStrategy().check_availability(http_stream, logger)
+    assert is_available is False
+    assert reason.startswith("HTTP Status Code: 404. Error: Not found.")
+    assert "requested resource was not found" in reason
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f104e70 and f1df8af.

📒 Files selected for processing (1)
  • unit_tests/sources/streams/http/test_availability_strategy.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Manifest Server Docker Image Build

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
unit_tests/sources/streams/http/test_http.py (1)

607-617: Fix unreachable assertions and target the right logger attribute.

Assertions inside pytest.raises won’t execute, and HttpClient uses _logger (not logger). Move checks after the context and assert on message content to avoid brittle equality. Wdyt?

-    mocker.patch.object(stream._http_client, "_logger")
-    with pytest.raises(AirbyteTracedException):
-        _, response = stream._http_client.send_request(
-            "GET", "https://g", {}, exit_on_rate_limit=True
-        )
-        stream._http_client.logger.error.assert_called_with(response.text)
-        assert response.status_code == status_code
+    mocker.patch.object(stream._http_client, "_logger")
+    with pytest.raises(AirbyteTracedException):
+        stream._http_client.send_request("GET", "https://g", {}, exit_on_rate_limit=True)
+    logged_msg = stream._http_client._logger.error.call_args[0][0]
+    assert str(status_code) in logged_msg
🧹 Nitpick comments (5)
unit_tests/sources/streams/http/test_http_client.py (3)

682-690: Make RATE_LIMITED status print-count robust (splitlines instead of split).

split() counts whitespace tokens and can miscount JSON payloads; splitlines() matches how we print one line per status. Wdyt?

-        trace_messages = capsys.readouterr().out.split()
-        assert len(trace_messages) == mocked_send.call_count
+        out = capsys.readouterr().out
+        trace_lines = [line for line in out.splitlines() if line.strip()]
+        assert len(trace_lines) == mocked_send.call_count

693-720: Remove unused expected_error param in endless backoff test.

Since exhaustion now always surfaces as AirbyteTracedException, expected_error is dead weight. Shall we drop it for clarity, wdyt?

-@pytest.mark.parametrize(
-    "exit_on_rate_limit, expected_call_count, expected_error",
-    [[True, 6, DefaultBackoffException], [False, 6, RateLimitBackoffException]],
-)
+@pytest.mark.parametrize(
+    "exit_on_rate_limit, expected_call_count",
+    [[True, 6], [False, 6]],
+)
 def test_backoff_strategy_endless(
-    exit_on_rate_limit: bool, expected_call_count: int, expected_error: Exception
+    exit_on_rate_limit: bool, expected_call_count: int
 ):
@@
-        assert mocked_send.call_count == expected_call_count
+        assert mocked_send.call_count == expected_call_count

749-800: Add a test for “missing failure_type → defaults to system_error.”

Right now we cover explicit failure_type propagation. Could we also assert the fallback when the backoff exception doesn’t set it, to lock in the contract, wdyt?

@@
     assert e.value.failure_type == expected_failure_type
+
+def test_send_with_retry_defaults_failure_type_to_system_error_when_missing():
+    http_client = HttpClient(
+        name="test",
+        logger=MagicMock(spec=logging.Logger),
+        error_handler=HttpStatusErrorHandler(logger=MagicMock(), max_retries=1),
+    )
+    http_client._send = MagicMock()
+    http_client._send.__name__ = "_send"
+    prepared_request = MagicMock(spec=requests.PreparedRequest)
+    mocked_response = MagicMock(spec=requests.Response)
+    mocked_response.status_code = 400
+    http_client._send.side_effect = DefaultBackoffException(
+        request=prepared_request,
+        response=mocked_response,
+        error_message="test error message",
+        failure_type=None,
+    )
+    with pytest.raises(AirbyteTracedException) as e:
+        http_client._send_with_retry(prepared_request, {})
+    assert e.value.failure_type == FailureType.system_error
unit_tests/sources/streams/http/test_http.py (2)

36-36: Prefer canonical import path for AirbyteTracedException.

To avoid relying on re-exports from airbyte_cdk.utils, can we import from traced_exception directly for consistency with other tests, wdyt?

-from airbyte_cdk.utils import AirbyteTracedException
+from airbyte_cdk.utils.traced_exception import AirbyteTracedException

45-59: Pop non-HttpStream kwargs before super().init to avoid leaking unknown args.

If callers pass api_budget/message_repository, those would be forwarded to HttpStream.init, which may reject them. Shall we pop them up-front and pass them only to HttpClient, wdyt?

-    def __init__(self, deduplicate_query_params: bool = False, **kwargs):
-        disable_retries = False
-        if "disable_retries" in kwargs:
-            disable_retries = kwargs.pop("disable_retries")
-        super().__init__(**kwargs)
-        self._http_client = HttpClient(
+    def __init__(self, deduplicate_query_params: bool = False, **kwargs):
+        disable_retries = bool(kwargs.pop("disable_retries", False))
+        # Pop extras that parent may not accept; keep authenticator for parent.
+        api_budget = kwargs.pop("api_budget", Mock(spec=APIBudget))
+        message_repository = kwargs.pop("message_repository", Mock(spec=MessageRepository))
+        authenticator = kwargs.get("authenticator", None)
+        super().__init__(**kwargs)
+        self._http_client = HttpClient(
             name=self.name,
             logger=self.logger,
             error_handler=self.get_error_handler(),
-            api_budget=kwargs.get("api_budget", Mock(spec=APIBudget)),
-            authenticator=kwargs.get("authenticator", None),
+            api_budget=api_budget,
+            authenticator=authenticator,
             use_cache=self.use_cache,
             backoff_strategy=self.get_backoff_strategy(),
-            message_repository=kwargs.get("message_repository", Mock(spec=MessageRepository)),
-            disable_retries=disable_retries,
+            message_repository=message_repository,
+            disable_retries=disable_retries,
         )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f1df8af and c86a055.

📒 Files selected for processing (3)
  • unit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.py (1 hunks)
  • unit_tests/sources/streams/http/test_http.py (12 hunks)
  • unit_tests/sources/streams/http/test_http_client.py (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.py
🧰 Additional context used
🧬 Code graph analysis (2)
unit_tests/sources/streams/http/test_http_client.py (5)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
airbyte_cdk/sources/streams/http/exceptions.py (3)
  • UserDefinedBackoffException (39-63)
  • DefaultBackoffException (66-67)
  • RateLimitBackoffException (70-71)
airbyte_cdk/sources/streams/http/error_handlers/response_models.py (2)
  • ErrorResolution (23-26)
  • ResponseAction (14-19)
airbyte_cdk/sources/streams/http/http_client.py (3)
  • name (533-534)
  • _send (315-384)
  • _send_with_retry (266-313)
unit_tests/sources/streams/http/test_http.py (4)
airbyte_cdk/sources/message/repository.py (1)
  • MessageRepository (45-60)
airbyte_cdk/sources/streams/call_rate.py (1)
  • APIBudget (513-627)
airbyte_cdk/sources/streams/http/http_client.py (3)
  • HttpClient (99-570)
  • name (533-534)
  • send_request (536-570)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Analyze (python)
🔇 Additional comments (9)
unit_tests/sources/streams/http/test_http_client.py (4)

558-563: LGTM: disable_retries → single attempt + AirbyteTracedException

This aligns with the new exhaustion behavior. Nice.


585-591: LGTM: default max_retries honored (6 calls total)

Matches the decorator behavior (5 retries + 1).


616-621: LGTM: custom max_retries honored

Assertion is precise and stable.


655-660: LGTM: max_time short-circuits retries

Good guard on time-bounded retries.

unit_tests/sources/streams/http/test_http.py (5)

189-192: LGTM: exhaustion surfaced as AirbyteTracedException

Good coverage for custom backoff_time path.


213-215: LGTM: precise message match for 429 exhaustion path

Assertion is crisp and exercises the new message format.


250-252: LGTM: 4xx mapped to AirbyteTracedException

Matches the new FAIL path.


282-289: LGTM: secrets filtering assertions

Nice verification of header/body redaction.


571-573: LGTM: AutoFailTrue disables retries at construction

This cleanly forces single-attempt behavior for these tests.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments but nothing blocking. Thanks @pnilan !

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
unit_tests/sources/streams/http/test_http_client.py (1)

795-802: Consider HttpMocker for consistency across http tests

Nice move to use the public send_request here. For consistency with the rest of the suite, would you consider HttpMocker instead of requests_mock, wdyt?

🧹 Nitpick comments (9)
airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (3)

241-247: Consider treating HTTP 408 as transient and backoff‑worthy as well

408 Request Timeout is typically transient (network hiccup, gateway timeout behavior upstream). Including it here would keep semantics consistent with 429/5xx for the refresh flow. Wdyt?

-                if e.response.status_code == 429 or e.response.status_code >= 500:
+                if e.response.status_code in (408, 429) or e.response.status_code >= 500:
                     raise DefaultBackoffException(
                         request=e.response.request,
                         response=e.response,
                         failure_type=FailureType.transient_error,
                     )

187-194: Add jitter to backoff to avoid synchronized retries

Would you consider enabling jitter to reduce retry stampedes, especially in shared auth endpoints? Backoff supports full_jitter out‑of‑the‑box. Wdyt?

 @backoff.on_exception(
-    backoff.expo,
+    backoff.expo,
     DefaultBackoffException,
     on_backoff=lambda details: logger.info(
         f"Caught retryable error after {details['tries']} tries. Waiting {details['wait']} seconds then retrying..."
     ),
-    max_time=300,
+    max_time=300,
+    jitter=backoff.full_jitter,
 )

28-33: Nit: typo in exception name (“Recurtion” → “Recursion”)

Minor naming nit that can trip readers. If you ever touch this area, would you consider renaming to ResponseKeysMaxRecursionReached and keeping a temporary alias for BC? Wdyt?

unit_tests/sources/streams/http/test_http_client.py (6)

558-563: Right expectation; small assertion can harden this test

Since we now surface an AirbyteTracedException, would you add a quick check that the failure_type is set (e.g., assert isinstance(e.value.failure_type, FailureType)) to guard regressions, wdyt?


586-591: Avoid relying on the default retry count

The “6 calls” assertion ties this test to today’s default. Would you set max_retries explicitly (e.g., 5) and assert retries + 1 instead, to keep the test stable if defaults change, wdyt?


683-690: Count TRACE messages by lines, not words

split() counts tokens and can be flaky. Prefer splitting by lines and filtering for TRACE entries, wdyt?

-        trace_messages = capsys.readouterr().out.split()
-        assert len(trace_messages) == mocked_send.call_count
+        out = capsys.readouterr().out
+        trace_lines = [ln for ln in out.splitlines() if '"type"' in ln and 'TRACE' in ln]
+        assert len(trace_lines) == mocked_send.call_count

769-773: Subclass BackoffStrategy for type consistency

Would you inherit from BackoffStrategy to match the production interface and aid IDE/type tools, wdyt?

-        class CustomBackoffStrategy:
+        class CustomBackoffStrategy(BackoffStrategy):
             def backoff_time(self, response_or_exception, attempt_count):
                 return 0.1

786-794: Avoid passing None for backoff_strategy

Passing None may shadow defaults. Would you only include the kwarg when defined, wdyt?

-    http_client = HttpClient(
-        name="test",
-        logger=MagicMock(spec=logging.Logger),
-        error_handler=HttpStatusErrorHandler(
-            logger=MagicMock(), error_mapping=error_mapping, max_retries=1
-        ),
-        backoff_strategy=backoff_strategy,
-    )
+    kwargs = {}
+    if backoff_strategy is not None:
+        kwargs["backoff_strategy"] = backoff_strategy
+    http_client = HttpClient(
+        name="test",
+        logger=MagicMock(spec=logging.Logger),
+        error_handler=HttpStatusErrorHandler(
+            logger=MagicMock(), error_mapping=error_mapping, max_retries=1
+        ),
+        **kwargs,
+    )

803-806: Strengthen assertion with request attempt count

Since max_retries=1, asserting two attempts ensures we exercised the retry path. Would you add this line, wdyt?

     with pytest.raises(AirbyteTracedException) as e:
         http_client.send_request(http_method="get", url="https://airbyte.io/", request_kwargs={})
-    assert e.value.failure_type == expected_failure_type
+    assert requests_mock.call_count == 2
+    assert e.value.failure_type == expected_failure_type
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c86a055 and 0462263.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (1 hunks)
  • unit_tests/sources/streams/http/test_http_client.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
unit_tests/sources/streams/http/test_http_client.py (5)
airbyte_cdk/utils/traced_exception.py (1)
  • AirbyteTracedException (25-145)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
  • FailureType (743-746)
airbyte_cdk/sources/streams/http/exceptions.py (3)
  • UserDefinedBackoffException (39-63)
  • DefaultBackoffException (66-67)
  • RateLimitBackoffException (70-71)
airbyte_cdk/sources/streams/http/error_handlers/response_models.py (2)
  • ResponseAction (14-19)
  • ErrorResolution (23-26)
airbyte_cdk/sources/streams/http/http_client.py (1)
  • name (533-534)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (1)

243-247: Preserve failure_type=FailureType.transient_error on 429/5xx refresh — add tests & verify other call sites

Preserving failure_type=FailureType.transient_error ensures exhausted retries surface as a transient error. Could you add tests asserting this path for both 429 and a representative 5xx during token refresh? Wdyt?

Quick pointers I observed from a scan:

  • airbyte_cdk/sources/streams/http/requests_native_auth/abstract_oauth.py (≈241–247)
  • airbyte_cdk/sources/streams/http/http_client.py (≈518)
  • DefaultBackoffException is defined in airbyte_cdk/sources/streams/http/exceptions.py

Can you confirm other DefaultBackoffException invocations also pass failure_type? If helpful, run this AST-based check (avoids multiline regex pitfalls) to find invocations missing failure_type:

#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import ast, pathlib
repo_root = pathlib.Path('.')
py_files = [p for p in repo_root.rglob('*.py') if '.venv' not in str(p) and 'site-packages' not in str(p)]
missing = []
for p in py_files:
    try:
        src = p.read_text()
        tree = ast.parse(src)
    except Exception:
        continue
    lines = src.splitlines()
    for node in ast.walk(tree):
        if isinstance(node, ast.Call):
            func = node.func
            name = None
            if isinstance(func, ast.Name):
                name = func.id
            elif isinstance(func, ast.Attribute):
                name = func.attr
            if name == 'DefaultBackoffException':
                has_failure = any(kw.arg == 'failure_type' for kw in node.keywords if kw.arg)
                start = max(0, getattr(node, 'lineno', 1) - 3)
                end = getattr(node, 'end_lineno', getattr(node, 'lineno', 1) + 5)
                end = min(len(lines), end + 2)
                snippet = '\\n'.join(lines[start:end])
                if not has_failure:
                    missing.append((str(p), getattr(node, 'lineno', '?'), snippet))
if missing:
    print('DefaultBackoffException call sites missing failure_type:')
    for f,l,s in missing:
        print(f'{f}:{l}')
        print(s)
        print('-'*80)
else:
    print('All DefaultBackoffException call sites include failure_type (or none found).')
PY

Wdyt?

unit_tests/sources/streams/http/test_http_client.py (2)

616-621: LGTM on explicit max_retries path

Clear and robust verification of retries + 1.


655-660: Mocked sleep advances frozen time — no change required

unit_tests/conftest.py's mock_sleep uses freezegun.freeze_time and monkeypatch.setattr("time.sleep", lambda x: frozen_datetime.tick(x)), so the fixture advances frozen time (time.time and time.monotonic) and the max_time budgeting should be deterministic — wdyt?

@pnilan pnilan merged commit e7f1e4c into main Sep 17, 2025
28 of 29 checks passed
@pnilan pnilan deleted the pnilan/fix/update-backoff-exception-handling branch September 17, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants