Skip to content

Conversation

ChristoGrab
Copy link
Collaborator

@ChristoGrab ChristoGrab commented Sep 24, 2025

What

Connector builder error messages are bulky and hard to parse for average users because they include the full config and catalog by default in the user-facing message. By moving those details to the error's internal message and only surfacing the original issue we can make our messaging clearer and more actionable to our users

I am currently looking into testing this locally in the builder, as the old flow for running the server against a local CDK was broken when we removed the airbyte-builder-server.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling for stream reading: user-facing messages are shorter and no longer include configuration or catalog details, reducing noise and preventing sensitive data exposure. Detailed context is still available internally for troubleshooting.
  • Tests

    • Added a unit test to verify that error responses separate concise user-facing messages from detailed internal diagnostics, including proper inclusion of the original error and affected stream.

@github-actions github-actions bot added the chore label Sep 24, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@christo/sanitize-user-error#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch christo/sanitize-user-error

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link

github-actions bot commented Sep 24, 2025

PyTest Results (Fast)

3 758 tests  +1   3 746 ✅ +1   6m 17s ⏱️ -2s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 4382381. ± Comparison against base commit a1428bf.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Sep 24, 2025

PyTest Results (Full)

3 761 tests  +1   3 749 ✅ +2   10m 58s ⏱️ ±0s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌  - 1 

Results for commit 4382381. ± Comparison against base commit a1428bf.

♻️ This comment has been updated with latest results.

@ChristoGrab ChristoGrab marked this pull request as ready for review September 24, 2025 19:59
@Copilot Copilot AI review requested due to automatic review settings September 24, 2025 19:59
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR sanitizes connector builder error messages by separating user-facing messages from technical debugging information. The change improves user experience by providing cleaner, more actionable error messages while preserving detailed context for debugging purposes.

Key changes:

  • Modified error handling to use concise user-facing messages without config/catalog dumps
  • Added internal_message field to preserve full technical context for debugging
  • Added comprehensive test coverage to verify the error message sanitization behavior

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
airbyte_cdk/connector_builder/connector_builder_handler.py Updated error handling to separate user-friendly messages from technical debugging details
unit_tests/connector_builder/test_connector_builder_handler.py Added test to verify error messages are properly sanitized and technical details preserved

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

coderabbitai bot commented Sep 24, 2025

📝 Walkthrough

Walkthrough

Exception handling in read_stream now separates user-facing and internal messages: the public error includes only the stream name and the underlying error, while a new internal_message captures config and catalog context. A corresponding unit test verifies this behavior. No public interfaces changed.

Changes

Cohort / File(s) Summary
Handler error messaging update
airbyte_cdk/connector_builder/connector_builder_handler.py
Adjusted read_stream error handling to create an AirbyteTracedException with a concise user-facing message and a detailed internal_message containing config/catalog context; overall control flow unchanged.
Unit tests
unit_tests/connector_builder/test_connector_builder_handler.py
Added test ensuring the error TRACE includes a short user message without config/catalog details and a verbose internal_message with full context.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Caller
  participant H as ConnectorBuilderHandler.read_stream
  participant R as Reader
  participant E as AirbyteTracedException

  C->>H: read_stream(stream_name, config, catalog)
  H->>R: read(stream_name, config, catalog)
  R--xH: throws Exception(err)

  rect rgba(255,235,230,0.6)
  note right of H: Error path (updated)
  H->>E: Build with user_message="Error reading stream {name}: {err}"<br/>internal_message="...stream/config/catalog..."
  H-->>C: TRACE message (type=ERROR) from E
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

security

Suggested reviewers

  • bnchrch
  • brianjlai

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly describes the main change by indicating that connector builder error messages are being sanitized and aligns with the PR’s objective to improve user-facing messaging clarity without including unnecessary technical details.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch christo/sanitize-user-error

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1428bf and 4382381.

📒 Files selected for processing (2)
  • airbyte_cdk/connector_builder/connector_builder_handler.py (1 hunks)
  • unit_tests/connector_builder/test_connector_builder_handler.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
unit_tests/connector_builder/test_connector_builder_handler.py (2)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
  • streams (388-420)
  • TestLimits (96-107)
airbyte_cdk/connector_builder/connector_builder_handler.py (1)
  • read_stream (90-130)
airbyte_cdk/connector_builder/connector_builder_handler.py (2)
airbyte_cdk/utils/traced_exception.py (2)
  • AirbyteTracedException (25-145)
  • from_exception (103-121)
airbyte_cdk/utils/airbyte_secrets_utils.py (1)
  • filter_secrets (73-80)
🔇 Additional comments (1)
unit_tests/connector_builder/test_connector_builder_handler.py (1)

1411-1478: Nice coverage for sanitized builder errors.

The assertions clearly lock in the intended split between user-facing and internal messages, and the mocked failure mode feels representative. Looks great.

Comment on lines +120 to 129
# - message: user-friendly error for display
# - internal_message: technical details for debugging (including config/catalog)
error = AirbyteTracedException.from_exception(
exc,
message=filter_secrets(
f"Error reading stream with config={config} and catalog={configured_catalog}: {str(exc)}"
),
message=filter_secrets(f"Error reading stream {stream_name}: {str(exc)}"),
)
# Override internal_message to include context for debugging
error.internal_message = filter_secrets(
f"Error reading stream {stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Prevent UnboundLocalError when stream_name is unavailable.

Lines 124-128 assume stream_name was set earlier, but if the exception fires before that assignment (e.g., the catalog has zero streams or run_test_read raises during evaluation of the name) the except block will hit an UnboundLocalError, masking the real failure instead of emitting the sanitized trace. Could we stash the name ahead of the try/except and fall back to a sentinel when it's missing so the handler still returns a useful error message, wdyt?

 def read_stream(
     source: ConcurrentDeclarativeSource,
     config: Mapping[str, Any],
     configured_catalog: ConfiguredAirbyteCatalog,
     state: List[AirbyteStateMessage],
     limits: TestLimits,
 ) -> AirbyteMessage:
-    try:
+    stream_name: str | None = None
+    try:
         test_read_handler = TestReader(
             limits.max_pages_per_slice, limits.max_slices, limits.max_records
         )
         # The connector builder only supports a single stream
         stream_name = configured_catalog.streams[0].stream.name
@@
-    except Exception as exc:
+    except Exception as exc:
+        safe_stream_name = stream_name or "<unknown stream>"
         # - message: user-friendly error for display
         # - internal_message: technical details for debugging (including config/catalog)
         error = AirbyteTracedException.from_exception(
             exc,
-            message=filter_secrets(f"Error reading stream {stream_name}: {str(exc)}"),
+            message=filter_secrets(f"Error reading stream {safe_stream_name}: {str(exc)}"),
         )
         # Override internal_message to include context for debugging
         error.internal_message = filter_secrets(
-            f"Error reading stream {stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
+            f"Error reading stream {safe_stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
         )
         return error.as_airbyte_message()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# - message: user-friendly error for display
# - internal_message: technical details for debugging (including config/catalog)
error = AirbyteTracedException.from_exception(
exc,
message=filter_secrets(
f"Error reading stream with config={config} and catalog={configured_catalog}: {str(exc)}"
),
message=filter_secrets(f"Error reading stream {stream_name}: {str(exc)}"),
)
# Override internal_message to include context for debugging
error.internal_message = filter_secrets(
f"Error reading stream {stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
)
def read_stream(
source: ConcurrentDeclarativeSource,
config: Mapping[str, Any],
configured_catalog: ConfiguredAirbyteCatalog,
state: List[AirbyteStateMessage],
limits: TestLimits,
) -> AirbyteMessage:
stream_name: str | None = None
try:
test_read_handler = TestReader(
limits.max_pages_per_slice, limits.max_slices, limits.max_records
)
# The connector builder only supports a single stream
stream_name = configured_catalog.streams[0].stream.name
# ... any additional logic here ...
except Exception as exc:
safe_stream_name = stream_name or "<unknown stream>"
# - message: user-friendly error for display
# - internal_message: technical details for debugging (including config/catalog)
error = AirbyteTracedException.from_exception(
exc,
message=filter_secrets(f"Error reading stream {safe_stream_name}: {str(exc)}"),
)
# Override internal_message to include context for debugging
error.internal_message = filter_secrets(
f"Error reading stream {safe_stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
)
return error.as_airbyte_message()
🤖 Prompt for AI Agents
In airbyte_cdk/connector_builder/connector_builder_handler.py around lines 120
to 129, the except block references stream_name which can be unassigned if the
exception occurs before it is set; stash a local sentinel variable (e.g.,
safe_stream_name = "<unknown_stream>") before the try/except and update any
references in the AirbyteTracedException.from_exception call and
error.internal_message to use safe_stream_name so an UnboundLocalError cannot
mask the original exception and the handler still returns a sanitized,
contextual error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant