-
Notifications
You must be signed in to change notification settings - Fork 30
chore: sanitize connector builder error messages #771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@christo/sanitize-user-error#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch christo/sanitize-user-error Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR sanitizes connector builder error messages by separating user-facing messages from technical debugging information. The change improves user experience by providing cleaner, more actionable error messages while preserving detailed context for debugging purposes.
Key changes:
- Modified error handling to use concise user-facing messages without config/catalog dumps
- Added internal_message field to preserve full technical context for debugging
- Added comprehensive test coverage to verify the error message sanitization behavior
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
airbyte_cdk/connector_builder/connector_builder_handler.py | Updated error handling to separate user-friendly messages from technical debugging details |
unit_tests/connector_builder/test_connector_builder_handler.py | Added test to verify error messages are properly sanitized and technical details preserved |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
📝 WalkthroughWalkthroughException handling in read_stream now separates user-facing and internal messages: the public error includes only the stream name and the underlying error, while a new internal_message captures config and catalog context. A corresponding unit test verifies this behavior. No public interfaces changed. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Caller
participant H as ConnectorBuilderHandler.read_stream
participant R as Reader
participant E as AirbyteTracedException
C->>H: read_stream(stream_name, config, catalog)
H->>R: read(stream_name, config, catalog)
R--xH: throws Exception(err)
rect rgba(255,235,230,0.6)
note right of H: Error path (updated)
H->>E: Build with user_message="Error reading stream {name}: {err}"<br/>internal_message="...stream/config/catalog..."
H-->>C: TRACE message (type=ERROR) from E
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/connector_builder/connector_builder_handler.py
(1 hunks)unit_tests/connector_builder/test_connector_builder_handler.py
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
unit_tests/connector_builder/test_connector_builder_handler.py (2)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
streams
(388-420)TestLimits
(96-107)airbyte_cdk/connector_builder/connector_builder_handler.py (1)
read_stream
(90-130)
airbyte_cdk/connector_builder/connector_builder_handler.py (2)
airbyte_cdk/utils/traced_exception.py (2)
AirbyteTracedException
(25-145)from_exception
(103-121)airbyte_cdk/utils/airbyte_secrets_utils.py (1)
filter_secrets
(73-80)
🔇 Additional comments (1)
unit_tests/connector_builder/test_connector_builder_handler.py (1)
1411-1478
: Nice coverage for sanitized builder errors.The assertions clearly lock in the intended split between user-facing and internal messages, and the mocked failure mode feels representative. Looks great.
# - message: user-friendly error for display | ||
# - internal_message: technical details for debugging (including config/catalog) | ||
error = AirbyteTracedException.from_exception( | ||
exc, | ||
message=filter_secrets( | ||
f"Error reading stream with config={config} and catalog={configured_catalog}: {str(exc)}" | ||
), | ||
message=filter_secrets(f"Error reading stream {stream_name}: {str(exc)}"), | ||
) | ||
# Override internal_message to include context for debugging | ||
error.internal_message = filter_secrets( | ||
f"Error reading stream {stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent UnboundLocalError when stream_name
is unavailable.
Lines 124-128 assume stream_name
was set earlier, but if the exception fires before that assignment (e.g., the catalog has zero streams or run_test_read
raises during evaluation of the name) the except block will hit an UnboundLocalError
, masking the real failure instead of emitting the sanitized trace. Could we stash the name ahead of the try/except and fall back to a sentinel when it's missing so the handler still returns a useful error message, wdyt?
def read_stream(
source: ConcurrentDeclarativeSource,
config: Mapping[str, Any],
configured_catalog: ConfiguredAirbyteCatalog,
state: List[AirbyteStateMessage],
limits: TestLimits,
) -> AirbyteMessage:
- try:
+ stream_name: str | None = None
+ try:
test_read_handler = TestReader(
limits.max_pages_per_slice, limits.max_slices, limits.max_records
)
# The connector builder only supports a single stream
stream_name = configured_catalog.streams[0].stream.name
@@
- except Exception as exc:
+ except Exception as exc:
+ safe_stream_name = stream_name or "<unknown stream>"
# - message: user-friendly error for display
# - internal_message: technical details for debugging (including config/catalog)
error = AirbyteTracedException.from_exception(
exc,
- message=filter_secrets(f"Error reading stream {stream_name}: {str(exc)}"),
+ message=filter_secrets(f"Error reading stream {safe_stream_name}: {str(exc)}"),
)
# Override internal_message to include context for debugging
error.internal_message = filter_secrets(
- f"Error reading stream {stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
+ f"Error reading stream {safe_stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}"
)
return error.as_airbyte_message()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# - message: user-friendly error for display | |
# - internal_message: technical details for debugging (including config/catalog) | |
error = AirbyteTracedException.from_exception( | |
exc, | |
message=filter_secrets( | |
f"Error reading stream with config={config} and catalog={configured_catalog}: {str(exc)}" | |
), | |
message=filter_secrets(f"Error reading stream {stream_name}: {str(exc)}"), | |
) | |
# Override internal_message to include context for debugging | |
error.internal_message = filter_secrets( | |
f"Error reading stream {stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}" | |
) | |
def read_stream( | |
source: ConcurrentDeclarativeSource, | |
config: Mapping[str, Any], | |
configured_catalog: ConfiguredAirbyteCatalog, | |
state: List[AirbyteStateMessage], | |
limits: TestLimits, | |
) -> AirbyteMessage: | |
stream_name: str | None = None | |
try: | |
test_read_handler = TestReader( | |
limits.max_pages_per_slice, limits.max_slices, limits.max_records | |
) | |
# The connector builder only supports a single stream | |
stream_name = configured_catalog.streams[0].stream.name | |
# ... any additional logic here ... | |
except Exception as exc: | |
safe_stream_name = stream_name or "<unknown stream>" | |
# - message: user-friendly error for display | |
# - internal_message: technical details for debugging (including config/catalog) | |
error = AirbyteTracedException.from_exception( | |
exc, | |
message=filter_secrets(f"Error reading stream {safe_stream_name}: {str(exc)}"), | |
) | |
# Override internal_message to include context for debugging | |
error.internal_message = filter_secrets( | |
f"Error reading stream {safe_stream_name} with config={config} and catalog={configured_catalog}: {str(exc)}" | |
) | |
return error.as_airbyte_message() |
🤖 Prompt for AI Agents
In airbyte_cdk/connector_builder/connector_builder_handler.py around lines 120
to 129, the except block references stream_name which can be unassigned if the
exception occurs before it is set; stash a local sentinel variable (e.g.,
safe_stream_name = "<unknown_stream>") before the try/except and update any
references in the AirbyteTracedException.from_exception call and
error.internal_message to use safe_stream_name so an UnboundLocalError cannot
mask the original exception and the handler still returns a sanitized,
contextual error.
What
Connector builder error messages are bulky and hard to parse for average users because they include the full config and catalog by default in the user-facing message. By moving those details to the error's internal message and only surfacing the original issue we can make our messaging clearer and more actionable to our users
I am currently looking into testing this locally in the builder, as the old flow for running the server against a local CDK was broken when we removed the airbyte-builder-server.
Summary by CodeRabbit
Bug Fixes
Tests