-
Notifications
You must be signed in to change notification settings - Fork 30
fix: make concurrent declarative source test order-agnostic #742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The test_read_concurrent_declarative_source test was failing on Python 3.13 due to race conditions in concurrent processing that caused records to be returned in different orders. This change replaces the strict list equality assertion with a sorted comparison that validates the same records are present without caring about their order. Fixes the flaky test failure: test_no_pagination_with_partition_router-manifest4-pages4-expected_records4-expected_calls4 Co-Authored-By: AJ Steers <[email protected]>
Original prompt from AJ Steers
FAILED unit_tests/sources/declarative/test_concurrent_declarative_source.py::test_read_concurrent_declarative_source[test_no_pagination_with_partition_router-manifest4-pages4-expected_records4-expected_calls4] - AssertionError: assert ['ABC': 2, '...artition': 0] == ['ABC': 0, '...artition': 1]
Brian Lai (U03APJQ895J): damn, well that's a new one. i'll give this a quick look Brian Lai (U03APJQ895J): interesting, that's new i didn't realize we were testing 12 and 13 on CI, but that tracks since AJ has been working on bumping to newer versions Brian Lai (U03APJQ895J): the output is correct on both side, just ordering by the looks of it. I'm curious if changing python versions would change anything about the threading or which partition ends up processed first, so i'll kick off another run to see. The test could also stand to be less brittle and check existence instead of strict ordering since we don't technically guarantee that an arbitrary parent partition is processed in order Brian Lai (U03APJQ895J): it def doesn't have anything to do with you change tho obviously AJ Steers (U05AKF1BCC9): I ran into into this also. Looks like maybe flaky on 3.13? (Race condition ordering?) https://github.com/airbytehq/airbyte-python-cdk/actions/runs/17332099415/job/49210272368?pr=738 AJ Steers (U05AKF1BCC9): <@U03APJQ895J> - We do have the option to mark that as a con... (1085 chars truncated...)
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1756509957-fix-concurrent-test-ordering#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1756509957-fix-concurrent-test-ordering Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Fixes a flaky test in the concurrent declarative source test suite by making the test order-agnostic. The test was failing intermittently on Python 3.13 due to race conditions where concurrent threads returned records in different orders.
- Replaces strict list equality check with sorted comparison using a composite sort key
- Maintains test validity while accommodating non-deterministic ordering in concurrent processing
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
message.record.data for message in _run_read(manifest, _stream_name) if message.record | ||
] | ||
assert output_data == expected_records | ||
assert sorted(output_data, key=lambda x: (x.get("partition", 0), x.get("ABC", 0), x.get("AED", 0))) == sorted(expected_records, key=lambda x: (x.get("partition", 0), x.get("ABC", 0), x.get("AED", 0))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sorting key logic is duplicated and creates a very long line. Consider extracting the key function to improve readability and reduce duplication: sort_key = lambda x: (x.get(\"partition\", 0), x.get(\"ABC\", 0), x.get(\"AED\", 0))
then use assert sorted(output_data, key=sort_key) == sorted(expected_records, key=sort_key)
Copilot uses AI. Check for mistakes.
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThe test for concurrent declarative source was changed to perform an order-insensitive comparison: both actual and expected record lists are sorted by a composite key (partition default 0, ABC default 0, AED default 0) before asserting equality. Changes
Sequence Diagram(s)Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Fixes the Ruff Format Check CI failure by breaking the long assertion line into multiple lines for better readability. Co-Authored-By: AJ Steers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one adjustment then can ✅
unit_tests/sources/declarative/test_concurrent_declarative_source.py
Outdated
Show resolved
Hide resolved
Addresses PR comment about duplicated lambda function and long line. Extracts the sorting key logic into a reusable variable to eliminate duplication and improve code readability while maintaining the same order-agnostic comparison functionality. Co-Authored-By: AJ Steers <[email protected]>
unit_tests/sources/declarative/test_concurrent_declarative_source.py
Outdated
Show resolved
Hide resolved
/format-fix |
fix: make concurrent declarative source test order-agnostic
Summary
Fixes a flaky test failure in
test_read_concurrent_declarative_source
that was occurring on Python 3.13 due to race conditions in concurrent processing. The test was using strict list equality (assert output_data == expected_records
) which failed when concurrent threads returned records in different orders.The fix replaces the strict equality check with a sorted comparison that validates the same records are present without caring about their order. This is appropriate for concurrent processing where order is not guaranteed.
The failing test case was:
test_no_pagination_with_partition_router-manifest4-pages4-expected_records4-expected_calls4
Review & Testing Checklist for Human
(partition, ABC, AED)
with default values of 0 is appropriate for all test scenarios.pytest unit_tests/sources/declarative/test_concurrent_declarative_source.py::test_read_concurrent_declarative_source -v
to ensure all test cases still pass.Notes
Link to Devin run: https://app.devin.ai/sessions/3251fb4d62c54774ab4d264d4ffe8b9a
Requested by: @aaronsteers
Summary by CodeRabbit