Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Aug 29, 2025

fix: make concurrent declarative source test order-agnostic

Summary

Fixes a flaky test failure in test_read_concurrent_declarative_source that was occurring on Python 3.13 due to race conditions in concurrent processing. The test was using strict list equality (assert output_data == expected_records) which failed when concurrent threads returned records in different orders.

The fix replaces the strict equality check with a sorted comparison that validates the same records are present without caring about their order. This is appropriate for concurrent processing where order is not guaranteed.

The failing test case was: test_no_pagination_with_partition_router-manifest4-pages4-expected_records4-expected_calls4

Review & Testing Checklist for Human

  • Verify sorting key works for all parametrized test cases - The change affects all test cases in the parametrized test, not just the failing partition router one. Check that the sorting key (partition, ABC, AED) with default values of 0 is appropriate for all test scenarios.
  • Run the full parametrized test suite - Execute pytest unit_tests/sources/declarative/test_concurrent_declarative_source.py::test_read_concurrent_declarative_source -v to ensure all test cases still pass.
  • Consider scope appropriateness - Evaluate if this change is too broad. The original issue was specific to partition router tests, but this change affects all concurrent declarative source tests.

Notes

  • Successfully tested the specific failing test case and it now passes
  • Lint checks pass with no issues
  • This addresses the race condition mentioned in the Slack thread where Brian Lai noted "ordering should not matter in the case of this test"

Link to Devin run: https://app.devin.ai/sessions/3251fb4d62c54774ab4d264d4ffe8b9a
Requested by: @aaronsteers

Summary by CodeRabbit

  • Tests
    • Made concurrency-related tests order-insensitive so they validate content regardless of record ordering.
    • Switched assertions to compare normalized, order-agnostic record sets to prevent flaky failures from non-deterministic ordering.
    • Improves CI stability and developer confidence with no changes to user-facing behavior.

The test_read_concurrent_declarative_source test was failing on Python 3.13
due to race conditions in concurrent processing that caused records to be
returned in different orders. This change replaces the strict list equality
assertion with a sorted comparison that validates the same records are
present without caring about their order.

Fixes the flaky test failure:
test_no_pagination_with_partition_router-manifest4-pages4-expected_records4-expected_calls4

Co-Authored-By: AJ Steers <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings August 29, 2025 23:26
Copy link
Contributor

Original prompt from AJ Steers
SYSTEM:
=== BEGIN THREAD HISTORY (in #proj-rip-legacy-python-cdk) ===
Pedro Lopez (U038J3BLW3C): Saw this one now on master, looks like an ordering issue:
<https://github.com/airbytehq/airbyte-python-cdk/actions/runs/17331775755/job/49209201586#step:8:33549>

FAILED unit_tests/sources/declarative/test_concurrent_declarative_source.py::test_read_concurrent_declarative_source[test_no_pagination_with_partition_router-manifest4-pages4-expected_records4-expected_calls4] - AssertionError: assert ['ABC': 2, '...artition': 0] == ['ABC': 0, '...artition': 1]
At index 0 diff: 'ABC': 2, 'partition': 1 != 'ABC': 0, 'partition': 0
Full diff:
[

  • 'ABC': 2, 'partition': 1,
    'ABC': 0, 'partition': 0,
    'AED': 1, 'partition': 0,
  • 'ABC': 2, 'partition': 1,
    ]
    ==== 1 failed, 3754 passed, 12 skipped, 2336 warnings in 680.00s (0:11:19) =====```

Brian Lai (U03APJQ895J): damn, well that's a new one. i'll give this a quick look

Brian Lai (U03APJQ895J): interesting, that's new i didn't realize we were testing 12 and 13 on CI, but that tracks since AJ has been working on bumping to newer versions

Brian Lai (U03APJQ895J): the output is correct on both side, just ordering by the looks of it. I'm curious if changing python versions would change anything about the threading or which partition ends up processed first, so i'll kick off another run to see.

The test could also stand to be less brittle and check existence instead of strict ordering since we don't technically guarantee that an arbitrary parent partition is processed in order

Brian Lai (U03APJQ895J): it def doesn't have anything to do with you change tho obviously

AJ Steers (U05AKF1BCC9): I ran into into this also. Looks like maybe flaky on 3.13? (Race condition ordering?)

https://github.com/airbytehq/airbyte-python-cdk/actions/runs/17332099415/job/49210272368?pr=738

AJ Steers (U05AKF1BCC9): <@U03APJQ895J> - We do have the option to mark that as a con... (1085 chars truncated...)

</details>

Copy link
Contributor

devin-ai-integration bot commented Aug 29, 2025

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions bot added bug Something isn't working security labels Aug 29, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1756509957-fix-concurrent-test-ordering#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1756509957-fix-concurrent-test-ordering

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Fixes a flaky test in the concurrent declarative source test suite by making the test order-agnostic. The test was failing intermittently on Python 3.13 due to race conditions where concurrent threads returned records in different orders.

  • Replaces strict list equality check with sorted comparison using a composite sort key
  • Maintains test validity while accommodating non-deterministic ordering in concurrent processing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

message.record.data for message in _run_read(manifest, _stream_name) if message.record
]
assert output_data == expected_records
assert sorted(output_data, key=lambda x: (x.get("partition", 0), x.get("ABC", 0), x.get("AED", 0))) == sorted(expected_records, key=lambda x: (x.get("partition", 0), x.get("ABC", 0), x.get("AED", 0)))
Copy link
Preview

Copilot AI Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sorting key logic is duplicated and creates a very long line. Consider extracting the key function to improve readability and reduce duplication: sort_key = lambda x: (x.get(\"partition\", 0), x.get(\"ABC\", 0), x.get(\"AED\", 0)) then use assert sorted(output_data, key=sort_key) == sorted(expected_records, key=sort_key)

Copilot uses AI. Check for mistakes.

Copy link
Contributor

coderabbitai bot commented Aug 29, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

The test for concurrent declarative source was changed to perform an order-insensitive comparison: both actual and expected record lists are sorted by a composite key (partition default 0, ABC default 0, AED default 0) before asserting equality.

Changes

Cohort / File(s) Summary
Tests: concurrency order-insensitive assertion
unit_tests/sources/declarative/test_concurrent_declarative_source.py
Assertion updated to sort actual and expected records by composite key (partition→default 0, ABC→default 0, AED→default 0) before comparison to remove dependency on record ordering under concurrency.

Sequence Diagram(s)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • pedroslopez — wdyt?
  • brianjlai — wdyt?

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6d66480 and c70821e.

📒 Files selected for processing (1)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/declarative/test_concurrent_declarative_source.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Pytest (All, Python 3.12, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.13, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Manifest Server Docker Image Build
  • GitHub Check: SDM Docker Image Build
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1756509957-fix-concurrent-test-ordering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

github-actions bot commented Aug 29, 2025

PyTest Results (Fast)

3 764 tests  ±0   3 752 ✅ ±0   6m 41s ⏱️ -2s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit c70821e. ± Comparison against base commit e5a1fc2.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Aug 29, 2025

PyTest Results (Full)

3 767 tests  ±0   3 755 ✅ ±0   11m 18s ⏱️ -7s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit c70821e. ± Comparison against base commit e5a1fc2.

♻️ This comment has been updated with latest results.

Fixes the Ruff Format Check CI failure by breaking the long assertion
line into multiple lines for better readability.

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@brianjlai brianjlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one adjustment then can ✅

Addresses PR comment about duplicated lambda function and long line.
Extracts the sorting key logic into a reusable variable to eliminate
duplication and improve code readability while maintaining the same
order-agnostic comparison functionality.

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers
Copy link
Contributor Author

/format-fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants