Skip to content

Conversation

maxi297
Copy link
Contributor

@maxi297 maxi297 commented Jul 31, 2025

What

Following the removal of unused stuff regarding availability, we want to support availability checks as much as possible as is for the new AbstractStream interface until we migrate fully to it.

What might change when we're done with the migration?

  • The streams might not be provided through AbstractSource.streams
  • We can remove the DeclarativeStream stuff

How

Assuming steams can return AbstractStream within the list and if it is the case, availability is done through AbstractStream.check_availability.

Note that some of the exit_on_rate_limit logic wasn't ported from the legacy CDK because nothing sets this in low code and we've only seen one instance of a source playing with this concept (source-github). For more information, see this Slack thread

Also note that as part of this PR, the most of the added code will not run in prod because we never return an AbstractStream in AbstractSource.streams. In order to test it though, I replace the return statement in ModelToComponentFactory.create_declarative_stream` by the following:

        stream_name = model.name or ""
        cursor = combined_slicers if combined_slicers and isinstance(combined_slicers, Cursor) else FinalStateCursor(stream_name, None, self._message_repository)
        partition_generator = StreamSlicerPartitionGenerator(
            DeclarativePartitionFactory(
                stream_name,
                schema_loader.get_json_schema(),
                retriever,
                self._message_repository,
            ),
            cursor,
        )

        return DefaultStream(
            partition_generator=partition_generator,
            name=stream_name,
            json_schema=schema_loader.get_json_schema(),
            # FIXME it seems this is what we do in the ConcurrentDeclarativeSource but it feels wrong
            primary_key=get_primary_key_from_stream(primary_key),
            cursor_field=cursor.cursor_field.cursor_field_key if hasattr(cursor, "cursor_field") else "",
            # FIXME we should have the cursor field has part of the interface of cursor
            logger=logging.getLogger(f"airbyte.{stream_name}"),
            # FIXME this is a breaking change compared to the old implementation,
            cursor=cursor,
        )

Summary by CodeRabbit

  • Refactor

    • Streamlined stream availability checks by consolidating and simplifying availability logic across core and file-based streams.
    • Removed deprecated and redundant availability strategy classes and related wrappers.
    • Updated internal interfaces to use a unified availability representation and direct checks within streams.
    • Eliminated availability strategy usage from concurrent stream adapters and declarative sources.
  • Tests

    • Updated and added tests to reflect the new availability check mechanisms, removing tests for deprecated strategies and adding coverage for new behaviors.
    • Enhanced test mocks with explicit type specifications for improved reliability.
  • Chores

    • Cleaned up unused imports and deprecated decorators for improved maintainability.

@github-actions github-actions bot added the chore label Jul 31, 2025
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/availability_strategy_to_support_abstract_stream#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/availability_strategy_to_support_abstract_stream

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

@maxi297 maxi297 requested review from tolik0 and brianjlai July 31, 2025 20:50
@maxi297
Copy link
Contributor Author

maxi297 commented Jul 31, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link

github-actions bot commented Jul 31, 2025

PyTest Results (Fast)

3 697 tests  +2   3 686 ✅ +2   6m 38s ⏱️ +3s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 2bc4b30. ± Comparison against base commit f976c72.

This pull request removes 5 and adds 7 tests. Note that renamed tests count towards both.
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available0]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available1]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available_using_singleton]
unit_tests.sources.streams.concurrent.test_adapters.StreamFacadeTest ‑ test_check_availability_is_delegated_to_wrapped_stream
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_check_availability
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_AirbyteTracedException_when_generating_partitions_when_get_availability_then_unavailable
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_AirbyteTracedException_when_reading_records_when_get_availability_then_unavailable
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_no_partitions_when_get_availability_then_unavailable
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_no_records_when_get_availability_then_available
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_records_when_get_availability_then_available
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_unknown_error_when_generating_partitions_when_get_availability_then_raise
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_unknown_error_when_reading_record_when_get_availability_then_raise

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jul 31, 2025

PyTest Results (Full)

3 700 tests  +2   3 689 ✅ +2   11m 39s ⏱️ +6s
    1 suites ±0      11 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 2bc4b30. ± Comparison against base commit f976c72.

This pull request removes 5 and adds 7 tests. Note that renamed tests count towards both.
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available0]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available1]
unit_tests.sources.streams.concurrent.test_adapters ‑ test_availability_strategy_facade[test_stream_is_available_using_singleton]
unit_tests.sources.streams.concurrent.test_adapters.StreamFacadeTest ‑ test_check_availability_is_delegated_to_wrapped_stream
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_check_availability
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_AirbyteTracedException_when_generating_partitions_when_get_availability_then_unavailable
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_AirbyteTracedException_when_reading_records_when_get_availability_then_unavailable
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_no_partitions_when_get_availability_then_unavailable
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_no_records_when_get_availability_then_available
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_records_when_get_availability_then_available
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_unknown_error_when_generating_partitions_when_get_availability_then_raise
unit_tests.sources.streams.concurrent.test_default_stream.ThreadBasedConcurrentStreamTest ‑ test_given_unknown_error_when_reading_record_when_get_availability_then_raise

♻️ This comment has been updated with latest results.

@maxi297
Copy link
Contributor Author

maxi297 commented Jul 31, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

octavia-squidington-iii and others added 3 commits July 31, 2025 21:18
@maxi297 maxi297 changed the base branch from maxi297/remove-availability-strategy-except-for-filebased to main August 5, 2025 10:57
Copy link
Contributor

coderabbitai bot commented Aug 5, 2025

📝 Walkthrough

Walkthrough

This set of changes refactors the availability checking mechanism across several modules. It removes deprecated and strategy-based availability classes, consolidates availability representation, updates type hints, and migrates to direct, inline, or standalone function-based availability checks for streams. Related test suites are updated to match the new approach.

Changes

Cohort / File(s) Change Summary
Declarative Stream Availability Checks
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py, airbyte_cdk/sources/declarative/checks/check_stream.py
Refactored to use a new evaluate_availability function for checking stream availability, updated type hints for streams, and removed direct dependencies on HttpAvailabilityStrategy.
Concurrent Stream Availability & Adapters
airbyte_cdk/sources/streams/concurrent/availability_strategy.py, airbyte_cdk/sources/streams/concurrent/adapters.py, airbyte_cdk/sources/streams/concurrent/default_stream.py, airbyte_cdk/sources/streams/concurrent/abstract_stream.py
Consolidated availability representation into a single StreamAvailability class, removed deprecated strategy classes and adapter logic, and implemented direct availability checks in DefaultStream.
File-Based Stream Availability
airbyte_cdk/sources/file_based/availability_strategy/__init__.py, airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py, airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py, airbyte_cdk/sources/file_based/stream/concurrent/adapters.py
Removed wrapper and deprecated availability strategy classes, updated method signatures, and removed related imports and decorators.
Concurrent Declarative Source
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
Removed usage of AlwaysAvailableAvailabilityStrategy when creating DefaultStream instances.
Core Streams Availability
airbyte_cdk/sources/streams/availability_strategy.py
Added a single comment line; no functional changes.
Declarative and Concurrent Stream Tests
unit_tests/sources/declarative/checks/test_check_stream.py, unit_tests/sources/streams/concurrent/test_adapters.py, unit_tests/sources/streams/concurrent/test_default_stream.py, unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py
Updated and refactored tests to align with the new availability check logic, removed deprecated strategy tests, and added new tests for direct availability checking in streams.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CheckStream
    participant Stream (or AbstractStream)

    User->>CheckStream: check_connection()
    CheckStream->>Stream: evaluate_availability(stream, logger)
    alt Stream is legacy
        CheckStream->>Stream: HttpAvailabilityStrategy().check_availability()
    else Stream is new
        CheckStream->>Stream: stream.check_availability()
    end
    Stream-->>CheckStream: (is_available, reason)
    CheckStream-->>User: (result)
Loading
sequenceDiagram
    participant DefaultStream
    participant PartitionGenerator
    participant Partition

    DefaultStream->>PartitionGenerator: generate_partitions()
    alt No partitions
        DefaultStream-->>Caller: StreamAvailability.unavailable("No partitions generated")
    else Exception during partitioning
        DefaultStream-->>Caller: StreamAvailability.unavailable(error_message)
    else
        DefaultStream->>Partition: read_records()
        alt Exception during reading
            DefaultStream-->>Caller: StreamAvailability.unavailable(error_message)
        else
            DefaultStream-->>Caller: StreamAvailability.available()
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~18 minutes

Possibly related PRs

  • airbytehq/airbyte-python-cdk#450: Refactors availability checking by introducing evaluate_availability, directly related to the abstraction now used in this PR.
  • airbytehq/airbyte-python-cdk#347: Introduces evaluate_availability and updates dynamic stream check configuration, closely related to availability logic changes here.
  • airbytehq/airbyte-python-cdk#293: Modifies CheckDynamicStream availability checks and adds a flag to control them, related to the refactoring of availability checking in this PR.

Would you like to cross-reference any of these PRs in your summary or changelog, wdyt?

Suggested labels

enhancement, airbyte-python-cdk

Would you like to add any additional labels for visibility, wdyt?

Suggested reviewers

  • aldogonzalez8

Anyone else you'd like to include for extra eyes on these refactors, wdyt?

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8edc4b and 2bc4b30.

📒 Files selected for processing (1)
  • unit_tests/sources/streams/concurrent/test_default_stream.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/streams/concurrent/test_default_stream.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check: source-google-drive
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Analyze (python)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch maxi297/availability_strategy_to_support_abstract_stream

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
airbyte_cdk/sources/streams/concurrent/availability_strategy.py (1)

10-16: Address naming consistency

As mentioned in a previous review, for consistency both available and unavailable methods should use the same convention - either both use cls or both use StreamAvailability. Currently one uses StreamAvailability and one uses cls.

     @classmethod
     def available(cls) -> "StreamAvailability":
-        return cls(True)
+        return StreamAvailability(True)

Or alternatively:

     @classmethod
     def unavailable(cls, reason: str) -> "StreamAvailability":
-        return cls(False, reason)
+        return StreamAvailability(False, reason)
🧹 Nitpick comments (3)
airbyte_cdk/sources/streams/availability_strategy.py (1)

17-17: Make the FIXME comment more specific, wdyt?

The current comment # FIXME this is quite vague. Since this appears to be part of the availability strategy refactoring mentioned in the PR, could you make it more descriptive? Something like # FIXME: Consider deprecating this class as part of AbstractStream migration would be more actionable.

-# FIXME this
+# FIXME: Consider deprecating this class as part of AbstractStream migration
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (1)

13-13: Remove unused import, wdyt?

It looks like HttpAvailabilityStrategy is no longer used since the code now uses evaluate_availability function on line 47. Should we remove this import to clean up the unused reference?

-from airbyte_cdk.sources.streams.http.availability_strategy import HttpAvailabilityStrategy
unit_tests/sources/streams/concurrent/test_default_stream.py (1)

259-259: Consider using is False for boolean comparison

For more Pythonic code, consider using is False instead of == False for boolean comparisons. The same applies to is True comparisons throughout the test file, wdyt?

-        assert availability.is_available == False
+        assert availability.is_available is False
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f976c72 and e8edc4b.

📒 Files selected for processing (16)
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2 hunks)
  • airbyte_cdk/sources/declarative/checks/check_stream.py (5 hunks)
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py (0 hunks)
  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1 hunks)
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1 hunks)
  • airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py (0 hunks)
  • airbyte_cdk/sources/file_based/stream/concurrent/adapters.py (0 hunks)
  • airbyte_cdk/sources/streams/availability_strategy.py (1 hunks)
  • airbyte_cdk/sources/streams/concurrent/abstract_stream.py (1 hunks)
  • airbyte_cdk/sources/streams/concurrent/adapters.py (0 hunks)
  • airbyte_cdk/sources/streams/concurrent/availability_strategy.py (1 hunks)
  • airbyte_cdk/sources/streams/concurrent/default_stream.py (2 hunks)
  • unit_tests/sources/declarative/checks/test_check_stream.py (4 hunks)
  • unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py (0 hunks)
  • unit_tests/sources/streams/concurrent/test_adapters.py (0 hunks)
  • unit_tests/sources/streams/concurrent/test_default_stream.py (2 hunks)
💤 Files with no reviewable changes (6)
  • airbyte_cdk/sources/file_based/stream/abstract_file_based_stream.py
  • unit_tests/sources/streams/concurrent/test_adapters.py
  • unit_tests/sources/streams/concurrent/scenarios/thread_based_concurrent_stream_scenarios.py
  • airbyte_cdk/sources/declarative/concurrent_declarative_source.py
  • airbyte_cdk/sources/file_based/stream/concurrent/adapters.py
  • airbyte_cdk/sources/streams/concurrent/adapters.py
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the `declarative_component_schema.py` file is auto-generated from `declarative_component_schema.yaml` and should be ignored in the recommended reviewing order.
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.
📚 Learning: the files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from ...
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/_run.py:62-65
Timestamp: 2024-11-15T01:04:21.272Z
Learning: The files in `airbyte_cdk/cli/source_declarative_manifest/`, including `_run.py`, are imported from another repository, and changes to these files should be minimized or avoided when possible to maintain consistency.

Applied to files:

  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py
  • unit_tests/sources/declarative/checks/test_check_stream.py
  • airbyte_cdk/sources/streams/concurrent/default_stream.py
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
📚 Learning: when code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repositor...
Learnt from: aaronsteers
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/cli/source_declarative_manifest/spec.json:9-15
Timestamp: 2024-11-15T00:59:08.154Z
Learning: When code in `airbyte_cdk/cli/source_declarative_manifest/` is being imported from another repository, avoid suggesting modifications to it during the import process.

Applied to files:

  • airbyte_cdk/sources/file_based/availability_strategy/__init__.py
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
📚 Learning: when modifying the `yamldeclarativesource` class in `airbyte_cdk/sources/declarative/yaml_declarativ...
Learnt from: ChristoGrab
PR: airbytehq/airbyte-python-cdk#58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the `YamlDeclarativeSource` class in `airbyte_cdk/sources/declarative/yaml_declarative_source.py`, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.

Applied to files:

  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py
  • unit_tests/sources/declarative/checks/test_check_stream.py
  • airbyte_cdk/sources/streams/concurrent/availability_strategy.py
  • airbyte_cdk/sources/streams/concurrent/default_stream.py
  • airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py
  • airbyte_cdk/sources/declarative/checks/check_stream.py
🧬 Code Graph Analysis (3)
airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1)
airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1)
  • AbstractFileBasedAvailabilityStrategy (19-47)
unit_tests/sources/declarative/checks/test_check_stream.py (4)
unit_tests/sources/mock_server_tests/mock_source_fixture.py (2)
  • streams (406-413)
  • spec (415-430)
airbyte_cdk/sources/abstract_source.py (1)
  • streams (74-79)
unit_tests/sources/test_source.py (2)
  • streams (58-61)
  • streams (148-150)
airbyte_cdk/sources/streams/core.py (1)
  • Stream (118-703)
airbyte_cdk/sources/streams/concurrent/abstract_stream.py (2)
airbyte_cdk/sources/streams/concurrent/default_stream.py (1)
  • check_availability (97-125)
airbyte_cdk/sources/streams/concurrent/availability_strategy.py (1)
  • StreamAvailability (9-37)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Check: source-pokeapi
  • GitHub Check: Check: source-shopify
  • GitHub Check: Check: source-intercom
  • GitHub Check: Check: destination-motherduck
  • GitHub Check: Check: source-google-drive
  • GitHub Check: Check: source-hardcoded-records
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Analyze (python)
🔇 Additional comments (11)
airbyte_cdk/sources/file_based/availability_strategy/__init__.py (1)

1-7: LGTM on the cleanup!

The removal of AbstractFileBasedAvailabilityStrategyWrapper from the exports is consistent with the broader refactoring to eliminate deprecated availability strategy wrappers. The remaining exports for the base abstract class and default implementation look appropriate.

airbyte_cdk/sources/streams/concurrent/abstract_stream.py (1)

92-96: Good reorganization of the abstract methods!

Moving check_availability to be closer to the other core properties like cursor makes the interface organization more logical. The docstring update also makes the return value clearer.

unit_tests/sources/declarative/checks/test_check_stream.py (2)

19-19: Great addition for better test mocking!

Adding the Stream import to support the spec parameter in MagicMock is a good improvement for type safety.


49-49: Excellent improvement to mock specifications!

Using MagicMock(spec=Stream) instead of plain MagicMock() provides much better type safety in tests. This will catch attribute access errors early and ensures the mocks conform to the actual Stream interface. Nice consistency across all the test functions!

Also applies to: 81-81, 95-95

airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (3)

7-7: Good updates for AbstractStream support!

The import changes look good - adding the Union type and importing the new evaluate_availability function aligns well with the migration goals. The more specific import path for AbstractSource is also an improvement.

Also applies to: 9-12


38-38: Appropriate migration typing!

The Union type hint for supporting both Stream and AbstractStream with the type: ignore comment properly acknowledges this is a transitional state during the migration. This is exactly the kind of pragmatic approach needed for incremental migrations.


47-47: Excellent refactoring to unified availability checking!

Replacing the strategy pattern with the evaluate_availability function is much cleaner and aligns perfectly with the PR objective of supporting AbstractStream. This function can handle both stream types appropriately, which is exactly what's needed during the migration phase.

airbyte_cdk/sources/streams/concurrent/default_stream.py (1)

97-126: Well-implemented availability check!

The implementation correctly handles various scenarios and gracefully manages exceptions. The approach of checking the first record is efficient. Just one minor thought - the comment on lines 104-107 about legacy behavior could be clearer about what specific legacy pattern this addresses, wdyt?

airbyte_cdk/sources/file_based/availability_strategy/abstract_file_based_availability_strategy.py (1)

19-48: LGTM! Clean refactoring

The changes appropriately remove deprecated dependencies and the optional source parameter provides good backward compatibility during the migration period.

airbyte_cdk/sources/declarative/checks/check_stream.py (2)

17-30: Excellent migration abstraction!

The evaluate_availability function provides a clean way to handle both stream types during the transition period. The clear documentation and appropriate error handling for unsupported types make this a solid implementation.


71-71: Type ignore comment is well-justified

Good to see the type: ignore comment includes clear reasoning about this being a migration step. This helps future maintainers understand why it's there.

@maxi297 maxi297 merged commit e1664ec into main Aug 6, 2025
24 of 26 checks passed
@maxi297 maxi297 deleted the maxi297/availability_strategy_to_support_abstract_stream branch August 6, 2025 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants