Skip to content

Conversation

llbbl
Copy link

@llbbl llbbl commented Sep 1, 2025

Set up Python Testing Infrastructure

Summary

This PR establishes a complete testing infrastructure for the ML project using Poetry as the package manager and pytest as the testing framework.

Changes Made

Package Management

  • Set up Poetry as the package manager with pyproject.toml configuration
  • Migrated dependencies from requirements.txt to Poetry format with proper version constraints
  • Added testing dependencies: pytest, pytest-cov, pytest-mock
  • Fixed Python version constraint to >=3.8,<3.12 for TensorFlow compatibility

Testing Configuration

  • Configured pytest with comprehensive settings in pyproject.toml:
    • Test discovery patterns for test_*.py and *_test.py files
    • Coverage reporting with 80% threshold requirement
    • HTML and XML coverage report generation
    • Custom markers: unit, integration, slow
    • Strict configuration and marker validation

Directory Structure

  • Created testing directories:
    tests/
    ├── __init__.py
    ├── conftest.py           # Shared fixtures
    ├── unit/
    │   └── __init__.py
    ├── integration/
    │   └── __init__.py
    └── test_infrastructure.py  # Validation tests
    

Shared Testing Fixtures

  • Created conftest.py with ML/AI-specific fixtures:
    • temp_dir, temp_file - Temporary filesystem utilities
    • mock_wandb - Mock Weights & Biases logging
    • mock_tensorflow - Mock TensorFlow imports
    • mock_huggingface_hub - Mock Hugging Face Hub operations
    • mock_dvc - Mock DVC operations
    • sample_data - Sample ML training/test data
    • sample_params, sample_config_file - Configuration utilities

Development Commands

  • Set up Poetry scripts:
    • poetry run test - Run all tests
    • poetry run tests - Alternative command
    • Both commands support all standard pytest options

Project Configuration

  • Updated .gitignore with testing patterns:
    • Test artifacts: .pytest_cache/, .coverage, htmlcov/
    • Python artifacts: __pycache__/, *.pyc, virtual environments
    • IDE and OS files
    • Claude Code settings: .claude/*

Validation

  • Created infrastructure validation tests that verify:
    • Python version compatibility (3.8+)
    • Testing framework imports work correctly
    • Project directory structure exists
    • Shared fixtures are available and functional
    • Custom pytest markers work
    • Temporary file/directory utilities work

Testing Instructions

Running Tests

# Install dependencies
poetry install

# Run all tests
poetry run test

# Run tests with specific options
poetry run test -v                    # Verbose output
poetry run test --no-cov             # Skip coverage
poetry run test -m unit              # Only unit tests
poetry run test -k "infrastructure"   # Tests matching pattern

Coverage Reports

  • Terminal output: Shows coverage percentage and missing lines
  • HTML report: Generated in htmlcov/ directory
  • XML report: Generated as coverage.xml for CI/CD integration

Writing New Tests

  1. Unit tests: Place in tests/unit/
  2. Integration tests: Place in tests/integration/
  3. Use shared fixtures: Available from conftest.py
  4. Mark tests: Use @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow

Dependencies

Production Dependencies (from requirements.txt)

  • dvc[gdrive]==2.10.2 - Data version control
  • wandb==0.12.19 - Experiment tracking
  • tensorflow==2.8 - ML framework
  • typer==0.4.1 - CLI framework
  • docopt==0.6.2 - Command line parsing
  • huggingface-hub - Model hub integration

Testing Dependencies (new)

  • pytest ^7.4.0 - Testing framework
  • pytest-cov ^4.1.0 - Coverage reporting
  • pytest-mock ^3.11.0 - Mocking utilities

Notes

  • Python version constraint set to >=3.8,<3.12 due to TensorFlow compatibility requirements
  • Coverage threshold set to 80% - can be adjusted in pyproject.toml
  • Testing infrastructure only - no actual unit tests for the codebase are included
  • Ready for development - developers can immediately start writing tests using the provided fixtures and configuration

Future Improvements

  • Add CI/CD integration with coverage reporting
  • Set up automated testing on pull requests
  • Add performance testing framework for ML model benchmarking
  • Integrate with code quality tools (black, flake8, mypy)

- Set up Poetry as package manager with pyproject.toml configuration
- Migrated existing dependencies from requirements.txt to Poetry format
- Added pytest, pytest-cov, and pytest-mock as testing dependencies
- Created comprehensive testing directory structure (tests/unit/, tests/integration/)
- Configured pytest with coverage reporting (80% threshold, HTML/XML output)
- Added custom markers for unit, integration, and slow tests
- Created conftest.py with shared fixtures for ML/AI testing (mocks for wandb, tensorflow, etc.)
- Set up Poetry scripts for `poetry run test` and `poetry run tests` commands
- Updated .gitignore with testing-related patterns and build artifacts
- Added validation tests to verify infrastructure setup

The testing infrastructure is now ready for developers to start writing tests.
@kkweon kkweon requested a review from deep-diver September 2, 2025 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant