Skip to content

Conversation

karangattu
Copy link
Collaborator

@karangattu karangattu commented Jul 24, 2025

This pull request introduces an AI-powered test generator for Shiny applications, allowing users to automatically generate Playwright tests for their apps using Anthropic or OpenAI models. The implementation includes a new CLI workflow for test generation, supporting both interactive and non-interactive modes, and adds robust validation and error handling. Additionally, the PR introduces new developer workflows to ensure documentation and test prompt quality, and updates the project dependencies and documentation accordingly.

Changes include:

  • Added a new shiny add test CLI command that generates comprehensive Playwright test files for Shiny apps using AI models from Anthropic or OpenAI, with options for provider and model selection, and improved validation and user guidance.

  • Removed the old test file creation logic and replaced it with the new AI-based generator, including improved error handling and interactive prompts. (shiny/_main_add_test.py)

  • Added a GitHub Actions workflow to validate that Playwright test generation prompts and results are up to date, running test generation and evaluation, and commenting on PRs with results.

  • Introduced a workflow to check that documentation for Playwright controller changes is kept in sync, automatically prompting contributors to update docs as needed.

  • Added Makefile targets and scripts for generating and processing testing documentation using repomix and a custom Python script, ensuring that AI test generation has access to the latest controller API docs.

  • Updated pyproject.toml with a new testgen dependency group for AI test generation, including required packages.

karangattu and others added 23 commits July 24, 2025 19:57
Introduces an AI-powered test generator (with CLI integration), evaluation suite with sample apps and scripts, and utility tools for documentation and quality control. Updates the CLI to support 'shiny generate test' for automated test creation using Anthropic or OpenAI models. Adds extensive documentation and example apps for robust evaluation and development workflows.
The workflow now checks for changes in documentation_testing.json and, if detected, creates a pull request using the peter-evans/create-pull-request action instead of pushing directly.
This update adds environment variable checks for ANTHROPIC_API_KEY and OPENAI_API_KEY when the respective provider is selected. If the required API key is not set, a clear error message is shown and the process exits, improving user guidance and preventing runtime errors.
This refactor improves maintainability and user experience when generating AI-powered test files.
Changed import of ShinyTestGenerator to use a relative path in create_test_metadata.py. Updated test_shiny_import.py to exclude '/testing/evaluation/apps/' from the tested paths.
Deleted the add_test command and its implementation, consolidating test file creation under the AI-powered test generation command. Updated CLI options and refactored parameter names for consistency. Also adjusted MAX_TOKENS in the test generator config.
Removed the auto-update workflow for testing documentation, added a new workflow to validate changes in the controller directory and prompt for documentation updates, and renamed the conventional commits workflow for clarity.
Renamed and relocated all files from shiny/testing/evaluation/ to tests/inspect-ai/ to better organize evaluation test assets and scripts under the tests directory.
Changed .gitignore, pyrightconfig.json, and test_shiny_import.py to reference 'tests/inspect-ai' instead of 'shiny/testing/evaluation'. This aligns configuration and test filtering with the new directory structure.
Moved all testing generator and utility modules from shiny/testing to shiny/pytest/generate for improved organization and clarity. Updated imports, workflow paths, and resource references accordingly. Removed obsolete shiny/testing/__init__.py and README.md.
Renamed and relocated utility scripts and related files from shiny/pytest/generate/utils to tests/inspect-ai/utils for improved organization. Updated workflow references to match new paths.
Introduces new Makefile targets to automate the process of updating testing documentation, including installing repomix, generating repomix output, processing documentation, and cleaning up temporary files. Also renames and updates the GitHub workflow to instruct contributors to use the new Makefile command for documentation updates.
Changed Makefile to check for repomix using 'command -v' instead of 'npm list -g'. Expanded and corrected the documentation_testing.json API definitions, adding new controller methods, fixing parameter type formatting, and improving descriptions for clarity and completeness.
Added instructions to skip testing icon and plot functionality in SYSTEM_PROMPT_testing.md. This clarifies the scope of tests and avoids unnecessary coverage for icons and plots.
Added 'chatlas[anthropic]', 'chatlas[openai]', and 'inspect-ai' to the test dependencies to support additional testing capabilities.
Introduces a new GitHub Actions workflow to validate test generation prompts in the 'shiny/pytest/generate' directory. Also renames workflow files for consistency, updates .gitignore to exclude new result and metadata files, and improves path handling in test metadata and evaluation scripts for robustness.
Replaces pip install with pip upgrade and Makefile targets for installing dependencies in the validate-test-generation-prompts GitHub Actions workflow.
Introduces a new step to set up py-shiny in the validate-test-generation-prompts GitHub Actions workflow before installing dependencies.
Replaces custom py-shiny setup and Makefile commands with pip install for test dependencies. Refactors comment formatting in evaluation results for improved readability and consistency.
Set fetch-depth to 0 for full git history, install dev dependencies along with test dependencies, and fix formatting in error comment for evaluation results. These changes improve workflow reliability and ensure all required packages are available.
Replaces the use of actions/github-script with marocchino/sticky-pull-request-comment for posting AI evaluation results on pull requests. Adds a step to prepare the comment body and writes it to a file, improving error handling and ensuring comments are updated rather than duplicated.
Adds environment variables for Python version and attempt count, implements caching for Python dependencies and Playwright browsers, and improves Playwright installation steps. These changes reduce redundant installs and speed up workflow execution.
@karangattu karangattu marked this pull request as ready for review July 25, 2025 15:26
@karangattu karangattu requested a review from schloerke July 25, 2025 15:26
Moved inspect-ai from pyproject.toml test dependencies to explicit installation in the GitHub Actions workflow. This change ensures inspect-ai is installed only during CI runs and not as a default test dependency.
Standardized quotes to double quotes in the workflow YAML and updated the pip cache key to use only 'pyproject.toml'. Removed unnecessary blank lines for improved readability.
Documented new feature: `shiny add test` command now uses AI models from Anthropic or OpenAI to automatically generate Playwright tests for Shiny applications.
@posit-dev posit-dev deleted a comment from github-actions bot Jul 25, 2025
Renamed GitHub Actions workflow files to use 'verify' instead of 'validate' in their filenames for consistency and clarity.
@posit-dev posit-dev deleted a comment from github-actions bot Aug 22, 2025
Pytest results handling and reporting have been removed from the prepare_comment.py script as they are not working properly. The overall result now only reflects the Inspect AI quality gate status.
@posit-dev posit-dev deleted a comment from github-actions bot Aug 22, 2025
Eliminates reading and handling of pytest and combined summary results in prepare_comment.py, as these features are currently not working properly.
Copy link

github-actions bot commented Aug 22, 2025

Test Generation Evaluation Results (Averaged across 3 attempts)

🔍 Inspect AI Test Quality Evaluation

  • Complete (C): 7.3
  • Partial (P): 1.7
  • Incomplete (I): 0.0
  • Passing Rate: 9.0/9.0 (100.0%)
  • Quality Gate: ✅ PASSED (≥80% required)

🎯 Overall Result

✅ PASSED - Quality gate based on Inspect AI results


Results are averaged across 3 evaluation attempts for improved reliability.

karangattu and others added 21 commits August 29, 2025 11:16
Move dotenv loading to shiny/_main_generate_test.py to ensure environment variables are loaded before API key validation, without requiring the generator to manage dotenv or logging. Remove dotenv and logging setup from ShinyTestGenerator in shiny/pytest/_generate/_main.py for cleaner separation of concerns.
* main:
  fix: Make outside users able to read tmp files (#2070)
  docs: update module server and ui to incorporate #705 (#2044)
  fix: errors on bookmark are now surfaced in the Python console (#2076)
  Add Connect Cloud as a hosting option in README (#2074)
  Update changelog
  Update changelog
  fix: include_css and include_js can use files in same dir (#2069)
`ERROR: OpenAI API requires at least version 1.104.1 of package openai (you have version 1.102.0 installed).`

Be sure anthropic/inspect-ai are up to date
@schloerke schloerke changed the title chore(test-generation): Integrate test generator feat(cli): Add AI support to shiny add test Sep 6, 2025
@schloerke schloerke merged commit 83d3952 into main Sep 6, 2025
130 of 131 checks passed
@schloerke schloerke deleted the integrate-test-generator branch September 6, 2025 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants