-
Notifications
You must be signed in to change notification settings - Fork 114
feat(cli): Add AI support to shiny add test
#2041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduces an AI-powered test generator (with CLI integration), evaluation suite with sample apps and scripts, and utility tools for documentation and quality control. Updates the CLI to support 'shiny generate test' for automated test creation using Anthropic or OpenAI models. Adds extensive documentation and example apps for robust evaluation and development workflows.
The workflow now checks for changes in documentation_testing.json and, if detected, creates a pull request using the peter-evans/create-pull-request action instead of pushing directly.
This update adds environment variable checks for ANTHROPIC_API_KEY and OPENAI_API_KEY when the respective provider is selected. If the required API key is not set, a clear error message is shown and the process exits, improving user guidance and preventing runtime errors.
This refactor improves maintainability and user experience when generating AI-powered test files.
Changed import of ShinyTestGenerator to use a relative path in create_test_metadata.py. Updated test_shiny_import.py to exclude '/testing/evaluation/apps/' from the tested paths.
Deleted the add_test command and its implementation, consolidating test file creation under the AI-powered test generation command. Updated CLI options and refactored parameter names for consistency. Also adjusted MAX_TOKENS in the test generator config.
Removed the auto-update workflow for testing documentation, added a new workflow to validate changes in the controller directory and prompt for documentation updates, and renamed the conventional commits workflow for clarity.
Renamed and relocated all files from shiny/testing/evaluation/ to tests/inspect-ai/ to better organize evaluation test assets and scripts under the tests directory.
Changed .gitignore, pyrightconfig.json, and test_shiny_import.py to reference 'tests/inspect-ai' instead of 'shiny/testing/evaluation'. This aligns configuration and test filtering with the new directory structure.
Moved all testing generator and utility modules from shiny/testing to shiny/pytest/generate for improved organization and clarity. Updated imports, workflow paths, and resource references accordingly. Removed obsolete shiny/testing/__init__.py and README.md.
Renamed and relocated utility scripts and related files from shiny/pytest/generate/utils to tests/inspect-ai/utils for improved organization. Updated workflow references to match new paths.
Introduces new Makefile targets to automate the process of updating testing documentation, including installing repomix, generating repomix output, processing documentation, and cleaning up temporary files. Also renames and updates the GitHub workflow to instruct contributors to use the new Makefile command for documentation updates.
Changed Makefile to check for repomix using 'command -v' instead of 'npm list -g'. Expanded and corrected the documentation_testing.json API definitions, adding new controller methods, fixing parameter type formatting, and improving descriptions for clarity and completeness.
Added instructions to skip testing icon and plot functionality in SYSTEM_PROMPT_testing.md. This clarifies the scope of tests and avoids unnecessary coverage for icons and plots.
Added 'chatlas[anthropic]', 'chatlas[openai]', and 'inspect-ai' to the test dependencies to support additional testing capabilities.
Introduces a new GitHub Actions workflow to validate test generation prompts in the 'shiny/pytest/generate' directory. Also renames workflow files for consistency, updates .gitignore to exclude new result and metadata files, and improves path handling in test metadata and evaluation scripts for robustness.
Replaces pip install with pip upgrade and Makefile targets for installing dependencies in the validate-test-generation-prompts GitHub Actions workflow.
Introduces a new step to set up py-shiny in the validate-test-generation-prompts GitHub Actions workflow before installing dependencies.
Replaces custom py-shiny setup and Makefile commands with pip install for test dependencies. Refactors comment formatting in evaluation results for improved readability and consistency.
Set fetch-depth to 0 for full git history, install dev dependencies along with test dependencies, and fix formatting in error comment for evaluation results. These changes improve workflow reliability and ensure all required packages are available.
Replaces the use of actions/github-script with marocchino/sticky-pull-request-comment for posting AI evaluation results on pull requests. Adds a step to prepare the comment body and writes it to a file, improving error handling and ensuring comments are updated rather than duplicated.
Adds environment variables for Python version and attempt count, implements caching for Python dependencies and Playwright browsers, and improves Playwright installation steps. These changes reduce redundant installs and speed up workflow execution.
Moved inspect-ai from pyproject.toml test dependencies to explicit installation in the GitHub Actions workflow. This change ensures inspect-ai is installed only during CI runs and not as a default test dependency.
Standardized quotes to double quotes in the workflow YAML and updated the pip cache key to use only 'pyproject.toml'. Removed unnecessary blank lines for improved readability.
Documented new feature: `shiny add test` command now uses AI models from Anthropic or OpenAI to automatically generate Playwright tests for Shiny applications.
Renamed GitHub Actions workflow files to use 'verify' instead of 'validate' in their filenames for consistency and clarity.
Pytest results handling and reporting have been removed from the prepare_comment.py script as they are not working properly. The overall result now only reflects the Inspect AI quality gate status.
Eliminates reading and handling of pytest and combined summary results in prepare_comment.py, as these features are currently not working properly.
Test Generation Evaluation Results (Averaged across 3 attempts)🔍 Inspect AI Test Quality Evaluation
🎯 Overall Result✅ PASSED - Quality gate based on Inspect AI results Results are averaged across 3 evaluation attempts for improved reliability. |
Move dotenv loading to shiny/_main_generate_test.py to ensure environment variables are loaded before API key validation, without requiring the generator to manage dotenv or logging. Remove dotenv and logging setup from ShinyTestGenerator in shiny/pytest/_generate/_main.py for cleaner separation of concerns.
* main: fix: Make outside users able to read tmp files (#2070) docs: update module server and ui to incorporate #705 (#2044) fix: errors on bookmark are now surfaced in the Python console (#2076) Add Connect Cloud as a hosting option in README (#2074) Update changelog Update changelog fix: include_css and include_js can use files in same dir (#2069)
`ERROR: OpenAI API requires at least version 1.104.1 of package openai (you have version 1.102.0 installed).` Be sure anthropic/inspect-ai are up to date
schloerke
approved these changes
Sep 6, 2025
shiny add test
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces an AI-powered test generator for Shiny applications, allowing users to automatically generate Playwright tests for their apps using Anthropic or OpenAI models. The implementation includes a new CLI workflow for test generation, supporting both interactive and non-interactive modes, and adds robust validation and error handling. Additionally, the PR introduces new developer workflows to ensure documentation and test prompt quality, and updates the project dependencies and documentation accordingly.
Changes include:
Added a new
shiny add test
CLI command that generates comprehensive Playwright test files for Shiny apps using AI models from Anthropic or OpenAI, with options for provider and model selection, and improved validation and user guidance.Removed the old test file creation logic and replaced it with the new AI-based generator, including improved error handling and interactive prompts. (
shiny/_main_add_test.py
)Added a GitHub Actions workflow to validate that Playwright test generation prompts and results are up to date, running test generation and evaluation, and commenting on PRs with results.
Introduced a workflow to check that documentation for Playwright controller changes is kept in sync, automatically prompting contributors to update docs as needed.
Added Makefile targets and scripts for generating and processing testing documentation using
repomix
and a custom Python script, ensuring that AI test generation has access to the latest controller API docs.Updated
pyproject.toml
with a newtestgen
dependency group for AI test generation, including required packages.