feat(cli): Add AI support to `shiny add test` #2041

karangattu · 2025-07-24T15:25:32Z

This pull request introduces an AI-powered test generator for Shiny applications, allowing users to automatically generate Playwright tests for their apps using Anthropic or OpenAI models. The implementation includes a new CLI workflow for test generation, supporting both interactive and non-interactive modes, and adds robust validation and error handling. Additionally, the PR introduces new developer workflows to ensure documentation and test prompt quality, and updates the project dependencies and documentation accordingly.

Changes include:

Added a new shiny add test CLI command that generates comprehensive Playwright test files for Shiny apps using AI models from Anthropic or OpenAI, with options for provider and model selection, and improved validation and user guidance.
Removed the old test file creation logic and replaced it with the new AI-based generator, including improved error handling and interactive prompts. (shiny/_main_add_test.py)
Added a GitHub Actions workflow to validate that Playwright test generation prompts and results are up to date, running test generation and evaluation, and commenting on PRs with results.
Introduced a workflow to check that documentation for Playwright controller changes is kept in sync, automatically prompting contributors to update docs as needed.
Added Makefile targets and scripts for generating and processing testing documentation using repomix and a custom Python script, ensuring that AI test generation has access to the latest controller API docs.
Updated pyproject.toml with a new testgen dependency group for AI test generation, including required packages.

Introduces an AI-powered test generator (with CLI integration), evaluation suite with sample apps and scripts, and utility tools for documentation and quality control. Updates the CLI to support 'shiny generate test' for automated test creation using Anthropic or OpenAI models. Adds extensive documentation and example apps for robust evaluation and development workflows.

The workflow now checks for changes in documentation_testing.json and, if detected, creates a pull request using the peter-evans/create-pull-request action instead of pushing directly.

This update adds environment variable checks for ANTHROPIC_API_KEY and OPENAI_API_KEY when the respective provider is selected. If the required API key is not set, a clear error message is shown and the process exits, improving user guidance and preventing runtime errors.

This refactor improves maintainability and user experience when generating AI-powered test files.

Changed import of ShinyTestGenerator to use a relative path in create_test_metadata.py. Updated test_shiny_import.py to exclude '/testing/evaluation/apps/' from the tested paths.

Deleted the add_test command and its implementation, consolidating test file creation under the AI-powered test generation command. Updated CLI options and refactored parameter names for consistency. Also adjusted MAX_TOKENS in the test generator config.

Removed the auto-update workflow for testing documentation, added a new workflow to validate changes in the controller directory and prompt for documentation updates, and renamed the conventional commits workflow for clarity.

Renamed and relocated all files from shiny/testing/evaluation/ to tests/inspect-ai/ to better organize evaluation test assets and scripts under the tests directory.

Changed .gitignore, pyrightconfig.json, and test_shiny_import.py to reference 'tests/inspect-ai' instead of 'shiny/testing/evaluation'. This aligns configuration and test filtering with the new directory structure.

Moved all testing generator and utility modules from shiny/testing to shiny/pytest/generate for improved organization and clarity. Updated imports, workflow paths, and resource references accordingly. Removed obsolete shiny/testing/__init__.py and README.md.

Renamed and relocated utility scripts and related files from shiny/pytest/generate/utils to tests/inspect-ai/utils for improved organization. Updated workflow references to match new paths.

Introduces new Makefile targets to automate the process of updating testing documentation, including installing repomix, generating repomix output, processing documentation, and cleaning up temporary files. Also renames and updates the GitHub workflow to instruct contributors to use the new Makefile command for documentation updates.

Changed Makefile to check for repomix using 'command -v' instead of 'npm list -g'. Expanded and corrected the documentation_testing.json API definitions, adding new controller methods, fixing parameter type formatting, and improving descriptions for clarity and completeness.

Added instructions to skip testing icon and plot functionality in SYSTEM_PROMPT_testing.md. This clarifies the scope of tests and avoids unnecessary coverage for icons and plots.

Added 'chatlas[anthropic]', 'chatlas[openai]', and 'inspect-ai' to the test dependencies to support additional testing capabilities.

Introduces a new GitHub Actions workflow to validate test generation prompts in the 'shiny/pytest/generate' directory. Also renames workflow files for consistency, updates .gitignore to exclude new result and metadata files, and improves path handling in test metadata and evaluation scripts for robustness.

Replaces pip install with pip upgrade and Makefile targets for installing dependencies in the validate-test-generation-prompts GitHub Actions workflow.

Introduces a new step to set up py-shiny in the validate-test-generation-prompts GitHub Actions workflow before installing dependencies.

Replaces custom py-shiny setup and Makefile commands with pip install for test dependencies. Refactors comment formatting in evaluation results for improved readability and consistency.

Set fetch-depth to 0 for full git history, install dev dependencies along with test dependencies, and fix formatting in error comment for evaluation results. These changes improve workflow reliability and ensure all required packages are available.

Replaces the use of actions/github-script with marocchino/sticky-pull-request-comment for posting AI evaluation results on pull requests. Adds a step to prepare the comment body and writes it to a file, improving error handling and ensuring comments are updated rather than duplicated.

Adds environment variables for Python version and attempt count, implements caching for Python dependencies and Playwright browsers, and improves Playwright installation steps. These changes reduce redundant installs and speed up workflow execution.

Moved inspect-ai from pyproject.toml test dependencies to explicit installation in the GitHub Actions workflow. This change ensures inspect-ai is installed only during CI runs and not as a default test dependency.

Standardized quotes to double quotes in the workflow YAML and updated the pip cache key to use only 'pyproject.toml'. Removed unnecessary blank lines for improved readability.

Documented new feature: `shiny add test` command now uses AI models from Anthropic or OpenAI to automatically generate Playwright tests for Shiny applications.

Renamed GitHub Actions workflow files to use 'verify' instead of 'validate' in their filenames for consistency and clarity.

Pytest results handling and reporting have been removed from the prepare_comment.py script as they are not working properly. The overall result now only reflects the Inspect AI quality gate status.

Eliminates reading and handling of pytest and combined summary results in prepare_comment.py, as these features are currently not working properly.

github-actions · 2025-08-22T18:15:57Z

Test Generation Evaluation Results (Averaged across 3 attempts)

🔍 Inspect AI Test Quality Evaluation

Complete (C): 7.3
Partial (P): 1.7
Incomplete (I): 0.0
Passing Rate: 9.0/9.0 (100.0%)
Quality Gate: ✅ PASSED (≥80% required)

🎯 Overall Result

✅ PASSED - Quality gate based on Inspect AI results

Results are averaged across 3 evaluation attempts for improved reliability.

Move dotenv loading to shiny/_main_generate_test.py to ensure environment variables are loaded before API key validation, without requiring the generator to manage dotenv or logging. Remove dotenv and logging setup from ShinyTestGenerator in shiny/pytest/_generate/_main.py for cleaner separation of concerns.

* main: fix: Make outside users able to read tmp files (#2070) docs: update module server and ui to incorporate #705 (#2044) fix: errors on bookmark are now surfaced in the Python console (#2076) Add Connect Cloud as a hosting option in README (#2074) Update changelog Update changelog fix: include_css and include_js can use files in same dir (#2069)

`ERROR: OpenAI API requires at least version 1.104.1 of package openai (you have version 1.102.0 installed).` Be sure anthropic/inspect-ai are up to date

…me PR

karangattu and others added 23 commits July 24, 2025 19:57

Update workflow to create PR for testing docs changes

a701f93

The workflow now checks for changes in documentation_testing.json and, if detected, creates a pull request using the peter-evans/create-pull-request action instead of pushing directly.

Refactor test file generation and validation logic

88d46de

This refactor improves maintainability and user experience when generating AI-powered test files.

Update imports and test exclusions for Shiny evaluation

03233b3

Changed import of ShinyTestGenerator to use a relative path in create_test_metadata.py. Updated test_shiny_import.py to exclude '/testing/evaluation/apps/' from the tested paths.

Update testing docs workflows and validation

5c83e7a

Removed the auto-update workflow for testing documentation, added a new workflow to validate changes in the controller directory and prompt for documentation updates, and renamed the conventional commits workflow for clarity.

Move evaluation test files to new directory

a037fb2

Renamed and relocated all files from shiny/testing/evaluation/ to tests/inspect-ai/ to better organize evaluation test assets and scripts under the tests directory.

Update references from evaluation to inspect-ai apps

8be0a01

Changed .gitignore, pyrightconfig.json, and test_shiny_import.py to reference 'tests/inspect-ai' instead of 'shiny/testing/evaluation'. This aligns configuration and test filtering with the new directory structure.

Merge branch 'main' into integrate-test-generator

9cf8715

Move utility scripts to tests/inspect-ai/utils

118a874

Renamed and relocated utility scripts and related files from shiny/pytest/generate/utils to tests/inspect-ai/utils for improved organization. Updated workflow references to match new paths.

Update testing guidelines to skip icons and plots

822637a

Added instructions to skip testing icon and plot functionality in SYSTEM_PROMPT_testing.md. This clarifies the scope of tests and avoids unnecessary coverage for icons and plots.

Add new test dependencies to pyproject.toml

c896498

Added 'chatlas[anthropic]', 'chatlas[openai]', and 'inspect-ai' to the test dependencies to support additional testing capabilities.

Update dependency installation in CI workflow

8f91d19

Replaces pip install with pip upgrade and Makefile targets for installing dependencies in the validate-test-generation-prompts GitHub Actions workflow.

Add py-shiny setup step to workflow

8f9a6a0

Introduces a new step to set up py-shiny in the validate-test-generation-prompts GitHub Actions workflow before installing dependencies.

Update test generation workflow dependencies and comments

40174fc

Replaces custom py-shiny setup and Makefile commands with pip install for test dependencies. Refactors comment formatting in evaluation results for improved readability and consistency.

karangattu marked this pull request as ready for review July 25, 2025 15:26

karangattu requested a review from schloerke July 25, 2025 15:26

karangattu added 3 commits July 25, 2025 21:06

Update inspect-ai installation in workflow and dependencies

5adfca2

Moved inspect-ai from pyproject.toml test dependencies to explicit installation in the GitHub Actions workflow. This change ensures inspect-ai is installed only during CI runs and not as a default test dependency.

Update YAML quoting and cache key in workflow

aabb772

Standardized quotes to double quotes in the workflow YAML and updated the pip cache key to use only 'pyproject.toml'. Removed unnecessary blank lines for improved readability.

Add AI-powered test generator for Shiny apps

13dfdb5

Documented new feature: `shiny add test` command now uses AI models from Anthropic or OpenAI to automatically generate Playwright tests for Shiny applications.

posit-dev deleted a comment from github-actions bot Jul 25, 2025

Rename workflow files from 'validate' to 'verify'

e746054

Renamed GitHub Actions workflow files to use 'verify' instead of 'validate' in their filenames for consistency and clarity.

posit-dev deleted a comment from github-actions bot Aug 22, 2025

Remove pytest results from prepare_comment script

f028254

Pytest results handling and reporting have been removed from the prepare_comment.py script as they are not working properly. The overall result now only reflects the Inspect AI quality gate status.

posit-dev deleted a comment from github-actions bot Aug 22, 2025

Remove unused code for pytest and combined results

69d2ff4

Eliminates reading and handling of pytest and combined summary results in prepare_comment.py, as these features are currently not working properly.

karangattu and others added 21 commits August 29, 2025 11:16

Merge branch 'main' into integrate-test-generator

d7a06ab

Update comment

a5bd55c

Make method internal

f8f888e

Fix outdated openai requirement

8a354ac

`ERROR: OpenAI API requires at least version 1.104.1 of package openai (you have version 1.102.0 installed).` Be sure anthropic/inspect-ai are up to date

Remove a layer of folder nesting

f17f6c2

Use a variable for the results folder. Ignore the new folder

56e9419

Add make commands for running and installing inspect-ai tests

7cda751

Allow multiple PRs to work at the same time. But cancel within the sa…

1828320

…me PR

Update for latest inspect-ai xml output

b4fc642

Make pytest path relative to current working directory

de5f74b

Use shiny setup helper. Test on workflow files updates

247d6d3

typo

73e8627

Update verify-test-generation-prompts.yaml

6eed4b2

Update GHA names

b95211f

make update-testing-docs

03f2ed3

Update verify-testing-docs-on-change.yml

ff2af1b

diagnostics

a1e7273

Update verify-testing-docs-on-change.yml

4808812

Reverse logic?

c0687f3

schloerke approved these changes Sep 6, 2025

View reviewed changes

schloerke changed the title ~~chore(test-generation): Integrate test generator~~ feat(cli): Add AI support to shiny add test Sep 6, 2025

schloerke merged commit 83d3952 into main Sep 6, 2025
130 of 131 checks passed

schloerke deleted the integrate-test-generator branch September 6, 2025 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(cli): Add AI support to `shiny add test` #2041

feat(cli): Add AI support to `shiny add test` #2041

Uh oh!

karangattu commented Jul 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

feat(cli): Add AI support to shiny add test #2041

feat(cli): Add AI support to shiny add test #2041

Uh oh!

Conversation

karangattu commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Generation Evaluation Results (Averaged across 3 attempts)

🔍 Inspect AI Test Quality Evaluation

🎯 Overall Result

Uh oh!

Uh oh!

Uh oh!

feat(cli): Add AI support to `shiny add test` #2041

feat(cli): Add AI support to `shiny add test` #2041

karangattu commented Jul 24, 2025 •

edited

Loading

github-actions bot commented Aug 22, 2025 •

edited

Loading