Skip to content

KumarLabJax/NextFlow_NTGDataAnalysisApp_v0.2

Repository files navigation

Nitroglycerin Data Analysis App

A comprehensive Shiny application for analyzing behavioral data from nitroglycerin (NTG) studies, featuring advanced preprocessing, machine learning, and statistical analysis capabilities. This application is designed for researchers studying the effects of different NTG doses on mouse behavioral patterns.

πŸ“ Project Structure

NitroglycerinDataAnalysisApp_v0.1/
β”œβ”€β”€ πŸ“Š data/                          # Data files
β”‚   β”œβ”€β”€ NTG_final_curated_nextflow_dataset.csv  # Main NTG dataset
β”‚   β”œβ”€β”€ RettApp_ControlData.csv       # Control data for testing
β”‚   └── RettApp_ComprehensiveTestData.csv  # Generated test data
β”œβ”€β”€ πŸ“š docs/                          # Documentation
β”‚   β”œβ”€β”€ README.md                     # This file
β”‚   β”œβ”€β”€ COMPLETE_TESTING_DOCUMENTATION.md  # Comprehensive testing guide
β”‚   β”œβ”€β”€ Software Requirements Specification.md
β”‚   └── To Do list.md
β”œβ”€β”€ πŸ§ͺ tests/                         # Comprehensive testing framework
β”‚   β”œβ”€β”€ data_generation/              # Test data creation scripts
β”‚   β”‚   β”œβ”€β”€ create_comprehensive_test_data_DOCUMENTED.R
β”‚   β”‚   β”œβ”€β”€ create_comprehensive_test_data.R
β”‚   β”‚   └── create_test_data.py
β”‚   β”œβ”€β”€ unit/                         # Unit tests
β”‚   β”‚   β”œβ”€β”€ run_simple_tests_DOCUMENTED.R
β”‚   β”‚   β”œβ”€β”€ run_detailed_tests_DOCUMENTED.R
β”‚   β”‚   └── setup_continuous_testing_DOCUMENTED.R
β”‚   β”œβ”€β”€ integration/                  # Integration tests
β”‚   β”‚   β”œβ”€β”€ test-server.R
β”‚   β”‚   β”œβ”€β”€ test-preprocessing.R
β”‚   β”‚   β”œβ”€β”€ test-lda-sanity.R
β”‚   β”‚   β”œβ”€β”€ test-server-edge-cases.R
β”‚   β”‚   └── test-integration.R
β”‚   └── testthat.R                    # Test runner
β”œβ”€β”€ πŸ—‚οΈ Archive/                       # Archived/old files
β”‚   β”œβ”€β”€ IntermediateCode/             # Previous versions
β”‚   └── NextflowViewer_v0.3.R
β”œβ”€β”€ 🎨 www/                           # Static assets
β”‚   β”œβ”€β”€ app_icon.png
β”‚   └── 3dmouse.png
β”œβ”€β”€ πŸ“± server_tabs/                   # Modular server components
β”‚   β”œβ”€β”€ tab_data_exploration_server.R
β”‚   β”œβ”€β”€ tab_lda_server.R
β”‚   β”œβ”€β”€ tab_pca_server.R
β”‚   β”œβ”€β”€ tab_ml_server.R
β”‚   β”œβ”€β”€ tab_correlation_server.R
β”‚   └── tab_summary_server.R
β”œβ”€β”€ 🌐 ui.R                           # User interface
β”œβ”€β”€ βš™οΈ server.R                       # Main server logic
β”œβ”€β”€ πŸ”§ global.R                       # Global configuration
β”œβ”€β”€ πŸ“„ LICENSE                        # License file
β”œβ”€β”€ πŸ“Š VisualizeTestChamber.R         # Test chamber visualization
β”œβ”€β”€ πŸ“‹ LL test location.pdf           # Test location documentation
└── πŸ“‹ LLUnit.pdf                     # Unit documentation

πŸš€ Quick Start

1. Prerequisites

  • R (version 4.0 or higher)
  • Required R packages (automatically installed by the app)

2. Running the App

# Load the Shiny app
shinyApp(ui = ui, server = server)

3. Data Analysis Workflow

The application follows a structured workflow across 5 tabs:

Tab 1: Data Exploration & Preprocessing

  1. Load Data: Upload your CSV file and select feature columns
  2. Select Subsets: Choose treatment doses (Tx) and timepoints to analyze
  3. Preprocessing Options:
    • Remove features with zero variance
    • Choose outlier detection method (None, IQR, Z-score)
    • Decide on outlier action (flag or remove)
    • Select data transformation (None, Log, Square Root, Box-Cox)
  4. Run Preprocessing: Click "Run Preprocessing" to apply changes
  5. Review Results: Check preprocessing log and visualization plots

Tab 2: Correlation Analysis

  • Explore feature correlations with interactive heatmaps
  • Create two-feature scatter plots with customizable faceting
  • Download correlation plots as PDF

Tab 3: Data Summary

  • View animal count summaries by treatment and timepoint
  • Access data overview statistics

Tab 4: Principal Component Analysis (PCA)

  • Perform PCA on preprocessed features for each timepoint
  • Choose feature selection method (variance explained or number of components)
  • Generate scree plots, PC scatter plots, and loadings plots
  • Download PCA results as PDF

Tab 5: Linear Discriminant Analysis (LDA)

  • Use PCs from PCA tab to discriminate between treatment doses
  • Automatically uses PCs accounting for 95% variance
  • Generate 1D/2D scatter plots, univariate plots, and accuracy comparisons
  • Cross-validation with configurable folds
  • Download LDA results as PDF

4. Using Test Data

# Generate comprehensive test data
source("tests/data_generation/create_comprehensive_test_data_DOCUMENTED.R")

# Run the app and upload: data/RettApp_ComprehensiveTestData.csv
# Or use the main NTG dataset: data/NTG_final_curated_nextflow_dataset.csv

πŸ§ͺ Testing

Quick Tests

# Run basic functionality tests
source("tests/unit/run_simple_tests_DOCUMENTED.R")

# Run comprehensive tests
source("tests/unit/run_detailed_tests_DOCUMENTED.R")

Continuous Testing

# Set up automatic testing (monitors files for changes)
source("tests/unit/setup_continuous_testing_DOCUMENTED.R")

πŸ“Š Features

Data Analysis Tabs (New Order)

  1. Data Exploration & Preprocessing

    • Comprehensive workflow instructions and guidance
    • Intelligent column name fuzzy matching for data compatibility
    • Zero-variance feature removal
    • Outlier detection and handling (IQR, Z-score methods)
    • Data transformation options (Log, Square Root, Box-Cox)
    • Interactive feature visualization with EDA plots
    • Preprocessing log and summary reporting
  2. Correlation Analysis

    • Interactive correlation heatmaps with automatic clustering
    • Two-feature scatter plots with customizable faceting
    • Color and shape customization by treatment or timepoint
    • PDF export capabilities for all plots
  3. Data Summary

    • Animal count tables by dose and timepoint
    • Data quality metrics and preprocessing summaries
    • Export capabilities for results and tables
  4. Principal Component Analysis (PCA)

    • Flexible feature selection (variance explained or number of components)
    • Scree plots with variance explained for each timepoint
    • 2D PCA scatter plots with dose-level coloring
    • Component loading analysis with top feature identification
    • PDF export for all PCA visualizations
  5. Linear Discriminant Analysis (LDA)

    • Automatic use of PCs accounting for 95% variance from PCA
    • 1D jittered plots for 2-dose groups, 2D scatter plots for 3+ groups
    • Univariate density and box plots by treatment
    • Cross-validation with configurable folds (3-10)
    • Comprehensive accuracy metrics with chance comparisons
    • Confidence intervals and model reliability assessment
    • PDF export for all LDA visualizations

πŸ”§ Configuration

Global Settings (global.R)

  • NTG dose level definitions (0, 2.5, 5, 10 mg/kg)
  • Correlation cutoff thresholds (0.90)
  • Package management and conflict resolution
  • Robust data preparation functions

Data Requirements

  • MouseID: Mouse identifier column
  • Timepoint: Timepoint column (e.g., Day 1, Day 2, etc.)
  • Tx: Treatment column (NTG dose levels: 0, 2.5, 5, 10)
  • Numeric features: Behavioral measurements
  • Optional metadata: Sex, Pen_ID, Cohort, DOB, Date, etc.

Supported Data Formats

  • CSV files with flexible column naming (fuzzy matching enabled)
  • Automatic detection of feature vs. metadata columns
  • Support for various timepoint and dose level formats

πŸ“š Documentation

  • Complete Testing Guide: docs/COMPLETE_TESTING_DOCUMENTATION.md - Comprehensive testing framework documentation
  • Software Requirements: docs/Software Requirements Specification.md - Detailed technical specifications
  • To Do List: docs/To Do list.md - Project development roadmap
  • Prompt Documentation: docs/Promopt.md - Development guidelines and specifications

πŸ› Troubleshooting

Common Issues

  1. Package Installation: The app automatically installs required packages
  2. Data Format: Ensure CSV files have correct column names (fuzzy matching helps with variations)
  3. File Paths: Use relative paths from project root
  4. Column Selection: Make sure to select appropriate feature column ranges
  5. Memory Issues: For large datasets, consider filtering by timepoint first

Getting Help

  1. Check the comprehensive testing documentation
  2. Run the test scripts to identify issues
  3. Review console output for error messages
  4. Use the built-in fuzzy matching for column name issues
  5. Check the troubleshooting section in the testing documentation

πŸ“„ License

See LICENSE file for details.

🀝 Contributing

  1. Follow the existing code structure
  2. Add tests for new features
  3. Update documentation
  4. Use the testing framework to validate changes

Note: This app is designed for behavioral data analysis in nitroglycerin (NTG) research studies. The application supports analysis of different NTG dose levels (0, 2.5, 5, 10 mg/kg) and their effects on mouse behavioral patterns. Ensure you have appropriate data and follow ethical guidelines for animal research.

About

An app to alalyze the nitroglycerin data from Nextflow

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published