A full-stack application that transcribes, diarizes, and summarizes meeting audio using local and cloud-based AI models. Supports Whisper, Pegasus, and Ollama with an optional Colab notebook for heavy tasks.
meeting-summarizer/
├── backend/
│ ├── app.py # Flask server
│ ├── transcribe.py # Whisper transcription
│ ├── summarize.py # Pegasus summarization
│ ├── summarize_ollama.py # Ollama integration
│ └── diarize.py # Placeholder for diarization
├── frontend/
│ ├── src/
│ │ └── App.jsx # React frontend
│ └── package.json
└── AI_Meeting_Summarizer_Colab.ipynb # Optional Colab notebook
cd backend
pip install flask flask-cors transformers torch torchaudio openai-whisper
python app.py
Runs on: http://localhost:5000
cd frontend
npm install
npm start
Opens at: http://localhost:3000
- 🎙️ Upload audio (WAV/MP3)
- 🔊 Transcription using Whisper
- 👥 (Planned) Diarization using pyannote
- ✂️ Summarization using:
- Pegasus (Hugging Face)
- Ollama (Mistral) via local API
- 🧠 Optional Colab for large models
Use this when your system can't handle Whisper Large or diarization locally.
- Open
AI_Meeting_Summarizer_Colab.ipynb
in Google Colab - Upload audio
- Get:
transcript.txt
diarization.txt
summary.txt
Ensure Ollama is running locally and has a supported model pulled:
ollama run mistral
No token needed for Pegasus, but diarization in Colab requires:
- Create token
- Paste it into the Colab variable
HUGGINGFACE_TOKEN
- Whisper-based transcription
- Pegasus and Ollama summarization
- Diarization in Flask (via pyannote)
- Colab GPU support for large models
- Auto-upload via file.io (optional)
Built with:
- OpenAI Whisper
- Hugging Face Transformers
- Ollama
- Pyannote Audio
- React + Flask