A collection of exploratory-data-analysis, data-engineering and machine-learning notebooks that walk from raw SpaceX REST-API calls all the way to an interactive Plotly-Dash dashboard. The goal is to predict whether the Falcon 9 first stage lands successfully and to visualise key factors such as payload, orbit and launch-site location.
File | Purpose |
---|---|
edadataviz.ipynb | End-to-end EDA and feature-engineering: statistical plots (seaborn/matplotlib), one-hot encoding, scaling, and CSV export (dataset_part_3.csv ). |
jupyter-labs-eda-sql-coursera_sqllite.ipynb | Loads Spacex.csv into SQLite, then answers 10 analytical SQL questions (landing outcomes, payload sums, etc.) using ipython-sql . |
jupyter-labs-spacex-data-collection-api.ipynb | Queries the SpaceX REST API, dereferences nested IDs (rockets, payloads, cores, launchpads), cleans nulls and writes dataset_part_1.csv . |
jupyter-labs-webscraping.ipynb | Scrapes the Falcon 9/Heavy launch table from a pinned 9-Jun-2021 Wikipedia snapshot with BeautifulSoup; saves fallback spacex_web_scraped.csv . |
lab_jupyter_launch_site_location.ipynb | Interactive Folium map: marks each launch site, clusters individual launches by success (green) / failure (red), and computes distances to coast, rail, highways, etc. |
labs-jupyter-spacex-Data wrangling.ipynb | Adds a binary Class column (1 = successful landing) and exploratory counts by orbit, outcome, launch-site; outputs dataset_part_2.csv . |
ML0101EN_SkillUp_FinalAssignment.ipynb | Weather-dataset classification (unrelated to SpaceX but required by course): compares Linear Regression, KNN, Decision Tree, Logistic Regression, SVM. |
SpaceX_Machine Learning Prediction_Part_5.ipynb | Hyper-parameter tuning (GridSearchCV) for Logistic Regression, SVM, Decision Tree and KNN on the engineered SpaceX dataset; confusion matrices + accuracy table. |
spacex_dash_app.py | Stand-alone Plotly-Dash web app: dropdown for launch-site, payload slider, success pie chart and scatter correlation; run with python spacex_dash_app.py . |
README.md | You are reading it. A roadmap for reproducing every notebook and running the dashboard. |
-
Clone & activate venv
git clone https://github.com/your-user/ibm-data-science.git cd ibm-data-science python3 -m venv venv && source venv/bin/activate
-
Install
pip install -r requirements.txt
-
Launch Jupyter
jupyter lab
-
Run the Dash app
python spacex_dash_app.py # open http://127.0.0.1:8050 in your browser
-
API Acquisition →
jupyter-labs-spacex-data-collection-api.ipynb
→dataset_part_1.csv
-
Data Wrangling & Label Engineering →
labs-jupyter-spacex-Data wrangling.ipynb
→dataset_part_2.csv
-
Feature Engineering & EDA →
edadataviz.ipynb
→dataset_part_3.csv
-
ML Modelling →
SpaceX_Machine Learning Prediction_Part_5.ipynb
-
Dashboard →
spacex_dash_app.py
- pandas ≥1.3
- numpy ≥1.21
- matplotlib ≥3.5, seaborn
- scikit-learn ≥0.24
- requests, beautifulsoup4
- folium
- dash ≥2.0, plotly
- ipython-sql, sqlalchemy, sqlite3
Install them automatically through the provided requirements.txt
.
MIT © 2023 arenkis