← Back to Portfolio

NFL QB Touchdown Predictor

Database-Driven Machine Learning System

Project Overview

A comprehensive database-driven machine learning project that predicts whether a quarterback (QB) will throw at least one touchdown (TD) in a given NFL game using player statistics, game context, and historical performance data.

Project Objective: Predict whether an NFL quarterback will throw a touchdown pass in a game using past performance and game details. This binary classification model uses historical QB game logs, player career stats, and basic bio data stored in a SQLite database, demonstrating feature engineering, model evaluation, and explainability techniques.

Key Features

🗄️ Database-Driven

All data stored in SQLite for easy management and validation with comprehensive relational schema

⚡ Real-time Predictions

Make predictions using current player data from database with confidence scores

✅ Data Validation

Comprehensive data quality checks and validation framework

🚀 Automated Workflow

One-command setup and deployment with modular architecture

📊 Historical Tracking

View prediction history and accuracy tracking in database

🌐 Modern Web App

Beautiful Streamlit interface with multiple pages and real-time updates

System Architecture

Built with a modular, database-first approach using SQLite for data persistence and Streamlit for the user interface.

Project Structure

📁 data/
├── 📁 raw/ # Original CSV files
└── 📁 processed/ # Cleaned & engineered datasets
📁 src/
├── 🗄️ database.py # Database management
├── 📥 data_loader.py # Load CSV data into database
├── ✅ data_validator.py # Data quality validation
├── 🔄 preprocess.py # Database-driven preprocessing
├── 🎯 train_model.py # Model training
└── 📊 explain_shap.py # Model explainability
📁 app/
└── 🌐 app.py # Streamlit web application

Database Schema

The project uses a relational SQLite database with comprehensive data management:

Core Tables

Key Relationships

Model Performance

88%
Accuracy
85%
F1 Score
91%
ROC-AUC
15+
Features

Data Sources

Web Application Features

The Streamlit app provides multiple comprehensive pages:

🎯 Make Prediction

🗄️ Player Database

📊 Prediction History

Technical Implementation

Python SQLite XGBoost Streamlit Pandas NumPy Scikit-learn SHAP

Data Validation Framework

Advanced Features

Automated Workflow

One-command setup with modular execution:

Database Management

What I'd Improve

Business Impact

This project demonstrates my ability to: