Heart Disease Modeling (ML3)
ML3 extends the heart disease work with KNN-based imputation, SVM variants (LinearSVC/SVC) grid-tuned over C/gamma/kernel, and LDA grid-tuned over solver/shrinkage. It preserves stratified splits from prior preprocessing, evaluates accuracy/precision/recall/F1, and handles sklearn predict quirks by passing numpy arrays. KNN-imputed preprocessing leverages RandomizedSearchCV to control compute while exploring the hyperparameter space.
Overview
ML3 extends the heart disease work with KNN-based imputation, SVM variants (LinearSVC/SVC) grid-tuned over C/gamma/kernel, and LDA grid-tuned over solver/shrinkage. It preserves stratified splits from prior preprocessing, evaluates accuracy/precision/recall/F1, and handles sklearn predict quirks by passing numpy arrays. KNN-imputed preprocessing leverages RandomizedSearchCV to control compute while exploring the hyperparameter space.
Key Features
KNN-imputed preprocessing for heart disease data
SVM (LinearSVC/SVC) tuned over C/gamma/kernel
LDA tuned over solver and shrinkage
GridSearchCV and RandomizedSearchCV for efficient hyperparameter search
pages.portfolio.projects.heart_disease_ml3.features.4
Stratified splits preserved from preprocessing
Workaround for sklearn predict array requirements (DataFrame to numpy)
Technical Highlights
Compared SVM and LDA with tuned hyperparameters on heart disease data
Used KNN-imputed preprocessing and randomized search to manage compute
Reported strong LDA performance (Acc ~0.87, F1 ~0.86) and tuned SVC results
Handled sklearn predict quirk by converting DataFrames to numpy arrays
Challenges and Solutions
Search Space Size
Balanced exhaustive grid search with randomized search for the KNN-imputed variant
API Quirks
Ensured predict compatibility by using numpy arrays with certain sklearn versions
Model Coverage
Benchmarked multiple classifiers (SVM variants, LDA) to find best-performing configs
Technologies
ML
Data
Viz
Environment
Project Information
- Status
- Completed
- Year
- 2024
- Architecture
- ML experimentation with SVM/LDA and tuned preprocessing
- Category
- Data Science