🎤Vocal Percussion Classifier
This project was built to solve an awesome problem using modern technologies and a sleek design.
Role
Research @ Music Informatics Lab
Duration
Jan 2025 - May 2025
Technologies
Pythonscikit-learnlibrosa
This project presents a comparative study of machine learning approaches for classifying vocal percussion sounds, commonly used in beatboxing and vocal percussion performance.
Project Goals
The primary objective was to develop and evaluate different machine learning models for accurately classifying various vocal percussion sounds. We focused on creating a robust system that could distinguish between different types of vocal percussion sounds (such as kick drums, snares, hi-hats, and cymbals) with high accuracy.
Methodology & Results
We implemented and compared several machine learning approaches:
- Traditional ML Models: SVM and Random Forest classifiers using handcrafted audio features
- Deep Learning: CNN and LSTM architectures for end-to-end learning
- Hybrid Approach: Combining traditional features with neural networks
Key findings showed that:
- The CNN model achieved the highest accuracy (92%) for isolated sounds
- LSTM performed better for continuous sequences
- Traditional ML models were more computationally efficient
Technical Implementation
- Feature Extraction: Used librosa for MFCCs, spectral features, and temporal characteristics
- Data Processing: Implemented custom augmentation techniques for limited training data
- Model Training: Utilized scikit-learn for traditional ML and TensorFlow for deep learning models