LiftOff’s Journey with Edge Impulse: From Speech Recognition to Detecting Glass Rattling.
Introduction
Exploring Edge Impulse to train and deploy AI models on small devices like the Raspberry Pi sparked our journey into embedded machine learning.
OKPi — Speech Recognition, Running Water Detection, and Glass Rattling Detection — began as curiosities and became insightful projects.
We navigated data collection, feature extraction, real-time inferencing, and the challenges of embedding ML models.
This blog shares our experiences, challenges, and solutions.
Project 1: OKPi — Our First Step into Embedded AI
Our first project was a speech recognition model called “OKPi” The idea was simple — train an Edge Impulse model to recognize the phrase “OKPi” as a wake word. This was our first exposure to the Edge Impulse environment, and it helped us understand:
● Feature extraction from audio samples
● Choosing the right DSP algorithm (Mel-Frequency Cepstral Coefficients — MFCC)
● Training and optimizing an AI model for accuracy.
Challenges Faced
1. Model Performance & Accuracy — The model performed well with our voices but struggled with different accents and pitch variations. We expanded our dataset with diverse speaker samples.
2. Deployment on Raspberry Pi & Noise Issues — background noise interfered with recognition. The default audio drivers (ALSA) lacked noise suppression, making it difficult for the model to detect the wake word unless spoken close to the microphone.
Solution: We installed PulseAudio (later switched to PipeWire), which provided noise cancellation. Additionally, we recompiled PortAudio with PipeWire support to improve microphone interaction.
Final Outcome
With a refined dataset and noise suppression, the model accurately recognized “OKPi” even in noisy environments. This success inspired us to tackle more real-world audio classification challenges.
Project 2: Running Water Detection
Our next challenge was to build a model that could detect the sound of running water. The idea was to identify whether a faucet was turned on, which could be useful for applications like leak detection or home automation.
Here’s the demonstration video link
Data Collection & Model Training
We collected samples of running water from different taps, flow rates, and distances. The dataset consisted of:
● Positive Samples: Running water from various sources.
● Negative Samples: Ambient noise, speech, and other household sounds.
Using Edge Impulse’s sound classification tutorial, we streamlined training and experimented with Mel-Frequency Energy (MFE) feature extraction, which effectively captured water sound patterns.
Challenges Faced
1. False Positives from Similar Sounds
The model initially misclassified other sounds (e.g., a pressure cooker’s whistle) as running water. We improved accuracy by adding more negative samples of common household noises.
2. Real-Time Inference Optimization
Running the model on a Raspberry Pi required optimizing inference speed. Edge Impulse’s quantized models helped reduce processing time while maintaining accuracy.
Final Outcome
After fine-tuning the dataset and model, I achieved high detection accuracy with minimal false positives. The model could reliably classify running water sounds, making it practical for real-world applications.
Project 3: Glass Rattling Detection
Our most complex project aimed to detect glass rattling caused by low-frequency sound vibrations. This could help monitor disturbances from loud music, construction noise, or heavy vehicles.
Here’s the demonstration video Link
Data Collection & Model Design
Unlike previous projects, this required precise control over frequency and volume:
● Positive Samples: Synthetic sine waves (60–90 Hz) played at varying volumes:
○ Low: Just above background noise
○ Medium: Audible vibration
○ High: Severe rattling
● Negative Samples: Sine waves outside the target frequency range.
Challenges Faced
1. Misclassification Due to synthetic data
Initially, using only pure sine waves led to overfitting, causing the model to misclassify similar frequencies as rattling.
Solution: We recorded real-world samples where actual glass rattling occurred, capturing the effects of room acoustics and resonance. This significantly improved model performance.
2. Handling Noise Interference on Raspberry Pi Similar to OKPi, real-time deployment required noise cancellation adjustments. We rebuilt PortAudio with PipeWire support to enhance audio clarity.
Final Outcome
After refining the dataset, the model could accurately detect when glass was rattling due to bass frequencies while ignoring background noise.
Lessons Learned & Key Takeaways
1. Data Quality is Everything
The effectiveness of an ML model depends entirely on the quality of the dataset. Collecting real-world samples significantly improved performance across all projects.
2. Noise Handling is Crucial for Audio-Based Models
From wake-word detection to environmental sound analysis, background noise and microphone sensitivity were major challenges. PipeWire and PulseAudio played crucial roles in resolving these issues on Raspberry Pi.
3. Feature Selection is Key
Choosing the right Digital Signal Processing (DSP) algorithm — whether MFCC for speech recognition or MFE for frequency-based classification — was critical to improving model accuracy.
4. Optimizing for Embedded Systems
Running real-time ML inference on Raspberry Pi required optimizing the model size and processing speed. Quantized models helped maintain performance while reducing latency.
Conclusion
What started as a simple exploration of Edge Impulse turned into an
in-depth journey into embedded machine learning and real-time signal processing. Each project built upon the lessons learned from the previous one, enhancing our understanding of data collection, model training, and deployment on Raspberry Pi.
From wake-word detection to real-time environmental sound analysis, these projects not only expanded our technical skills but also demonstrated the practical potential of AI in everyday applications.