Bayesian Optimization for signal processing pipeline in C++
An AutoML tool for signal processing
Goal of this post
This post tries to show that bayesian optimization can not only tune the hyperparameters of ML model, but also the parameters in regular signal processing pipeline — peak detection for time-series data.
Methods
Libraries used in this work — limbo from [1]. This library is implemented in C++, which can serve as a lightweight framework suitable for robot or IoT edge devices where computational efficiency is a major issue.
As for the peak detector, I used the peak detection algorithm in this stack overflow post. There are three parameters in this peak detector — lag, threshold, and influence, which requires training set to tune via Bayesian Optimization.
The test procedure is listed as follow:
1.Generate periodic time-series for training and test set1
2.Tune the parameters of peak detector based on training set
3.Test the found parameters w/ test set1
4.Generate a new kind of time-series — test sset2, and apply previous peak detector directly to check if it still works.
Results & Discussion
1.Since training set and test set1 shared the same property of time series, the number of peaks in test set1 can be correctly detected by peak detector w/ parameters set trained by training set.
s1 = ../found_params.txt
MoveCounter created!
s2 = ../../../../test_set1.csv
Finish reading data!
# of move detect = 20
2.Because the property of test set2 has changed, the previous peak detector no longer found the correct number of peaks. Therefore, retraining to find another parameter set is necessary.
s1 = ../found_params.txt
MoveCounter created!
s2 = ../../../../test_set2.csv
Finish reading data!
# of move detect = 51
MoveCounter detroyed!
Conclusion:
When signal-processing pipeline is deployed to edge device, one of challenges is to tackle data drift due to inevitable change in environmental conditions. With the help of a lightweight optimization framework in C++, on-device retraining would be promising solution to update deployed pipeline without sophisticated server for pipeline update!
For more details in implementation, please refer to my github.
Reference:
- Cully, A., Chatzilygeroudis, K., Allocati, F., and Mouret J.-B., (2018). Limbo: A Flexible High-performance Library for Gaussian Processes modeling and Data-Efficient Optimization. The Journal of Open Source Software. Library’s github.