Bayesian Optimization for signal processing pipeline in C++

An AutoML tool for signal processing

Huang Kevin
3 min readMay 7, 2021

Goal of this post

This post tries to show that bayesian optimization can not only tune the hyperparameters of ML model, but also the parameters in regular signal processing pipeline — peak detection for time-series data.

Methods

Libraries used in this work — limbo from [1]. This library is implemented in C++, which can serve as a lightweight framework suitable for robot or IoT edge devices where computational efficiency is a major issue.

As for the peak detector, I used the peak detection algorithm in this stack overflow post. There are three parameters in this peak detector — lag, threshold, and influence, which requires training set to tune via Bayesian Optimization.

The test procedure is listed as follow:

1.Generate periodic time-series for training and test set1

training dataset for parameter tuning: 10 peaks within
test set1: 20 peaks within

2.Tune the parameters of peak detector based on training set

3.Test the found parameters w/ test set1

4.Generate a new kind of time-series — test sset2, and apply previous peak detector directly to check if it still works.

test set2: 10 peaks within

Results & Discussion

1.Since training set and test set1 shared the same property of time series, the number of peaks in test set1 can be correctly detected by peak detector w/ parameters set trained by training set.

s1 = ../found_params.txt
MoveCounter created!
s2 = ../../../../test_set1.csv
Finish reading data!
# of move detect = 20

2.Because the property of test set2 has changed, the previous peak detector no longer found the correct number of peaks. Therefore, retraining to find another parameter set is necessary.

s1 = ../found_params.txt
MoveCounter created!
s2 = ../../../../test_set2.csv
Finish reading data!
# of move detect = 51
MoveCounter detroyed!

Conclusion:

When signal-processing pipeline is deployed to edge device, one of challenges is to tackle data drift due to inevitable change in environmental conditions. With the help of a lightweight optimization framework in C++, on-device retraining would be promising solution to update deployed pipeline without sophisticated server for pipeline update!

For more details in implementation, please refer to my github.

Reference:

  1. Cully, A., Chatzilygeroudis, K., Allocati, F., and Mouret J.-B., (2018). Limbo: A Flexible High-performance Library for Gaussian Processes modeling and Data-Efficient Optimization. The Journal of Open Source Software. Library’s github.

--

--

Huang Kevin
Huang Kevin

Written by Huang Kevin

Algorithm engineer at semiconductor company with background in physics

No responses yet