Skip to content

Feature Profile

Rajiv Sambasivan edited this page Apr 25, 2024 · 2 revisions

This page illustrates the notion of a feature profile. The dataset used for the illustration is a dataset from a sensor array (https://ieee-dataport.org/open-access/dataset-binary-classification-digital-sensor-signals). Cheap sensors and controllers make IOT data pretty pervasive. Probably some of you are Arduino hobbyists. Briefly, the data represents voltage readings from a capacitive sensor when it is immersed in oil and water (see https://ieee-dataport.org/open-access/dataset-binary-classification-digital-sensor-signals). The goal is to build a classifier to identify the immersion medium from the sensor voltages. Applying the process described in Fukunaga’s book reveals that the data has good separation, so in fact, a hyperplane type classifier (perceptron/SVM) should do the trick. However, going with the process, we evaluate a non-parametric method, a decision tree. It turns out that with a simple feature engineering trick, we can get good results with this dataset. Using the statistical properties of the signal is a standard featurization idea for signal data. We use a very rudimentary set of properties, the minimum, maximum, mean, and standard deviation of the signal. With these features, we get good discrimination and we can identify oil and water voltage signals very effectively. Inspection of the errors revealed that there might be label noise in this dataset. A small set of records that appear to be oil immersion are labeled as water immersion and vice-versa. This could be a circuit glitch for some readings. Such readings can be identified with an outlier analysis of the oil and water immersion data. The denoised data can then be used to develop the classifier. The minimum voltage recorded by the sensor array and the standard deviation of the signal are the most discriminative properties. For the feature profile implementation, please see:

this notebook

For the workflow report implementation, please see: this notebook