Automated Characterization of Consumer-Grade Sensor Accuracy from Supporting Data in Heterogeneous Air Quality Monitoring Networks

Academic Research Topics in Environmental Measurement and Monitoring
Oral Presentation

Prepared by D. Ramsay, J. Paradiso
MIT Media Lab, 75 Amherst St, Cambridge, MA, 02139, United States


Contact Information: dramsay@media.mit.edu; 703-347-1376


ABSTRACT

Affordable, consumer-grade air quality monitoring devices are becoming more common as the mainstream visibility and concern over pollution rises. Though cost-effective sensor technology is improving, collocation studies have demonstrated that consumer devices are poorly characterized and frequently inaccurate. The aim of this work is to test the viability of logistic regression to model the conditions under which affordable sensors are reliable. Accuracy estimates for each measurement are derived using additional onboard sensors combined with publicly available weather and geography data.

Models were trained based on a two-month collocation study comparing six affordable sensors with four regulatory grade sensors in Roxbury, MA. AUC-ROC scores of 0.79-0.81 for several sensors demonstrate that this technique can successfully predict sensor accuracy within a useful margin. The results further suggest 0.90-0.98 scores are possible for longer studies. Model hyper-parameters can be judiciously manipulated to quantify cross-seasonal prediction strength and sensor precision. Feature reduction may provide insight into sensor failure mode and device design.

The techniques in this paper make it possible to extract the subset of reliable measurements from sensors that are otherwise dominated by systematic inaccuracy. As the ability to quantify measurement-to-measurement uncertainty improves, academics and researchers may evolve their models by probabilistically integrating data from affordable, consumer-grade sensor networks.