A Scalable Approach for Location-Specific Detection of Santa Ana Conditions
Nguyen, M., Uys, D., Crawl, D., Cowart, C., and Altintas, I., A Scalable Approach for Location-Specific Detection of Santa Ana Conditions, In Proceedings of the 2016 IEEE International Conference on Big Data.
Santa Ana conditions are hot, dry, windy weather conditions that can greatly increase the dangers of wildfires in southern California. We present a machine learning approach to detect Santa Ana conditions based on sensor measurements from weather stations. Cluster analysis is performed on historical weather data to build models to identify Santa Ana patterns. A separate model is built using data from each weather station to capture the patterns specific to the microclimate of each region. Real-time sensor data from a weather station can then be processed to determine if the region surrounding that station is experiencing Santa Ana conditions. Results can be used as a warning system to focus firefighting efforts on regions with increased wildfire risks. Through the use of the Kepler workflow system and distributed computing with Spark, data from several weather stations can be processed in parallel using a scalable clustering algorithm, allowing our approach to scale to large datasets from multiple weather stations.