A collection of open-source tools for the analysis of air pollution data.

An enormous amount of air pollution data is collected worldwide and the amount of data collected continues to increase. In Europe, for example, there are thousands of measurement sites. The large bulk of this information is analysed in basic ways – often to check compliance with air quality standards and guidelines.

This situation represents a considerable missed opportunity because more insightful analysis of air pollution data can yield a much richer source of information concerning the nature of air pollution such as source identification and attribution. Improved information concerning the sources and nature of air pollution will help lead to the development of better policies for controlling air pollution.

The analysis of air pollution data can however be difficult. Often there is a lack of knowledge concerning what types of technique are available (there are many), the tools required to carry out analyses can be spread across many different types of software, or are expensive. In general, a consistent set of tools for air quality data analysis does not exist.

The openair project aims to address these issues by:

  • Providing a free, open-source set of tools available to everyone
  • Making available a range of existing techniques and developing new ones for the analysis of air pollution data
  • Using the statistical/data analysis software R as a platform – a powerful, open-source programming language ideal for insightful data analysis
  • Making it easy to carry out sophisticated analyses quickly, in an interactive and reproducible way
  • Encourage the air quality community to use and help further develop these tools

The project is available as a package for the R project for statistical computing. It is led by the Environmental Research Group at King’s College London and supported by the University of Leeds.

Wind and Pollution Roses

Wind roses and pollution roses are the ‘bread and butter’ of air pollution analysis, but it is surprisingly difficult to find software to produce these plots properly. openair comes with a dataset ‘mydata’, which provides several years of air pollution data from a site in London.

bivariate polar plots

The bivariate polar plot is a useful diagnostic tool for quickly gaining an idea of potential sources. Wind speed is one of the most useful variables to use to separate source types.

trend analysis

Trend analysis is a key component of many analyses of air pollution data. openair has several ways of undertaking trend analysis. There are two main functions: smoothTrend and MannKendall, which serve different purposes – the latter useful for quantification. Both these functions are highly flexible and there are many examples of usage in the manual.

model evaluation

Frequently, the evaluation of models is limited to a few numeric statistics e.g. mean bias, correlation coefficient. However, almost all the functions in openair that are used for analysing air pollution measurement data can also be applied directly to model output data. Applying these functions and comparing modelled-measured outputs can greatly improve the evaluation of models. Many of the functions can help reveal why a model performs as it does, rather than only providing a measure of agreement.

importing data

openair has two functions for importing data from UK air pollution monitoring sites – importAURN and importKCL.

While openair was developed initially for the air quality community, it is useful for a wide range of users:

  • The techniques are useful across the atmospheric sciences
  • Consultancies and industry for who air pollution is important
  • Academia – many of the techniques are useful for research purposes
  • Regulators concerned with controlling air pollution
  • openair contains many tools for model evaluation

Indeed, now there are a large and growing number of international users from a wide range of backgrounds including, industry, consultancies and academia. There are an increasingly large number of reports and journal articles that use openair for analysis and interpretation.