In 2019, we embarked on a bold project: to release a complete structured dataset from a governmental institution into Wikimedia projects, in order to maximize its dissemination and its reuse. Environment and Climate Change Canada, Government of Canada, has agreed to fund our project for two years, until March 31, 2021. We are now sharing the result of several months of work.
100 Years of Weather Data?
Yes, you read that right. The oldest weather records in the country date back to 1840, over 170 years ago. In 2021, Environment and Climate Change Canada’s (ECCC) Meteorological Service celebrates its 150th anniversary, making it one of the oldest institutions in Canada. A true meteorological “heritage”! The completeness of the records from the 8,756 weather stations scattered from coast to coast makes this scientific information interesting and valuable in the context of open data and public awareness of climate change. This data is already available on the ECCC website, but we wanted to highlight this treasure of raw data.
The first phase (2019-2020) consists in the massive import of weather data from ECCC into Commons. We are talking about dozens of recorded parameters (temperature, wind speed, precipitation, etc.). In total, four weather data sets are made available by ECCC: the almanac (i.e. records), monthly data, daily data and hourly data. This is a huge amount of data and this massive import is unprecedented in the Commons project, where images, videos and audio recordings are usually included. In order not to alienate the community, it was decided in this first phase to import only yearbook and monthly data. We are talking about more than 26 million data in total! This massive import of government data is a world first as far as we know.
The second phase (2020-2021) consists in finding ideas to enhance this data and disseminate it through Wikimedia projects, such as Wikipedia, which remains by far the best known and most consulted project of the Wikimedia family of projects. Wikimedia Canada has therefore collaborated with two institutions and scientific networks that care about our mission: Acfas and IVADO. We organized a first presentation on January 27, 2021. It was the first time we talked about the project outside our small team. However, it was a real success. On our invitation, more than 60 people from different organizations (private, governmental, paragovernmental, NPO,…) joined us. A few weeks later, we gathered 12 people from different backgrounds to hold a brainstorming session and benefit from the collective intelligence.
We want to address three questions:
- How can we use this data and disseminate it through Wikimedia projects to make it meaningful and accessible to as many people as possible?
- How can we influence the rest of the world to import similar datasets and make them available in Wikimedia projects?
- How can these data help us address some of the questions about climate change?
The data import required the creation of four tools (an upload tool, a merge tool, a discrimination tool and finally, a conversion tool). It was necessary to switch from a data model in XML format to the more restrictive JSON format, which is used globally by Wikimedia projects.
The brainstorming gave us four main directions to disseminate the data: 1. data visualizations, 2. descriptive use of weather data to “augment” information about events, places or people, 3. prescriptive use of weather data to help decision making in different sectors and other countries, 4. creation of Meta-Wiki pages (pages used to coordinate projects) to invite Wikimedians around the world to replicate the project.
Final report and other useful links
We presented the results in a scientific presentation at the 55th Congress of the Canadian Meteorological and Oceanographic Society on June 1, 2021. Reference: 100 years of Weather Observations from the Meteorological Service of Canada in Wikimedia Commons (Wikipedia), Ha-Loan Phan (Wikimedia Canada), Pierre Choffet (Wikimedia Canada), Miguel Tremblay (Environment and Climate Change Canada).
See the Meta page of the project.
During the summer, we will introduce you to our two interns, Ali Sabzi (Polytechnique Montréal) and Laurence Taschereau (Université du Québec à Montréal), who will continue our work with ECCC data as part of IVADO‘s “Data Storytelling” internship, supervised by Wikimedia Canada and Acfas.