The Master Decoder is a tool for decoding and normalising climate data from USHCN and Canadian sources. Presently, it can read in USHCN Daily, USHCNv2 Monthly, and Canadian Daily Climate Data (CDCD). It takes all the different units and data types used by these repositories and automatically converts them to the English system and hands them to users' programs in an easy-to-understand format.
This section outlines the steps used in processing the climate data and briefly describes the code and processes involved. You may expect all code mentioned below to be made available as the relevant publications come out.
(1) Daily data were collected by the United States Historical Climate Network (USHCN) program and processed into smoothed, monthly data. Therefore this step was performed for us. However, direct use of daily data would require us to perform such a step, so it is relevant here.
(2) The monthly data was extracted from the USHCN files in which it was packaged.
(3) A subset of stations, regions, and/or months was selected and 30-year averages, stabilities, growth rates, and other relevant properties were calculated.
(4) Mathematical climatic surfaces were fit to these results.
(5) A set of geographic relevant points on the intersection of the climate surfaces were chosen, mapped to the surfaces, then tracked over time.
(6) Properties of the movement of the points (velocity of a point's entire track, velocity of a portion of its track, indicators of goodness-of-fit) were calculated.
(7) The surfaces and/or tracked points were overlaid on a map of the geopolitical terrain they traversed.
Each of the above steps was distinct in its requirements and different tools were therefore developed for each, using the programming language suitable for each step. Details on the programs, languages, and alternates for each step are documented here, as a record and as an aid for those who would adapt these methods to other regions of the world.
(1) Generation of monthly data. USHCN daily data, initially hoped to provide valuable insights into discrete, extreme weather events proved difficult to work with for reasons stated earlier in this report. Programs and code for this step were developed by other parties in conjunction with the USHCN. Information on this process is available on the USHCNv2 Monthly Data website.
(2) Unpacking of monthly data. Throughout the project, the code for this process was written in the programming language C or C++. C compiles to computer code which, appropriately written, is extremely fast and efficient. It also allows excellent management of computer memory. Both of these properties were important, given that the USHCN daily data consumes 1.6GB and that intermediate processing steps require an additional 2-3GB of main memory (RAM). While we ultimately did not employ the daily data, we do not rule out its use in future projects we or other researchers conduct. C is a widely-used and well-understood language. Therefore, when we began work with the monthly data, its code was again developed in C as a module within the existing code base. Code to unpack Canada's daily climate data was also developed. This code base is accessible from other programs via simple function calls and represents a unified module for accessing the climate data of the majority of North America's land mass. The source code is available on the website.
(3) Calculation of averages. The code developed for this step is small and very specific to the processing needs of this project, and therefore has a lower probability of reuse. Nonetheless, it is also available on the website.
(4), (5) Fitting of Surfaces/Tracking of Points. Commonalities in these two steps allowed the same language for both and many functions could be shared. The prototype code was developing using William Waite's Stage2 general purpose macro processor to produce Mathematica analysis scripts, the results of which were again passed through Stage2 to develop output suitable for mapping. This required some manual intervention and human judgment. The actual code for surface fitting and intersections was in Matlab, which had easy mapping capabilities and which was easy to program. The Stage2 pre- and post-processing steps were folded in Step 3 (calculation of averages) and into the Matlab scripts, and the manual steps were automated to remove the need for manual intervention.