You are here


Master Decoder

The Master Decoder is a tool for decoding and normalising climate data from USHCN and Canadian sources. Presently, it can read in USHCN DailyUSHCNv2 Monthly, and Canadian Daily Climate Data (CDCD). It takes all the different units and data types used by these repositories and automatically converts them to the English system and hands them to users' programs in an easy-to-understand format.


Program Outline

This section outlines the steps used in processing the climate data and briefly describes the code and processes involved. You may expect all code mentioned below to be made available as the relevant publications come out.

(1) Daily data were collected by the United States Historical Climate Network (USHCN) program and processed into smoothed, monthly data. Therefore this step was performed for us. However, direct use of daily data would require us to perform such a step, so it is relevant here.

(2) The monthly data was extracted from the USHCN files in which it was packaged.

(3) A subset of stations, regions, and/or months was selected and 30-year averages, stabilities, growth rates, and other relevant properties were calculated.

(4) Mathematical climatic surfaces were fit to these results.

(5) A set of geographic relevant points on the intersection of the climate surfaces were chosen, mapped to the surfaces, then tracked over time.

(6) Properties of the movement of the points (velocity of a point's entire track, velocity of a portion of its track, indicators of goodness-of-fit) were calculated.

(7) The surfaces and/or tracked points were overlaid on a map of the geopolitical terrain they traversed.

Each of the above steps was distinct in its requirements and different tools were therefore developed for each, using the programming language suitable for each step. Details on the programs, languages, and alternates for each step are documented here, as a record and as an aid for those who would adapt these methods to other regions of the world.

(1) Generation of monthly data. USHCN daily data, initially hoped to provide valuable insights into discrete, extreme weather events proved difficult to work with for reasons stated earlier in this report. Programs and code for this step were developed by other parties in conjunction with the USHCN. Information on this process is available on the USHCNv2 Monthly Data website.

(2) Unpacking of monthly data. Throughout the project, the code for this process was written in the programming language C or C++. C compiles to computer code which, appropriately written, is extremely fast and efficient. It also allows excellent management of computer memory. Both of these properties were important, given that the USHCN daily data consumes 1.6GB and that intermediate processing steps require an additional 2-3GB of main memory (RAM). While we ultimately did not employ the daily data, we do not rule out its use in future projects we or other researchers conduct. C is a widely-used and well-understood language. Therefore, when we began work with the monthly data, its code was again developed in C as a module within the existing code base. Code to unpack Canada's daily climate data was also developed. This code base is accessible from other programs via simple function calls and represents a unified module for accessing the climate data of the majority of North America's land mass. The source code is available on the website.

(3) Calculation of averages. The code developed for this step is small and very specific to the processing needs of this project, and therefore has a lower probability of reuse. Nonetheless, it is also available on the website.

(4), (5) Fitting of Surfaces/Tracking of Points. Commonalities in these two steps allowed the same language for both and many functions could be shared. The prototype code was developing using William Waite's Stage2 general purpose macro processor to produce Mathematica analysis scripts, the results of which were again passed through Stage2 to develop output suitable for mapping. This required some manual intervention and human judgment. The actual code for surface fitting and intersections was in Matlab, which had easy mapping capabilities and which was easy to program. The Stage2 pre- and post-processing steps were folded in Step 3 (calculation of averages) and into the Matlab scripts, and the manual steps were automated to remove the need for manual intervention.

(6) Calculation of track properties. The track was represented by a series of GPS locations, one per year. Great circle distances and the angle between each sequential pair of points in the track were calculated in Javascript (see below) From these, the overall velocity and North/East velocities of any subset of the track were calculated.

(7) Map overlay. Finally, we instituted procedures to make the map overlay step as intuitive and useful, accessible, and powerful as possible, and available by web access. For those wishing to adapt our work to other applications, the technical details are as follows: A client-side AJAX/Javascript/OpenLayers web application was developed to run with a Unix/Apache/PHP/BASH server stack. On the client-side, the web interface uses the open-source OpenLayers framework to display geopolitical maps of the areas of interest. We supplied the USHCN station information and the locations of all the stations. Specific instances of these were be selected and climate surfaces fit (see above) to the selected stations using BASH scripts to run steps 2-4. The researcher may then select any point on the map, thereby initiating another AJAX request to launch a BASH script which runs step 5. The server returns the GPS points of the track. Step 6 is then performed on the client side using Javascript and the OpenLayers framework. Finally, the tracks were displayed using the Openlayers framework and made interactive through Javascript controls on the page. The resulting graphical interface allowed researchers full access to the analysis products of this project while being simple enough for anyone to use.