Package Overview

This tutorial is meant to provide some basic understanding of how the ILAMB python package works and is organized. The level of information communicated is aimed at a developer who wants to implement his own benchmark into the system and needs to understand how to go about doing so. We will start here with a few simple examples which demonstrate functionality, and layer in complexity in subsequent tutorials.

The ILAMB python package consists of four main objects: Variable, ModelResult, Confrontation, and Scoreboard. We will discuss the first three in this tutorial.

The Variable Object

The Variable object is the basic building block of the ILAMB package. It keeps track of dimensions as the netCDF variables do, but also provides data-aware analysis routines which operate on the data in an intelligent manner. For example, consider the following variable we can create from the data used in a previous tutorial:

from ILAMB.Variable import Variable
import os
v = Variable(filename = os.environ["ILAMB_ROOT"] + "/MODELS/CLM40cn/rsus/rsus_Amon_CLM40cn_historical_r1i1p1_185001-201012.nc",
             variable_name = "rsus")

The first two lines here import the functionality we need. The first imports the Variable object from the ILAMB package and the second imports a standard python package which allows us to interact with the operating system. We need this package to gain access to the ILAMB_ROOT environment variable explained in the First Steps tutorial. Then we create a variable object by specifying the filename as well as the name of the variable which want to extract from inside. We can then print this variable:

print v

which will display the following information to the screen:

Variable: rsus
--------------
              unit: W m-2
        isTemporal: True (1932)
         isSpatial: True (192,288)
        nDatasites: N/A
         dataShape: (1932, 192, 288)
           dataMax: 4.028994e+02
           dataMin: 0.000000e+00
          dataMean: 6.153053e+01

The Variable object understands the dimensionality of the data as well as its unit and then provides analysis routines which operate intelligently depending on the type of data present. So for example, we can find the mean value over the time period of the data by:

print v.integrateInTime(mean=True)

which will display:

Variable: rsus_integrated_over_time_and_divided_by_time_period
--------------------------------------------------------------
              unit: W m-2
        isTemporal: False
         isSpatial: True (192,288)
        nDatasites: N/A
         dataShape: (192, 288)
           dataMax: 1.386898e+02
           dataMin: 9.787394e+00
          dataMean: 6.148656e+01

The returned value is another Variable object, which now has lost its temporal dimension because this was integrated out. It represents the average in time at each grid cell in the original data. The Variable object has a lot of functionality and will be expanded to meet needs of developers. For a more complete explanation of the interface, consult the documentation. However, the point of this tutorial is that we use the Variable object to perform analysis operations in a uniform and flexible manner. Its full functionality will be covered in more detail in a future tutorial.

The ModelResult Object

The ModelResult object is meant to make getting a model’s variables easy. We anticipate that researchers will have placed all a model run’s results in a single directory bearing the model name as well as perhaps version, or forcing. To create this object, we simply point to the top-level directory where the results are contained:

from ILAMB.ModelResult import ModelResult
m = ModelResult(os.environ["ILAMB_ROOT"] + "/MODELS/CLM40cn",
               modelname = "CLM40cn")

When we instantiate the model result, internally we search for all variables found in all netCDF files contained underneath this top-level directory. This makes extracting variables simple. We can extract the same variable as above, but in a much more simple manner once the model result has been defined:

v = m.extractTimeSeries("rsus")
print v

yields the folling screen output:

Variable: rsus
--------------
              unit: W m-2
        isTemporal: True (1932)
         isSpatial: True (192,288)
        nDatasites: N/A
         dataShape: (1932, 192, 288)
           dataMax: 4.028994e+02
           dataMin: 0.000000e+00
          dataMean: 6.153053e+01

In addition to making the aquisition of model data simpler, if land fractions and areas are relevant (that is, the variable is spatial), we will apply them to the variable automatically. The user is only responsible for having the appropriate datafiles (areacella and sftlf) in the model’s directory. Extracting the variables from the ModelResult object ensures that we handle model data consistently. The ModelResult interface is much smaller, and will be expanded in the future.

The Confrontation Object

The Confrontation object manages the benchmark dataset, the extraction of the data from the model, the anaylsis performed, as well as the plotting and generating of results. As a developer, you will be writing your own Confrontation objects so it is important to understand what they are and how they work. First, we will initialize one to help illustrate their functionality:

from ILAMB.Confrontation import Confrontation
c = Confrontation(source   = os.environ["ILAMB_ROOT"] + "/DATA/rsus/CERES/rsus_0.5x0.5.nc",
                  name     = "CERES",
                  variable = "rsus")

As before, we specify the source data relative to the ILAMB_ROOT variable. We also have given the confrontation a name and a variable to expect. There are two main functions to highlight at this point. The first has to do with preparing data for comparison:

obs,mod = c.stageData(m)

The stageData functionality returns both the observational and model datasets as Variable objects and in a form in which they are comparable. For example, if we again print mod here, we is analagous to v above, we see:

Variable: rsus
--------------
              unit: W/m2
        isTemporal: True (131)
         isSpatial: True (192,288)
        nDatasites: N/A
         dataShape: (131, 192, 288)
           dataMax: 4.028824e+02
           dataMin: 0.000000e+00
          dataMean: 6.035579e+01

However, the temporal dimension has been greatly reduced (from 1932 entries down to 131). This is because the observational dataset is contemporary and the model starts back in 1850. In addition to clipping the data, we also convert units if appropriate.

The second main function of the Confrontation is to perform the desired analysis. This happens in the confront functionality:

c.confront(m)

Where m is the ModelResult being passed in. This routine calls stageData internally, and then performs the desired analysis. The function does not return anything, but generates an analysis file which contains the results of the analysis. In this case, you will find two netCDF4 files in your directory: CERES_Benchmark.nc and CERES_CLM40cn.nc. You can use ncdump or ncview (from NetCDF Tools) to examine the contents of these files.

The Confrontation also handles the plotting and generation of HTML output pages, but this is a more advanced aspect of the object, detailed in its interface, shown here.

Summary

While there is much more to learn in understanding the ILAMB python package, these are the basic objects and concepts you will need to grasp to implement new benchmarks and analysis. The basic idea is that we have encapsulated the notion of benchmark datasets and their accompanying analysis into a Confrontation class which operates on the ModelResult represented as a Variable. What we have done here manually is part of what happens inside of the ilamb-run script, which we executed in previous tutorials.