Vous êtes sur la page 1sur 28

Lesson 3

cdms2, masks, cdscan, cdutil, genutil
cdms2: selecting data
import cdms2
DATAPATH = ‘/CAS_OBS/mo/sst/HadISST/’
f = cdms2.open(DATAPATH + ‘sst_HadISST_Climatology_1961-1990.nc’)
x = f(‘sst’) # retrieves the whole dataset - a “slab”

# Selecting a specific area


x = f(‘sst’, latitude=(0., 35.), longitude= (20., 100.))
print x.shape

# Selecting specific times


y = f(‘sst’, time=(‘0-6-1’, ‘0-6-31’))
# You can also select the specific time from a “slab”
z = x(time=(‘0-6-1’, ‘0-6-31’))

# You can “squeeze” out “singleton” dimensions


y = f(‘sst’, time=(‘0-6-1’, ‘0-6-31’), squeeze=1)

# You can also change the order of returned dimensions


z = x(time=(‘0-6-1’, ‘0-6-31’), order=‘xyt’)
Querying the slab
# You can query the order of returned dimensions!
print 'Order of dimensions for x is:', x.getOrder()

# You can change the order of returned dimensions!


x2 = f('sst', latitude=(0., 35.), longitude=(20., 100.), order='xty')

# If I just want to ensure that time as my first axis


x3 = f('sst', order=‘t…')

# You can check the axis and its properties


lat_axis = x2.getLatitude()
print lat_axis.info()
lat_bounds = lat_axis.getBounds()

# Axes can be retrieved by index position


my_ax = x2.getAxis(1)
print my_ax.info()
Bounds!

lat_bounds_2

(lat, lon)

lat_bounds_1

lon_bounds_1 lon_bounds_2
Specifying precise regions to extract
# If I need a specific point - say lat, lon = 32.1, 100.3
x3 = f('sst', latitude=(32.1, 32.1, 'cob'),
longitude=(100.3, 100.3, 'cob'))
print x3.shape

# The first 2 positions are either ‘c’losed or ‘o’pen


# The 3rd position is ‘b’ or ‘e’ or …..
cdutil ‐ overview

• Set of climate data specific utilities.
• The cdutil Package contains a collection of 
sub‐packages useful to deal with Climate 
Data
• Sub‐components are:
– region
– times: tools to deal with the time dimension.
– vertical
– averager
cdutil: region

• The cdutil.region module allows the user to 
extract a region “exactly”. i.e. resetting the 
latitude and longitude bounds to match the 
area “exactly”, therefore computing an “exact” 
average when passed to the averager function.
• Predefined regions are:
– AntarcticZone, AAZ (South of latitude 66.6S)
– ArcticZone, AZ (North of latitude 66.6N)
– NorthernHemisphere, NH # useful for dataset with 
latitude crossing the equator
– SouthernHemisphere, SH
– Tropics (latitudes band: 23.4S, 23.4N)
Cdutil: region
• cdutil.region
– cdms2 selector to extract “exact” region (i.e reset 
bounds correctly so averaging account for only 
“actual” area averaged not the full cell.
S=f(“var”, cdutil.region.NorthernHemisphere)

• Creating your own regions
NINO34 =
cdms2.selectors.Selector(cdutil.region.domain(l
atitude=(-5., 5.), longitude=(190., 240.)))
Interpolation (re‐gridding)
# Suppose we have a slab
ds1 = f('sst', latitude=(0., 35.), longitude=(20., 100.), \
time=('0-6-1', '0-6-31'), order='xty’)
print 'ds1.shape = ', ds1.shape, 'Axis order=', ds1.getOrder()
# Let us now extract another dataset.
f2 =
cdms2.open(’/CAS_OBS/climatology/NCEP_NCAR_Climatology_ltm/slp.
day.ltm.nc')
ds2 = f2('slp', latitude=(0., 35.), longitude=(20., 100.), \
time=('1-6-1', '1-6-31'))
# I want to transform ds2 into the grid in ds1
ds3 = ds2.regrid(ds1.getGrid())
print 'ds3.shape = ', ds3.shape, 'Axis order=', ds3.getOrder()
# Alternate regridder
from regrid import Regridder
ingrid = ds2.getGrid()
outgrid = ds1.getGrid()
regridFunc = Regridder(ingrid, outgrid)
new_ds2 = regridFunc(ds2)
Vertical regridding 
• You can regrid pressure‐level coordinates in the vertical 
axis using the pressureRegrid() method.

• You need to define, or use an existing, vertical axis.

• Then use the pressureRegrid method on the variable 
you wish to regrid, passing it the new level as the 
argument:

• If var is the variable to regrid and the newlevs is 


the vertical axis to regrid to:
>>> var_on_new_levels = var.pressureRegrid(levout)
cdms2: Using masks
sst_mask = cdms2.MV2.getmask(ds1(order='tyx'))
print 'sst_mask.shape after reorder = ', sst_mask.shape
print sst_mask.__class__
# So we resize the mask
sst_mask.resize(ds3.shape)
#
ds4 = cdms2.createVariable(ds3[:], mask=sst_mask,
id='masked_psl', fill_value=1.e+20)
cdscan
• A utility that helps you manage files better.
• When you have many .nc (or .ctl) files you can 
use this utility to generate a single “xml” file 
that makes life simple.
• Try the following:
cdscan -x “some_filename.xml” DATA_PATH1/*.nc
– You can also change the time axis while you are at 
it!
cdscan –x “some_filename.xml” –i 1 –r“months
since 1-1-1” DATA_PATH2/*.nc
cdutil: times
• cdutil.times – for time axes, geared toward climate data
– Climatology, Departures, Anomalies Tools works on BOUNDS, 
NOT on time values, designed for monthly seasons, but one 
can create an engine for other kind of data (daily, yearly, 
etc…).
ac=cdutil.times.ANNUALCYCLE.climatology(s)
– In order to set bounds you can use:
cdutil.setTimeBoundsMonthly(Obj)
cdutil.setTimeBoundsYearly(Obj)
cdutil.setTimeBoundsDaily(Obj, frequency=1)
(Obj can be slab or time axis)
– Create your own seasons:
MONSOON = cdutil.times.Seasons(’JJAS’)
• cdutil imports everything in the times module so you can just 
call e.g.:
cdutil.setTimeBoundsMonthly(slab/axis)
The importance of bounds

• CDAT used to set bounds automatically. E.g.:
longitude = [0, 90, 180, 270]

∴ bounds = [[-45, 45], [45, 135],


[135, 225], [225, 315]]
• Seems reasonable, but imagine a monthly mean time series where the times 
are recorded on 1st day of each month:
timeax=[“1999-1-1”, “1999-2-1”, …, “2100-12-1”]

• CDAT assumes that each month represents the period of 15th last month to 
15th this month. 
• Since cdutil tools use bounds they will be misinterpreting the data. Need to 
set the bounds sensibly:
>>> cdutil.setTimeBoundsMonthly(timeax)
Pre‐defined time‐related means
• DJF, MAM, JJA, SON (seasons) 
>>> djf_mean=cdutil.DJF(my_var)

• SEASONALCYCLE (means for the 4 predefined seasons 
[DJF, MAM, JJA, SON ]) – array of above.
>>> seas_mns=cdutil.SEASONALCYCLE(my_var)

• YEAR (annual means)

• ANNUALCYCLE (monthly means for each month of the 
year)
– EXERCISE: Try calculating the climatological annual cycle for 
the NCEP Reanalysis data you have read in.
Climatologies and departures
Season extractors have 2 functions available:
• climatology: which computes the average of all 
seasons passed. ANNUALCYCLE.climatology(), will return 
the 12 month annual cycle for the slab:
>>> ann=cdutil.ANNUALCYCLE.climatology(v)

• departures: which given an optional climatology will 
compute seasonal departures from it.
>>> d=cdutil.ANNUALCYCLE.departures(v, cli60_99)

# Note that the second argument is optional but can be a pre‐computed 
climatology such as here cli60_99 is a 1960‐1999 climatology but the variable v is 
defined from 1900‐2000. If not given then the overall climatology for v is used.
Simple user‐defined averaging
• You can create your own simple averages using 
arrays, slabs or variables in the usual way:
– Averaging over 4 time steps:
>>> t.shape
(4, 181, 360)
>>> av=(t[0]+t[1]+t[2]+t[3])/4

• Drawbacks:
– Doesn’t retain your metadata.
– Cannot average simply across axes within a variable.
MV2 Averaging
• The MV2 module has an averaging function:
MV2.average(x, axis=0, weights=None, returned=0)
– computes the average value of the non‐masked elements of x
along the selected axis. If weights is given, it must match the 
size and shape of x, and the value returned is:

– elements corresponding to those masked in x or weights are 
ignored. If returned, a 2‐tuple consisting of the average and 
the sum of the weights is returned.
MV2 Averaging: example

• To calculate a set of zonal means:
import cdms2, MV2
f=cdms2.open(’/CAS_OBS/mo/sst/HadISST/
sst_HadISST_Climatology_1961-1990.nc’)
data=f(‘sst’)
print data.shape
zm=MV2.average(data, axis=2)
print zm.shape
print zm.info()
The cdutil “averager” function

• The “cdutil.averager()” function is the key to 
spatial and temporal averaging in CDAT.
• Masks are dealt with implicitly.
• Powerful area averaging function. 
• Provides control over the order of operations 
(i.e. which dimensions are averaged over first).
• Allows weightings for the different axes:
– pass your own array of weights for each dimension,  
use the default (grid) weights or specify equal 
weighting. 
Usage of cdutil.averager
result = averager( V, axis=axisoptions,
weights=weightoptions,
action=actionoptions,
returned=returnedoptions,
combinewts=combinewtsoptions)

axisoptions has to be a string. You can pass axis='tyx', or '123', or 'x (plev)’.

weightoptions is one of 'generate’ | ‘weighted’ | 'equal' | ‘unweighted’ | 
array | Masked Variable

actionoptions is 'average' | 'sum‘ [Default = 'average‘]. 
You can either return the weighted average or the weighted sum of the 
data.
Example: Region Averaging, and 
Climatology
import cdutil
# define the your custom regions.
NINO3 = cdms2.selectors.Selector(cdutil.region.domain(latitude=(-5., 5.,
'ccb'), longitude=(210., 270., 'ccb')))
fsst = cdms2.open(INDIR + 'sst_HadISST_1870-1_2011-1.nc’)
nino3_data = fsst('sst', NINO3)
print nino3_data.shape, nino3_data.getOrder()
# Compute the Spatial average
nino3_average = cdutil.averager(nino3_data, axis='xy')
# Anomaly from climatology computed over 1961-1990
nino3_slice = nino3_average(time=('1961-1-1', '1990-12-31'))
nino3_clim = cdutil.ANNUALCYCLE.climatology(nino3_slice)
print nino3_clim.shape
# Now departures
nino3_anomaly = cdutil.ANNUALCYCLE.departures(nino3_average, nino3_clim)
print nino3_anomaly.shape
EXERCISE

• Extract the SST data and compute global 
anomalies from the 1961‐1990 climatology for 
the whole length of dataset. 
• Average the anomaly data over x and y axes 
using “equal” weights for both axes and 
compare against area “weighted” average.
genutil : general utilities
• genutil.statistics: set of basic statistical 
functions
• correlation, covariance, geometricmean, 
laggedcorrelation, laggedcovariance, 
linearregression, meanabsdiff, median, 
array_indexing, percentiles, arrayindexing, 
rank, autocorrelation      , rms, autocovariance, 
std, variance
Statistics Example
• c1 = genutil.statistics.correlation(a, b, axis=‘t’)
• c1.shape
• c2 = genutil.statistics.correlation(a, b, 
axis=‘xy’)
• c2.shape
Support for other grid types

RectGrid ‐ Associated latitude and longitude are 
1‐D axes, with strictly monotonic values.

CurveGrid ‐ Latitude and longitude are 2‐D 
coordinate axes (Axis2D).

GenericGrid ‐ Latitude and longitude are 1‐D 
auxiliary coordinate axes (AuxAxis1D)
Curvilinear and Generic Grids 
Acknowledgements
• Dean Williams, Charles Doutriaux (PCMDI, 
LLNL)
• Dr. Johnny Lin 

Vous aimerez peut-être aussi