20CRv3 early access details

Created by laura.slivinski on - Updated on 02/23/2019 12:49

How to access preliminary NOAA-CIRES-DOE 20CRv3 data at the National Energy Research Scientific Computing Center (NERSC).

 

First, you must obtain an account at http://www.nersc.gov/users/accounts/user-accounts/get-a-nersc-account/  .
In the Description field of the account form, enter "I will be working with Gilbert Compo on analyses of the new NOAA-CIRES-DOE 20th Century Reanalysis version 3 using repo m958. I will be studying <your scientific interest goes here>." 

 

To see details on the model, assimilation, observations, and other implementation algorithms, click here.

 

Ensemble Mean and spread files in GRIB and text observation files are currently available on disk at NERSC:

/project/projectdirs/incite11/ensda_v451/ensda_[syear]/YYYYMMDDHH and

/project/projectdirs/20C_Reanalysis/ensda_v452/ensda_[syear]/YYYYMMDDHH, where:

  • 451 = 20CRv3si and 452 = 20CRv3mo

  • syear = “stream year” is any year from 1804 - 2009 ending in a 4 or a 9

  • Yearmonthdayhour directories are available every six hours (00,06,12,18)

 

Within a directory, there are several types of files. Consider the example 1916010100:

> l /project/projectdirs/incite11/ensda_v451/ensda_1914/1916010100

total 234240

drwxrwxr-x     2 cmccoll m958  131072 May 24 14:50 ./

drwxrwxr-x+ 8895 cmccoll m958   524288 Jul 24 17:08 ../

-rwxrwxr-x     1 cmccoll m958 15102002 Jan 11  2018 pgrbensmeananl_1916010100.grb2*

-rwxrwxr-x     1 cmccoll m958 16957694 Jan 11  2018 pgrbensmeananl_1916010103.grb2*

-rwxrwxr-x     1 cmccoll m958 16966630 Jan 11  2018 pgrbensmeanfg_1916010100_fhr06.grb2*

-rwxrwxr-x     1 cmccoll m958 57797749 Jan  3 2018 pgrbenssprdanl_1916010100*

-rwxrwxr-x     1 cmccoll m958 65366796 Jan  3 2018 pgrbenssprdanl_1916010103*

-rwxrwxr-x     1 cmccoll m958 65280022 Jan  3 2018 pgrbenssprdfg_1916010100_fhr06*

-rwxrwxr-x     1 cmccoll m958   90315 Jan 3 2018 psobfile*

-rwxrwxr-x     1 cmccoll m958   88308 Jan 3 2018 psobs.txt*

-rwxrwxr-x     1 cmccoll m958  149856 Jan 3 2018 psobs_posterior.txt*

-rwxrwxr-x     1 cmccoll m958   88977 Jan 3 2018 psobs_prior.txt*
 

pgrb” refers to grib (for spread files) or grib2 (for mean and everymember files) filetype.

anl” refers to “analysis”, and “fg” refers to “first guess”. Not all variables are available in all types of files. Accumulated and averaged variables are only available in pgrb files 3 hours after the central analysis time (in this example 00Z) and in the “fg” file valid 6 hours after the central analysis time.  Accumulations and averages are needed from both to make a 3 hourly timeseries. See below for more details.

psob* are each text files with observation statistics; psobs_posterior.txt is the final file with all statistics after completing the assimilation of observations at that time step. See link for specifics regarding each field in this file.

 

Examples for YYYMMDDHH = 2011010100:

pgrbensmeananl_YYYYMMDDHH.grb2 (pgrbanl, for short)

pgrbensmeananl_YYYYMMDD{HH+3}.grb2  (pgrbanl+3 for short)

pgrbensmeanfg_YYYYMMDDHH_fhr06.grb2  (pgrbfg for short)

Note that grib1 files (ie, ensemble spread files) will have the same variables, but may have different names depending on your reader.

In particular, note that precipitation rate is only in the pgrbfg* and pgrbanl+3. For YYYYMMDDHH = 2011010100 as above, note that the pgrbfg* file actually contains 3-hour average precipitation rate from 2010123121 to 2011010100, and the pgrbanl+3 contains the 3-hour average precipitation rate from  2011010100 to 2011010103.

This holds true for other average and accumulation variables.

Ensemble Mean and spread files in netCDF are currently available on disk at NERSC (to use nc tools run "load module nco")

Yearly files of 3 hourly or monthly mean fields for selected variables are currently being generated in 

/global/cscratch1/sd/cmccoll/ensmean_ncfiles_v451  
/global/cscratch1/sd/cmccoll/enssprd_ncfiles_v451
/global/cscratch1/sd/cmccoll/ensmean_ncfiles_v452  
/global/cscratch1/sd/cmccoll/enssprd_ncfiles_v452
 

As an example, files for Pressure at Mean Sea Level (PRMSL) are available from 1836 to 1980 in v451 directories and 1981 to 2015 in v452

ls /global/cscratch1/sd/cmccoll/ens*ncfiles*_v45?/PRMSL*

/global/cscratch1/sd/cmccoll/ensmean_ncfiles_v451/PRMSL.1836.mnmean_v451.nc
/global/cscratch1/sd/cmccoll/ensmean_ncfiles_v451/PRMSL.1836_v451.nc
...

/global/cscratch1/sd/cmccoll/ensmean_ncfiles_v452/PRMSL.2015.mnmean_v452.nc
/global/cscratch1/sd/cmccoll/ensmean_ncfiles_v452/PRMSL.2015_v452.nc
 

Ensemble Mean and spread files in netCDF available at NERSC’s high performance storage system (HPSS)

To see what is available in a given HPSS directory, login to Cori (or Edison),  and run “hsi ls /home/projects/incite11/[subdirectory]/”

possible subdirectory:

    ensda_v451_ensmean_netCDF

    ensda_v452_ensmean_netCDF

    ensda_v451_enssprd_netCDF

    ensda_v452_enssprd_netCDF

filenames: - VAR_Y1-Y10_ensmean_v451.tar

                    VAR_Y1-Y10_enssprd_v451.tar

        VAR is the variable

         For V451 Y1-Y10 can be 1836-1845,1846-1855,1856-1865...1976-1980

         For V452 Y1-Y10 can be 1981-1990,1991-2000,2001-2010,2011-2015

example:

/home/projects/incite11/ensda_v451_ensmean_netCDF/WEASD_1976-1985_ensmean_v451.tar

 

Individual ensemble members, as well as mean and spread files, are available on NERSC’s high performance storage system (HPSS)

 

Example workflow from Philip Brohan that accesses a few selected variables and converts every member to netCDF

Example workflow of Chesley McColl that accesses 85 selected variables and converts every member to netCDF

Example workflow of Chesley McColl that access 85 selected variables and converts ensemble mean and spread to netCDF

 

To see what is available in a given HPSS directory, login to Cori (or Edison),  and run “hsi ls /home/projects/incite11/[subdirectory]/”

To see all files within a tarball, use “htar -tvf [hpss directory]/[tarball].tar".  

See below for relevant paths and tarball names.

 

Details of access for every-member netCDF files on the HPSS (currently in progress):

***Note: These directories are incomplete.  This post-processing is in progress, so some years may be complete, but many are not.

For 1836 - 1980:

/home/projects/incite11/20CR_v3_451_ncfiles/[variable]/

For 1981 onward:

/home/projects/incite11/20CR_v3_452_ncfiles/[variable]/

Each of these directories contains every-member 3-hourly netCDF files for a single year in the form [variable]_[YYYY]_v3.tar, as well as every-member monthly mean netCDF files for a single year in the form [variable]_[YYYY]_mnmean_v3.tar.

For example, /home/projects/incite11/20CR_v3_451_ncfiles/PRMSL/PRMSL_1901_v3.tar contains PRMSL.1901_mem001.nc through PRMSL.1901_mem080.nc. Each of these netCDF files contains the 3-hourly Pressure Reduced to Mean Sea Level for all of 1901 for the given member.

 

Details of access for every-member grib files on the HPSS: 
 

For 1836 - 1980:

/home/projects/incite11/ensda_v451_archive_grb2_monthly/ensda_451_[syear]/[YYYY]/  

For 1981 onward (as of 11 Oct 2018, 2015 is finished):

/home/projects/incite11/ensda_v452_archive_grb2_monthly/ensda_452_[syear]/[YYYY]/

Recall “syear” will end in a 4 or a 9, and production years within that directory will start 1 January two years after syear.  So, /home/projects/incite11/ensda_v451_archive_grb2_monthly/ensda_451_1859/

contains years 1861 - 1865, and /home/projects/incite11/ensda_v451_archive_grb2_monthly/ensda_451_1864/

contains years 1866 - 1870.

 

To access, login to a NERSC data transfer node (dtn01.nersc.gov or dtn02.nersc.gov), cd to the directory where you want the data (probably in $SCRATCH) and run “htar -xvf /home/projects/incite11/ensda_v451_archive_grb2_monthly/ensda_451_[syear]/[YYYY]/[tarball].tar” .

To see all tarballs within a directory, run “hsi ls /home/projects/incite11/ensda_v451_archive_grb2_monthly/ensda_451_[syear]/[YYYY]/”

 

Within each [YYYY] directory, each individual ensemble member (01 - 80) for each month is tar’d up, as well as the ensemble statistics. Note that each YYYYMM tarball still includes 3-hourly (pgrbanl, sflx)  or 6-hourly (pgrbfg, psobs) files within it, NOT monthly means.

YYYYMM_pgrbanl_mem0**.tar

YYYYMM_pgrbfg_mem0**.tar

YYYYMM_pgrbensmean.tar

YYYYMM_pgrbenssprd.tar

YYYYMM_sflxgrbensmean.tar  (includes sflxgrbensmeanfg_YYYYMMDDHH_fhr03.grb and sflxgrbensmeanfg_YYYYMMDDHH_fhr06.grb; see links below.)

YYYYMM_sflxgrbenssprd.tar

YYYYMM_psobs.tar (includes observation diagnostic files psobfile, psobs.txt, psobs_prior.txt, and psobs_posterior.txt; see above link for descriptions.)

 

Examples for YYYYMMDDHH = 2016010106:

sflxgrbensmeanfg_2016010106_fhr03.grb (includes accum. variables from 0-3Z)

sflxgrbensmeanfg_2016010106_fhr06.grb (includes accum. variables from 3-6Z)

NOTE: There is an unresolved issue where some sflxgrbensmean files include two extra variables than other files (pressure at convective cloud top and pressure at convective cloud bottom.) We are unsure of the extent of this problem.

 

If everymember sflx files are necessary, then one needs to access the HPSS directories /home/projects/incite11/ensda_v451_archive_orig and

/home/projects/incite11/ensda_v452_archive_orig

 

Within /home/projects/incite11/ensda_v451_archive_orig/ensda_451_[syear] are the six-hourly tarballs with everything.  For example:

/home/projects/incite11/ensda_v451_archive_orig/ensda_451_1904/1906020106.tar includes:

sanl_1906020106_fhr0[3,6]_mem0[01..80] + ensmean (spectral model file converted to grib)

pgrbfg* everymember and ensemble statistics (all in grib1)

pgrbanl for 1906020106 and 1906020109

psob* files

sflxgrb_1906020106_fhr0[0,3,6,9]_mem0[01..80]

sflxgrbensmeanfg_1906020106_fhr03

sflxgrbensmeanfg_1906020106_fhr06

sflxgrbenssprdfg_1906020106_fhr03

sflxgrbenssprdfg_1906020106_fhr06

 

Note: all dates 1, 10, 20 (and some 5, 15, 25) at 00Z will have extra files (needed to back up and restart the model.)

For example, /home/projects/incite11/ensda_v451_archive_orig/ensda_451_1904/1906020100.tar includes:

bfg_1906020100_fhr0[0,3,6,9]_mem0[01..80] + ensmean (spectral model file)

sfg_1906020100_fhr0[0,3,6,9]_mem0[01..80] + ensmean (spectral model file)

sanl_1906020100_fhr0[3,6]_mem0[01..80] + ensmean (spectral model file converted to grib)

sfcanl_1906020100_fhr0[3,6]_mem0[01..80]+ ensmean  (spectral model file)

pgrbfg* everymember and ensemble statistics (all in grib1)

pgrbanl for 1906020100 and 1906020103

psob* files

sflxgrb_1906020100_fhr0[0,3,6,9]_mem0[01..80]

sflxgrbensmeanfg_1906020100_fhr03

sflxgrbensmeanfg_1906020100_fhr06

sflxgrbenssprdfg_1906020100_fhr03

sflxgrbenssprdfg_1906020100_fhr06

 

Add new comment

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.