The GSI assimilation system

The GSI (Gridpoint Statistical Interpolation) System, is a state-of-the-art data assimilation system initially developed by the Environmental Modeling Center at NCEP. It was designed as a traditional 3DVAR system applied in the gridpoint space of models to facilitate the implementation of inhomogeneous anisotropic covariances (Wu, Purser, and Parrish 2002; Purser et al. 2003b, 2003a). It is designed to run on various computational platforms, create analyses for different numerical forecast models, and remain flexible enough to handle future scientific developments, such as the use of new observation types, improved data selection, and new state variables (Kleist et al. 2009).

The 3DVAR system replaced the NCEP regional grid-point operational analysis system by the North American Mesoscale Prediction System (NAM) in 2006 and the Spectral Statistical Interpolation (SSI) global analysis system used to generate Global Forecast System (GFS) initial conditions in 2007 (Kleist et al. 2009). In recent years, GSI has evolved to include various data assimilation techniques for multiple operational applications, including 2DVAR [e.g., the Real-Time Mesoscale Analysis (RTMA) system; Pondeca et al. (2011)], the hybrid EnVar technique (e.g., data assimilation systems for the GFS, the Rapid Refresh system (RAP), the NAM, the HWRF, etc. ), and 4DVAR [e.g., the data assimilation system for NASA’s Goddard Earth Observing System, version 5 (GEOS-5); Zhu and Gelaro (2008)]. GSI also includes a hybrid 4D-EnVar approach that is currently used for GFS generation.

In addition to the development of hybrid techniques, GSI allows the use of ensemble assimilation methods. To achieve this, it uses the same observation operator as the variational methods to compare the preliminary field or background with the observations. In this way the exhaustive quality controls developed for variational methods are also applied in ensemble assimilation methods. The EnKF code was developed by the Earth System Research Lab (ESRL) of the National Oceanic and Atmospheric Administration (NOAA) in collaboration with the scientific community. It contains two different algorithms for calculating the analysis increment, the serial Ensemble Square Root Filter (EnSRF, Whitaker and Hamill 2002) and the LETKF (Hunt, Kostelich, and Szunyogh 2007) contributed by Yoichiro Ota of the Japan Meteorological Agency (JMA).

To reduce the impact of spurious covariances on the increment applied to the analysis, ensemble systems apply a localization to the covariance matrix of the errors of the observations \(R\) in both the horizontal and vertical directions. GSI uses a polynomial of order 5 to reduce the impact of each observation gradually until a limiting distance is reached at which the impact is zero. The vertical localization scale is defined in terms of the logarithm of the pressure and the horizontal scale is usually defined in kilometers. These parameters are important in obtaining a good analysis and depend on factors such as the size of the ensemble and the resolution of the model.

GSI uses the Community Radiative Transfer Model (CRTM, Liu et al. 2008) as an operator for the radiance observations that calculates the brightness temperature simulated by the model in order to compare it with satellite observations. GSI also implements a bias correction algorithm for the satellite radiance observations. The preliminary field estimated with CRMT is compared with the radiance observations to obtain the innovation. This innovation is then used to calculate a bias that is then applied to an updated innovation. This process can be repeated several times until the innovation and the bias correction coefficients converge.

Available observations for assimilation

Here is the list of observations that can be assimilated by GSI. In bold are the observations for with I have experience and/or the ones I’ve adapted the code for it.

Conventional observations:

  • Radiosondes
  • Pilot ballon (PIBAL) winds
  • Synthetic tropical cyclone winds
  • Wind profilers: USA, Jan Meteorological Agency (JMA)
  • Conventional aircraft reports
  • Aircraft to Satellite Data Relay (ASDAR) aircraft reports
  • Meteorological Data Collection and Reporting System (MDCRS) aircraft reports
  • Dropsondes
  • Moderate Resolution Imaging Spectroradiometer (MODIS) IR and water vapor winds
  • Geostationary Meteorological Satellite (GMS), JMA, and Meteosat cloud drift IR and visible winds
  • European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and GOES water vapor cloud top winds
  • GEOS hourly IR and cloud top wind
  • Surface land observations
  • Surface ship and buoy observations
  • Special Sensor Microwave Imager (SSMI) wind speeds
  • Quick Scatterometer (QuikSCAT), the Advanced Scatterometer (ASCAT) and Oceansat-2 Scatterometer (OSCAT) wind speed and direction
  • RapidScat observations
  • SSM/I and Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) precipitation estimates
  • Velocity-Azimuth Display (VAD) Next Generation Weather Radar ((NEXRAD) winds
  • Global Positioning System (GPS) precipitable water estimates Sea surface temperatures (SSTs)
  • Doppler wind Lidar
  • Aviation routine weather report (METAR) cloud coverage
  • Flight level and Stepped Frequency Microwave Radiometer (SFMR) High Density Observation (HDOB) from reconnaissance aircraft
  • Tall tower wind

Satellite radiance/brightness temperature observations

  • SBUV: NOAA-17, NOAA-18, NOAA-19
  • High Resolution Infrared Radiation Sounder (HIRS): Meteorological Operational-A(MetOp-A), MetOp-B, NOAA-17, NOAA-19
  • GOES imager: GOES-11, GOES-12
  • Atmospheric IR Sounder (AIRS): aqua
  • AMSU-A: MetOp-A, MetOp-B, NOAA-15, NOAA-18, NOAA-19, aqua
  • AMSU-B: MetOp-B, NOAA-17
  • Microwave Humidity Sounder (MHS): MetOp-A, MetOp-B, NOAA-18, NOAA-19
  • SSMI: DMSP F14, F15, F19
  • SSMI/S: DMSP F16
  • Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E): aqua
  • GOES Sounder (SNDR): GOES-11, GOES-12, GOES-13
  • Infrared Atmospheric Sounding Interferometer (IASI): MetOp-A, MetOp-B
  • Global Ozone Monitoring Experiment (GOME): MetOp-A, MetOp-B
  • Ozone Monitoring Instrument (OMI): aura
  • Spinning Enhanced Visible and Infrared Imager (SEVIRI): Meteosat-8, Meteosat-9, Meteosat-10
  • Advanced Technology Microwave Sounder (ATMS): Suomi NPP
  • Cross-track Infrared Sounder (CrIS): Suomi NPP
  • GCOM-W1 AMSR2
  • GPM GMI
  • Megha-Tropiques SAPHIR
  • Himawari AHI
  • GOES ABI

Other observations

  • GPS Radio occultation (RO) refractivity and bending angle profiles
  • Solar Backscatter Ultraviolet (SBUV) ozone profiles, Microwave Limb Sounder (MLS) (including NRT) ozone, * and Ozone Monitoring Instrument (OMI) total ozone
  • Doppler radar radial velocities radar reflectivity Mosaic
  • Tail Doppler Radar (TDR) radial velocity and super-observation
  • Tropical Cyclone Vitals Database (TCVital)
  • Particulate matter (PM) of 10-um diameter, 2.5-um diameter or less
  • MODIS AOD (when using GSI-chem package)
  • Significant wave height observations from JASON-2, JASON-3, SARAL/ALTIKA and CRYOSAT-2

Running GSI

Every assimilation cycle starts with the background, a forecast generated using a numerical model (WRF-ARW for this guide), that was initialized from previous analysis and observations (in bufr format) that enters the GSI system. GSI will also need “fixed” files with information about the observations. This “fix” files define which observations are going to be assimilated, they errors and quality control options.

Diagram that shows a cycle. First the background (forecasts generated using a numerical model, WRF-ARW, initialized from previous analysis) and Observations (from bufr files) enters the GSI system. Inside the system, the Observation Operator takes care of the quality control and bias correction and then calculates the innovation and generates the diag files. Then the ENKF part calculates the update appling the innovation to the background to generate the analysis. The analysis is used to create a new background to star the cycle again

Diagram of an assimilation cycle

GSI can also be used with the following background files:

  • WRF-NMM input fields in binary format
  • WRF-NMM input fields in NetCDF format
  • WRF-ARW input fields in binary format
  • WRF-ARW input fields in NetCDF format
  • GFS input fields in binary format or through NEMS I/O
  • NEMS-NMMB input fields
  • RTMA input files (2-dimensional binary format)
  • WRF-Chem GOCART input fields with NetCDF format
  • CMAQ binary file

And the official tutorials in the DTCenter webpage are a good starting point to grasp the use of this options.

GSI can also be run without observations to test the code, this is with a single synthetic observation defined in the SINGLEOB_TEST section in the gsi namelist. Another thing to try at the beginning.

The fixed files are located in the fix/ folder and includes statistic files, configuration files, bias correction files, and CRTM coefficient files1. The information of the configuration files is saved in the output files after running GSI.

GSI Name Content File names
anavinfo Information file to set control and analysis variables anavinfo_arw_netcdf
berror_stats background error covariance (for variacional methods) nam_nmmstat_na.gcv, nam_glb_berror.f77.gcv,
errtable Observation error table prepobs_errtable.global
convinfo Conventional observation information file global_convinfo.txt
satinfo Satellite channel information file global_satinfo.txt
pcpinfo Precipitation rate observation information file global_pcpinfo.txt
ozinfo Ozone observation information file global_ozinfo.txt
satbias_angle Satellite scan angle dependent bias correction file global_satangbias.txt
satbias_in Satellite mass bias correction coefficient file sample.satbias
satbias_in Combined satellite angle dependent and mass bias correction coefficient file gdas1.t00z.abias.new

About the GSI code

GSI is written in fortran and the code is separated in more than 5 hundred files. While GSI has 2 good user guides, not everything is documented and sometimes you will need to read the code.

To swim around the code I found the grep -r "key word" command very useful. Each file and subroutine inside it has a header with information about what it does, changes and input and output arguments. It’s worth mentioning a few key files:

  • gsimain.f90 and gsimod.f90 are the main files that control the system. gsimain.f90 has a list off code errors and the possible cause.
  • *info.f90 like convinfo.f90 and radinfo.f90 are the routines that read the configuration files. Note that the satinfo file is read by the radinfo.f90 subroutine.
  • read_*.f90 are a family of routines every bufr file that GSI is prepared to work with. For example there is a read_prepbufr.f90 and a read_goesimg.f90 that read goes observations including ABI.
  • Another important family of files is setup*.f90, these files process each variable to be assimilated. For example, setupt.f90 will:
    • reads temperature obs assigned to given mpi task (geographic region),
    • simulates obs from guess,
    • apply some quality control to obs,
    • load weight and innovation arrays used in minimization
    • collects statistics for runtime diagnostic output
    • writes additional diagnostic information to output file
  • qcmod.f90 includes important routines for the quality control of radiance observations.

The mail files associated to the enkf code are:

  • enkf.f90 and enkf_main.f90
  • read*.f90 are the family of files that read the diag files generated by the observation operator.
  • radbias.f90 manage the bias correction for radiance observation.

References

Hunt, Brian R., Eric J. Kostelich, and Istvan Szunyogh. 2007. “Efficient Data Assimilation for Spatiotemporal Chaos: A Local Ensemble Transform Kalman Filter.” Physica D: Nonlinear Phenomena 230 (1-2): 112–26. https://doi.org/10.1016/j.physd.2006.11.008.
Kleist, Daryl T., David F. Parrish, John C. Derber, Russ Treadon, Wan-Shu Wu, and Stephen Lord. 2009. “Introduction of the GSI into the NCEP Global Data Assimilation System.” Weather and Forecasting 24 (6): 1691–1705. https://doi.org/10.1175/2009WAF2222201.1.
Liu, Quanhua, Fuzhong Weng, Yong Han, and Paul van Delst. 2008. “Community Radiative Transfer Model for Scattering Transfer and Applications.” In IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium, 4:IV - 1193-IV - 1196. https://doi.org/10.1109/IGARSS.2008.4779942.
Pondeca, Manuel S. F. V. De, Geoffrey S. Manikin, Geoff DiMego, Stanley G. Benjamin, David F. Parrish, R. James Purser, Wan-Shu Wu, et al. 2011. “The Real-Time Mesoscale Analysis at NOAA’s National Centers for Environmental Prediction: Current Status and Development.” Weather and Forecasting 26 (5): 593–612. https://doi.org/10.1175/WAF-D-10-05037.1.
Purser, R. James, Wan-Shu Wu, David F. Parrish, and Nigel M. Roberts. 2003a. “Numerical Aspects of the Application of Recursive Filters to Variational Statistical Analysis. Part I: Spatially Homogeneous and Isotropic Gaussian Covariances.” Monthly Weather Review 131 (8): 1524–35. https://doi.org/10.1175//1520-0493(2003)131<1524:NAOTAO>2.0.CO;2.
———. 2003b. “Numerical Aspects of the Application of Recursive Filters to Variational Statistical Analysis. Part II: Spatially Inhomogeneous and Anisotropic General Covariances.” Monthly Weather Review 131 (8): 1536–48. https://doi.org/10.1175//2543.1.
Whitaker, Jeffrey S., and Thomas M. Hamill. 2002. “Ensemble Data Assimilation Without Perturbed Observations.” Monthly Weather Review 130 (7): 1913–24. https://doi.org/10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.
Wu, Wan-Shu, R. James Purser, and David F. Parrish. 2002. “Three-Dimensional Variational Analysis with Spatially Inhomogeneous Covariances.” Monthly Weather Review 130 (12): 2905–16. https://doi.org/10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2.
Zhu, Yanqiu, and Ronald Gelaro. 2008. “Observation Sensitivity Calculations Using the Adjoint of the Gridpoint Statistical Interpolation (GSI) Analysis System.” Monthly Weather Review 136 (1): 335–51. https://doi.org/10.1175/MWR3525.1.

Footnotes

  1. This files need to be downloaded separately as they are to big to be part of the GSI repository. Also the coefficient files can be updated with better approximations over time.↩︎