Frequently Asked Questions - Data FAQ
IMERG provides a data field that estimates the probability that the retrieved precipitation amount is “liquid”, which is defined to include “mixed” (liquid and solid) precipitation. In retrospect the field name should have been “ice”, but “liquid” had already be set. The rational is that mixed precipitation is very rare and transient, so it should be lumped with “liquid” or “ice”. Furthermore, the primary effects of “ice” are to 1) prevent the falling precipitation from immediately entering the hydrological system (until it melts), and 2) to create (potentially) dangerous travel conditions. “Mixed” typically ends up not creating either of these effects, so lumping it with “liquid” seems appropriate.
Even given this basic definition, there are numerous forms of precipitation, and it might not be obvious how they end up being classified in IMERG. The key fact is that the phase is computed diagnostically at present, based on work by Guosheng Liu (Florida State University) and students. The Liu scheme uses data from a numerical model or model analysis to compute a “specification”, without reference to the satellite data, including whether or not IMERG estimates that precipitation is occurring, or even possible to estimate. Thus, probabilityLiquidPrecipitation (pLP) is a globally complete field whenever the relevant model data exist. An additional factor is that there is a conceptual difference between how the half-hourly phase is computed and how phase is defined in this probability framework for the monthly data. We will handle the half-hourly first, for which the Liu specification equation is directly relevant.
Liu determined that the primary factor for phase is the surface wet bulb temperature (Tw), a combination of temperature and humidity, with small contributions from the low-altitude Tw lapse rate and the surface pressure, and with systematic differences between ocean and land areas. In practice, the fitted probability as a function of Tw is converted to separate look-up tables for ocean and land.
Typical results for different forms of precipitation are:
- Rain: Ordinary falling liquid typically happens for Tw>0°C, so pLP is high.
- Freezing Rain: Liquid that freezes upon contact with the Earth's surface typically falls in Tw<0°C, so pLP is low.
- Snow, ice pellets, snow pellets: These frozen hydrometeors occur around or below Tw<0°C, so pLP varies from around 50% to very low.
- Sleet: Frozen droplets (U.S. definition) typically fall in Tw<0°C, so pLP is usually below 50%.
- Mixed snow and rain; falling slush: The mixed category is likely to occur around the pLP=50% mark. If one uses 50% as a liquid/solid threshold, that implies that mixed cases will end up in both categories, depending on the details.
- Hail: Hail typically occurs when the surface air temperature is well above freezing (i.e., on summer afternoons). Thus, pLP is very high. But, hail is even rarer than mixed and unlikely to be correctly specified in this scheme, and anyway, in such conditions it rapidly melts and so is properly “mixed”.
- Dew and frost: These phenomena are not forms of precipitation. They are liquid or solid water that condenses directly at the Earth's surface. For this reason, any amount of surface accumulation due to dew or frost is not included in the IMERG precip estimate.
As the time interval for the data values lengthens, it becomes increasingly likely that both liquid and solid might have fallen, at which point the meaning of pLP should change to “what fraction of the estimated precipitation amount fell as liquid or mixed?” This is the definition of pLP for both the monthly IMERG Final Run pLP and the set of GIS IMERG files (TIFF+WorldFile) providing estimated accumulations longer than three hours.
Variable and Description
30-min averaged data
Merged microwave-only precipitation estimate [Final]
Random error for gauge-calibrated multi-satellite precipitation [Final, Early, Late]
This is an estimate of the non-systematic component of the error. The exact variable name depends on the product, but all begin with "Random error..."
Microwave satellite observation time [Final]
Observation time of the microwave precipitation estimates given as minutes from the beginning of the current half-hour.
Microwave satellite source identifier [Final]
This is an integer between 0 and 24 that corresponds to the instrument from which the microwave precipitation estimate was taken
Weighting of IR-only precipitation relative to the morphed merged microwave-only precipitation estimate [Final]
This is the weighting of the infrared data in the final merged estimate, given in percent. Zero means either no IR weighting or no precipitation.
IR-only precipitation estimate [Final]
This is the microwave-calibrated infrared precipitation estimate.
Multi-sallite precipitation estimate with climatological gauge calibration [Final, Early, Late]
This is the precipitation estimate that has been calibrated with gauge data. This variable is recommended for most users. Note: Climatological gauge calibration is used in Early and Late.
Multi-satellite precipitation estimate [Final]
This is the precipitation estimate that has not been calibrated with gauge data.
Accumulation-weighted probability of liquid precipitation phase [Final]
This is the probability of liquid precipitation. The probabilities are calculated globally regardless of whether precipitation is actually present
1 month averaged data
Weighting of observed gauge precipitation relative to the multi-satellite precipitation estimate
This is the percent weighting of the surface gauge data.
Merged satellite-gauge precipitation estimate
This is the precipitatiotn estimate that has been calibrated with gauge data. This variable is recommended for most users.
Accumulation-weighted probability of liquid precipitation phase
This is the probability of liquid precipitation. The probabilities are calculated globally regardless of whether precipitation is actually present.
Random error for merged satellite-gauge precipitation
This is an estimate of the non-systematic component of the error.
The main difference between the IMERG Early and Late Run is that Early only has forward propagation (which basically amounts to extrapolation), while the Late has both forward and backward propogation (allowing interpolation). As well, the additional 10 hours of latency allows lagging data transmissions to make it into the Late run, even if they were not available for the Early (see below).
There are two possible factors which contribute to differences in the IMERG Late Run and Final Run datasets:
- The Late Run uses a climatological adjustment that incorporates gauge data. In Version 4 and later (scheduled to be available in November - Decemberr 2016), this will be a climatological adjustment to the Final run, which includes gauge data at the monthly scale. For Version 3 (which is the currently available data) the TRMM V7 climatological adjustment of the TMPA-RT to the production TMPA is used (which includes gauge at the monthly scale) because this at-launch algorithm didn't yet have any Late and Final data from which to build the climatological adjustment. The Final run uses a month-to-month adjustment to the monthly Final Run product, which combines the multi-satellite data for the month with GPCC gauge. Its influence in each half hour is a ratio multiplier that's fixed for the month, but spatially varying.
- The Late Run is computed about 15 hours after observation time, so sometimes a microwave overpass is not delivered in time for the Late Run, but subsequently comes in and can be used in the Final. This would affect both the half hour in which the overpass occurs, and (potentially) morphed values in nearby half hours.
The difference over the oceans has to be the first, while the difference over many land areas could be either. The satellite sensor difference could be examined by comparing the satellite sensor data field in the Late and Final Run datasets for each half hour. Since the gauge adjustment is a constant multiplier, a time series should show a constant ratio between the Late and the Final Runs for the entire month (except for cases where the satellite sensor is changing, just as for the ocean).
We always advise people to use the Final Run for research, but to be realistic; with such a short record, the extra months of Late Run might outweigh the risk of using less-accurate data. The vast majority of grid boxes have fairly similar Late and Final values, so it makes sense to stick to metrics that are more resistant to occasional data disturbances than others. Extreme values are more sensitive to these details; medians, means, and root-mean square difference are less sensitive.
Compared to previous versions, Version 05B IMERG introduces additional coverage at the high latitudes for the complete precipitation fields in all Runs -- Early, Late, and Final. IMERG continues to use a merged geosynchronous infrared brightness temperature analysis to both support computing the motion vectors in morphing and provide IR-based precipitation estimates. The requisite analysis (provided by NOAA/NWS/Climate Prediction Center) covers the latitude band 60°N-S, so a "full" IMERG analysis is possible there. At higher latitudes (in both hemispheres) both morphing and IR-based estimates are not included, so the coverage in the complete precipitation fields is "partial" -- limited to times when overpasses occur for microwave sensors and with no snow/ice on the surface. Some of the other data fields, specifically the merged microwave estimates and the precipitation phase, were already provided for the entire globe.
Q1: How closely should the monthly satellite-gauge combined precipitation datasets follow the gauge analysis?
A1: The combined precipitation research team at Goddard has major responsibility for the Global Precipitation Climatology Project monthly Satellite-Gauge combined product, the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B43 monthly product, and the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (GPM) mission (IMERG) Final Run monthly product. In each case the multi-satellite data are averaged to the monthly scale and combined with the Global Precipitation Climatology Centre's (GPCC) monthly surface precipitation gauge analysis (see https://www.dwd.de/EN/ourservices/gpcc/gpcc.html). In each case the multi-satellite data are adjusted to the large-area mean of the gauge analysis, where available (mostly over land), and then combined with the gauge analysis using a simple inverse estimated-random-error variance weighting. In all three data sets the gauge analysis has an important or dominant role in determining the final combined value for grid boxes in areas with "good" gauge coverage. Regions with poor gauge coverage, such as central Africa have a higher weight on the satellite input. The oceans are mostly devoid of gauges and therefore mostly lack such gauge input.
Q2: How closely related are the short-interval multi-satellite precipitation datasets to the monthly satellite-gauge combined precipitation datasets?
A2: The short-interval GPCP is the One-Degree Daily (1DD), the short-interval TMPA is 3B42 (which is 3-hourly), and the short-interval IMERG is the half-hourly. In each case the short-interval data are adjusted with a simple, spatially varying ratio to force the multi-satellite estimates to approximately average up to the corresponding monthly product, although with controls on the ratios to prevent unphysical results. Thus, monthly-average values for the short-interval data should be close to the mean values for the monthly datasets, which the developers consider more reliable than the short-interval datasets. In fact, compared to datasets that lack the adjustment to the monthly satellite-gauge estimates, the 1DD, 3B42, and IMERG Final half-hourly datasets tend to score better at timescales longer than a few days. This is presumably because the random error begins to cancel out as more samples are averaged together, while the bias error remains.
Bolvin, D.T., R.F. Adler, G.J. Huffman, E.J. Nelkin, J.P. Poutiainen, 2009: Comparison of GPCP Monthly and Daily Precipitation Estimates with High-Latitude Gauge Observations. J. Appl. Meteor. Climatol., 48(9), 1843–1857.
GPM project data sets, including the Core Observatory and constellation partner sensor data sets, national data sets, including multi-satellite data sets, have been released to the public and are available for download now (click here to see a table of GPM data products). These initial releases are being computed for the GPM era (February 2014 to present) using pre-launch calibrations.
Subsequently, a general reprocessing will upgrade the algorithms to fully GPM-based calibrations. This is scheduled to occur in September 2015 for Core Observatory and partner data sets, and in January 2016 for the U.S. multi-satellite algorithm (Integrated Multi-Satellite Retrievals for GPM; IMERG). After about a year of additional development work, the data sets will be retrospectively processed back to the start of TRMM (January 1998).
The resolution of Level 0, 1, and 2 data is determined by the footprint size and observation interval of the sensors involved. Level 3 products are given a grid spacing that is driven by the typical footprint size of the input data sets. See the table of GPM & TRMM Data Downloads for details on the resolution of each specific product.
There are several sources for downloading and viewing data which allow you to subset the data by longitude and latitude. These include the Simple Subset Wizard, Giovanni and STORM . In the new Giovanni 4 you can also now obtain data for a specific country, U.S. state, or watershed by using the "Show Shapes" option in the "Select Region" pane
The transition from the Tropical Rainfall Measuring Mission (TRMM) data products to the Global Precipitation Measurement (GPM) mission products has begun. The TMPA products will be replaced by the Integrated Multi-satellitE Retrievals for GPM (IMERG) products. It is tentatively planned to continue computing the TMPA products throughout the transition, into Spring 2017. Click here for more details on this transition. Click here for more details on this transition.
GPM data products can be divided into two groups (real-time and production) depending on how soon they are created after the satellite collects the observations. For applications such as weather, flood, and crop forecasting that need precipitation estimates as soon as possible, real-time data products are most appropriate. GPM real-time products are generally available within a few hours of observation. For all other applications, production data products are generally the best data sets to use because additional or improved inputs are used to increase accuracy. These other inputs are only made available several days, or in some cases, several months, after the satellite observations are taken, and the production data sets are computed after all data have arrived, making possible a more careful analysis.
The TRMM FTP has a Climatology directory which contains files in the TRMM Composite Climatology developed by Wang, Adler, Huffman, and Bolvin. A journal article on this topic is available here:http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-13-00331.1 . Pre-generated world maps of TRMM climatology data are also available here.
The data set source should be acknowledged when the data are used. One standard format for a formal reference is:
Dataset authors/producers, data release date: Dataset title, version. Data archive/distributor, access date in standard AMS format, data locator/identifier (doi or URL).
G. Huffman, D. Bolvin, D. Braithwaite, K. Hsu, R. Joyce, P. Xie, 2014: Integrated Multi-satellitE Retrievals for GPM (IMERG), version 4.4. NASA's Precipitation Processing Center, accessed 31 March, 2015, ftp://arthurhou.pps.eosdis.nasa.gov/gpmdata/
For more details on citation format, please refer to the American Meteorologic Society Data Archiving and Citation guidelines:http://www2.ametsoc.org/ams/index.cfm/publications/authors/journal-and-bams-authors/journal-and-bams-authors-guide/data-archiving-and-citation/
In the case of data sets that have not been given DOI’s, the most persistent "landing page" should be named as the "data locator", for example,
As an “Acknowledgment”, one possible wording is:
"The <dataset name> data were provided by the NASA/Goddard Space Flight Center's <team's organization> and PPS, which develop and compute the <dataset name> as a contribution to <project (TRMM or GPM)>, and archived at the NASA GES DISC."
For any given data TMPA data set, each data value provides a precipitation rate based on one (or perhaps two) satellite snapshots during the TMPA’s 3-hour analysis period. IMERG values are based on a single snapshot during its half-hour analysis period, or a morphed interpolation if no microwave values are available. The values are expressed in the intensive units mm/hr; it is usually best to assume that this rate applies for the entire 3- or half-hour period. If you wish to regrid to a finer time and/or space grid, note that many interpolation schemes have the property of suppressing maxima in precipitation and expanding rain events into neighboring zero-amount periods.
The GPM satellite constellation observes precipitation as it is falling, and maintains a database of precipitation records dating back to 1998. GPM is primarily focused on obtaining the highest quality precipitation measurements and studying fundamental atmospheric processes, and thus we do not focus on forecasting or predicting the weather. However, the near-real-time data collected by GPM is ingested into computer models by operational agencies such as the NWS and the ECMWF, who use it to improve their weather forecasts. Please visit the NWS and ECMWF websites for further information:
Although the TRMM satellite is no longer in service, the 3B42 series of algorithms will continue to be run using other satellites in the constellation to produce data products that are consistent with the long-term records. The current plan is to continue production into mid-2018 to give users time to transition to the newer IMERG multi-satellite data products. For more details about the status of 3B42 (and 3B42RT) and the transition to IMERG, please refer to this document: https://pmm.nasa.gov/sites/default/files/document_files/TMPA-to-IMERG_transition_170810.pdf
First locate the data product that meets your needs, then look to the “Format” column to find the appropriate link to download data in your desired format.
In general, GPM data products are named using the following format:
[algorithm level].[satellite].[instrument].[algorithm name].[year / month / date].[data start time hr/min/sec UTC].[data end time UTC].[sequence indicator showing orbit # (L2) or day/month (L3)].[algorithm version].[data format]
This is a Level 2A product, using the GPM satellite's GMI sensor, using the "GPROF 2008" algorithm, showing data from Novemeber 1st 2013 starting at 23:51:52 UTC and ending at 01:24:00 UTC, orbit number 352, using version 03C of the algorithm, in HDF5 format.
For a more detailed explanation of GPM file naming conventions, please refer to the following document: File Naming Convention for Precipitation Products For the Global Precipitation Measurement (GPM) Mission
The GPROF retrieval uses all the GMI channels, but these channels are recorded by multiple feed horns on the instrument, which produce data with slightly different geolocations that are systematically offset from each other. Thus, only the region with overlapping data can support GPROF retrievals. The Core Observatory data are downlinked via the NASA Tracking and Data Relay Satellite System (TDRSS) communications satellite system to the NASA White Sands Test Facility in New Mexico, and networked to PPS at NASA/Goddard as 5-minute packets. So, for GPROF to create retrievals across the entire granule, the previous and following granules are required to give all the channels over a packet's entire area of coverage. And, to satisfy the need for "real time" production, the retrieval is run no more than 11 minutes after the last observation time. (The maximum delay was recently adjusted to accommodate changes in GPROF run times.)
Episodically, the Core Observatory is out of sight of the TDRSS satellites. Depending on orbital details, the gap can be 20 minutes, and as long as an hour. When this happens, the last packet before a gap will lack timely access to the following packet and the last several scans will not have all the necessary channel data. The result is several scans of "missing" retrievals at the end of the granule. There are, of course, other ways in which scans of "missing" retrievals can occur, but the issue described is the most common.
First, ensure you have registered your email address with the PPS using this webpage: http://registration.pps.eosdis.nasa.gov/registration/
Once registered, your email address will serve as both your username AND password for logging into the FTP site. Email addresses are converted to lower case when registering, so please enter your username and password in lowercase as well.
If you need access to the near-realtime (NRT) GPM files on ftp://jsimpson.pps.eosdis.nasa.gov, please be sure to check the box labelled "Near-Realtime Products". Otherwise you will be unable to log in to the NRT FTP server.
If you have already registered but would like to change your account details (such as adding access to NRT products) please visit this page and click "Verify Email or Update Info": http://registration.pps.eosdis.nasa.gov/registration/
The PPS FTP does not work with the Safari web browser due to the way it handles FTP authentication. It is recommended you use another web browser such as Chrome or Firefox, or use the command line or a dedicated FTP client application (Click here for a list of possible FTP clients. NASA does not endorse any of these applications, they are listed merely as a suggestion).
Please visit the "PPS Satellite-Ground Coincidence Finder" website from NASA's Precipitation Processing System: https://storm.pps.eosdis.nasa.gov/storm/data/Service.jsp?serviceName=OverflightFinder#events
This tool allows you to select a location, date range, and satellite to determine when the satellite has passed over that location within those dates.
No, GPM data is provided on a "best effort" basis and should not be considered operational. By design, most GPM products, and specifically IMERG, are "best effort". We are pretty proud that our best effort has been quite effective. But, the systems are taken down for routine preventative maintenance on selected Tuesdays, and if a server crashes, a network goes down, the government shuts down, or partner satellites suddenly fall silent, we don't have hot spares, personnel with pagers, 24/7 operations, etc. that guarantee continuous operation.
The entire partner constellation processing strategy is "best effort", so no specific delivery requirements are included in the letters of agreement. The commitment is to maintain communications about the status of the sensors, data quality, and data transmissions, and recognize that GPM is an interested party.