Issues in Science and Technology Librarianship | Fall 2011 | |||
DOI:10.5062/F4CC0XMJ |
In the absence of formal data set citation standards in the literature, there is no quantitative information on the connection between data distributed from NASA's Earth Observing System (EOS) data centers and subsequent research published using EOS data. This paper provides an analysis of a 10-year citation history of research using EOS instrument data in the peer-reviewed literature, which illustrates that the high volume of published EOS-related papers is indicative of the use of data from the NASA DAACs and comprises a significant contribution to the body of scientific knowledge about the Earth's climate.
In 1999, NASA launched the first of three spacecraft as part of the Earth Observing System (EOS). Terra, the flagship of the EOS program was launched on December 18, 1999, and carried five instruments to monitor the Earth surface, oceans, and atmosphere. Aqua, complementing the Terra spacecraft with six instruments (including two identical instruments onboard Terra), began its mission on May 4, 2002. Launched on July 15, 2004, the last of the EOS spacecraft, Aura's mission focused on Earth's atmospheric chemistry. The following is a list of EOS instruments and spacecraft.
MODIS | Moderate Resolution Imaging Spectrometer | Terra, Aqua |
CERES | Clouds and Earth's Radiant Energy System | Terra, Aqua |
ASTER | Advances Spaceborne Thermal Emission and Reflection Radiometer | Terra |
MISR | Multi-angle Imaging SpectroRadiometer | Terra |
MOPITT | Measurements of Pollution in The Troposphere | Terra |
AIRS | Atmospheric Infrared Sounder | Aqua |
AMSR | Advanced Microwave Scanning Radiometer-EOS | Aqua |
AMSU | Advanced Microwave Sounding Unit | Aqua |
HSB | Humidity Sounder for Brazil | Aqua |
HIRDLS | High Resolution Dynamics Limb Sounder | Aura |
MLS | Microwave Limb Sounder | Aura |
OMI | Ozone Monitoring Instrument | Aura |
TES | Tropospheric Emission Spectrometer | Aura |
Data from EOS instruments are first acquired by ground stations and then processed by Mission Control after which data are delivered to NASA's Distributed Active Archive Centers (DAACs), the Instrument Teams and Science Investigator-led Processing Systems (SIPS). The NASA DAACs and SIPSs make data products available to the user community and other data centers for creation of value-added products and data sets. The table below lists the primary data facilities for the distribution of EOS Terra, Aqua, and Aura data.
Figure 1. Processing and distribution of EOS data (after Behnke, et al. 2005).
In most cases, data from EOS instruments used in scientific investigations and subsequent publications originated at one or more of the NASA DAACs or with the Instrument teams. The volume of peer-reviewed published papers that are based on EOS instrument data is indicative of the use of data from the NASA DAACs and comprises a significant contribution to the body of scientific knowledge about the Earth's climate. The NASA DAACs consist of the following:
Data Center | Description |
GES DISC | GSFC Earth Sciences Data and Information Services Center, NASA Goddard Space Flight Center, Greenbelt, MD |
LaRC ASDC | NASA Langley Research Center's Atmospheric Science Data Center, Langley, VA |
LP DAAC | Land Processes DAAC, USGS EROS Data Center, Sioux Falls, SD |
PODAAC | Physical Oceanography DAAC, NASA Jet Propulsion Laboratory, Pasadena, CA |
NSIDC DAAC | National Snow and Ice Data Center, NOAA/CIRES, Boulder, CO |
Since the launch of Terra in December 1999, 5,633 peer-reviewed articles have been published based on data from the Thomson Reuters Web of Science (WoS) citation database (2011). Figure 2 shows the number of articles using EOS instruments published by year, with 43% of the papers being published in 2008-2009.
Figure 2. Peer-reviewed articles published between 2000-2009 that cite using EOS data.
Records were extracted from the WoS using specific instrument keywords and qualifying keywords where necessary. For example, MODIS is a unique acronym to the instrument on Terra and Aqua; however, ASTER and MLS are acronyms or keywords used in other sciences. A search of WoS on ASTER alone retrieves hundreds of hits relating to the "aster" species of flowers. Qualifying instrument names with keywords results in a search of higher precision, although it cannot be ruled out that a few valid articles were not captured. For example, qualifying the acronym MLS with the keyword "microwave" generated a more precise set of results that mostly captured articles related to the EOS MLS instrument, rather than an "MLS" acronym found in the biomedical literature.
Using the WoS database, a search was performed on each EOS instrument. Data were downloaded into spreadsheets using the WoS extraction function, which consists of the Author(s), Title, Journal Title, Year of Publication and Abstract for each instrument searched. Next, each record downloaded was tagged with the instrument name, and any discrepancies, such as duplicate entries, were resolved. Data records were sorted by year and by instrument.
Each citation record was manually checked for accuracy, and journals that may not normally report Earth science results were examined more closely to check for imprecise search results. For example, the use of MODIS data in scientific studies tended to be far more interdisciplinary than the use of other EOS instrument data; therefore, journal titles that seemingly would have little to do with Earth observations, might actually qualify in the search.
For example, the following article and journal title might appear to be an error in the search results for papers using MODIS data:
Journal: JOURNAL OF MEDICAL ENTOMOLOGY
Title: Infestation of rural houses by Triatoma infestans (Hemiptera: Reduviidae) in southern area of Gran Chaco in Argentina.
However, in the article, MODIS imagery was used to examine areas before and after insecticide application to determine its effectiveness.
In another example, ASTER vegetation indices were used to determine the effectiveness of mosquito-borne disease control in an article from a journal that ordinarily would not report results from remote sensing data:
Journal: VECTOR-BORNE AND ZOONOTIC DISEASES
Title: Remotely-sensed vegetation indices identify mosquito clusters of West Nile virus vectors in an urban landscape in the northeastern United States
Occasional inconsistencies were found in the WoS results. For example, one paper had a title that was clearly about data from the MODIS instrument; however, the article was associated with the Journal of Obstetrics & Gynecology. Further analysis revealed that the authors were assigned to the paper inaccurately by WoS (or the publisher provided incorrect data to WoS).
EOS instrument data were cited in 5,633 articles, identified as published in 432 separate journals. Approximately 70% of all of the articles were published in the 20 journals shown in Table 2 and Figure 3.
Journal | #Articles 2000-2009 | Journal Impact Factor* (5 year aggregate) |
Journal of Geophysical Research – Atmospheres | 849 | 3.475 |
Remote Sensing of Environment | 732 | 4.757 |
IEEE Transactions on Geoscience and Remote Sensing | 446 | 2.705 |
International Journal of Remote Sensing | 441 | 1.621 |
Geophysical Research Letters | 431 | 3.341 |
Atmospheric Chemistry and Physics | 206 | 5.416 |
Journal of Climate | 85 | 4.746 |
Journal of Atmospheric and Oceanic Technology | 82 | 1.810 |
Atmospheric Environment | 75 | 3.584 |
Journal of Applied Meteorology and Climatology | 70 | 2.420 |
Journal of Atmospheric Sciences | 60 | 3.219 |
Sensors | 58 | 1.403 |
Journal of Applied Remote Sensing | 57 | N/A |
IEEE Geoscience and Remote Sensing Letters | 56 | N/A |
Quarterly Journal of the Royal Meteorological Society | 53 | 3.271 |
Journal of Applied Meteorology | 46 | 2.420 |
Canadian Journal of Remote Sensing | 42 | 1.010 |
Applied Optics | 40 | 1.522 |
Monthly Weather Review | 39 | 2.856 |
Journal of Geophysical Research – Oceans | 37 | 3.475 |
For each journal, the Thomson Reuters Journal Impact Factor (JIF) is provided from the Journal Citation Reports. The Impact Factor "is a measure of the frequency with which the "average article" in a journal has been cited in a particular year or period" (Thomson Reuters 1994). In general, the higher the number the larger the impact of the journal on the scientific literature.
The average median Impact Factor for all journals in the subject areas Meteorology and Atmospheric Science, Oceanography, and Remote Sensing is 1.275. Articles on EOS data are being published in journals, whose Impact Factor is greater than the average in those subject disciplines and therefore have a higher impact on the scientific literature.
Figure 3. Total number of articles that cite using EOS data per journal for the period 2000-2009.
A wide range of scientific journals other than those shown above also reported on research using EOS instrument data including:
EOS data (especially MODIS and ASTER) has a wide range of applications including archaeology, paleontology, infectious disease mapping, and medicine. Some discipline journals that do not traditionally publish research using remote sensing data have published results using EOS instrument data, such as:
Because EOS data are openly available to the international community, many articles citing EOS instrument data are published in foreign and non-English publications. Some examples include:
Most EOS-related papers cite using MODIS data, often in tandem with other EOS instruments (11% of the papers cite using two or more EOS instrument data sets). Figure 4 shows the total number of peer-reviewed papers published using data from each EOS instrument.
Figure 4. The data clearly show that the MODIS instrument appears in the most peer-reviewed papers published over the 10-year span, comprising nearly 55% of all EOS-instrument papers. It was difficult to make a distinction between Terra/MODIS or Terra/CERES and Aqua/MODIS or Aqua/CERES in the citations; therefore, in this paper no attempt was made to assign MODIS or CERES to a specific spacecraft.
The following figures illustrate the number of articles that cite EOS instrument data by spacecraft. Because the number of articles citing the use of MODIS data far outnumbers the articles using other EOS instrument data, the graph of MODIS articles (from Terra and Aqua) is illustrated separately from the other graphs.
Figure 5. The number of articles published per year, which cite using MODIS data is shown. The majority of articles cite using MODIS data.
Figure 6. The number of articles using EOS Terra data other than MODIS amount to less than a 100 articles in any given year, with the exception of ASTER data, which peaked in 2008 and 2009.
Figure 7. The number of articles cited using data from Aqua are less than 80 articles in any given year. Aqua was launched in 2002.
Figure 8. The number of articles cited using data from Aura are less than 80 articles in any given year. Only OMI shows an increasing trend in the number of article published since launch. Aura was launched in 2004.
Table 3 shows the citation data, as extracted from the Web of Science, for each EOS instrument over the period 2000-2009. Not surprising, the number of citations of articles that use MODIS data is much greater than for citations for other EOS articles. In comparison to articles from a range of similar disciplines from WoS (Remote Sensing, Meteorology & Atmospheric Sciences, Oceanography, and Environmental Sciences), EOS-related articles provide a significant fraction of the cited literature from those fields. Over the 2000-2009 period, Web of Science reports 112,096 citations, of which EOS-related articles contribute 64.6% of the citations. This represents a significant impact of the use of EOS data on the scientific literature in those disciplines.
Table 4 illustrates the highest impact article, as measured by the number of citations over the 2000-2009 period, per EOS instrument. In the case of AMSU/AMSR/HSB, the primary article by Aumann, et al. is the most highly cited article for all three instruments. Further analysis needs to be done to ascertain the impact of highly cited articles.
Table 3: Number of citations per instrument 2000-2009
Instrument | Number Citations | Avg. Citations/Article | Avg. Citations/Year |
MODIS | 41977 | 9.25 | 3498.08 |
ASTER | 3663 | 9.11 | 302.25 |
MISR | 3154 | 11.95 | 262.83 |
MOPITT | 2716 | 15.7 | 226.33 |
CERES | 2054 | 14.67 | 171.17 |
AIRS | 3529 | 8.87 | 294.08 |
AMSR | 2897 | 5.63 | 241.42 |
AMSU | 4371 | 10.38 | 364.25 |
HSB | 1082 | 25.76 | 120.22 |
HIRDLS | 320 | 5.82 | 26.67 |
MLS | 3381 | 13.21 | 281.75 |
OMI | 2035 | 10.88 | 185.00 |
TES | 1241 | 13.20 | 103.42 |
Table 4: Most Highly Cited articles per Instrument 2000-2009
Instrument | Year | Citations | First Author/Title | Journal |
MODIS | 2002 | 55.30 | Huete: Overview of radiometric and biophysical…of MODIS vegetation indices | RSE |
ASTER | 2002 | 23.40 | Justice: The MODIS Fire Products | RSE |
MISR | 2005 | 15.00 | Kahn: MISR Global Aerosol Optical Depth based on…Aeronet Observations | JGR-Atmos |
MOPITT | 2003 | 17.33 | Jacob: TRACE-P aircraft mission | JGR-Atmos |
CERES | 2004 | 29.50 | Zhang: Calculation of Radiative fluxes from surface to top of atmosphere based on ISCCP…. | JGR-Atmos |
AIRS | 2003 | 32.22 | Aumann: AIRS/AMSU/HSB on the Aqua Mission… | IEEE Trans. Geo. Remote Sens. |
AMSR | 2003 | 25.78 | Njoku: Soil Moisture retrieval from AMSR-E | IEEE Trans. Geo. Remote Sens. |
AMSU | 2003 | 32.22 | Aumann: AIRS/AMSU/HSB on the Aqua Mission… | IEEE Trans. Geo. Remote Sens. |
HSB | 2003 | 32.22 | Aumann: AIRS/AMSU/HSB on the Aqua Mission… | IEEE Trans. Geo. Remote Sens. |
HIRDLS | 2006 | 15.50 | Schoeberl: Overview of the Aura Mission | IEEE Trans. Geo. Remote Sens. |
MLS | 2006 | 29.00 | Waters: The EOS MLS on the Aura Satellite | IEEE Trans. Geo. Remote Sens. |
OMI | 2006 | 28.33 | Levelt: The Ozone Monitoring Instrument | IEEE Trans. Geo. Remote Sens. |
TES | 2001 | 11.55 | Beer: TES for the EOS Aura Satellite | Applied Optics |
EOS metrics data are available on the volume of data distributed to users. It is assumed that these data users are performing scientific investigations that could eventually lead to results published in the scientific literature. A comparison was made between the number of papers published by instrument and the amount of data distributed by the DAACs from each instrument during the 10 year period (Chang 2010). As one might expect, there is a correlation between the amount of data users download and the number of papers published (see figures 9 and 10). MODIS data accounts for 72% of the total instrument data distributed to users, and more than 55% of the papers published on EOS data utilize results from MODIS. Other instruments are proportionally aligned. HIRDLS data has the least amount of data distributed and has correspondingly the fewest papers published.
There are some interesting anomalies: ASTER data distributed to users is less than half the amount of MISR data distributed (319 TB vs. 634 TB), yet there are nearly twice as many papers published utilizing ASTER data than MISR data (635 vs. 327). A possible reason could be that more diverse scientific investigations are made using ASTER than MISR resulting in many more papers on a wider variety of topics. Likewise for MODIS: papers using MODIS data have been published on a wide variety of disciplines, some of which are outside the mainstream Earth sciences.
Figure 9. EOS data distribution to users (in Terabytes [TB]) for EOS instruments from 2000-2009. (Data provided by Chang, H-D.)
Figure 10. Total number of peer-reviewed articles by instrument 2000-2009.
Papers published citing the use of EOS instrument data have greatly contributed to the body of knowledge about Earth system processes. Although authors will cite that a specific EOS instrument was used, the citation of specific datasets in the peer-reviewed literature, however, is lacking, and the prospects for consistent citation in the future are not bright, though there is considerable movement towards establishing data citation standards (Green 2009). Many publishers do not allow the explicit citation of datasets in a format similar to citing journal articles and many authors do not explicitly cite the source of the data in their papers. The scholarly citation of datasets in scientific publishing has been widely discussed in Green (2009) and in Altman and King (2007). The German Research Foundation (DFG) and the German National Library of Science and Technology (TIB) have established a DOI registration system for scientific data sets (from the PANGEA information system) (Grobe, et al. 2006). Other recent initiatives include DataOne and DataCite. For years, NASA's Global Change Master Directory has been managing metadata for a wide variety of remote sensing and in-situ Earth science data sets and permits the assignment of unique identifiers. Both the American Geophysical Union (American Geophysical Union 2009) and the American Meteorological Society (American Meteorological Society 2010) have convened committees to explore the scholarly citation of scientific data sets.
However, even if broad consensus among publishers is ever achieved on the citation of data sets in the future, there could never be any retrospective analysis of data citation in the literature, simply because it does not now exist. The use of EOS data that were ultimately obtained from the NASA DAACs, either directly or indirectly through secondary data distributors, can be extrapolated from the published literature. In this analysis, an attempt was made to provide some proxy indication of data use, at least in the case of NASA's EOS data that are distributed from the DAACs and investigator-led teams.
Since the peer-reviewed literature, in general, does not address specific data sets that were used in the publication of a paper, only the qualitative roles that data from EOS instruments played in the scientific investigation leading to published results are addressed. Over the period 2000-2009, 5,633 papers were published in over 400 different journals citing the use of EOS instrument data using the Web of Science database. Over 55% of those papers cited using MODIS data. The total number of papers published using EOS data closely tracks the amount of data distributed to users through the NASA EOS DAACs.
The impact of the use of EOS data on the scientific literature is significant in that EOS-cited papers tended to be published in journals with high Journal Impact Factors and that EOS-related articles contributed 64.6% of the total number of citations from related disciplines. Also, from a total sample of approximately 15,000 Earth science papers published during this period approximately 37% of the papers represent EOS data. Even in the absence of formal data citation standards, EOS data, primarily delivered by the DAACs, represents a significant contribution to the scientific literature.
This work was performed while under NASA Contract NNG07AZ07C, NASA Goddard Space Flight Center Library (Code 272), under direction of Robin M. Dixon. The author gratefully acknowledges the assistance of Lola M. Olsen of NASA's Global Change Master Directory (GCMD) in reading the manuscript and providing valuable comments. The author also acknowledges Dr. Hyo-Duk Chang of Adnet Systems and ESDIS Project Office for providing EOS distribution data and to Nicole S. Hoes of LAC Group for editing the manuscript.
American Geophysical Union (AGU). 2009. Peer-Reviewed Data Publication and Other Strategies to Sustain Verifiable Science. Interagency Data Stewardship/2009AGUTownHall. [Internet]. [Cited November 7, 2011]. Available from: http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/2009AGUTownHall
American Meteorological Society (AMS). 2009. AMS Ad Hoc Committee on Data Stewardship Prospectus. [Internet]. [Cited November 7, 2011]. Available from: http://www.unidata.ucar.edu/staff/mohan/Data Stewardship Prospectus.pdf
Altman, M. and King, G. 2007. A proposed standard for the scholarly citation of quantitative data. D-Lib Magazine 13(3/4). [Internet]. [Cited November 7, 2011]. Available from: http://www.dlib.org/dlib/march07/altman/03altman.html
Behnke, Jeanne, Watts, T.H., Kobler, B., Lowe, D., Fox, S., and Meyer, R. 2006. EOSDIS Petabyte Archives: Tenth Anniversary, Proceedings of the 22nd IEEE/13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2005).
Chang, Hyo-Duk. 2010. Personal communication (ADNET Systems, Inc.).
Green, Toby. 2009. We Need Publishing Standards for Datasets and Data Tables, OECD Publishing White Paper, OECD Publishing. [Internet]. [Cited November 7, 2011]. Available from: http://dx.doi.org/10.1787/603233448430
Grobe, Hannes, Diepenbroek, M., Dittert, N., Reinke, M. and Seiger, R. 2006. Archiving and distributing earth-science data with the PANGEA Information System, In Futterer, D.K., Damaske, D., Kleinscmidt, G., Miller, H., and Tessensohn, F. (eds.). Antarctica: Contributions to Global Earth Sciences, Berlin: Springer, 403-406. [Internet]. [Cited November 7, 2011]. Available from: http://dx.doi.org/10.1007/3-540-32934-X
Parsons, Mark A., Duerr, R., and Minster, J-B. 2010. Data citation and peer review. Eos Transactions of the AGU 91(34}.
Thomson Reuters. 1994. The Thomson Reuters Impact Factor. [Internet]. [Cited March 2011]. Available from: http://thomsonreuters.com/products_services/science/free/essays/impact_factor/)
Web of Science. 2011. [Internet]. [Cited March 2011]. Available from: http://isiknowledge.com/wos