URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed. |
Electronic Resources Reviews
DOE Data Explorer
Meredith Ayers
Science Librarian
Founders Memorial Library
Northern Illinois University
DeKalb, Illinois
mayers@niu.edu
Copyright 2008, Meredith Ayers . Used with permission.
The following is a brief overview of the mechanics of the DOE Data Explorer. The reviewer hopes to elaborate further on the types of data available in another article.
Overview
The Department of Energy (DOE)
Data Explorer is a relatively new and currently unsophisticated research tool which helps researchers, students, and the public find stored and maintained data sets. The site claims to have cited over 200 data sets and is continuing to grow. The DOE does not claim responsibility for the accuracy and availability of the stored data. The purpose of the engine is to make both archived and active data easier to find. The Data Explorer is operated and maintained by the DOE's Office of Scientific and Technical Information (OSTI) which is responsible for providing all the bibliographic information in the database based on the information found at the web sites hosting the data.
The Data Explorer indexes collections of scientific research data, figures and plots, numeric files, scientific images, interactive maps, multimedia and computer simulations. The data collections themselves reside on various servers in numerous locations including national laboratories, data centers, colleges and universities, corporations, and international organizations. Access to the data collections is free, however, some may require password registration. Users should note that they may need specific software in order to access some data collections.
Criteria
Data indexed by the Data Explorer must be raw data. This means it must contain mostly numbers, figures, plots, images, etc. and for the most part be non-text information. While some sites which provide access to the data may provide links to research papers in which the data are referred to or used, the Data Explore does not search, index or provide access to research papers. The data must be maintained for analysis, reference purposes, or reuse and the collections may be small but must consist of more than a few items that fall under the collection's "title." The final criterion used is that the data must be funded completely or in part by the DOE for collection of or maintenance of the data collection.
Searching
On the home page (figure 1) in the left-hand column is the Browse box, Basic Search box, and a link to the advanced search page. The basic search allows a searcher to type a word or phase with which to perform a search. The search is not case sensitive and quotation marks cannot be used for exact phrase searching.
Figure 1: DOE Data Explorer Homepage
|
The submit function initiates a string search for the query searching every field in the full record for a match.
Figure 2: Advanced Search Page
|
The Advanced Search Screen (figure 2) provides more options. One can search in a specified field:
- full record
- title
- content type
- creator
- organizations
- sponsor
- subject category/keyword
Data Explorer also provides two sorting options: Title and Content Type. Sorting by title provides an alphabetized list of titles matching the search criteria. Sorting by content type groups the titles based on how their content is classified. For example, all of the citations that are collections of Computer Models/Simulations are grouped together; all of the citations that are Figures and Plots are grouped together, etc. Content types include:
- Computer Models/Simulation
- Figures/Plots
- Interactive Data Maps
- Multimedia
- Numeric Files/Datasets
- Scientific Images
- Specialized Mix
Definitions of theses can be found at {http://www.osti.gov/dataexplorer/contenttypes.html}.
Results
As shown in Figure 2 a search for "carbon cycle data" was performed from the advanced search screen. The results of the search are shown in Figure 3.
Figure 3: Results of search for "carbon cycle data."
|
Clicking on the link for the bibliographic record yields the record shown in figures 4 & 5.
Figure 4: bibliographic record -part 1.
|
Figure 5: Bibliographic record- part 2
|
The following fields will always appear in the bibliographic record:
- collection title
- collection sponsor
- subject categories
- keywords
- description
- DOE Data Explorer number
These fields will appear only if the information is available:
- creator/PI
- other sponsors (non-DOE)
- DOE Data Center (only if it is the location of the collection)
- DOE Scientific User Facility (only if it is the location of the collection)
- other related organizations.
Clicking on the term "Data Collection" takes the user from the DOE Data Explorer to the site hosting the data as shown in Figure 6.
Figure 6: Site hosting the data collection.
|
Once at the host site, the information the user seeks may be anywhere from one click away to several, or it may be necessary to perform another search in the data center interface. In this instance if the user scrolls down the page in the left hand column under the subtitle of "Analog Channels" there is a link to "Covariances of H2O, CO2" which provides the data as shown in Figure 7.
Figure 7: chart showing the covariances of H2O, CO2
|
Browsing
Browsing the database is another useful option. Browsing categories include title
- content type
- DOE Data Center
- DOE Scientific User Facility
- sponsor
- subject category
Note that these fields do not appear in every record, but exist in a record only if the information is available. Browsing is a simple process. For example, selecting DOE Data Center will list all nine Data Centers. After selecting one of the data centers, the titles of the data collections indexed by the DOE Data Explorer are listed below the data center's name as shown in Figure 8. In Figure 8 the data collections from the Alternative Fuels and Advanced Vehicles Data Center (AFDC) are found. Clicking on another data center name will cause those results to disappear while revealing the new results under the data center just selected. This is the format for all the browsing categories.
Figure 8: Browsing the DOE Data Centers.
|
Scope
The DOE Data Explorer indexes data collections from the following DOE Data Centers:
- Alternative Fuels and Advanced Vehicles Data Center (AFDC)
- Atmospheric Radiation Measurement (ARM)
- Carbon Dioxide Information Analysis Center (CDIAC)
- Comprehensive Epidemiological Data Resource (CDR)
- Controlled Fusion Atomic Data Center (CFADC)
- DOE Joint Genome Institute's (JGI)
- National Nuclear Data Center (NNDC)
- Renewable Resource Data Center (RREDC)
- U.S. Transuranium and Uranium Registries (USTUR)
Data collections also come from related DOE centers that are funded by other agencies but located at DOE facilities that are partly funded by the DOE, or from DOE groups that gather, analyze and disseminate data. These groups provide a large variety of very specific data including genetic data, environmental data, to nuclear related data.
Miscellaneous
The Data Explorer highlights a different "Featured Data Collection" every month. Clicking on the "What's New" menu tab will tell the user what data collections have been recently added or removed from the list as well as other updates. The OSTI also invites comments and suggestions from users and asks they be notified if a data collection is missing that the user thinks should be included.
Reminder
Even though this is a potentially useful tool for the location of experimental or raw data sets, it is still new and unsophisticated in its searching methods. There are no Boolean operators and the primary limits are fields that can be searched. Keep in mind that it is searching data collections. This means a collection may contain related data under a title that may not at first appear to be what the user is looking for. It is in the best interest of the searcher to explore the data hosting site in search of the data. The search shown here was prefabricated and not all searches are guaranteed to go so smoothly.