Skip to main content
Advanced Search

Filters: Tags: Data and Information Assets (X) > partyWithName: Community for Data Integration - CDI (X)

71 results (546ms)   

View Results as: JSON ATOM CSV
The USGS National Land Cover Trends Project has the largest repository of field photos at the USGS (over 33,000 photos). Prior to CDI funding, Land Cover Trends had limited funding to make the national collection of photos available online for researchers, land managers, and citizens. The goal of this CDI project was to add geotags and keywords to the digital copies of each field photo and make the collection searchable and downloadable via the Internet. By funding the effort to integrate Land Cover Trends field photography and online mapping technology, CDI has helped provide access to geographic data needed to conduct science and support policy decisions. Sharing georeferenced photography distributed across the...
thumbnail
Legacy data (n) - Information stored in an old or obsolete format or computer system that is, therefore, difficult to access or process. (Business Dictionary, 2016) For over 135 years, the U.S. Geological Survey has collected diverse information about the natural world and how it interacts with society. Much of this legacy information is one-of-a-kind and in danger of being lost forever through decay of materials, obsolete technology, or staff changes. Several laws and orders require federal agencies to preserve and provide the public access to federally collected scientific information. The information is to be archived in a manner that allows others to examine the materials for new information or interpretations....
thumbnail
The goal of this project is to improve the USGS National Earthquake Information Center’s (NEIC) earthquake detection capabilities through direct integration of crowd-sourced earthquake detections with traditional, instrument-based seismic processing. During the past 6 years, the NEIC has run a crowd-sourced system, called Tweet Earthquake Dispatch (TED), which rapidly detects earthquakes worldwide using data solely mined from Twitter messages, known as “tweets.” The extensive spatial coverage and near instantaneous distribution of the tweets enable rapid detection of earthquakes often before seismic data are available in sparsely instrumented areas around the world. Although impressive for its speed, the tweet-based...
thumbnail
The purpose of this project was to test and develop first-generation biological data integration and retrieval capabilities for the Water Quality Portal (National Water Quality Monitoring Council, [n.d.]) using the Water Quality Exchange (WQX) data exchange standard (Environmental Information eXchange Network, 2016). The Water Quality Portal (Portal) is a significant national water data distribution node that is aligned with the vision of the Open Water Data Initiative (Advisory Committee on Water Information, [n.d.]). The Portal is sponsored by the USGS, the EPA, and the National Water Quality Monitoring Council. The WQX data exchange standard is a mature standard widely adopted within the water quality monitoring...
thumbnail
People in the locality of earthquakes are publishing anecdotal information about the shaking within seconds of their occurrences via social network technologies, such as Twitter. In contrast, depending on the size and location of the earthquake, scientific alerts can take between two to twenty minutes to publish. The goals of this project are to assess earthquake damage and effects information, as impacts unfold, by leveraging expeditious, free and ubiquitous social-media data to enhance our response to earthquake damage and effects. Principal Investigator : Michelle Guy, Paul S Earle Cooperator/Partner : Scott R Horvath, Douglas Bausch, Gregory M Smoczyk The project leverages an existing system that performs...
As one of the cornerstones of the U.S. Geological Survey's (USGS) National Geospatial Program, The National Map is a collaborative effort among the USGS and other Federal, State, and local partners to improve and deliver topographic information for the Nation. It has many uses ranging from recreation to scientific analysis to emergency response. The National Map is easily accessible for display on the Web, as products and services, and as downloadable data. (Description from The National Map website, http://nationalmap.gov/about.html) In fiscal year 2010, the Community for Data Integration (CDI) funded the development of web services for the National Hydrography Dataset (NHD), the National Elevation Dataset (NED)...
thumbnail
USGS research in the Western Geographic Science Center has produced several geospatial datasets estimating the time required to evacuate on foot from a Cascadia subduction zone earthquake-generated tsunami in the U.S. Pacific Northwest. These data, created as a result of research performed under the Risk and Vulnerability to Natural Hazards project, are useful for emergency managers and community planners but are not in the best format to serve their needs. This project explored options for formatting and publishing the data for consumption by external partner agencies and the general public. The project team chose ScienceBase as the publishing platform, both for its ability to convert spatial data into web services...
thumbnail
The USGS 3D Elevation Program (3DEP) is managing the acquisition of lidar data across the Nation for high resolution mapping of the land surface, useful for multiple applications. Lidar data is initially collected as 3-dimensional “point clouds” that map the interaction of the airborne laser with earth surface features, including vegetation, buildings, and ground features. Generally the product of interest has been high resolution digital elevation models generated by filtering the point cloud for laser returns that come from the ground surface and removing returns from vegetation, buildings, powerlines, and other above ground features. However, there is a wealth of information in the full point cloud on vegetation...
We aim to migrate our research workflow from a closed system to an open framework, increasing flexibility and transparency in our science and accessibility of our data. Our hyperspectral data of agricultural crops are crucial for training/ validating machine learning algorithms to study food security, land use, etc. Generating such data is resource-intensive and requires expertise, proprietary software, and specific hardware. We will use CHS resources on their Pangeo JupyterHub to recast our data and workflows to a cloud agnostic open-source framework. Lessons learned will be shared at workshops, in reports, and on our website so others can increase the openness and accessibility of their data and workflows....
The USGS maintains an extensive monitoring network throughout the United States in order to protect the public and help manage natural resources. This network generates millions of data points each year, all of which must be evaluated and reviewed manually for quality assurance and control. Sensor malfunctions and issues can result in data losses and unexpected costs, and are typically only noticed after they occur during manual data checks. By connecting internal USGS databases to “always-on” artificial-intelligence applications, we can constantly scan data-streams for issues and predict problems before they occur. By connecting these algorithms to other cloud-hosted services, the system can automatically notify...
thumbnail
Understanding and anticipating change in dynamic Earth systems is vital for societal adaptation and welfare. USGS possesses the multidisciplinary capabilities to anticipate Earth systems change, yet our work is often bound within a single discipline and/or Mission Area. The proposed work breaks new ground in moving USGS towards an interdisciplinary predictive modeling framework. We are initially leveraging three research elements that cross the Land Resources and Water Mission Areas in an attempt to “close the loop” in modeling interactions among water, land use, and climate. Using the Delaware River Basin as a proof-of-concept, we are modeling 1) historical and future landscapes (~1850 to 2100), 2) evapotranspiration...
thumbnail
A BioBlitz is a field survey method for finding and documenting as many species as possible in a specific area over a short period. The National Park Service and National Geographic Society hosted the largest BioBlitz survey ever in 2016; people in more than 120 national parks used the iNaturalist app on mobile devices to document organisms they observed. Resulting records have Global Positioning System (GPS) coordinates, include biological accuracy assessments, and provide an unprecedented snapshot of biodiversity nationwide. Additional processing and analysis would make these data available to inform conservation and management decisions. This project developed a process to integrate iNaturalist data with existing...
thumbnail
Lower technical and financial barriers have led to a proliferation of lidar point-cloud datasets acquired to support diverse USGS projects. The objective of this effort was to implement an open-source, cloud-based solution through USGS Cloud Hosting Solutions (CHS) that would address the needs of the growing USGS lidar community. We proposed to allow users to upload point-cloud datasets to CHS-administered Amazon Web Services storage where open-source packages Entwine and Potree would provide visualization and manipulation via a local web browser. This functionality for individual datasets would mirror services currently available for USGS 3DEP data. After the software packages could not satisfy internal technical...
thumbnail
Identifying the leading edge of a biological invasion can be difficult. Many management and research entities have biological samples or surveys that may unknowingly contain data on nonindigenous species. The new Nonindigenous Aquatic Species (NAS) Database automated online tool “SEINeD” (Screen and Evaluate Invasive and Non-native Data) will allow a user to search for these nonindigenous occurrences at a push of a button. This new tool will enable stakeholders to upload a biological dataset of fishes, invertebrates, amphibians, reptiles, or aquatic plants collected anywhere in a U.S. State or Territory and screen that data for non-native aquatic species occurrences. In addition to checking for the nativity of species...
thumbnail
Geochronological data provide essential information necessary for understanding the timing of geologic processes and events, as well as quantifying rates and timescales key to geologic mapping, mineral and energy resource and hazard assessments. The USGS’s National Geochronological Database (NGDB) contains over 30,000 radiometric ages, but no formal update has occurred in over 20 years. This project is developing a database with a web-based user interface and sustainable workflow to host all USGS-generated geochronological data. This new geochronological database consists of (1) data from the existing NGDB; (2) published literature data generated by the USGS; and (3) more recent data extracted from ScienceBase...
thumbnail
Natural resources managers are regularly required to make decisions regarding upcoming restoration treatments, often based on little more than business as usual practices. To assist in the decision-making process, we created a tool that predicts site-specific soil moisture and climate for the upcoming year, and provides guidance on whether common restoration activities (i.e. seeding, planting) will be successful based on these conditions. This tool is hosted within the Land Treatment Exploration Tool (LTET), an application already used by land managers that delivers a report of site condition and treatment history. Incorporated within the short-term drought forecaster (STDF) is a rigorous statistical process,...
This project leveraged existing efforts toward the use of social media systems for delivery of information into a web based visualization framework. Rather than support the development of an expensive system developed in-house, this project supports the use of cloud-based social media system Twitter to provide a robust observation platform. Development efforts were directed at utilizing the substantial Twitter API feature set to query the media stream for species observation submissions. Citizen science participants were encouraged to use the Twitter direct message system to submit species observations using a pre-defined schema. Observations were extracted from the Twitter stream and processed using geospatial,...
thumbnail
Over the past 40 years the National Wildlife Health center has collected wildlife health information from around the U.S. and beyond, amassing the world’s largest repository of wildlife-disease surveillance data. This project identified, characterized, and documented NWHC’s locally stored wildlife health datasets, a critical first step to migrating them to new laboratory- and public-facing data systems, such as the Wildlife Health Information Sharing Partnership-event reporting system. To accomplish this, we developed a systematic, standardized approach for collaborating with laboratory scientists to locate, define, and classify their long-term datasets so that they can be cleansed, archived, and mapped to new...
thumbnail
As one of the largest and oldest science organizations in the world, USGS has produced more than a century of earth science data, much of which is currently unavailable to the greater scientific community due to inaccessible or obsolescent media, formats, and technology. Tapping this vast wealth of “dark data” requires 1) a complete inventory of legacy data and 2) methods and tools to effectively evaluate, prioritize, and preserve the data with the greatest potential impact to society. Recognizing these truths and the potential value of legacy data, USGS has been investigating legacy data management and preservation since 2006, including the 2016 “DaR” project, which developed legacy data inventory and evaluation...
thumbnail
In the mid-1800s, tile-drains were installed in poorly-drained soils of topographic lows as water management to protect cropland during wet conditions; consequently, estimations of tile-drain location have been based on soil series. Most tile drains are in the Midwest, however each state has farms with tile and tile-drain density has increased in the last decade. Where tile drains quickly remove water from fields, groundwater and stream water interaction can change, affecting water availability and flooding. Nutrients and sediment can quickly travel to streams thru tile, contributing to harmful algal blooms and hypoxia in large water bodies. Tile drains are below the soil surface, about 1 m deep, but their location...


map background search result map search result map Developing a USGS Legacy Data Inventory to Preserve and Release Historical USGS Data USGS Data at Risk: Expanding Legacy Data Inventory and Preservation Strategies Developing a USGS Legacy Data Inventory to Preserve and Release Historical USGS Data