We recently reported that Carol Song of Purdue University and four co-PIs, including Jack Smith of Marshall University, were awarded a five-year, $4.5 million grant from the National Science Foundation. Both Song and Smith participated in SGCI’s Science Gateways Bootcamps while working on separate gateway projects.
The grant will allow the team to build a “plug and play” platform, called GeoEDF, that will give researchers the ability to easily access and process geospatial data.
The article below, which was published by Science Node on January 14, 2019, provides more detail about the important work being done by the GeoEDF team.
From agriculture to clean water, geographic data can solve a lot of problems--if only we could manage it better.
What if a farmer could scan a soybean leaf and discover that her crop isn’t flourishing because the soil needs more potassium?
What about a scientist in Appalachia accessing sensor data from a mountain stream that indicates it’s contaminated with coal ash—and alerting local householders before they get sick?
That—and other feats—is what a new NSF-funded project hopes to do. Led by computer scientist Carol Song of Purdue University, the GeoEDF platform brings together many different types of geospatial data. This one-stop-shop will help scientists expand their research and public officials fine-tune policies.
“We want to help with decision making,” says Song. “To let people compare the data and be able to say, ‘If we do this, this might happen. If we do that, that might happen.’”
Geospatial data is any information that has a geographic aspect. This includes obvious examples like maps that show roads, rivers, and boundaries, but also measurements from sensors (such as seismometers or water quality meters) that include a geographic location.
Government agencies like NOAA, NASA, and the USGS already collect valuable information such as satellite remote sensing, land elevation, census, agricultural, economic, and other data. Citizens contribute data too, through citizen science platforms or geotagged tweets.
Such a wealth of information could help scientists solve a lot of problems. But because there are so many different types of data, it can be hard to collect, compare, and share it all.
“We're trying to help solve new problems,” says Song. “Right now, people working with these very diverse data sets have an army of students that collect relevant data from different repositories and then massage it into something actually usable. It's a long process—sometimes six months just wrangling the data. We want to speed that up.”
One necessary component of reaching that desired speed is computers that can handle demands for all that data.
GeoEDF is an expansion of the existing MyGeoHub science gateway which draws on local computing resources at Purdue, but is also connected to supercomputers like Comet and Stampede through XSEDE (Extreme Science and Engineering Discovery Environment).
As the project scales up with the new GeoEDF platform, the developers will need to rely even more on XSEDE. “We are using Jetstream to do our development and test our hypotheses,” Song says. “To find out what we need and how we need it to work.
One GeoEDF project will aggregate data from multispectral handheld scanners that diagnose plant health and provide county- and state-wide estimates of expected crop performance. Courtesy Purdue University.
Song’s team also depends on the open-source HUBzero platform, which supplies the gateway’s infrastructure—from hosting the website to providing user-oriented elements such as discussion forums and collaborative features for sharing tools and datasets.
But ultimately, science is what this project’s all about. Even in her early work as a graduate student at the National Center for Supercomputing Applications (NCSA), Song was drawn to working with applications that could bring computation to domain scientists.
“Although I’m not an expert in any of the domains I’m working with, I’ve learned enough to appreciate their problems and to see how what we do can directly impact them,” says Song. “That’s where I get my sense of accomplishment and satisfaction.”
Song provided a sneak peek at what some of those science partners are up to:
Data alone can’t solve our problems. But we live in a world in which information is streaming data from every corner—from every back alley, mountain stream, and soybean field. It’s up to us to take advantage of the available data and transform it into safer homes, drier basements, stronger crops, and clean, drinkable water.