Director’s Blog 5/11/19

Director’s Blog May 11, 2019
By Ian Billick, PhD

If you have any big picture ideas related to science and big data, related to RMBL or not, that you want to share, shoot them to me. I am spending the week of May 20th in DC participating in an NSF Ideas Lab centered around informatics. Several years ago The National Science Foundation released “10 Big Ideas”, one of which is “Harnessing the Data Revolution (HDR)”. The goal is enable society to harness the explosion in data to enhance decision‐making and prediction across a range of topics, from human disease to environmental disasters. NSF is spending approximately $30 million to build advanced cyber infrastructure to accelerate data intensive research, improve mechanisms to train people to work effectively within emerging cyber infrastructure, and support research involving cyber infrastructure.

The concept behind an “Ideas Lab” is an interesting one, and a bit of an unusual approach for NSF. Traditionally they “invite proposals” and decide who to fund. With an Ideas Lab they “invite people” (and eventually proposals). They are running four Ideas Labs around data, each of which involves approximately 30 participants and 5 mentors from a wide range of disciplines, including physics, chemistry, and molecular biology. Over the course of the week there are multiple rounds of brainstorming, with the goal of creating smaller teams that generate pre‐proposals. They hope that this approach is more likely to lead to bigger ideas that cross disciplines. Think “Shark Tank” for scientists.

With your input RMBL has a nice list of data initiatives, including archiving and documenting valuable datasets, promoting collaboration through easier integration of different data sources, improving discovery of data resources (including the weather station data), generating “community” datasets (e.g., spatially explicit predictions of snowmelt data), and building a centralized research administration portal. A centralized portal could reduce the time you spend on RMBL administrative tasks, freeing up more time for fun things! Done well such a portal could also increase our capture rate of data and metadata, and facilitate integration of different datasets. I feel pretty good that we have a productive 5 years ahead of us, assuming we can raise the funds to support the initiatives.

But can we think any bigger?? Here are a few vague ideas/observations I’m kicking around to see if they jell into something more concrete, and to possibly spur your creative juices! In no order….

1. As the diversity of sciences at RMBL increases I’m fascinated by differences in approaches to fundamental concepts including the relative importance of observational and experimental data, hypothesis testing versus model building versus parameter estimation, and replication and inference.
2. The application of machine learning to field research is taking off quickly.
3. Access to massive amounts of highly precise, spatially explicit data seems like it could really change ecology and evolutionary biology.
4. The limits of data mining in the absence of mechanistic/theoretical models seems to be a recurring theme popping up across disciplines.
5. Can predictive techniques successfully developed in the context of complex sociopolitical situations be applied to environmental processes?
6. Can the advances in decision‐making theory be applied to, and improve, environmental‐decision‐making?
7. Technology is quickly changing the volume and quality of data field scientists have access to. For better or worse. If you are curious, ask me this summer about the Ideas Lab!

In the meantime, if you have some big ideas around the “data revolution”, or even just your own set of vague observations, that you want to share, you know how to find me!