Philanthropic initiative Schmidt Futures is sponsoring a National Academies study that will explore the potential for artificial intelligence to control complex scientific workflows. Stu Feldman, the initiative’s chief scientist, says he hopes federal agencies will pay attention.
(Image credit – National Astronomical Observatory of Japan)
A new National Academies study committee is gearing up to explore how artificial intelligence (AI) might move beyond emerging applications in analyzing scientific data to tackle higher-level decisions involved with the design and direction of complex experiments. The committee is chaired by Daniel Atkins, a professor at the University of Michigan who was the first director of the National Science Foundation’s cyberinfrastructure office.
Titled “Realizing Opportunities for Advanced and Automated Workflows in Scientific Research,” the study is sponsored by Schmidt Futures, a philanthropic initiative founded by former Google CEO Eric Schmidt and his wife Wendy Schmidt. It is the latest in a set of recent, privately backed National Academies projects to explore fundamental questions of scientific methodology and practice. Last year, the Academies completed a study on open science sponsored by the Laura and John Arnold Foundation. And earlier this year, it released an NSF-sponsored study on reproducibility and replicability that received additional funding from the Sloan Foundation and was chaired by the president of the Gordon and Betty Moore Foundation.
Schmidt Futures Chief Scientist Stu Feldman said at the new study’s kickoff meeting last month that the idea for it derived from the initiative’s “AI Accelerator” program, which aims to develop a “new way of conducting science.” Feldman told the study committee he regards its task as dovetailing with the drive toward open science and bolstering reproducibility and replicability. He stressed, though, that it is up to the committee to interpret its charge and conduct its work as it sees fit.
Schmidt Futures sees AI as transforming experimental process
Speaking to FYI about Schmidt Futures’ interest in AI-driven science, Feldman noted that he takes the term to connote a range of computational methods that includes machine learning, pattern matching, simulation, and Bayesian optimization. He said the initiative has sought to back research projects that use these tools in groundbreaking ways.
“I’ve noticed that very first-class scientists have been utilizing this not just as a tool to get a particular step done — image analysis, today if you don’t use machine learning you’re doing it wrong in practice — but also using it to derive and guide the overall experiment plan,” he said.
According to Feldman, AI techniques make it possible to use more sophisticated mathematics than standard statistical analysis, enabling thousands of experiments or observations to be rapidly performed and analyzed. He remarked, “In some of the examples I’ve seen, realistically a university-sized research lab might be able to out-produce a classical industrial lab operating in the old, more manual approach, and might be able to get radically better and more interesting answers.”
Schmidt Futures is currently funding six projects in this mold that span the sciences. Feldman pointed as an example to a project that is using AI to develop and test ways of incorporating the outputs of high-resolution cloud models into global climate models, which cannot simulate detailed cloud behavior. Tapio Schneider, the lead investigator on the project and a professor at Caltech, is a member of the National Academies study committee.
As another example, Feldman pointed to a group at Princeton University and Johns Hopkins University working on the Prime Focus Spectrograph, an instrument under development for use at the Subaru Telescope in Hawaii. The spectrograph’s key design feature is its use of 2,400 fibers that can be individually steered to gather photons from different targets for variable lengths of time. This flexibility raises the problem of how to decide when to cease a fiber’s observations of one target and switch to another. Feldman remarked,
The question is, how do I maximize what we’re learning by my choice of exposures? And that is an intriguing philosophic question. It’s a much more intriguing mathematical question: how do you formulate anything that looks like that? And then how do you handle the squabbles of all of the observers who each want their piece of the work? Because of course everybody has their own short-term goals. It’s a fascinating overlay on an absolutely magnificent instrument.
Feldman hopes workflow control gains more attention
Feldman said he has also been struck by the scientific potential of workflow engines, which are software packages that assist in the management of business and administrative processes. He noted such engines are often used in clinical trials, which demand thorough documentation of procedures and data.
Observing that workflow engines have not yet been widely adopted across science, Feldman said, “Most scientists are not so rigid, and they’re much more opportunistic in their studies, very reasonably since there’s a huge discovery component.” He asserted, though, that increasing adoption of electronic laboratory notebooks, a growing emphasis on open science, and the potential for leaps in productivity all hint that automated workflows have strong prospects.
Concerning why Schmidt Futures has turned to the National Academies, Feldman said he sees value in the views of independent experts, and that if they agree automated workflows are a “big deal,” then they can recommend steps to push them forward in the U.S. and beyond. He noted he is a member of the National Academies Board on Research Data and Information and that the subject seemed “utterly consistent” with the board’s focus on open data and open science practices. He said the premium that workflows place on explicitly defining and recording procedures lends them to the requirements of open science and facilitates experimental replication, though he noted they can be used in proprietary research as well.
Feldman also said he regards it as important to center the study in Washington, D.C., in order to better engage the interest of federal agencies that fund science. He argued that efforts such as NSF’s “Harnessing the Data Revolution” initiative remain focused on the analysis of large volumes of data, which he suggested is also true of the large data-processing initiatives associated with projects such as the Large Synoptic Survey Telescope currently under construction in Chile.
The ultimate objective, Feldman stressed, is to have AI use experimental data to make decisions about how to steer the workflow of the experiment. He said,
I’m hoping there will be a lot of evidence that this is a coming approach that radically improves scientific efficiency, that radically improves the scale and depth of experiments you can do, while simultaneously permitting the recording of far more metadata that will permit replication. … And I hope that some of the agencies will conclude this would be a direction that they should be actively supporting.