On April 24, the House Science, Space and Technology Committee’s Subcommittees on Research and Technology held a joint hearing to examine big data and the impact it has on multiple scientific fields. Members of the subcommittees heard about advancements in information technology, data analytics, and research and development in public and private sectors. Challenges surrounding data management, privacy, and workforce development were also addressed.
Members on both sides of the aisle were interested in learning more about big data sets that are being generated across a wide variety of industries and sectors. Industry spending on analyzing and managing big data has increased significantly and there is a projected need for managers, analysts, and other professionals who have skills processing large data sets. Members also focused their attention on the role that colleges and universities play in educating and training students in big data-related disciplines.
In March 2012, the Obama Administration unveiled its Big Data Research and Development Initiative aimed at improving the tools and techniques used to access, organize and store large data sets. Six agencies are participating in this initiative: the National Science Foundation (NSF), National Institutes of Health (NIH), Department of Defense (DOD), Defense Advanced Research Project Agency (DARPA), Department of Energy (DOE) and United States Geological Survey (USGS).
Research Subcommittee Chairman Larry Bucshon (R-IN) seemed pleased that “big data offers a range of opportunities for private industry to reduce costs and increase profitability.” He also highlighted that using advanced computing power, “universities and federal laboratories can drive new research initiatives that will significantly increase our scientific knowledge base.” Bucshon was interested to hear how industry, academia and government are addressing the shortfall in workforce needs. In addition, he wanted to learn about how the agencies that are involved in the Administration’s Big Data Initiative were coordinating their programs.
Technology Subcommittee Chairman Thomas Massie (R-KY) described some of the ongoing issues surrounding big data, noting that “individual researchers have always been faced with difficult decisions about their data: what to keep, what to toss, what to verify with additional experiments. As our computing power has increased, so has the luxury of storing more data.” He highlighted how “the ability to analyze large amounts of data across multiple networked platforms is also transforming the private sector.”
Research Subcommittee Ranking Member Dan Lipinski (D-IL) focused on opportunities for big data based on the change in the volume and variety of data available as well as how fast the data is being processed. He noted that there are challenges associated with new tools and software packages used to manage and organize big data as he pointed to the need for a workforce with significant analytical skills. “Challenges necessitate involvement from government, academia, and the private sector,” stated Lipinski as he highlighted the government’s Networking and Information Technology Research and Development program as well as the NSF-NIH joint big data grant program.
Technology Subcommittee Ranking Member Frederica Wilson (D-FL) was interested in learning how the federal government can contribute to the creation of new tools and software, and to building the workforce needed to address big data issues. She highlighted the role that data from photos played in the identification of the Boston bombing suspects as she expressed her interest in learning more about big data issues.
Three witnesses testified. David McQueeney, Vice President of Technical Strategy and Worldwide Operations at IBM Research offered suggestions to the subcommittees regarding the transition from computing based on processor technology to computing based on cognitive systems. He stated that national labs should target modeling, simulations, and analytics research to improve exoscale computing systems as he showed his support for the reauthorization of the High End Computing Revitalization Act. He also advocated for language to be included in the reauthorization of the Perkins Act that would allow for the restructuring of education programs to align with big data-related workforce needs.
Michael Rappa, Director of the Institute for Advanced Analytics and Professor at North Carolina State University highlighted the role of his university in training and motivating students to learn skills associated with managing and using big data sets. He noted the national challenges of educating a data-savvy workforce and highlighted the partnerships and programs at his universities that provide students with incentives, resources and opportunities for collaboration with industry.
Farnam Jahanian, Assistant Director for the Computer and Information Science and Engineering Directorate at the National Science Foundation stated that “it is important to emphasize not only the enormous volume of data but also the velocity, heterogeneity, and complexity of data that now confronts us.” He identified four areas for investment in big data challenges: investment in foundational research to advance big data techniques and technology; support for interdisciplinary research communities; investment in education and workforce development; and the development of cyberinfrastructure to capture, manage, analyze and share digital data.
Questions from members demonstrated a common interest in learning about the role of the private sector and of government, managing how data is provided to the public, and increasing student interest in big data analysis. US competitiveness as it relates to the management and use of big data was also a topic of discussion.