Data Science for All

Prabhakar leads Purdue's Integrative Data Science Initiative

BY LESA PETERSEN

Professor of Computer Science Sunil Prabhakar has been appointed to lead and amplify the impact of data science teaching and research initiatives at Purdue through his new role as the inaugural director of the Integrative Data Science Initiative (IDSI). The IDSI represents Purdue’s push to put data science at the center of education and discovery.

On the education side, the IDSI will guide curricula for majors related to data science — and set data science literacy goals for all undergraduate Purdue students, regardless of major. As part of these goals, the existing Statistics Living-Learning Community, directed by Professor of Statistics Mark Ward, will evolve to become the Data Mine — creating opportunities for students to gain data science expertise.

On the research side, Purdue’s Discovery Park is providing internal funding for data science research projects that make advances in pressing and socially relevant issues such as health care, defense, ethics and public policy. Prabhakar will identify collaborations across the campus, help secure and allocate resources, and provide a data science research hub for the University.

Prabhakar, long-standing head of the Department of Computer Science, will also work to build private and public partnerships across the globe. He’s excited to lead an initiative that will have a profound impact on the economy and on the lives of many Purdue students. “Purdue has all the right people and experience to impact this emerging field,” he says.



DRIVING DATA SCIENCE

College of Science researchers are the driving force behind advances in data science.


He coined the phrase

Bill Cleveland, the Shanti S. Gupta Distinguished Professor of Statistics, is credited for being the first researcher to define “data science” as it is used today, in a talk at the 1999 meeting of the International Statistical Institute, and in a 2001 paper in the ISI Review, “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” In 2009, Cleveland and his colleagues were among the first to figure out how to analyze big data. They started up the Divide & Recombine (D&R) approach to big data and the writing of the DeltaRho software for D&R.

Machine learning gets real

Associate Professor Jennifer Neville, the Miller Family Chair of Computer Science and Statistics, develops robust and efficient machine-learning algorithms for use in applications that involve network data — to solve real-world problems such as fraud detection and network security.

Hope for being safer on the road

Through a data science research project within the Statistics Living-Learning Community — soon to become the Data Mine — Hope Cullers, a double major in mathematics and applied statistics, contributed to building a road safety classification scale under the supervision of Associate Professor Michael Baldwin in the Department of Earth, Atmospheric, and Planetary Sciences. The team is using machine learning to best predict which weather conditions have the greatest impact on road conditions.

New paradigms for network analysis

Patrick J. Wolfe, the Frederick L. Hovde Dean of Science and Miller Family Professor of Statistics and Computer Science, focuses on the study of networks as statistical data objects — to understand rapidly growing and increasingly complex large networks that are pervasive and fundamental in science and society. Wolfe explores new paradigms for network analysis and new statistical theories that aim to keep pace with big network data.

A seismic shift in data-driven sensing

Assistant Professor of Computer Science Bruno Ribeiro is working to revolutionize industrial and environmental chemical sensing by incorporating fundamental physics principles into data-driven deep learning models. By combining real-world data with physics knowledge, Ribeiro wants to make chemical sensing a lot faster. These new techniques are already finding applications in environmental monitoring and precision agriculture projects of the Wabash Heartland Innovation Network (WHIN), using biodegradable, inexpensive, and low-powered chemical sensors of Purdue’s SMART Film consortium.