REU Site: Collaborative Research: BigDataX: From theory to practice in Big Data computing at eXtreme scales

NSF Award CCF-1757964; $333,106 (Collaborative Total: $368K); February 2018 through January 2021. This project is in collaboration with Ioan Raicu at IIT as well as Kyle Chard and Aaron Elmore at the University of Chicago.
This project extends the BigDataX program, a Research Experiences for Undergraduates (REU) site at Illinois Institute of Technology (IIT) and the University of Chicago (UChicago). The BigDataX site focuses on undergraduate research in both the theory and practice of big data computing at extreme scales. BigDataX includes a diverse group of 10 undergraduate students, who will conduct research over a 10 week period, in the area of big data and how it will impact the design, analysis, and implementation of run-time systems and storage systems to support big data applications. This research will make extreme scale computing more tractable, touching every branch of computing in high-end computing and datacenters. These advancements will impact scientific discovery and economic development at the national level, and they will strengthen a wide range of research activities enabling efficient access, processing, storage, and sharing of valuable scientific data from many disciplines. The proposed work will place students in the middle of a technological revolution which will revolutionize the computing domain.
The primary objective of this proposal is to promote a data-centric view of scientific and technical computing, at the intersection of distributed systems theory and practice. This proposal includes 6 mentors, with a variety of complementing expertise from theory to systems, distributed systems, storage management, and data science. The mentors have extensive experience in mentoring undergraduate students, collectively having worked with over one hundred students in research activities, publishing dozens peer-reviewed workshop/conference/journal papers at top venues together with undergraduate students. The students that will be part of the BigDataX program will be exposed to big data applications, large data sets, and various distributed systems such as the Mystic reconfigurable testbed at IIT, Chameleon cloud and Theta supercomputer at Argonne National Laboratory, the XSEDE national cyberinfrastructure, and the Amazon Web Services cloud.