Scientific applications running on High End Computing (HEC) platforms can generate large volumes of output data. As these grow to Peta-Scale and beyond, fast write and read accesses to massive data are becoming increasingly important, both to speed up the simulation and to accelerate exploration of data. Efficient access, understanding and management of voluminous and complex data generated by scientific simulations presents daunting challenges to both computational and computer scientists.

In this project, we are aiming to enhance the performance and flexibility of current parallel I/O software stack and create an end-to-end solution for scientific data management. Our research is based on the I/O and data management requirements of production Peta-Scale scientific simulation codes such as GTC, S3D, and CHIMERA, and we try to address the challenges of storing, accessing, moving, and managing data at Peta Scale and beyond.