Data analysis and visualization are widely recognized as critical components of science enabling the understanding of complex simulations and experiments, the debugging of simulation codes, and the presentation of scientific insights. Thru strategic investments and the work of laboratory, industry and university participants, many of whom are members of this proposal team, the Department of Energy (DOE) has been a leader in basic research, software development, and product deployment in this area over the past two decades. With this proposal, we will leverage and advance these investments by delivering exascale-ready in situ data analysis and visualization capabilities to the Exascale Computing Project (ECP).
The ALPINE proposal has four major development areas:
- Deliver exascale visualization and analysis algorithms that will be critical for ECP Applications as the dominant analysis paradigm shifts from post hoc (post-processing) to in situ (processing data in a code as it is generated).
- Deliver an exascale-capable infrastructure for the development of in situ algorithms and deployment into existing applications, libraries, and tools.
- Engage with ECP Applications to integrate our algorithms and infrastructure into their software.
- Engage with ECP Software Technologies to integrate their exascale software into our infrastructure.
We believe these focus areas are critically important to the success of ECP data analysis and visualization effort and therefore proposal team members are assigned to two of these four areas, one from a capability building area (algorithms or infrastructure) and one from an integration area (application or software technology). Many high performance simulation codes are using post hoc processing, meaning they write data to disk and then visualize and analyze it afterwards. Given exascale I/O constraints, in situ processing will be necessary. In situ data analysis and visualization selects, analyzes, reduces, and generates extracts from scientific simulation results during the simulation runs to overcome bandwidth and storage bottlenecks associated with writing out full simulation results to disk. Since in situ processing is in “early days” in terms of production usage within the DOE, we need to add new types of in situ algorithms. For example, we will develop algorithms that can automate which data analysis and visualization routines are applied and how they are applied to focus on the most important aspects of the simulation. We will also develop algorithms that can transform simulation data in a way that massively reduces it and yet allows for the integrity of the underlying information to be preserved. Our team will deploy these algorithms to ECP application scientists via our infrastructure.
Our capability will leverage existing, successful software, ParaView and VisIt, including their recent developed in situ libraries Catalyst and Libsym, by integrating and augmenting them to address the challenges of exascale. Both software projects represent long-term DOE investments, and they are the two dominant software packages for large-scale visualization and analysis within the DOE Office of Science (SC) and the DOE National Nuclear Security Agency (NNSA). These two products will provide significant coverage of ECP Applications, and we can leverage their existing engagements to provide exascale-level analytics. This strategy also allows us to enlist talented staff across both development teams, which includes the PIs for ParaView and VisIt, who will also serve as PIs for ALPINE. Overall, we believe that ALPINE is a key element for ECP success, helping to ensure that the challenges of exascale visualization and analysis needs are addressed.