In situ infrastructure, data analysis, visualization, and compression for exascale science
ALPINE/ZFP is a project within the Exascale Computing Project (ECP). Data analysis and visualization are widely recognized as critical components of science enabling the understanding of complex simulations and experiments, the debugging of simulation codes, and the presentation of scientific insights. Through strategic investments and the work of laboratory, industry and university participants, the Department of Energy (DOE) has been a leader in basic research, software development, and product deployment in this area over the past two decades. ALPINE/ZFP leverages and advance these investments by delivering exascale-ready in situ data analysis, visualization, and compression capabilities to the Exascale Computing Project.
The ALPINE/ZFP project addresses key development needs for Exascale science applications:
- Deliver exascale visualization and analysis algorithms that will be critical for ECP Applications as the dominant analysis paradigm shifts from post hoc (post-processing) to in situ (processing data in a code as it is generated).
- Deliver exascale-capable infrastructure for the development of in situ algorithms and deployment into existing applications, libraries, and tools.
- Deliver lossy compression through zfp — an open source library for compressed floating-point arrays that supports very high throughput read and write random access.
- Engage with ECP Science Applications to integrate our algorithms and infrastructure into their software.
- Engage with ECP Software Technologies to integrate their exascale software into our infrastructure.
The ALPINE team is addressing two problems related to Exascale processing: (1) delivering infrastructure and (2) delivering performant in situ algorithms. The challenge is that our existing infrastructure tools need to be made Exascale-ready in order to achieve performance within simulation codes’ time budgets, support many-core architectures, scale to massive concurrency, and leverage deep memory hierarchies. The challenge for in situ algorithms is to apply in situ processing effectively without a human in the loop. This means that we must have adaptive approaches to automate saving the correct visualizations and data extracts.
Our capability will leverage existing, successful software, ParaView and VisIt, including their in situ libraries Catalyst and LibSim, by integrating and augmenting them to address the challenges of exascale. Both software projects represent long-term DOE investments, and are widely used for large-scale visualization and analysis within the DOE Office of Science (SC) and the DOE National Nuclear Security Agency (NNSA). ALPINE is also developing an additional in situ framework, Ascent. Ascent is a lightweight infrastructure solution, meaning that it is focused on a streamlined API, minimal memory footprint, and small binary size. Our solution strategy is two-fold, in response to our two major challenges: infrastructure and algorithms.
For infrastructure, we have developed a layer on top of the VTK-m library for ALPINE algorithms. This layer is where all ALPINE algorithms will be implemented, and it is deployed in ParaView, VisIt, and Ascent. Thus all development effort by ALPINE will be available in all of our tools. Further, by leveraging VTK-m, we will be addressing issues with many-core architectures.
Currently, many high performance simulation codes are using post hoc processing. Given Exascale I/O and storage constraints, in situ processing will be necessary. In situ data analysis and visualization selects, analyzes, reduces, and generates extracts from scientic simulation results during the simulation runs to overcome bandwidth and storage bottlenecks associated with writing out full simulation results to disk. Current ALPINE algorithm development includes: Lagrangian flow analysis; topological analysis; adaptive data-driven sampling approaches; rotational invariant pattern detection; task-based feature detection; statistical feature detection and characterization.
The zfp project is addressing one of the primary challenges for Exascale computing: overcoming the performance cost of data movement. Far more data is being generated than can reasonably be stored to disk and later analyzed without some form of data reduction. Moreover, with deepening memory hierarchies and dwindling per-core memory bandwidth due to increasing parallelism, even on-node data motion between RAM and registers makes for a signicant performance bottleneck and primary source of power consumption.
zfp is a floating-point array primitive that mitigates this problem using very high-speed, lossy (but optionally error-bounded) compression to significantly reduce data volumes. zfp reduces I/O time and off-line storage requirements by 1-2 orders of magnitude depending on accuracy requirements, as dictated by user-set error tolerances. Unique among data compressors, zfp also supports constant-time read/write random access to individual array elements from compressed storage. zfp’s compressed arrays can often replace conventional arrays in existing applications with minimal code changes. This allows the user to store tables of floating-point data in compressed form that otherwise would not fit in memory, either using a desired memory footprint or a prescribed level of accuracy. When used in numerical computations, zfp arrays provide a fine-grained knob on precision while achieving accuracy comparable to IEEE floating point at half the storage, reducing both memory usage and bandwidth.
This project is extending zfp to make it more readily usable in an Exascale computing setting by parallelizing it on both the CPU and GPU while ensuring thread safety; by providing bindings for several programming languages (C, C++, Fortran, Python); by adding new functionality, e.g., for unstructured data and spatially adaptive compressed arrays; by hardening the software and adopting best practices for software development; and by integrating zfp with a variety of ECP applications, I/O libraries, and visualization and
data analysis tools.
For more information and code releases please see: