Worldwide data volumes are exploding and islands of storage remote from compute will not scale. In scientific research domains such as physics, space sciences, meteorology, genetics and biology, experiments and simulations generate increasingly large data sets. As these scale into the range of exabytes (billions of gigabytes), novel storage, processing, and analytics solutions must be devised to continue deriving insights and innovation in research.
The SAGE project proposes an advanced object based storage solution, termed Percipient Storage, with a very flexible new API enabling applications to achieve Exascale I/O loads exploiting deep I/O hierarchies. The solution will have the capability to run computations on data from any tier – with a homogenous view of data throughout the stack. The SAGE architecture reflects the need for reducing data movement in order to improve energy efficiency, as well as the technology trend towards new non-volatile memory technologies. DFKI will work in cooperation with European partners at Allinea, Bull, CCFE, CEA, Diamond Light Source, FSZ Jülich, KTH, and STFC in this Horizon 2020-funded project lead by Seagate, to meet the requirements of exascale scientific computing.
In the course of the project, DFKI will integrate the advanced data analytics platform Apache Flink with the native object interface together with data-local computations. Apache Flink will benefit from the full performance and features offered by the storage platform, lifting its analytics processing capabilities to the exascale level.
The project has received funding from the European Union’s Horizon2020 Research & Innovation Programme under grant agreement 671500.
Partners
Seagate, Allinea, Bull, CCFE, CEA, Diamond Light Source, FSZ Jülich, KTH, STFC