Exascale and Extreme Data Science at NERSC

Abstract

NERSC’s primary mission is to accelerate scientific discovery at the DOE Office of Science through high performance computing and data analysis. NERSC supports the largest and most diverse research community of any computing facility within the DOE complex, providing large-scale, state-of-the-art computing for DOE’S unclassified research programs in alternative energy sources, climate change, environmental science, materials research, astrophysics and other science areas related to DOE’s science mission.

 

NERSC’s next supercomputer, Cori, is being deployed in 2016 in Berkeley Laboratory’s new Computational Research and Theory (CRT) Facility. Cori will include over 9300 manycore Intel Knight’s Landing processors, which introduce several technological advances, including higher intra-node parallelism; high-bandwidth, on-package memory; and longer hardware vector lengths. These enhanced features are expected to yield significant performance improvements for applications running on Cori. In order to take advantage of the new features, however, application developers will need to make code modifications because many of today’s applications are not optimized to take advantage of the manycore architecture and on-package memory.

 

To help users transition to the new architecture, in 2014 NERSC established the NERSC Exascale Scientific Applications Program (NESAP). Through NESAP, several code projects are collaborating with NERSC, Cray and Intel with access to early hardware, special training and “deep dive” sessions with Intel and Cray staff. Eight of the chosen projects also will be working with a postdoctoral researcher to investigate computational science issues associated with manycore systems. The NESAP projects span a range of scientific fields—including astrophysics, genomics, materials science, climate and weather modeling, plasma fusion physics and accelerator science—and represent a significant portion of NERSC’s current and projected computational workload.

 

Cori will include many enhancements to enable a rapidly growing extreme data science workload at NERSC. Cori will have a 1600 Intel® Haswell processor partition with larger memory nodes to enable extreme data analysis. A fast internet connection will let users stream data from experimental and observational facilities directly into the system.  A “Burst Buffer”, a 1.5 Petabyte layer of NVRAM, will help accelerate I/O. Cori will also include a number of software enhancements to enable complex workflows.

 

For the longer term we are investigating whether a single system can meet the simulation and data analysis requirements of our users. For example, we are adding a genome assembly  miniapp (Meraculous) to our benchmark suite and we are considering adding one for genome alignment (Blast). We are also investigating how data intensive workflows (e.g., cosmology and genomics) differ from our simulation workloads.

 

 

Sudip Dosanjh

Dr. Sudip Dosanjh is Director of the National Energy Research Scientific Computing (NERSC) Center at Lawrence Berkeley National Laboratory. NERSC’s mission is to accelerate scientific discovery at the U.S. Department of Energy’s Office of Science through high performance computing and extreme data analysis. NERSC deploys leading-edge computational and data resources for over 4,500 users from a broad range of disciplines. NERSC will be partnering with computer companies to develop and deploy pre-exascale and exascale systems during the next decade.

 

Previously, Dr. Dosanjh headed extreme-scale computing at Sandia National Laboratories. He was co-director of the Los Alamos/Sandia Alliance for Computing at the Extreme-Scale from 2008-2012. He also served on the U.S. Department of Energy’s Exascale Initiative Steering Committee for several years.

 

Dr. Dosanjh had a key role in establishing co-design as a methodology for reaching exascale computing. He has numerous publications on exascale computing, co-design, computer architectures, massively parallel computing and computational science.

Date/Time: 
07/07/2016 - 10:00
Presenter: 
Sudip Dosanjh
Location: 
233-305E