Jupyter Community Workshop: Jupyter for Scientific User Facilities and High-Performance Computing

Rollin Thomas
Jupyter Blog
Published in
3 min readJan 29, 2019

--

We are excited to share more news about the Jupyter Community Workshop for Scientific User Facilities and High-Performance Computing! This is part of a series of Jupyter Community Workshops funded by Bloomberg to “bring together small groups of Jupyter community members and core contributors for high-impact strategic work and community engagement on focused topics.”

This workshop will be held in Berkeley, California, from Tuesday June 11 to Thursday June 13, 2019. The workshop is being jointly hosted at the National Energy Research Scientific Computing Center (NERSC, part of Lawrence Berkeley National Laboratory) and the Berkeley Institute for Data Science (BIDS, at the University of California). The workshop executive committee consists of Rollin Thomas (NERSC), Dan Allan (Brookhaven National Laboratory), and Chris Holdgraf (BIDS — UC Berkeley).

We, the organizers, invite your expression of interest in the effort through this Google form. Let us know whether you’d be interested potentially in attending, just want to be kept in the loop, or just want to express your support. Finding out who is doing what with Jupyter in this space is the first step in building our community.

Why? Advances in technology at experimental and observational science facilities (EOS facilities: telescopes, particle accelerators, light sources, genome sequencers and so on), in robust high-bandwidth global networks, and in high-performance computing (HPC) have resulted in an exponential growth of data for scientists to collect, manage, and understand. Interpreting these data streams requires computational and storage resources greatly exceeding those available on laptops, workstations, or university department clusters. Funding agencies increasingly look to HPC centers to address the growing and changing data needs of their scientists. These institutions are uniquely equipped to provide the resources needed for extreme scale science. At the same time, scientists seek new ways to seamlessly and transparently integrate HPC into their EOS workflows. That’s where we see Jupyter fitting in.

We know that scientists love Jupyter because it combines visualization, data analytics, text, and code into a document they can share, modify, and even publish. What about using Jupyter to control experiments in real-time, or steer complex simulations on a supercomputer, or even connect experiments to HPC for real-time feedback and decision making? How can users reach outside the notebook to corral external data and computational resources in a seamless, Jupyter-friendly manner?

These were the questions on our minds when we proposed a three-day workshop for Jupyter developers, HPC engineers, and staff from EOS facilities. We are looking to foster a new collaborative community that can make Jupyter the pre-eminent interface for managing EOS workflows and data analytics at HPC centers. EOS scientists need Jupyter to work well at their facilities and HPC centers, and this workshop will help us address the technical, sociological, and policy challenges involved.

The workshop itself will include presentations, posters, and a couple half-day hack-a-thon/breakout sessions for collaboration. We will identify best practices, share lessons learned, clarify gaps and challenges in supporting deployments, and work on new tools to make Jupyter easier to use for big science.

During the workshop, participants will be invited to collaborate on a survey white paper that documents the current state of the art in Jupyter deployments at various facilities and HPC centers. The document will include deployment descriptions, maintenance and user support strategies, security discussions, use cases, and lessons learned. A forward-looking summary provided at the end of the white paper will tie together common threads across various facilities and highlight areas for future research, development, and implementation. We will aim to have the paper completed and published to arXiv within three months of the end of the workshop. These ideas in a single document should help developers, maintainers, and researchers make the case they need to management and policymakers to drive the effort forward.

So let us know if you’re interested in the effort by filling out the Google form, even if you think you can’t make it. Part of what we’re doing is finding out who is doing what with Jupyter where in HPC and EOS. That’s the real first step in building our community!

--

--