IDIES Data Center

High-value Datasets

At the center of IDIES research infrastructure, the data center boasts state-of-the-art hardware, optimized for AI/ML capabilities, to deliver a seamless experience for those working in science research with Big Data.

Coupled with the technical expertise of our support staff, research software engineers, and the decades of experience gained through our pioneering work with high-profile data projects such as the Sloan Digital Sky Survey (SDSS) and the Johns Hopkins Turbulence database (JHTDB)—see below for case studies on these projects—our resources are uniquely equipped for the deposition, curation, and dissemination of datasets of all sizes.

Case Study: The Sloan Digital Sky Survey

Case Study: SciServer and Johns Hopkins Turbulence Database (JHTDB)

The SciServer is an NSF funded project (NSF ACI-1261715 and CSSI-3211791) to support collaborative data-intensive research, and the data sets it supports are hosted in the IDIES Data Center. There are extensive resources for data storage, both for databases and for file systems, and for computational processing.

Across storage systems that support File Storage, databases, Logs, user storage and working compute space, there is approximately 6PB of storage space for SciServer operations and for the multiple science domains that it supports. This is in addition to several PB of storage to support the group’s traditional Astronomy data sets.

SciServer has 15 Compute servers each with between 48 and 64 compute cores, and between 256GB and 1TB of RAM, for a total system supporting just under 900 processing cores and just under 8TB of RAM. The SciServer supports a large number of data-intensive projects from many disciplines, with several petabytes of data in active use.

Other University Resources

Data-driven HPC—Advanced Research Computing at Hopkins (ARCH)

The mission of the Advanced Research Computing center at Hopkins (ARCH, https://www.arch.jhu.edu/about-arch/) – previously Maryland Advanced Research Computing Center (MARCC) – is to enable research, creative undertakings, and learning that involve and rely on the use and development of advanced computing.

ARCH is a shared computing facility at Johns Hopkins University that enables research, discovery, and learning, relying on the use and development of advanced computing. ARCH administers state of the art high performance computing resources, manages highly reliable data storage, and provides outstanding collaborative scientific support to empower computational research, scholarship, and innovation.

ARCH provides us with potential access to 23,000 cores and over 1.4 PFlops. The system uses FDR-14 Infiniband topology and includes Dell PowerEdge GPU nodes along with dual Intel Xeon servers. Similar to SciServer, all of our team has worked on MARCC and so it provides a common collaborative framework for us to work together.

NSF Grant Information

Renovations from an NSF ARRA grant (OCI-0963185) “Advanced CyberInfrastructure for High Performance, Data Intensive Computing” created a flexible, stable environment for a high density of computing equipment and petabyte storage to support data-intensive research.

An NSF STCI grant (OCI-1137045) “Collaborative Research: 100G Connectivity for Data-Intensive Computing at JHU” provides for an 100G connection through the MidAtlantic Crossroads (MAX) to Pittsburgh and then Starlight at Chicago.

IDIES operates with support from:

JHU DSAI logo
NSF logo
NASA logo
NIH logo
Alfred P Sloan Foundation logo
Grant and Betty Moore logo
John Templeton Foundation logo
WM KECK Foundation logo
Intel logo
Microsoft logo
Nokia logo
nvidia logo