AFLUX@JHU: Materials Search-API for the JHU aflow.org Data Repositories

William Shiber

Mentored by Corey Oses

Recent applications of machine learning to materials data repositories have accelerated computational materials design. Machine learning algorithms have enhanced the prediction of materials properties such as thermodynamic stability, potential energy surfaces, and electronic structures [1–3]. The Automatic Flow (aflow++) framework for materials discovery is an open-source software for high-throughput materials simulations that has generated the largest database for computationally investigated inorganic materials [4, 5]. It has successfully predicted novel experimentally verified magnets [6], super-alloys [7], high-entropy car- bides [8, 9], and phase change memory compositions [3]. The aflow++ framework [10] has created over 3.5 million entries spanning 1,100 crystallographic prototypes, all housed withithe aflow.orgonline data repositories [11].

This summer we created AFLUX@JHU: a materials search-API for the aflow.org data repositories generated and housed at Johns Hopkins University. The original AFLUX implementation — written in PHP — was not built within the public distribution of the aflow++ framework. AFLUX@JHU is the first publicly available AFLUX implementation, written in C++ directly within aflow++. AFLUX@JHU connects the repositories generated and stored at JHU with those housed at other universities, creating a federated environment of materials data — enhancing research efficiency while maintaining data provenance.

The ability to filter materials on specific tensor components is extremely valuable. By implementing new syntax into the AFLUX language and developing new SQL queries to search the database we were able to allow for tensor-like filterability. Additionally, many materials parameters are specific to certain atoms or species within the crystal. Because of the dynamic nature of these data vectors, filtering within a relational database using SQL is almost infeasible. Instead, we created a second layer of filtering using C++ within aflow++ to allow for users to filter on these properties. Both new types of filterability make use of the ‘@’ operator within the AFLUX language to specify the desired splicing of these matrices.

https://s4e.ai/search/API/?ael_stiffness_ tensor(@(3,3)(0.2))

Tensor like filtering example on the component in position 3,3 of the ael stiffness tensor https://s4e.ai/search/API/?bader_net_charges(@ (Fe)(0))

Filtering on the bader charge of the Fe atoms in the structure (note that this is “if any”)

Further work is needed to fully connect our data with that at other universities and complete the “plug-and-play” ability of AFLOW. This simply implies editing how AFLOW chooses where to pull the data from, a DB file on the machine or online from the JHU, Duke, or other databases.

Coming into this project I had very little coding experience, none with C++, SQL, or anything close to the size of the AFLOW framework. Throughout the summer I was able to hone my skills within database design and implementation, developing with C++, and leveraging SQL to query databases. Beyond the basic hard-skills I developed through this project, I was able to learn what it was like to work within a research group. I was able to strengthen my communication and teamwork skills as well as becoming a better at managing large, long-term projects.

A headshot of William Shiber. William is grinning and has short curly hair. He is wearing a light blue collared shirt and pink tie and is against a blue abstract background.
William Shiber

William Shiber is a second-year student pursuing a BS/MS in Applied Mathematics and Statistics with a focus on Statistics and Statistical learning.


“This project aims to design and implement AFLUX@JHU: a materials search-API for the aflow.org data repositories that are generated and housed at Johns Hopkins. The aflow.org framework is high-throughput materials simulation software used to discover novel materials and predict material properties. Our goal is to implement an API to allow machine learning access to this data, which consists of over 3.5 million stored entries. AFLUX@JHU will include tensor-like filterability and atom- and species-specific properties filterability to optimize workflows. AFLUX@JHU will provide access to a growing compendium of highly-filterable materials data, not only easing the user experience by placing the computational burden server-side but also making such searches possible for queries wrangling millions of entries.”

IDIES logo