The IDIES Seed Funding Program Awards are competitive awards of $25,000. The Seed Funding initiative provided funding to the following data-intensive computing projects because they (a) involve areas relevant to IDIES and JHU institutional research priorities, (b) are multidisciplinary; and (c) build ideas and teams with good prospects for successful proposals to attract external research support by leveraging IDIES intellectual and physical infrastructure.
PI: Thomas Lippincott (Computer Science)
Co-Is: Sharon Achinstein (English), Jacob Lauinger (Near Eastern Studies)
Research in the humanities often involves richly-structured datasets that are fundamentally multimodal, combining, for example, temporal and geographic information with text and images. These properties present challenges for human intelligence’s limited attention and memory, and for computational intelligence’s limited capacity for focused reasoning. This project considers empirical questions from two domains that exemplify these challenges: changes to political and moral thought across time and geography during the Colonial era, and scribal variance in cuneiform inscriptions from the Ancient Near East. By jointly representing images, transcriptions, translations, and metadata, we will determine natural clusters that emerge from neural embeddings of existing data sets, and their alignment with themes from traditional scholarship. This project ranges over the life cycle of traditional and computational research, including data curation, annotation, machine learning, and interpretation, with particular attention towards improving the traditional scholar’s ability to annotate primary sources and interact with the machine learning output.
PI: Nadia Zakamska (Physics and Astronomy)
Co-I: Tamás Budavári (Applied Mathematics and Statistics)
One of the most enduring mysteries of modern astrophysics is that of the origin of type Ia supernovae, the cosmological standard candles used in measuring the geometry of the universe. The most likely scenario is that type Ia supernovae arise as a result of a merger of two white dwarfs — compact remnants of evolution of stars like our Sun — but no candidate progenitors have yet been discovered. In this program, we will develop the necessary machine-learning tools to discover white dwarf binaries in emerging spectroscopic, photometric and astrometric datasets. This project has potential for a breakthrough in the long-standing search for type Ia progenitors.
PI: Natalia Trayanova (Biomedical Engineering and Medicine)
Co-I: Allison Hayes (Cardiology)
It is now recognized that patients recovered from COVID-19, especially those with severe COVID requiring intensive care, frequently develop long-term debilitating symptoms and hospital readmissions. Although acute cardiac complications due to COVID-19 are now described, the long-term cardiovascular (CV) complications of COVID remain unclear. It is not known what is the frequency and nature of the CV complications, or what are the predictors for developing such adverse events in the long term posthospitalization. We are now in a unique position to address this pressing clinical need. The goal of this project is to develop a real-time machine learning (ML) solution to predict long-term (1 year) adverse CV events in patients who were discharged after hospital admission for COVID-19. The warning system will be able to identify at-risk patients in real time and alert caregivers and patients, reducing mortality, ensuring the delivery of goal-oriented therapy, and providing tangible clinical decision support.
PI: Jonathan Ling (Pathology)
Co-I: Benjamin Langmead (Computer Science)
Transactive response DNA-binding protein 43kDa (TDP-43) is an RNA-binding protein known to form pathological inclusions in a variety of age-related neurodegenerative disorders. This proposal aims to mine the vast public RNA sequencing archives to uncover new mechanisms of TDP-43 dysregulation. Using an interdisciplinary approach, these findings will be validated with in silico and in vitro model systems. Insights gained from this study may reveal novel therapeutic targets and prophylactic measures to reduce the aggregation of TDP-43 and other misfolded proteins during aging
PI: Natalia Trayanova (Biomedical Engineering, WSE)Co-I: David Spragg (Cardiology, SOM), Nikhil Paliwal (Alliance for Cardiovascular Diagnostic and Treatment Innovation)
To prevent recurrent ablation procedures in atrial fibrillation (AF) patients, we propose a data-driven technology that will enable a priori prediction of the success of pulmonary vein isolation (PVI). We will use existing AF patient clinical data and artificial intelligence to train predictive models for the success of PVI using catheter ablation. The overall goal of this technology is to provide clinical guidance as to which AF patients would benefit from PVI, thus maximizing the benefit of PVI while minimizing the financial costs and procedural risks of unnecessary ablation procedures.
PI: Thomas Haine (Earth & Planetary Sciences, KSAS)
Co-I: Charles Meneveau (Mechanical Engineering, WSE)
Postdoc: Miguel Jimenez-Urias (Earth & Planetary Sciences, KSAS)
The overall project goal is to apply a novel numerical procedure to Direct Numerical Simulations of canonical Rotating Stratified Flows relevant to dynamical oceanography in order to reveal differential operators associated with turbulent closures. This will provide a stepping stone for the development of non-local, scale dependent turbulence closures in ocean modeling. It will provide a framework for the creation of a SciServer Database of Canonical Geophysical Flows relevant to dynamical oceanography, in similar spirit to the Johns Hopkins Turbulence Database.
PI: Marc Stein (School of Education, BERC)
Co-I: Julia Burdick-Will (Sociology, KSAS), Gerard Lemson (IDIES)
The overarching goal for this project is to set up the pipeline to develop a “real-time” database of Baltimore transit and crime data on the SciServer platform that can be used to estimate daily routes to school using public transit, estimate daily variation in commuting difficulty (travel time, transfers, delays due unreliable service) and violence exposure on those routes.
PI: Rene Vidal (Biomedical Engineering, WSE)
Co-I: Benjamin Haeffele (MINDS), Matthew Ippolito (Medicine, SOM)
The current proposal will build on computer vision techniques recently developed by Dr. Haeffele in the Vidal Laboratory of the Johns Hopkins Whiting School of Engineering, to detect and classify blood cells in low-resolution lens-free images with a reduced volume of annotated data. This project will extend such computer vision methodology for data mining of malaria microscopy data in patient samples from antimalarial drug trials conducted by the Johns Hopkins Malaria Research Institute at the Johns Hopkins Bloomberg School of Public Health. Linking computer vision-based machine learning algorithms to malaria pharmacology promises to unlock novel insights into the effect of drugs on malaria parasites while establishing a new evaluative tool for the assessment and understanding of malaria and its treatment.
Brian Camley (Physics & Astronomy, Krieger School of Arts & Sciences)
Andrew Ewald (Cell Biology, School of Medicine)
In developing organisms, groups of cells work together to sense chemical signals, sharing information to make measurements more precisely than any single cell can alone. We will characterize how groups of mammary cells process information by studying organoids made of a mixture of active cells (which always believe they see a signal) and normal cells. Over time, these organoids develop branches, as during normal mammary development. Our plan will be to use the location of the active cells to predict the location of the branches, inferring which cells are most important from experimental data. Understanding how the pattern of activity is translated into branching will allow us to better understand how chemical signals are integrated across a group of cells.
Chris Cannon (English & Classics, Krieger School of Arts & Sciences)
Sayeed Choudhury (Sheridan Libraries)
Mark Patton (Sheridan Libraries)
The history of English meter before 1500 has been difficult to write because we cannot tell from the way poetry was written down how it sounded. Geoffrey Chaucer is the central figure in this story, the inventor of iambic pentameter, the staple of English verse until the 20th century, even though the norms of Middle English grammar suggest that his verse was still sometimes irregular. This project will use a database of all of Chaucer’s words tagged for its grammatical function (and his contemporary John Gower), now tagging each word metrical function—compared throughout with the metrical function of Gower’s words as a control—to ask what happens to Middle English grammar if Chaucer’s verse was always regular.
Sarah Wheelan (Oncology, School of Medicine)
Jai Won Kim (IDIES, Krieger School of Arts & Science)
Jonathon Pevsner (Neurology, Kennedy Krieger Institute)
Luigi Marchionni (Neurology, School of Medicine)
Frederick Tan (Bioinformatics, Carnegie Institution)
We will create a robust platform for teaching students how to execute and interpret nontrivial genomics workflows. We plan to combine our longstanding experience in teaching R and Unix with the flexible and powerful SciServer platform, developed within the IDIES. We will adapt existing content to SciServer and will create new content that leads students through reproducible analysis of truly large-scale datasets, that are realistic examples of what they will encounter in their own work. Explanatory video tutorials will be created as well, enabling independent study.
Scot Miller (Environmental Health & Engineering, Whiting School of Engineering)
Darryn Waugh (Earth & Planetary Sciences, Krieger School of Arts & Sciences)
Methane is the second-most important greenhouse gas and plays a critical role in global climate. Methane mysteriously began to rise in 2007 and has been increasing ever since, implying that methane emissions are also increasing. Scientists do not understand where, how, when, or why emissions changed.
A new satellite promises to fundamentally change methane monitoring. The Sentinel-5 Precursor satellite launched in late 2017 and observes methane with far better global coverage than previous satellites. We plan to create a TROPOMI-based tropospheric methane product and use this product to estimate global methane emissions. This research will elucidate the distribution of global methane, and we can begin to hypothesize which source types are driving emissions, human or natural.
Rajat Mittal (Mechanical Engineering, Whiting School of Engineering)
Andreas Andreou (Pediatrics, School of Medicine)
W. Reid Thompson (Pediatrics, School of Medicine)
Jung Hee Seo (Mechanical Engineering, Whiting School of Engineering)
Wearable sensors are now able to automatically record and analyze our movements, pulse-rates, O2 saturation, sleep and respiration rates. Heart sounds encode vital information about our cardiovascular system, but automated acquisition of these acoustic signals remains a challenge. Recently, our team has developed and tested a novel wearable phonocardiographic (PCG) system, the “StethoVest.” However, effects of body-habitus on PCG measurements and meaningful analysis of the complex signals remains an open issue and is the focus of this project. A multidisciplinary team of mechanical and electrical engineers will combine forces with a cardiologist and employ a suite of tools ranging from patient measurements and computational models, to explore these fundamental questions.
Roman Galperin (Carey Business School)
Marshall Shuler (Neuroscience, School of Medicine)
How do people learn to search for information in unfamiliar domains? What is the role of peers and social context? We aim to improve our understanding of these questions by studying human search behavior in examining innovations. We will apply the insights developed in neuroscience and social sciences to develop a model of social learning of search, using data on hundreds of millions of searches conducted by patent examiners while evaluating inventions. We propose that the examiners’ task of finding specific, relevant knowledge in unfamiliar fields under time constraints represents a general problem of efficient search in knowledge space. We expect that examiners learn to search more efficiently over time and rely on peers for the learning. Our study will contribute to current theories of learning and search for knowledge, produce specific suggestions for improving the patent examination process, and create a dataset for the larger researcher community.
Janet Markle (Molecular Microbiology and Immunology, Bloomberg School of Public Health)
Anthony Guerrerio (Pediatrics, School of Medicine)
This project aims to uncover genetic and immunological drivers of disease pathogenesis in children with very early onset inflammatory bowel disease (VEOIBD). The project combines data-intensive genome-wide sequencing capabilities and cellular immunology expertise with unique patient access. VEOIBD is a rare and devastating disease which may result from single-gene inborn errors of immunity, however most children with this disease currently lack a genetic diagnosis. We propose the in-depth analysis of whole exome sequencing data to identify novel candidate mutations, followed by functional testing of these candidates at the molecular and cellular levels. Through this effort we hope to provide a more complete understanding of VEOIBD pathogenesis on a patient-by-patient level, which will permit tailored therapies in the future.
The JHU Institute for Data Intensive Engineering and Science (IDIES) is pleased to announce its first Seed Funding Program awardees.
The IDIES Seed Funding Program RFP was issued for competitive awards of $25,000. The goal of the Seed Funding initiative is to provide funding for data-intensive computing projects that (a) will involve areas relevant to IDIES and JHU institutional research priorities; (b) are multidisciplinary; and (c) build ideas and teams with good prospects for successful proposals to attract external research support by leveraging IDIES intellectual and physical infrastructure.
The following projects have been selected for the first 2014 round of awards:
by Yanif Ahmad, (Dept. of Computer Science), Raimond Winslow (Dept. Biomedical Engineering), and Yair Amir, (Dept. of Computer Science)
by Sarah Wheelan, (Dept. of Oncology) and Srinivasan Yegnasubramanian, (Dept. of Oncology)
by Tamer A. Zaki (Dept. of Mechanical Engineering) and Gregory Eyink (Applied Math and Statistics)
by Ben Langmead, PhD (Dept. of Computer Science) and Jeffrey Leek, PhD (Dept. of Biostatistics)
by Nitin Daphalapurkar (Dept. of Mechanical Engineering), and Lori Graham-Brady (Dept. of Civil Engineering)
The funds available to support this seed funding offering were made possible in part by the Gordon and Betty Moore Foundation.
View our latest Seed Funding Program RFP and apply today here.