PI: Thomas Lippincott (Computer Science)
Co-Is: Sharon Achinstein (English), Jacob Lauinger (Near Eastern Studies)
Research in the humanities often involves richly-structured datasets that are fundamentally multimodal, combining, for example, temporal and geographic information with text and images. These properties present challenges for human intelligence’s limited attention and memory, and for computational intelligence’s limited capacity for focused reasoning. This project considers empirical questions from two domains that exemplify these challenges: changes to political and moral thought across time and geography during the Colonial era, and scribal variance in cuneiform inscriptions from the Ancient Near East. By jointly representing images, transcriptions, translations, and metadata, we will determine natural clusters that emerge from neural embeddings of existing data sets, and their alignment with themes from traditional scholarship. This project ranges over the life cycle of traditional and computational research, including data curation, annotation, machine learning, and interpretation, with particular attention towards improving the traditional scholar’s ability to annotate primary sources and interact with the machine learning output.