Methodology
My main goal with Daggerobelus is to create visual representations of data found in early modern archives. These visual representations, I argue, have the potential to help us understand large trends across archives while also preserving the specificity of particular microhistory. Given the extensive nature of the archives with which I plan to work—such as the Folger Shakespeare Library’s recipe book collection—data science workflows can help ensure a clear flow from the original archive to machine-readable formats to the final product: visual outputs in the form of figures.
These workflows are highly dependent on using Anthropic’s Claude. Unlike other types of AI models that are simply chat-based, Claude is a highly intelligent and competent research assistant that can perform a myriad of tasks, including data extraction and advanced coding. While fields such as software engineering are already actively integrating Claude into their methodologies, archival research has yet to use agentic AI like Claude to analyze and visualize large data sets—likely due to monetary barriers, the newness of the technology, and general field-wide skepticism about AI. Importantly, Claude is not generative AI but rather agentic AI. Its primary purpose is to execute workflows like data extraction, website authoring, and statistical analysis of ingested documents. The main hypothesis of Daggerobelus is that agentic AI–and its associated tasks–have a place in archival research.
From a data science perspective, I use techniques like:
- NLP (Natural Language Processing) of primary documents from historical archives into machine-readable formats like JSON
- Data analysis using Python and pandas to perform techniques like network analysis, statistical modeling, and other advanced techniques
- Charting using D3 and modern web-based visualizations to create readable figures that organize dense historical data
A large part of this methodology includes ensuring that the workflow can be extrapolated to new archives by creating a clear flow from 1) the original archive to 2) machine-readable formats and 3) visual outputs in the form of figures. Since each archive presents its own idiosyncrasies, this method can transfer and adapt to a new historical corpus depending on the schema I identify as salient.