January 14, 2026
Using AI to Find Insights in Historic Manuscripts
Sonia Yaco
Emerging Technologies Librarian for Rutgers University Libraries
Using AI to Find Insights in Historic Manuscripts
Sonia Yaco
Emerging Technologies Librarian for Rutgers University Libraries
Minutes of the 14th Meeting of the 84th Year
George Bustin, Old Guard president, called the meeting to order and presided. Frances Slade led the invocation. President Bustin reminded the audience of the security measures in place at the Jewish Center of Princeton, which comprise wearing name tags and a prohibition on photography and filming. He also stated that the new Winter 2026 Program had been finalized and is available on the Old Guard website.
Attendance at The Jewish Center was 134. Three guests were present: Michael Leopold (guest of Eliot Freeman), and Maureen Strazdon and Sarah Mahan (guests of Ferris Olin). Julie Elward-Berry read the minutes of the December 10 meeting.
Ferris Olin introduced the speaker, Sonia Yaco, Emerging Technologies Librarian for Rutgers University Libraries. She has led special collections and university archives departments at three universities and is presently Associate Director of Special Collections and University Archives at Rutgers University Libraries. Prior to entering academia, Yaco headed a computer consulting firm serving libraries, educational institutions and Fortune 500 companies. Her research focusses on identifying innovative methods to make archives easier to access using emerging technology.
Librarian Yaco described the current challenge facing library archivists, namely immense growth in available collections and their increasingly rich multi-format content, specifically photos and other images. Simultaneously there is an explosion of competent and nimble AI tools, which both patrons and institutions expect librarians to apply. Yaco headed a joint study with a team from Rutgers and Durham UK Universities, using as a test case the substantial collection of papers of William Elliot Griffis (and his sister Margaret Clark Griffis) from the second half of the nineteenth century. The collection comprised correspondence, handwritten diaries, travel journals, photographs and other published materials. The topics involve Japan, Korea, missionaries, education in Japan, and US relations, all during the Japanese Meiji era. Ms. Griffis was one of the first women to educate women in Japan.
The first goal was to get more out of textual data, first by improving the readability of the texts using multiple software tools ranging from older technology translators and optical character recognition to agentic AI. Work was performed over the period of 2022 to 2025; both cost and learning curve required were evaluated. eScriptorum was notably successful in 2022, but by 2025, ChatGPT and Gemini Pro were equally competitive. More advanced challenges were sentiment analysis and name entity recognition and organization identification. Finally, the team tested linking photos with text, asking the translator to provide a complex description of a photo, and to match image descriptions to diary text, which would facilitate creating a caption for an available image. Such tasks and tests required multiple loops to deliver the best results.
At all stages, the study showed that human intervention was critical, both as a validity check and to input ethical oversight. Large resource investment was a given, in terms of software costs, staff time, institutional infrastructure support, and in-house expertise. From the librarian’s perspective, these tools can make archive material more accessible to the library client, adding insights through data analysis and improved organization and searchability of multimedia. During the Q&A, Yaco reminded the audience that most people under forty can no longer even read cursive writing in archive materials.
Respectfully submitted,
Julianne Elward-Berry
Attendance at The Jewish Center was 134. Three guests were present: Michael Leopold (guest of Eliot Freeman), and Maureen Strazdon and Sarah Mahan (guests of Ferris Olin). Julie Elward-Berry read the minutes of the December 10 meeting.
Ferris Olin introduced the speaker, Sonia Yaco, Emerging Technologies Librarian for Rutgers University Libraries. She has led special collections and university archives departments at three universities and is presently Associate Director of Special Collections and University Archives at Rutgers University Libraries. Prior to entering academia, Yaco headed a computer consulting firm serving libraries, educational institutions and Fortune 500 companies. Her research focusses on identifying innovative methods to make archives easier to access using emerging technology.
Librarian Yaco described the current challenge facing library archivists, namely immense growth in available collections and their increasingly rich multi-format content, specifically photos and other images. Simultaneously there is an explosion of competent and nimble AI tools, which both patrons and institutions expect librarians to apply. Yaco headed a joint study with a team from Rutgers and Durham UK Universities, using as a test case the substantial collection of papers of William Elliot Griffis (and his sister Margaret Clark Griffis) from the second half of the nineteenth century. The collection comprised correspondence, handwritten diaries, travel journals, photographs and other published materials. The topics involve Japan, Korea, missionaries, education in Japan, and US relations, all during the Japanese Meiji era. Ms. Griffis was one of the first women to educate women in Japan.
The first goal was to get more out of textual data, first by improving the readability of the texts using multiple software tools ranging from older technology translators and optical character recognition to agentic AI. Work was performed over the period of 2022 to 2025; both cost and learning curve required were evaluated. eScriptorum was notably successful in 2022, but by 2025, ChatGPT and Gemini Pro were equally competitive. More advanced challenges were sentiment analysis and name entity recognition and organization identification. Finally, the team tested linking photos with text, asking the translator to provide a complex description of a photo, and to match image descriptions to diary text, which would facilitate creating a caption for an available image. Such tasks and tests required multiple loops to deliver the best results.
At all stages, the study showed that human intervention was critical, both as a validity check and to input ethical oversight. Large resource investment was a given, in terms of software costs, staff time, institutional infrastructure support, and in-house expertise. From the librarian’s perspective, these tools can make archive material more accessible to the library client, adding insights through data analysis and improved organization and searchability of multimedia. During the Q&A, Yaco reminded the audience that most people under forty can no longer even read cursive writing in archive materials.
Respectfully submitted,
Julianne Elward-Berry