11 Comments
User's avatar
Florindo Palladino's avatar

Congratulations! I’ve forked the repository—hope to give you feedback soon.

Expand full comment
Silvia Escanilla Huerta's avatar

Is this only designed for English written sources? I have thousands of photographs taken at the archives that I could use to test this tool, but they're all manuscripts written in Spanish...

Expand full comment
Mark Humphries's avatar

In in our tests, it works just as well on French documents as English documents. Users have reported that it works well on German and Italian too...but I am not sure we have heard results for Spanish documents. The program uses Generative AI to read the documents, so success will entirely depend on the abilities of the model you use. My intuition is that Gemini-2.5-Pro would work well for Spanish documents, but I would be interested to hear any results.

Expand full comment
Gunnar W. Knutsen's avatar

How well it performs on Spanish depends on the handwriting and subject material. For Inquisition records the LLMs perform rather badly unless you use a fine-tuned version for error checking.

Expand full comment
Thiago's avatar

It seems revolutionary! I look forward to the .exe file that will make things easier for us, non-coding/lazy historians. Incidentally, I tried Gemini 2.5 Pro and it seemed really impressive at first - but then I realized it was just hallucinating in an extremely convincing way, totally making things up, even when I call it out on that. Much worse than ChatGPT. I wonder why, and if (or why) it's so much better when you use the API.

Expand full comment
Mark Humphries's avatar

Thanks for the comment. May I ask what you were asking it to do?

Expand full comment
Thiago's avatar

Transcribe a page of a volume from the Royal African Company, because the post said Gemini was fairly good at transcriptions. It is fairly legible, albeit BW. It was disconcerting: the output looked totally legit in terms of eighteenth-century writing, but when I checked it had no relation to the image. Very weird!

Expand full comment
Mark Humphries's avatar

So this was in the Gemini app, right? And I assume you uploaded a jpg rather than provide a URL? I just tried it in the Gemini app on a few different documents and it worked as expected. I can't share screenshots in the comments but this is what I did. First I went to Gemini:

https://gemini.google.com/app

Then I used the + button to upload a jpg. The JPG does not need to be perfect resolution, but it needs to be readable by a human. I then wrote "Transcribe this 18th century document." And the transcription was fine.

Not sure if that matches what you did, but I tried it with a few images.

Expand full comment
Thiago's avatar

Yes, that`s exactly what I did. It did really well on a French 1708 doc with perfect handwriting and a good image. Then it did poorly on a Brazilian notarial record from the 1700s (handwriting and image were worse). Then it hallucinated like it was 2022 with the RAC record. Then my preview ended (I only subscribe to ChatGPT Plus, not to other LLMs).

Expand full comment
Mark Humphries's avatar

Interesting. I don't subscribe to Gemini either, but I haven't run into any issues on the free tier.

Not sure what to say other than that I haven't seen that happen in Transcription Pearl or Archive Studio with the Gemini API. I've found that performance with the API is way more consistent.

Expand full comment
Thiago's avatar

Looking forward to experimenting with it!

Expand full comment