7 Comments
User's avatar
Mark Humphries's avatar

That's great! I'm glad it was helpful!

Expand full comment
Wolf Larson's avatar

I'm a surveyor with 44 years experience transcribing old handwritten deeds and just tried Transkribus. It's childish and completely worthless. By the time you create corrective applications you have invested much time with very poor results. If you're experienced, you can just read the documents and dictate them.

Expand full comment
Mark Humphries's avatar

I agree. But Transkribus is a different thing, and not what I am actually talking about here. In fact the point is how it’s not really usable. But AI on handwritten documents is totally different. It is about 98% effective on English documents out of the box compared with Transkribus which is 60-75% depending on the hand. I entirely understand why reading documents is important and can’t agree more. But there are lots of situations where it is not possible or feasible, for example on huge datasets. It’s also a convenience issue. On some projects I have hundreds of thousands of documents and it is handy to have full text search.

Expand full comment
Running Elk's avatar

Good to see scholars push back against the AI doomsayers.

Expand full comment
Vivienne Cuff's avatar

You should do a reasonable sized project using high quality images. For example, a year’s worth of letters sent (in the 19th Century they were copied into letterpress books) or some kind of register or casebook. This would be a good comparison. Letterpress copy books present real challenges because of the issues with the medium. Letters written in pencil are problematic. The more hands you have in a set of images, the more difficulties arise.

Another issue, is the overfitting and under fitting of models. The use of AI tools will always need a human working with and to review the transcripts produced.

Expand full comment
Walt Rice's avatar

I've had some success having ChatGPT-4o refine a transcription from another source first without any context, and then with the added step of doing its own transcription from the image. By itself, ChatGPT produces a so-so transcription; the ones I've done from Microsoft Azure Cognitive Vision are riddled with errors. But together, we've got a winner. Thanks for the hint of the technique to use one tool to refine the output of the other.

Expand full comment
Karen A. Chase's avatar

Brilliant. You saved me SO much time in transcribing! What was taking me 30 minutes per page of a journal, took me 3 minutes for the first page and I got faster at it. I did not need to venture into the GPT-4 function/coding. Yes, I did have to add some corrections, and account for missing or torn pages, but the job took so little time, I could go for a bike ride. So when people say they fear AI, I will say, yes but look at all the time it grants us.

Expand full comment