By Chloë Farr and Kate Fryer
The Mountain Legacy Project (MLP) is a group of passionate researchers who work with historic and modern mountain photographs to document landscape change over time. Repeat photography is central to this work. By returning to the same locations where historic survey photographs were taken MLP researchers can create visual comparisons that support environmental and cultural research.
The metadata created during these visits records important information on stations and locations, forming a link between fieldwork and the digital records on the MLP digital archive, Explorer. The manual transfer process to upload the field notes can take many hours as field data needs to be translated into the desired end structure one form at a time.
In a cross departmental collaboration with Kula and Mountain Legacy Project, we are exploring whether AI-assisted Optical Character Recognition (OCR) can help extract information from these field notes in a more structured and reliable way. The goal is not to replace paper field notes but to explore whether automation could support upload workflows, reduce manual transcription, and improve consistency.
While it might seem clever to document direct to digital format in the field with the implementation of iPads, Paper field notes remain valuable. They are portable, function well in adverse conditions, and unlike digital they don’t need to rely on batteries. The challenge comes after field work, when the information recorded on paper needs to be uploaded to our site through a series of web forms.
This serves as a pilot for assisting researchers across campus to improve the conversion of notes, be it from the field or a lab, into machine readable text that enables computational data analysis through software like RStudio, Excel, and SPSS.
The OCR
This process followed the same approach presented in a previous Kula blog post. We processed each image scan with a 5 billion parameter, locally run open source vision language model fine-tuned for OCR called Chandra OCR 2. The outputs were transcribed versions of the scans in HTML, JSON, and Markdown file types. Markdown and HTML formats allow for easy visualization, as in the images below. All three file types allow for simple parsing to CSV (tabular data format, excel-compatable), which can then be uploaded to the MLP Explorer site as a single file, and reduce the need for manual transcription.
So far, we have achieved parsing Chandra’s outputs to CSV. The next step is to arrange the data in the CSV to comply with the current data’s structure as it’s held in the website’s database.


The aforementioned previous blog post used a different VLM-OCR model, olmOCR-2. An added affordance of Chandra is that it segments images into new files, and positions them contextually within the rest of the content. We see this below where it cropped a rough sketch of a rock into a new image, here linked in the markdown file.

The image above is a preview of the formatted markdown. Below shows the raw file contents, and that the model also provides alt-text for the cropped image:

For repeat photography sketches, illustrations can be especially valuable because they may help locate a station, position the camera, or understand field markers that are difficult to describe in words alone, but having additional descriptions makes the image accessible and transportable across digital data formats. This collaboration has revealed an attainable path to speeding up the transcription process of field notes while retraining the trusty pencil and paper duo.

