

“And if the computer can't make a definitive decision, if it could give us back probabilities or the top four ranks, then an expert has a place to start. “If the computer could just translate or identify the highly repetitive parts and leave it to an expert to fill in the difficult place names or verbs or things that need some interpretation, that gets a lot of the work done,” said Paulus, the Tablet Collection Curator at the OI. And a system that can’t quite make up its mind may still be useful. Many of the tablets describe basic commercial transactions, similar to “a box of Walmart receipts,” Paulus said. A lot of digital heavy liftingīut even 80% accuracy can immediately provide help for transcription efforts. Ongoing research will try to nudge that number higher while examining what accounts for the remaining 20%. When tested on tablets not included in the training set, the model could successfully decipher cuneiform signs with about 80% accuracy. With resources from the UChicago Research Computing Center, Krishnan used this annotated dataset to train a machine learning model, similar to those used in other computer vision projects.
#Just translate how to
Using this collection, researchers created a dictionary of the Elamite language inscribed on the tablets, and students learning how to decipher cuneiform built a database of more than 100,000 “hotspots,” or identified individual signs. That training set is thanks to more than 80 years of close study by OI and UChicago researchers and a recent push to digitize high-resolution images of the tablet collection-currently over 60 terabytes and still growing-before their return to Iran. “It’s a good machine learning problem, because the accuracy is objective here, we have a labeled training set and we understand the script pretty well and that helps us. Computer vision over the last five years has improved so significantly ten years ago, this would have been hand wavy, we wouldn’t have gotten this far,” Krishnan said. “From the computer vision perspective, it's really interesting because these are the same challenges that we face. The overlap was immediately apparent to both sides. Krishnan applies deep learning and AI techniques to data analysis, including video and other complex data types. Schloen and Prosser oversee OCHRE, a database management platform supported by the OI to capture and organize data from archaeological excavations and other forms of research. Sanjay Krishnan of the Department of Computer Science at a Neubauer Collegium event on digital humanities. The collaboration began when Paulus, Sandra Schloen and Miller Prosser of the OI met Asst. “If we could come up with a tool that is flexible and extensible, that can spread to different scripts and time periods, that would really be field-changing,” said Susanne Paulus, associate professor of Assyriology. With a training set of more than 6,000 annotated images from the Persepolis Fortification Archive, the Center for Data and Computing-funded project will build a model that can “read” as-yet-unanalyzed tablets in the collection, and potentially a tool that archaeologists can adapt to other studies of ancient writing. That’s the motivation behind DeepScribe, a collaboration between researchers from the OI and UChicago’s Department of Computer Science. But a technological breakthrough at the University of Chicago may finally make automated transcription of these tablets-which reveal rich information about Achaemenid history, society and language-possible, freeing up archaeologists for higher-level analysis.

Since the 1990s, scientists have recruited computers to help-with limited success, due to the three-dimensional nature of the tablets and the complexity of the cuneiform characters.

#Just translate manual
For decades, researchers painstakingly studied and translated these ancient documents by hand, but this manual deciphering process is very difficult, slow and prone to errors. Twenty-five centuries ago, the “paperwork” of Persia’s Achaemenid Empire was recorded on clay tablets-tens of thousands of which were discovered in 1933 in modern-day Iran by archaeologists from the University of Chicago’s Oriental Institute.
