By Martine van den Berg
An archaeologist digging in dusty soil, exhausted by heat and battling many challenges and disappointments, finally finding something unexpected and extraordinary. We cherish this romantic idea of archaeology fed by books and movies. The reality could not be further removed from this image. Today’s biggest challenge is not to find something new, but to work your way through the ever-increasing body of already excavated and digitized objects. In order words: it is no longer a problem of scarcity but of abundance. How to find the needle in the haystack of existing digital collections?
This realization made Heleen Wilbrink, Egyptologist and founder of Aincient, wonder whether computers could achieve what humans could not. Would artificial intelligence be able to solve this long-standing problem? She turned to Google for help. Together with their partner Synerscope she devised a tool which aims to recognize, sort and analyze the images of archaeological artifacts in minutes instead of days.
The first test case: Dutch National Museum of Antiquities
To put the tool to the test, Wilbrink used the open-data set of the Dutch National Museum of Antiquities (known as RMO for short in Dutch). The results of the prototype were amazing. With very little work and by maximizing Google’s Cloud Vision API the collection was categorized using image recognition, thus giving the researcher the possibility to rediscover objects in the existing body of material.
In early spring this breakthrough technology was presented at a well-visited press conference at the RMO. “Improving the ‘searchability’ of our on-line databases, to literally increase the value of the available collection data, was a challenge,” said Wim Weijland, director of the RMO. “The available data can be used as a source of inspiration for new exposition topics, but also serves the public and researchers. In the future, we will be able to link well-organized databases to those of other national and international museums, which will increase the knowledge level. This way, the current and future technology will give us a better look into the past.”
Databases as Barriers
But how exactly is aincient’s solution different from other search tools? To fully appreciate this innovation, we must go back in time. Since the early nineteen century archaeologist have been excavating a massive number of artifacts. Many of these finds were studied, categorized and stored in museums and universities worldwide. Some made it to an exhibition or permanent display, while many other objects have been hidden from public view in warehouses, storerooms, dusty cellars and private collections.
The geographical spread and inaccessibility of some of these collections posed a huge challenge for those researching the past. Often it was easier to initiate a new excavation then to rediscover what others had uncovered in the generations before. Here digitalization has made a difference. With hard work and dedication many objects have been photographed, described and tagged. Today more and more collections can be discovered by means of a digital search.
However, major barriers remain, as it is still extremely difficult to search and compare single items from different collections. This is not only due to the use of variation in data-structures and lack of standardization but also relates to the core of the archaeological discipline; categorization. Much of the discoverability depends on the quality of tagging. There are multiple metatags to use – think of the material, morphology, function and period – and the definitions of the assemblies depend largely on the choices made when the objects were categorized in the early stages of the excavation.
Scarabs and libation flasks – or perfume bottles?
Let’s look at two examples to illustrate this point. When a researcher wants to describe the development, function and geographical spread of a certain object, for example ‘second millennium green scarabs,’ this study requires a complete overview of all scarabs and implies searching through many hundreds of collections and covering thousands of assemblies, geolocations and historical periods. This is an impossible task and so it is common practice to select and study a subset. As a result, much of the scholarly argument is pivoting around the question whether this subset is representative for the total body of excavated finds. With this tool, covering all collections and studying complete data-sets becomes within reach.
In exactly the opposite example, a researcher might want to study a certain object of unknown function, -for example a small glass bottle- and try to understand it within its excavated context. Is it a perfume bottle for household use? Was it a container for herbs or a poison? Is it a libation flask for cultic use? The excavated context will be key in understanding its function, but this not always conclusive. Especially when the provenance of an object is unclear, the researcher will need to find a similar shaped artifact. But to locate an ‘unknown object’ takes an enormous amount of time and depends on a certain element of luck, as one cannot digitally search for an object without first textually defining it. Using the power of artificial intelligence will enable the possibility of uploading an image that will be automatically compared and matched with similar objects.
Rising above inefficiencies
By making maximum use of AI, aincient will have enormous benefits for archeological research and reduce the inefficiency of human categorization. With one click a researcher will have an overview of all artifacts with similar shape and form. The more databases aincient is able to link up, the more comprehensive the results will be. Given the fact that the search process will be completed in minutes instead of months, it will eliminate much of the manual work. As a result, research(ers) will be able to focus much more on in-depth analysis. Due to its ability to find correlations between objects and collections that were previously hidden under different metatags and keywords, aincient will also increase the likelihood of new discoveries in material that was excavated and digitized years ago.
Where inaccessible databases have become a major barrier to research, the press conference made clear how the technologies that are incorporated in aincient will be able to unlock the past. “At SynerScope, we offer quick solutions to develop difficult-to-link data and databases, making them comprehensible and usable”, according to CEO Jan-Kees Buenen. “The large on-line RMO collection was an ideal candidate to show the benefits of using our technology.” André Hoekzema, Head of Google Cloud Benelux added: “Google Cloud Vision API uses powerful machine learning models that are applicable in many ways. Google and Synerscope reinforce each other in this area. This RMO pilot also offers other museums and scientists the potential to accelerate their research.”