Categories
News

Aincient and Partners: Exploring Archives and Historical Locations using AI and Crowdsourcing 2.0

Artificial Intelligence (AI), crowdsourcing 2.0 and Handwritten Text Recognition (HTR), enable a smart search through 200,000 scans of historical texts word by word. The goal our project is creating a prototype for a smart archival search for our archival partners: the Amsterdam City Archives (Stadsarchief Amsterdam), the Dutch National Archives (Nationaal Archief) and the North Holland Archives (Noord-Hollands Archief).

However, searching for locations in historical texts is problematic for several reasons, while the need for this is significant. Together with the archives and a crowd of volunteers, Aincient and partners created AI datasets for finding historical locations in transcriptions. We are developing a prototype search environment where scans, transcriptions, historical maps and (historical) images can be explored. Aincient has formed a consortium of heritage and AI specialists, consisting of Aincient, Picturae, Sioux Technologies and Islands of Meaning.

Artificial Intelligence

The AI application offers a solution to the searchability problem of the archives and the users of the archives, especially in the area of locations. Every day, archives are searched online by thousands of users, partly as an important source for scientific research. In addition, this solution can be applied in many heritage institutions in the public sector, in the Netherlands and abroad.

In recent years there has been an enormous development within AI in the field of Natural Language Processing (NLP). We focus within NLP on Named Entity Recognition (NER) for the automatic recognition of locations in historical texts.  We are using the open source tool BERT by Google, a deep learning AI solution. In the previous SBIR phase, the initial results achieved with BERT were good.

What challenge does it solve?

This project solves several challenges. The first challenge is the fact that locations are hard to find in the extensive online archives, while there is a great need for them. Even just for researching the history of your own house or city. This applies to searching through text as well as through maps. The other challenge is the scarcity of datasets to train AI in the heritage sector, also when it comes to locations. Together with the archives and a crowd of volunteers, we created AI datasets for locations, names and dates. For this purpose we launched the project ‘Tag the text’ on crowdsourcing platform VeleHanden. Over 10,000 historical texts have been tagged and checked within two months.

Small Business Innovation Research (SBIR)

Aincient and partners participate in the SBIR ‘Artificial intelligence for public services’. Phase 1 has been successfully completed and we are currently working on phase 2, the development of a prototype. We also participate in the SBIR for the National Archives, of which phase 2 has been completed and the implementation phase will be finished soon. In addition, Aincient is a member of the working group Culture of the Netherlands AI Coalition.

SBIR makes use of an exception in the procurement legislation for applied research and development. A societal question posed by the government is central.  SBIR works in a tiered innovation competition towards the innovative solution(s). The best proposals execute a feasibility study and the best feasibility studies are commissioned to develop an innovation. This innovation is ideally tested in practice by the potential government customer.

In collaboration with:

Aincient is carrying out the SBIR AI project in collaboration with the Amsterdam City Archives (Stadsarchief Amsterdam), the Dutch National Archives (Nationaal Archief) and the North Holland Archives (Noord-Hollands Archief), Picturae, Sioux Technologies, and Islands of Meaning.

This article is a moderated version of the arcticle published online by the NLAIC, the Netherlands AI Coalition.

Categories
News

Excavations 2.0: how Aincient uses Artificial Intelligence to unlock the past

By Martine van den Berg

An archaeologist digging in dusty soil, exhausted by heat and battling many challenges and disappointments, finally finding something unexpected and extraordinary. We cherish this romantic idea of archaeology fed by books and movies. The reality could not be further removed from this image. Today’s biggest challenge is not to find something new, but to work your way through the ever-increasing body of already excavated and digitized objects. In order words: it is no longer a problem of scarcity but of abundance. How to find the needle in the haystack of existing digital collections?

This realization made Heleen Wilbrink, Egyptologist and founder of Aincient, wonder whether computers could achieve what humans could not. Would artificial intelligence be able to solve this long-standing problem? She turned to Google for help. Together with their partner Synerscope she devised a tool which aims to recognize, sort and analyze the images of archaeological artifacts in minutes instead of days.

The first test case: Dutch National Museum of Antiquities

To put the tool to the test, Wilbrink used the open-data set of the Dutch National Museum of Antiquities (known as RMO for short in Dutch). The results of the prototype were amazing. With very little work and by maximizing Google’s Cloud Vision API the collection was categorized using image recognition, thus giving the researcher the possibility to rediscover objects in the existing body of material.

In early spring this breakthrough technology was presented at a well-visited press conference at the RMO. “Improving the ‘searchability’ of our on-line databases, to literally increase the value of the available collection data, was a challenge,” said Wim Weijland, director of the RMO. “The available data can be used as a source of inspiration for new exposition topics, but also serves the public and researchers. In the future, we will be able to link well-organized databases to those of other national and international museums, which will increase the knowledge level. This way, the current and future technology will give us a better look into the past.”

Databases as Barriers

But how exactly is aincient’s solution different from other search tools? To fully appreciate this innovation, we must go back in time. Since the early nineteen century archaeologist have been excavating a massive number of artifacts. Many of these finds were studied, categorized and stored in museums and universities worldwide. Some made it to an exhibition or permanent display, while many other objects have been hidden from public view in warehouses, storerooms, dusty cellars and private collections.

The geographical spread and inaccessibility of some of these collections posed a huge challenge for those researching the past. Often it was easier to initiate a new excavation then to rediscover what others had uncovered in the generations before. Here digitalization has made a difference. With hard work and dedication many objects have been photographed, described and tagged. Today more and more collections can be discovered by means of a digital search.

However, major barriers remain, as it is still extremely difficult to search and compare single items from different collections. This is not only due to the use of variation in data-structures and lack of standardization but also relates to the core of the archaeological discipline; categorization. Much of the discoverability depends on the quality of tagging. There are multiple metatags to use – think of the material, morphology, function and period – and the definitions of the assemblies depend largely on the choices made when the objects were categorized in the early stages of the excavation.

Scarabs and libation flasks – or perfume bottles?

Let’s look at two examples to illustrate this point. When a researcher wants to describe the development, function and geographical spread of a certain object, for example ‘second millennium green scarabs,’ this study requires a complete overview of all scarabs and implies searching through many hundreds of collections and covering thousands of assemblies, geolocations and historical periods. This is an impossible task and so it is common practice to select and study a subset. As a result, much of the scholarly argument is pivoting around the question whether this subset is representative for the total body of excavated finds. With this tool, covering all collections and studying complete data-sets becomes within reach.

In exactly the opposite example, a researcher might want to study a certain object of unknown function, -for example a small glass bottle- and try to understand it within its excavated context. Is it a perfume bottle for household use? Was it a container for herbs or a poison? Is it a libation flask for cultic use? The excavated context will be key in understanding its function, but this not always conclusive. Especially when the provenance of an object is unclear, the researcher will need to find a similar shaped artifact. But to locate an ‘unknown object’ takes an enormous amount of time and depends on a certain element of luck, as one cannot digitally search for an object without first textually defining it. Using the power of artificial intelligence will enable the possibility of uploading an image that will be automatically compared and matched with similar objects.

Rising above inefficiencies

By making maximum use of AI, aincient will have enormous benefits for archeological research and reduce the inefficiency of human categorization. With one click a researcher will have an overview of all artifacts with similar shape and form. The more databases aincient is able to link up, the more comprehensive the results will be. Given the fact that the search process will be completed in minutes instead of months, it will eliminate much of the manual work. As a result, research(ers) will be able to focus much more on in-depth analysis.  Due to its ability to find correlations between objects and collections that were previously hidden under different metatags and keywords, aincient will also increase the likelihood of new discoveries in material that was excavated and digitized years ago.

Where inaccessible databases have become a major barrier to research, the press conference made clear how the technologies that are incorporated in aincient will be able to unlock the past. “At SynerScope, we offer quick solutions to develop difficult-to-link data and databases, making them comprehensible and usable”, according to CEO Jan-Kees Buenen. “The large on-line RMO collection was an ideal candidate to show the benefits of using our technology.” André Hoekzema, Head of Google Cloud Benelux added: “Google Cloud Vision API uses powerful machine learning models that are applicable in many ways. Google and Synerscope reinforce each other in this area. This RMO pilot also offers other museums and scientists the potential to accelerate their research.”

We are looking forward to a future where many collections will be accessible through Aincient. The next big step will be the public launch of the tool later this year. Make sure to subscribe to the newsletter for the latest developments. Happy excavations!
Categories
News

Sneak preview of amazing research tool

We initiated a project with Google, Synerscope and the Dutch National Museum of Antiquities (RMO in Dutch). Our goal is to drastically speed up research on ancient collections, resulting in more discoveries. We started with the online collection of the RMO and combined it with the powerful software of Synerscope and Google Vision API. Our first results will be presented during our press conference on the 20th of March. We will keep you posted!

Categories
News

Digital future of Egyptology

If anything is possible, what would be the digital future of Egyptology? That was the question during our one day brainstorm at Berkeley. We created a vision and roadmap with an international team with experts on Egyptology, Assyriology, archaeology and Unicode. Interested? Read the Results Berkeley Digital Humanities brainstorm.

Categories
News

Newer Realities

Virtual-, augmented- and mixed realities are a perfect way to simulate travel through space and time. Great opportunities for exploring ancient civilizations. But what do these new realities have to offer and what distinguishes one from the other? Get acquainted with each of them and discover the cool stuff that you can try at home right away. Let’s have a look.

Virtual Reality (VR) is an immersive experience generated by a computer. VR headgear can transport you to another world, where for example you can walk with dinosaurs. All you need for a basic set-up is a mobile phone with gyroscope and a Google cardboard. A fancier option is the Samsung Gear VR. High-end versions are Occulus Rift and HTC Vive. If you want to try the basic version at home, have a look at our favorites and how to get started: VR tips and tricks. The Google cardboard app and Google street view app are a great place to start your virtual journey.

Augmented Reality (AR) is less immersive than VR. It enhances your current perception of reality by adding extra information, for example text notifications or simulated screens. Pokémon Go made AR very popular for a broad public, but these days most attention goes to the more advanced possibilities of Mixed Reality (MR).

MR combines the best of both realities. You see the real world with virtual objects that seem real and that become larger when you get closer. A well known example of MR is the Microsoft Hololens. A very interesting experience, although we do not know of any applications focused on ancient cultures yet. Let us know if you do!