Skip navigation

The surveying of the world - automatic opening up of the German-speaking web

| Konferenz | Machine Learning

10. and 11. October 2019: Lecture by Joachim Feist "Die Vermessung der Welt" at the symposium on the topic "Netzwerk maschinelle Verfahren in der Erschließung".

The German National Library (Deutsche Nationalbibliothek, DNB) in Frankfurt held a symposium on October 10th and 11th on the topic "Netzwerk maschinelle Verfahren in der Erschließung". mindUp managing director Joachim Feist discussed in his lecture the benefits of context-related analysis methods and clarified the direct connection between the fields of application of online marketing and the tasks of DNB.

DNB has already categorised and indexed new publications using machine learning methods. The core questions of the symposium dealt with the techniques that are available for the semantic indexing of large amounts of text and which tasks can be solved with them. The individual contributions discussed the learnigs that could be gained from the experiences. The topics of the lectures ranged from knowledge graphs to indexing of broadcasting contributions to applications in the patent field.

The title of Feist's lecture comes from the book "Die Vermessung der Welt" by Daniel Kehlmann. The cover picture of the book shows how Alexander von Humboldt categorized and indexed the mountains in South America. Just as von Humboldt stood before the mountains of the Andes, today the DNB stands before the constantly growing mountain of new publications in the book and journal sector. And so mindUp stands before the mountain of millions of emails and billions of websites, which today are already structured with self-learning procedures.

mindUp uses the self-developed software contentDetection in the field of internet marketing. There are many parallels to the keywords used at DNB. Websites are categorized according to content and keywords are assigned on the basis of which content-based advertising or suitable products are published. In addition, the mindUp system automatically learns new words, which do not have to be in any dictionary, as Feist illustrated with the example of "Babyhopser". mindUp essentially does without black box procedures, as the explicit expert knowledge of the marketing managers is also anchored in the system.

Beside the use of the system for customers, like eBay or in on-line marketing, a crawler of mindUp visits the entire, German-language internet. The goal is more the thematic and local recognition of all German-language websites and less a full text index, as with a search engine. This supports a wide variety of processes: Finding of new companies, detection of moves, detection of fraudulent fake shops, etc.

mindUp's technology already supported the work of DNB. In addition to the task of archiving new publications in the book and media sector, DNB also has the task of selectively securing internet pages, e.g. in the area of current events or relevant sectors. The "mapping" of the German-speaking internet landscape by mindUp made it possible to specifically select relevant websites and to qualify them with further information.