DIFFBOT, the artificial intelligence that is reading all the Internet to build the largest knowledge base in the world

GPT-2 and GPT-3, two artificial intelligences developed by OpenAI, have been surprising the world since last year with their ability to respond and complete texts in the same way that a human being could do it.

A paradigmatic example of their abilities can be observed in the following tweet, in which GPT-3 is seen filling historical and demographic data on the EE.UU in an Excel document: that Alaska became a state in 1906 and that Michigan has a population of 10.3 million people.

Both very feasible data, in appearance, but...totally false.The problem of AI as GPT-3, known as 'language models' is that they are good imitators (that is, capable of reproducing human writing patterns), but are not trained to write data adjusted to reality because, simply,Do not understand what they read.

Un vistazo a…ZAO, la APP MÓVIL china que a través de DEEPFAKE te convierte en DICAPRIO en SEGUNDOS

We need to understand what they read

And that terribly reduces the usefulness of artificial intelligences.So that there are already attempts to solve this problem.The Startup Diffbot, for example, has developed an AI dedicated to the task of learning (or, at least, extract those data that is capable of recognizing) through the revolutionary method of reading.Reading a lot.

En XatakaLa paradoja de Moravec: por qué la inteligencia artificial hace fácil lo difícil (y viceversa)

Diffbot, la inteligencia artificial que se está leyendo todo Internet para construir la mayor base de conocimientos del mundo

To read, in fact, all public WWW, in multiple languages: its way of understanding human language is to try to fit everything it reads in a subject of subject + verb + predicate, which allows it to establish relationships between concepts, such asfor example:

Taking these simple data, the role of DIFFBOT AI is to create what is called a knowledge graph: a network of relationships equipped with a 'reasoning' system that allows you to reach new conclusions from the data extracted.Diffbot scan the www and update its knowledge chart every 4-5 days, adding up to 150 million tickets on each occasion.

And in addition, it applies to your knowledge older algorithms of Machine Learning, which allow you to identify obsolete information and replace it with a more novelty.

It is so exhaustive that it is not made up of reading the HTML text, but applies computer vision algorithms to extract information and videos also.And also, sail like us: reviewing the websites from top to bottom, changing between eyelashes and clicking on emerging windows.

And in the near future, its creators plan to provide it with a language model (similar to GPT-3) so that, now that AI has been able to understand what it reads, can generate texts from it and create a "systemUniversal answer to questions about facts ".

En XatakaEl papel de la intuición en el desarrollo de la conducción autónoma (y cómo podemos simularla)

Much more than a simple 'curiosity'

But what use can this have, regardless of the mere scientific interest?Well, Diffbot already has 400 clients who pay for extracting information from their knowledge chart, large companies that use it for quite diverse tasks:

Via |Technology Review

image |Pixabay