(Paris) Many media and websites have decided to block their access to GPTBot, a data vacuuming robot launched in early August by the company OpenAI to feed its artificial intelligence models, which they accuse of “looting” their content .

The New York Times, CNN, the Australian broadcaster ABC, the Reuters and Bloomberg news agencies: all have blocked the digital path to GPTBot, a robot launched without fanfare on August 8 by OpenAI, which created ChatGPT.

The mission ? Suck up all the data from websites, ready to open the door to it, to feed generative artificial intelligence (AI) models.

But the young Californian company, which has publicly indicated how to prevent its robot from accessing a site’s data, is facing a growing digital outcry.

According to an estimate from Originality.ai, a plagiarism detection tool, nearly 10% of the world’s 1,000 largest sites had denied access to GPTBot two weeks after its launch.

Among these, Amazon.com, Wikihow.com, Quora.com or the Shutterstock image bank. This list is expected to grow quickly according to Originality.ai which estimates that the proportion of websites banning their access to GPTBot is expected to increase by 5% per week.

In France, GPTBot has become a “robot non grata” on the sites of France Médias Monde (France 24 and RFI), Mediapart, Radio France and TF1.

“In the 24 hours following the announcement, we immediately looked at what we could do,” recalls Laurent Frisch, director of digital and innovation strategy for the Radio France group, to AFP. .

Because “there is one thing that does not pass: it is the looting without authorization of content”, justified Monday Sibyle Veil, the president of Radio France, during a press conference.

“There is no reason for them to come and learn about our content without compensation”, “without us knowing the ins and outs”, nor how the content would be used, continues Laurent Frisch.

Generative AI operating on a probabilistic model, “our data can be associated with others, more or less accurate, or even false,” adds Vincent Fleury, director of digital environments at France Médias Monde.

This is why “the platforms must source all the media, under penalty of lack of neutrality and possible manipulation”, pleads Bertrand Gié, director of the News division of Figaro and president of the Geste (Grouping of online service publishers).

“The idea is not to be the turkey of the joke. To be looted by these companies which then make profits on the basis of our productions, it’s fine at some point, “summarizes Vincent Fleury.

Hence the need to open discussions with OpenAI and other players in generative AI, indicate most of the media interviewed.

“You have to compensate the media fairly. Our desire is therefore to obtain licensing and remuneration agreements,” argues Bertrand Gié.

In the United States, the Associated Press (AP) news agency led the way with the conclusion of an agreement in July with OpenAI authorizing it to use its archives since 1985 in exchange for access to its technology and its expertise in AI.

OpenAI also committed $5 million to the American Journalism Project, an organization that supports many local media outlets, and up to $5 million in credits to use its programming interface (API) to help journalists. to integrate AI tools into their production.

But beyond the high visibility of OpenAI with ChatGPT, “hundreds of start-ups are being created in different media-related fields”, recalls Mediapart, calling for “an open debate on regulation” and the impact of “all forms of AI.”

Proof that the press situation, ten international media groups, including AFP, The Associated Press or the Gannett / USA Today group, urged political leaders and sector managers in August to regulate the use of AI in the ‘information.