The chatbot so used to write essays, computer code and fairy tales does more than just write. ChatGPT, the tool with artificial intelligence, can also analyze images, describe their content, answer questions about them and even recognize faces in them.
We can imagine that one day we will be able to upload the photo of a broken down car engine or a mysterious rash and get a solution from ChatGPT.
But OpenAI, creator of ChatGPT, does not want it to become a facial recognition tool.
Jonathan Mosen, CEO of an employment agency in Wellington, New Zealand, is among a select group with access to an advanced version of ChatGPT that can analyze images. Recently, Mosen, who is blind, used visual analytics to determine which dispensers in a hotel room’s bathroom held shampoo, conditioner and body wash. ChatGPT did much better than the image analysis software it had previously used.
“He gave me the capacity in milliliters of each bottle. He described the ceramic in the shower to me, says Mr. Mosen, 54. Everything was described as a blind man must hear. With just one photo, I had exactly what I needed. »
For the first time, Mosen can “interrogate images,” he says. For example, the text accompanying an image found on social media described her as a “happy-looking blonde”. When he asked ChatGPT to analyze the image, the bot described “a woman in a dark blue shirt taking a selfie in a full-length mirror.” He could ask other questions: what kind of shoes was she wearing? What else did you see in the mirror?
Mr. Mosen has touted the capabilities of this technology and demonstrated it in a podcast he hosts on the subject of “blind living.”
In March, when OpenAI announced GPT-4 – the new AI-enabled software – it described it as “multimodal”: it can respond to textual and visual prompts. Most users have only been able to converse with ChatGPT with words, but Mosen gained early access to visual analytics through Be My Eyes, a small firm that matches blind users with sighted volunteers and provides customer service for businesses. Be My Eyes worked with OpenAI this year to test the robot’s “sight” before the mainstream launch of this feature.
The app recently stopped giving Mr. Mosen information about faces: they were masked for privacy reasons. He is disappointed, believing he should have the same access to information as a sighted person.
Why this change? OpenAI is concerned that it has created a tool with power that it does not want to release as is in a consumer feature.
OpenAI’s technology can identify public figures — people who have a Wikipedia page, for example — says Sandhini Agarwal, a researcher at OpenAI. But it’s not as comprehensive as the tools designed to find faces on the internet, such as Clearview AI and PimEyes. The tool recognizes OpenAI CEO Sam Altman in photos, but not employees.
Such a feature would exceed what is generally considered acceptable practice by US tech companies. It could also pose legal problems in some places like Illinois and Europe, where companies are required to obtain consent from individuals to use their biometric information, including their facial imprint.
Also, OpenAI does not want ChatGPT to comment on the gender or emotional state of the people photographed. OpenAI is working on a way to address these and other security concerns before offering the image analysis feature to the general public, Agarwal says.
Microsoft, which has invested 10 billion in OpenAI, also has access to the visual analysis tool. A small number of users of Microsoft’s AI-powered Bing robot have access to it. When they upload images, a message informs them that faces are blurred in Bing to protect privacy.
Sayash Kapoor, a PhD student in computer science at Princeton University, used Bing to decode a CAPTCHA, a visual security check meant to be understandable only by a human. The bot cracked the code and recognized the letters hidden in the calligraphic jumble in the background, but Bing noted that “CAPTCHAs are made to prevent bots like me from accessing certain websites or services.”
“AI is shattering all the devices that are supposed to separate humans from machines,” said Ethan Mollick, an innovation and entrepreneurship researcher at the Wharton School at the University of Pennsylvania.