ChatGPT has new voice and image recognition superpowers
These new features are exclusive to Plus or Enterprise service subscribers.
- Recognize images and have a chinwag about them.
- Convert speech-to-text and text-to-speech, making it more of a chatterbox than ever.
To show off the image recognition feature, OpenAI released a promo video where a user asks ChatGPT for a hand in lowering a bike seat. The chatbot, getting the picture (pun intended), dishes out some basic advice.
When the user asks for more specific tips, highlighting the bike seat catch, ChatGPT identifies the bolt type and tells the user to grab an Allen wrench.
The system can even look over a user manual and toolbox image to check if the user has the right tool. This is a fresh twist in the chatbot world, which has been mostly about voice synthesis and speech recognition so far.
OpenAI’s New Voice Services: Say Hello to “Juniper,” “Sky,” and “Breeze”
In another video, a mom asks ChatGPT to spin a bedtime story about a forest hedgehog. The voices, named “Juniper,” “Sky,” and “Breeze,” each based on a licensed voice actor, sound as natural as can be.
OpenAI’s voice synthesis tech is on par with companies like ElevenLabs, which has taken some heat for its tech being used for deepfakes and harassment.
Currently, OpenAI’s voice services are only available in ChatGPT voice chat, but Spotify will also be licensing its voice systems.
Spotify recently revealed new podcast voice translation features, letting popular podcasters’ voices be cloned in Spanish, French, and German.
However, there’s a catch.
Potential Speed Bumps Ahead
It’s worth noting that the system’s speed and capacity might not live up to the hype from the promo videos.
For example, the voice recognition feature reportedly takes a few seconds to respond, and the image system won’t try to identify people in photos, according to pre-release reviews.
OpenAI is gradually rolling out these new features, allowing for ongoing tweaks and risk reduction, which is super important with voice and image recognition.
But, the introduction of vision-based models presents a new challenge: the risk of misinterpreting or not accurately understanding users’ prompts.
OpenAI has done some red teaming to minimize these risks, but it’s only a matter of time before users push the chatbot’s ethical boundaries.
- This simple feature solves any ChatGPT plagiarism claims
- The Best ChatGPT Mac Apps
- SheetGPT adds ChatGPT to Sheets – own it FOR LIFE for only $49
- OpenAI’s official ChatGPT app is now available on Android