Connect with us


ChatGPT has new voice and image recognition superpowers

These new features are exclusive to Plus or Enterprise service subscribers.

Openai k is an artificial intelligence system that is being used to automate tasks. Full text: openai k
Image: OpenAI

OpenAI, the genius crew behind the famous AI chatbot ChatGPT, has dropped some killer new features that are set to take the bot’s human-like interaction to the next level. The AI can now:

  • Recognize images and have a chinwag about them.
  • Convert speech-to-text and text-to-speech, making it more of a chatterbox than ever.

To show off the image recognition feature, OpenAI released a promo video where a user asks ChatGPT for a hand in lowering a bike seat. The chatbot, getting the picture (pun intended), dishes out some basic advice.

When the user asks for more specific tips, highlighting the bike seat catch, ChatGPT identifies the bolt type and tells the user to grab an Allen wrench.

The system can even look over a user manual and toolbox image to check if the user has the right tool. This is a fresh twist in the chatbot world, which has been mostly about voice synthesis and speech recognition so far.

OpenAI’s New Voice Services: Say Hello to “Juniper,” “Sky,” and “Breeze”

In another video, a mom asks ChatGPT to spin a bedtime story about a forest hedgehog. The voices, named “Juniper,” “Sky,” and “Breeze,” each based on a licensed voice actor, sound as natural as can be.

OpenAI’s voice synthesis tech is on par with companies like ElevenLabs, which has taken some heat for its tech being used for deepfakes and harassment.

Currently, OpenAI’s voice services are only available in ChatGPT voice chat, but Spotify will also be licensing its voice systems.

Spotify recently revealed new podcast voice translation features, letting popular podcasters’ voices be cloned in Spanish, French, and German.

However, there’s a catch.

These new features are exclusive to Plus or Enterprise service subscribers. They’re set to roll out on iOS and Android in the next fortnight, with web users getting image capabilities shortly after.

Chatgpt app on ios screenshots
Image: KnowTechie

Potential Speed Bumps Ahead

It’s worth noting that the system’s speed and capacity might not live up to the hype from the promo videos.

For example, the voice recognition feature reportedly takes a few seconds to respond, and the image system won’t try to identify people in photos, according to pre-release reviews.

OpenAI is gradually rolling out these new features, allowing for ongoing tweaks and risk reduction, which is super important with voice and image recognition.

But, the introduction of vision-based models presents a new challenge: the risk of misinterpreting or not accurately understanding users’ prompts.

OpenAI has done some red teaming to minimize these risks, but it’s only a matter of time before users push the chatbot’s ethical boundaries.

Have any thoughts on this? Drop us a line below in the comments, or carry the discussion to our Twitter or Facebook.

Editors’ Recommendations:

Follow us on Flipboard, Google News, or Apple News

Kevin is KnowTechie's founder and executive editor. With over 15 years of blogging experience in the tech industry, Kevin has transformed what was once a passion project into a full-blown tech news publication. Shoot him an email at or find him on Mastodon or Post.

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Deals of the Day

More in AI