Apple may have been late to the AI party compared to other tech giants but clearly, it has something big planned for 2024. The WWDC 2024 keynote is expected to have a lot of talk on the current hot topic “AI“. The tech giant is reportedly building an AI model that promises to be better and faster than ChatGPT. More importantly, Siri could improve a lot with the new AI model called “Reference Resolution As Language Modeling” or “ReALM.”
Don’t Miss!!! Social Nation is hosting Asia’s Largest Creator Festival on the 20th & 21st of April at Jio World Garden, Mumbai. Book your tickets here.
Apple researchers released a preprint paper on its ReALM large language model and claimed that it can “substantially outperform” OpenAI’s GPT-4 in particular benchmarks. ReALM can supposedly understand and handle different contexts. In theory, this will allow users to point to something on the screen or running in the background and query the language model about it.
This ability to understand exactly what is being referred to would be very important to chatbots. The ability for users to refer to something on a screen using “that” or “it” or another word and having a chatbot understand it perfectly would be crucial in creating a truly hands-free screen experience, according to Apple.
The company is keen for its AI tech to work on-device, which not only gives Apple better control over the content and responses but also keeps the user’s data secure and private. The new Apple report here claims that the company has observed that the ReALM is able to perform better than ChatGPT 4 with fewer parameters, which makes it possible to use an on-device model.
Apple’s AI model is claimed to be converting images into text which allows ReALM to read the parameters faster and efficiently as well. Having this AI tech could not only make iOS 18 an AI-rich platform for iPhone users, but they could finally see the potential of Siri in this 2.0 avatar that should be utilising the prowess of the ReALM model from the company.
For instance, if you command Siri to help you call a number from a website opened on your iPhone, the AI model will help Siri convert it into text and immediately call the number without saying anything else to the AI assistant.
On-device AI capability will enable Apple to entice more people to use its devices. However, you could see the company partner with Google or OpenAI to give iPhone users advanced AI tools like image generation and AI video creator that need more computing power and are not suited for on-device processing.
In the paper, researchers wrote that they want to use ReALM to understand and identify three kinds of entities —
Onscreen entities are things that are displayed on the user’s screen. Conversational entities are those that are relevant to the conversation. For example, if you say “What workouts am I supposed to do today?” to a chatbot, it should be able to work out from previous conversations that you are on a 3-day workout schedule and what the schedule for the day is.
Background entities are those things that do not fall into the previous two categories but are still relevant. For example, there could be a podcast playing in the background or a notification that just rang. Apple wants ReALM to be able to understand when a user refers to these as well.
“We demonstrate large improvements over an existing system with similar functionality across different types of references, with our smallest model obtaining absolute gains of over 5 per cent for on-screen references. We also benchmark against GPT-3.5 and GPT-4, with our smallest model achieving performance comparable to that of GPT-4, and our larger models substantially outperforming it,” wrote the researchers in the paper.
Point to note is that in GPT-3.5, which only accepts text, the researchers’ input was just the prompt alone. But with GPT-4, they also provided a screenshot for the task, which helped improve performance substantially.
“Note that our ChatGPT prompt and prompt+image formulation are, to the best of our knowledge, in and of themselves novel. While we believe it might be possible to further improve
results, for example, by sampling semantically similar utterances up until we hit the prompt length, this more complex approach deserves further, dedicated exploration, and we leave this to future work,” added the researchers in the paper.
So while ReALM works better than GPT-4 in this particular benchmark, it would be far from accurate to say that the former is a better model than the latter. It is just that ReALM beat GPT in a benchmark that it was specifically designed to be good at. It is also not immediately clear when or how Apple plans to integrate ReALM into its products.
Apple entering the AI models world, it would still be super exciting for us. And we can’t wait to see the lineup at the WWDC 2024 in June this year!
Dealing with heartbreak, learning to let go and trying to move on has got to…
Diwali celebrates the triumph of good over evil. On this special occasion, we wish to…
The hub of all things digital, Social Nation is on a quest to bring you real stories…
Didn't most of you spend the past weekend doing 'Diwali ki safai' at home? Do…
To celebrate the joyful spirit of Diwali, Zara joins forces with the renowned multidisciplinary local…
Marking his standup debut in May, Viraj Ghelani is now all set to entertain the…
Leave a Comment