A ChatGPT that can use devices

Avatar


OpenAI is reportedly developing an autonomous artificial intelligence (AI) assistant system capable of assuming control of a user’s device to perform tasks. 

The potential new product was first reported by The Information, citing a source familiar with the matter.

While details are scarce — OpenAI didn’t immediately respond to Cointelegraph’s requests for comment and clarification — it stands to reason that the next logical step beyond generative AI systems such as ChatGPT would be action agents.

Action agents

Generative AI systems such as ChatGPT and Google’s Gemini are designed to generate humanlike media such as text, images, audio and video.

Typically, in order to get one of these models to perform a real-world action, such as operating a robot, developers must cobble them together with external applications that adapt the AI’s output into a programmable executable.

Related: Huawei researchers say giving AI a ‘body’ is the next step toward human-level agents

The technology underpinning most smart assistants and similar systems isn’t quite as robust as what’s working under the hood of ChatGPT, Gemini or even Amazon’s own generative and foundational AI products.

It stands to reason that a virtual assistant built on large language model technology (such as that used to create ChatGPT) would have greater potential for autonomous action than the comparatively simple systems powering the previous generation of smart assistants.

Death of the user interface

Until it’s known exactly what OpenAI intends to do with its reported autonomous action agents, all that one can do is speculate as to their potential capabilities.

The Information’s report indicated that the new AI system would be capable of operating users’ devices to perform requested tasks. It cited an example where a user asks the AI to copy data from one platform to another.

Ostensibly, any physical function a human can perform — such as swiping, tapping, clicking, double-clicking, typing and even solving CAPTCHA puzzles to prove one is not a robot —could be performed by an AI system with sufficient device privileges.

Autonomy and security

While this technology might sound like something straight out of a Marvel film — Iron Man’s JARVIS, for example — the reality is that the road to autonomous assistance systems is littered with privacy and security challenges.

Current state-of-the-art generative AI systems aren’t discrete. They require connectivity to massive cloud compute centers. While it is possible to run some AI functions entirely on laptops and smartphones, it’s unlikely that an AI action agent, as imagined, would be able to run on an onboard AI chip alone.

This could pose a potentially massive privacy threat. Coupled with the obvious security threat of giving a corporate AI system unfettered access to private information and the average smartphone’s ability to exchange data at internet scale, the realization of an autonomous action agent could represent a critical new cyberthreat with global implications.