Perplexity AI Unveils Autonomous Hybrid Inference System at Computex 2026

Perplexity AI Unveils Autonomous Hybrid Inference System at Computex 2026

At Computex 2026, Perplexity AI introduced a groundbreaking hybrid inference system capable of autonomously deciding which tasks should be executed on a user's device and which should be offloaded to cloud-based language models. The company claims this is the first mechanism of its kind that can make such decisions in real-time, even adjusting the execution location of specific task stages as needed.

During a joint presentation with Intel's CEO Lip-Bu Tan, Perplexity's CEO Aravind Srinivas demonstrated the new technology using a Personal Computer agent that analyzed sensitive materials related to a business deal. Local models running on the Intel Core Ultra Series 3 processor determined which data should remain on the user's device while identifying what could be safely sent to the cloud for more complex processing.

The standout feature of this system is not just its ability to run models locally—similar solutions already exist—but its capability to automatically select where tasks should be executed. Users no longer need to pre-determine whether certain data stays on their device or is shared with external services. Confidential information such as financial documents or medical records can be processed locally, while resource-intensive tasks are directed to cloud models.

Perplexity announced that this feature will be available to users in the coming weeks. This development is part of a broader strategy initiated by the company earlier in 2026. In February, they launched a Computer agent capable of coordinating the work of 19 different models, including Claude, Gemini, GPT, and Grok, with all processing occurring in the cloud. The March introduction of the Personal Computer for macOS already integrated local and server-based data processing, allowing access to the computer's file system and enabling complex workflows within a secure environment.

The latest version takes a step further, as the system now determines not only which model is best suited for a specific sub-task but also where that task should be physically executed. According to the company, users will receive permission before potentially sensitive information is sent to the cloud.

This announcement is particularly noteworthy in light of the main theme of Computex 2026, which revolves around the industry's shift towards AI operating directly on user devices. Just prior to the presentation, Nvidia's CEO Jensen Huang unveiled the RTX Spark superchip designed for running large language models and autonomous AI agents on personal computers, while Intel showcased its Xeon 6+ server processors and positioned the Core Ultra Series 3 as the foundation for hybrid AI scenarios.

If Perplexity's technology functions as demonstrated, it could drive an increased demand for more powerful processors in personal computers. The greater the capabilities of local hardware, the more computations can be performed without relying on cloud infrastructure, reducing latency and data processing costs.

Perplexity believes that this approach could impact not only the chip market. As user devices become more powerful, some tasks will shift from data centers directly to individual computers. This is particularly significant for handling sensitive information that may never leave the user's device.

The company connects this innovation to its broader vision, asserting that the orchestration level—software systems that distribute tasks among various models and tools—plays a key role in advancing artificial intelligence. This idea now extends to computational infrastructure, with the system selecting not just the model but also the execution location.

However, the technology remains complex from an engineering standpoint. For the orchestrator to function correctly, it must evaluate the difficulty of each sub-task, assess data sensitivity, consider local hardware capabilities, and ensure the integrity of the process while continuously shifting computations between the user’s device and the cloud.

This unveiling comes at a time of rapid growth for Perplexity. According to the company, its annual recurring revenue surpassed $450 million in March, with a market valuation reaching $20 billion. Simultaneously, the company is facing an increasing number of lawsuits from several media entities alleging copyright and trademark infringement.

Perplexity is also making strides in the corporate market. In the spring, it launched the Computer for Enterprise platform aimed at competing with traditional corporate solutions, integrating with platforms like Snowflake, Datadog, Salesforce, SharePoint, and HubSpot, and offering tools for legal document analysis, financial auditing, negotiation preparation, and customer service processing.

The new technology may find the most demand in the corporate sector. For banks, healthcare organizations, law firms, and defense contractors, the ability to process sensitive data locally while still accessing robust cloud models for less sensitive tasks could become a key requirement for implementing AI systems.

While the technology has only been showcased at Computex, its real-world capabilities still need to be tested. Nevertheless, Perplexity has effectively proposed a new model for organizing computations, where artificial intelligence independently selects not only the tool for task resolution but also the computer on which the task will be executed. This development could reshape market dynamics and influence competitors as they adapt to the evolving landscape of AI technology.

Informational material. 18+.

" content="b3bec31a494fc878" />