AI-Driven Browser Wars: Redefining Web Entry from Search to Smart Agents

2025-07-12 22:49:01

AI Will Reshape Browsers: New Trends in the Third Browser War

The third browser war is quietly unfolding. From Netscape and IE in the 1990s to Firefox and Chrome, the browser competition has always been a concentrated reflection of platform control and technological paradigm shifts. Chrome has achieved dominance due to its update speed and ecological interconnection, while Google has formed a closed loop of information entry through a dual oligopoly structure of search and browser.

But today, this pattern is beginning to shake. The rise of large language models (LLM) has led more and more users to complete tasks on the search results page with "zero clicks," resulting in a decrease in traditional webpage click behavior. At the same time, rumors about Apple possibly replacing the default search engine in Safari further threaten Alphabet's profit base, and the market's unease regarding the "search orthodoxy" has begun to show signs.

Browsers themselves are also facing a role reshaping. They are not just tools for displaying web pages, but rather a collection container of various capabilities such as data input, user behavior, and privacy identity. Although AI agents are powerful, to complete complex page interactions, call local identity data, and control web elements, they still need to rely on the trust boundaries and functional sandboxes of browsers. Browsers are transitioning from human interfaces to platforms for system calls by agents.

Is there still a need for browsers? We believe that what could truly break the current browser market pattern is not another "better Chrome", but a new interaction structure: not the display of information, but the invocation of tasks. The future browser needs to be designed for AI Agents - capable not only of reading, but also of writing and executing. Projects like Browser Use are attempting to semanticize page structures, transforming visual interfaces into structured texts that can be called by LLMs, achieving a mapping from pages to instructions and significantly reducing interaction costs.

Mainstream projects have begun to test the waters: Perplexity is building a native browser called Comet, using AI to replace traditional search results; Brave combines privacy protection with local reasoning, enhancing search and blocking features with LLM; while Crypto-native projects like Donut are targeting new entry points for AI and on-chain asset interactions. These projects share a common feature: they aim to reconstruct the input side of the browser rather than beautifying its output layer.

For entrepreneurs, opportunities lie in the triangular relationship between input, structure, and agency. The browser, as the interface for future Agent calls to the world, means that whoever can provide structured, callable, and trustworthy "capability blocks" will become part of the new generation of platforms. From SEO to AEO(Agent Engine Optimization), from page traffic to task chain calls, product forms and design thinking are being restructured. The third browser war is taking place in "input" rather than "display"; what determines victory is no longer who captures the user's attention, but who wins the trust of the Agent and gains access to the calls.

A Brief History of Browser Development

In the early 1990s, Netscape Navigator opened the door to the digital world for millions of users. Microsoft realized the importance of browsers and forcibly bundled Internet Explorer into the Windows system, undermining Netscape's market dominance.

In a difficult situation, Netscape engineers chose to make the browser source code public, which later became the foundation of the Mozilla project, ultimately named Firefox. Firefox achieved multiple breakthroughs in user experience, plugin ecology, security, and more, marking the victory of the open-source spirit.

Meanwhile, the Opera browser was launched in 1994, and in 2003 it introduced its self-developed Presto engine, supporting cutting-edge technologies such as CSS and responsive layouts. In the same year, Apple launched the Safari browser. In 2007, IE7 was released with Windows Vista, but the market response was mediocre. Firefox's market share steadily increased to about 20%, while IE's dominance gradually weakened.

Chrome was launched in 2008 and quickly rose to prominence with its frequent updates and unified experience across all platforms. In November 2011, Chrome surpassed Firefox for the first time; six months later, it overtook IE, completing its transformation from challenger to dominator.

Entering the 2020s, Chrome's global market share stabilized at around 65%. The Google search engine and the Chrome browser form a dual monopoly structure, with the former controlling about 90% of global search entry points, and the latter holding the majority of users' "first window" to access the web.

With the rise of large language models (LLM), traditional search has been impacted. In 2024, Google’s search market share dropped from 93% to 89%. Rumors about Apple possibly launching its own AI search engine are more likely to shake Alphabet's profit pillar.

From Navigator to Chrome, the browser wars have always been a battle over technology, platforms, content, and control. Those who control the entry point define the future.

In the eyes of VCs, relying on the new demands for search engines in the era of LLM and AI, the third browser war is gradually unfolding.

The Obsolete Architecture of Modern Browsers

Traditional browser architecture includes:

Client front-end entry: Complete TLS decryption, QoS sampling, and geo-routing.
Query Understanding: Perform spelling correction, synonym expansion, and intent analysis.
Candidate Recall: Use inverted indexing and vector indexing to filter preliminary candidate pages.
Multi-level sorting: Filter candidate pages down to about 1000 using light features.
Deep Learning Main Ranking: Using technologies such as RankBrain and Neural Matching to understand query semantics.
Deep Reordering: Using the BERT model for more refined sorting of documents.

This is a typical workflow of the Google search engine. However, in the current era of AI and big data, users have developed new demands for browser interaction.

AI Will Reshape Browsers

The browser serves as a universal entry point, not only for reading data but also for user interaction with that data. The browser itself is a place where user fingerprints are stored. More complex user behaviors and automated actions must be carried out through the browser.

The browser is a storage place for personalized content:

Most large models are hosted in the cloud, making it difficult to directly access sensitive data on the local machine.
All data will be sent to third-party models, requiring re-obtaining user authorization.
Automatic filling of verification codes, calling the camera, etc., must be completed within the browser sandbox.
The data context is highly dependent on the browser, including tabs, cookies, etc.

Profound Changes in Interaction Forms

User search behavior is evolving. Research in 2024 shows that 63% of the 1,000 Google queries in the United States are "zero-click" actions. Users are accustomed to obtaining information directly from the search results page.

AI browsers still need to explore appropriate interaction forms, especially in data reading aspects, as the current large model's "hallucination problem" has not yet been eradicated, making it difficult for many users to fully trust automatically generated content summaries.

What can truly trigger a large-scale transformation in browsers is the data interaction layer. Users are increasingly inclined to use natural language to describe complex tasks, and these Agentic Tasks are being taken over by AI Agents.

The future browser must be designed for full automation, considering:

How to balance human reading experience with AI agent interpretability
How to serve users and agent models on the same page

Browser Use

Browser Use has built a truly semantic layer, creating a semantic recognition architecture for the next generation of browsers. It reinterprets the traditional "DOM = nodes tree for humans" into "semantic DOM = instruction tree for LLMs", allowing agents to accurately click, fill in, and upload without needing to look at "screen point coordinates".

This route replaces visual OCR or coordinate Selenium with "structured text → function call", executing faster, saving more tokens, and reducing errors. TechCrunch calls it "the glue layer that truly allows AI to understand web pages."

Main features of Browser Use:

Abstract interactive elements into JSON fragments, accompanied by metadata such as roles and visibility.
Convert the entire page into a flattened "semantic node list" for LLM to read at once.
Receive high-level instructions from LLM output and replay them in a real browser.

Once this set of standards is introduced to W3C, it can greatly solve browser input problems.

ARC

The Browser Company(Arc's parent company) focuses on the AI-driven browser DIA. However, its predictions are flawed, failing to clearly distinguish that "interaction" consists of both input and output dimensions.

On the input side, AI can enhance the efficiency of command-based interactions; however, on the output side, this judgment is clearly unbalanced, overlooking the browser's core role in information presentation and personalized experience. As a platform that accommodates private data while being able to universally render diverse product interfaces, the browser has limited substitutability at the input level, and its complexity on the output side makes it difficult to be disrupted.

To truly shake Chrome, it is necessary to fundamentally reshape the browser's rendering mode to adapt to the interactive needs dominated by AI Agents, especially in the design of input-side architecture. Browser Use focuses on the structural transformation of the underlying mechanisms of the browser, promoting "atomization" or "modularization," which will yield highly disruptive potential through the derived programmability and combinability.

Perplexity

Perplexity is an AI search engine known for its recommendation system, with a latest valuation of 14 billion dollars. Its main feature is real-time summarization of pages, giving it an advantage in obtaining instant information. Perplexity will launch its native browser Comet, deeply embedded in the answers engine.

However, Perplexity still needs to address the high search costs and low profit margins for marginal users. Google is also actively reshaping AI, launching a new browser tab experience with AI Model.

It is difficult to truly threaten Google merely by imitating surface-level functions. What is likely to establish a new order is the reconstruction of the browser architecture from the ground up, deeply embedding LLM into the browser kernel, and achieving a fundamental transformation in interaction methods.

Brave

Brave is one of the earliest and most successful browsers in the crypto industry, based on the Chromium architecture. It attracts users with its model of earning tokens through privacy and browsing. However, the demand for privacy is still mainly concentrated among specific user groups, making it difficult to disrupt existing giants.

Brave has 82.7 million monthly active users, 35.6 million daily active users, with a market share of approximately 1%-1.5%. Its average monthly search query volume is about 1.34 billion times, which is about 0.3% of Google.

Brave plans to upgrade to a privacy-first AI browser. However, it is limited by the lack of user data acquisition, and the low level of customization for large models hinders rapid and precise product iteration. In the upcoming era of the Agentic Browser, Brave may maintain a stable share among specific privacy-focused user groups, but it is unlikely to become a major player.

Donut

Crypto startup Donut has raised $7 million in Pre-seed funding. Its vision is to achieve an integrated capability of "exploration-decision-making-crypto native execution."

The core of this direction lies in integrating the native automation execution path of encryption. In the future, Agents are expected to replace search engines as the main traffic entry point, with entrepreneurs competing for the access and conversion traffic brought by Agent execution. This trend has been referred to in the industry as "AEO"(Answer/Agent Engine Optimization) or "ATF"(Agentic Task Fulfilment).

Advice for Entrepreneurs

The browser itself remains the largest "gateway" in the untransformed internet world. There are approximately 2.1 billion desktop users and over 4.3 billion mobile users globally; it is a common carrier for data input, interactive behavior, and personalized fingerprint storage.

For entrepreneurs, the real disruptive potential does not lie in optimizing the "output" layer. The true breakthrough lies in the "input side" - how to enable AI Agents to proactively engage with the entrepreneur's products to complete specific tasks. This will be key to whether future products can be integrated into the Agent ecosystem and obtain traffic and value distribution.

In the search era, spell "click"; in the agency era, spell "invoke".

Entrepreneurs should reimagine products as API components, allowing agents to not only "understand" them but also "invoke" them. Product design needs to consider three dimensions:

Standardization of interface structure: Is the product "callable"?

Can key operations be described through a semantic DOM structure or JSON mapping?
Is a state machine provided to enable the Agent to reliably reproduce user behavior flows?
Does user interaction support scripted restoration?
Is there a stable WebHook or API Endpoint available?

Identity and Access: Can Agent "cross the trust barrier"?

Can it serve as a trusted intermediary for AI agents to complete transactions, invoke payments, or manage assets?
For crypto entrepreneurs, consider building the "MCP( Multi Capability Platform) of the blockchain world."

Understanding the Traffic Mechanism Again: The Future is not SEO, it is AEO/ATF

Products should have a clear task granularity: not "page", but "callable capability unit"
Start doing Agent optimization ( AEO ) or task scheduling adaptation ( ATF )
Adaptation to different LLM frameworks.

AGENT-4.15%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

18 Likes