Brave Search introduces the Summarizer, an AI tool for synthesized, relevant results
Today we’re thrilled to announce the latest AI-powered feature in Brave Search: the Summarizer.
The Summarizer provides concise and to-the-point answers at the top of Brave Search results pages, in response to the user’s input, solely based on Web search results. Unlike a purely generative AI model, which is prone to spout unsubstantiated assertions, we trained our large language models (LLMs) to process multiple sources of information present on the Web. This produces a more concise, accurate answer, expressed in coherent language.
In addition, the provenance of original sources of data is cited at all times via links. This maintains the rightful attribution of information, and helps users assess the trustworthiness of the sources, both of which are needed to mitigate the authority biases of large language models.
The Brave Summarizer is available today for all users of Brave Search, on desktop and mobile. For users who would prefer not to use the Summarizer, they can easily turn it off by opting out in settings.
Using Web results enables the Summarizer to provide real-time information that is up to date with today’s events. Given the current advancements in AI, it’s crucial to remind users that one should not believe everything an AI system produces, in much the same way one should not believe everything that is published on the Web. At the risk of stating the obvious, we should not suspend critical thinking for anything we consume, no matter how impressive the results of AI models can be.
Besides the summary itself, our AI models are also able to replace the already query-dependent snippets (result descriptions) with a summarized version of those snippets, highlighting the answer when possible. This can be viewed as a summary of a single source (such as a press article), as opposed to the main summary where multiple sources are considered and aggregated to create a more comprehensive answer. The summary at the top of the results page and these special descriptions co-occur, so users will see the overarching summary as well as snippets with highlighted answers.
Note the highlighted answers in the result snippets
“With 22 million queries per day, Brave Search is the fastest growing search engine since Bing. We provide independent search results from our own index of the Web, and today we’re further improving the relevance of those results with our AI-powered Summarizer,” said Josep M. Pujol, Chief of Search at Brave. “Unlike AI chat tools which can provide fabricated responses, the Summarizer generates a plain-written summary at the top of the search results page, aggregating the latest sources on the Web and providing source attribution for transparency and accountability. This open system is available to all Brave Search users today to help them better navigate search results.”
Unlike many others that have released similar features recently, we do not rely on third parties, nor do we limit access due to scalability concerns. The Brave Summarizer relies on our owned and operated models that are highly tuned to be as efficient as possible at inference time. Today, Brave Search processes daily peaks of 600 queries per second, which are then evaluated against our AI model. Although a summary is generated for about 17% of queries, we expect this number to grow in the near future as we scale our system. The Brave AI model is probably the largest such system in production to date, in that it receives more traffic than others in terms of queries per second, we apply the Summarizer to all queries, and Bing and Google have yet to open up their systems. 1
Besides scalability, a tremendous effort has been put into ensuring the quality of the generated summaries. However, as the model is still in its early phase of development, there is the possibility of producing “hallucinations,” which mix unrelated snippets into a single result. There is also the possibility of some false or offensive text, but our aim is to continue working on improving the models as our users’ feedback starts pouring in.
The Summarizer was fully developed by the Brave Search team and as such is based on the same principles of independence and privacy that we apply across all products. The Summarizer is not powered by ChatGPT or its backend systems; it is instead composed of three different LLMs 2 trained on different tasks:
The first one is QA (question answering): this model is used to try to extract a concrete answer, if any, from text snippets. Brave has been using LLMs for a while to improve search relevance, and this is an extension of what Brave Search already had in place to power its knowledge graph and featured snippets features. The difference lies in the number and length of text snippets analyzed.
After the QA extraction phase, result candidates are further classified with an ensemble of zero-shot classifiers on a wide variety of criteria (hate-speech, vulgar writing, spam, etc).
The final set of candidate text is ultimately processed by the summarizer/paraphrasing model, which tries to rewrite the input so that repetition is removed and that language is kept uniform to improve readability.
We plan to share more technical details with special emphasis on the scalability aspect and lessons learned after the first weeks of our large scale release.
Note that the Summarizer is currently disabled in Brave Search Goggles (an innovative Brave Search feature that enables users to create filters to alter the ordering of search results) 3, while we refine our models to guarantee the quality of the input source for user-generated Goggles. We’ll be sharing more details about the Summarizer for Goggles in the near future.
Conclusions
The Brave Search Summarizer comes on top of multiple developments we integrated to improve search relevance in the past, and more recent developments that were triggered after the release of ChatGPT last December, notably after the announcement that Microsoft would deeply integrate OpenAI’s models into their search engine Bing. Although it has not been publicly released, early feedback of their model ranges from impressive 4 to scary 5.
Although the industry is generating much hype around AI, at Brave we are not yet convinced that LLMs can replace search as we know it. However, if used properly, these new models can help the user navigate results, which is the approach we follow with the Summarizer. Chat-like interfaces and oracle-based search remain unproven and, as of today, we remain skeptical that they’ll be useful for all search tasks.
That said, we are strong believers in any new technology that helps users satisfy their needs and puts them in control of their online experience. We will continue exploring how to apply LLMs not only to the search field, but also to the Brave browser, where we expect the assistant-like capabilities of LLMs to be truly fruitful and revolutionary.
Brave has a history of taking up big challenges, offering credible alternatives to BigTech. The Brave browser has over 57 million monthly active users, and Brave Search serves more than 22 million queries per day out of its independent index, which makes it the biggest search engine after Bing, Google, and other engines relying on Bing or Google APIs. Though it’s still early, we expect our developments with AI to follow a similar path and see proven user adoption.
Brave Search is available on allBravebrowsers (desktop, Android, and iOS), and is also available from any other browser atsearch.brave.com.
The base LLM models are based on either BART or DeBERTa (which are open source and hosted on Hugging Face), with heavy retraining based on our own data from search results. ↩︎
CodeLLM is aimed at programming queries, combining the depth and quality of search results with the summarization and explainability power of large language models.
Today, Brave is excited to announce the release of the Brave Search API, making the backbone of Brave Search available to companies and developers worldwide…