Blog: Are Search Engines Doomed to be Replaced by ChatGPT?

December 15, 2022
Generative AI, LLM, Search Engine

The latest chat bot, ChatGPT from OpenAI, is hailed as a know-it-all candidate that could soon change the way we search.

Share This post

NATURAL LANGUAGE AND LARGE LANGUAGE MODELS (LLMS)

Language is a multifaceted phenomenon that can be described on neurological, biological, psychological, social, anthropological and even historical levels. It is the heart of what it is to be human, arguably more so than our emotions or art or music (those being culture-specific and not universal, unlike speaking), or non-linguistic cognitive facilities (which we share, to varying degrees, with other animals). Even people who cannot speak are submerged in a world which humans have created over hundreds of millennia through the use of words. Somehow, it feels as if trying to reduce this to a single “paradigm” or “theoretical framework” is to miss the point, like talking about a painting solely in terms of the chemical composition of the paint.

One easily gets the impression that the entire domains of natural language processing (NLP) and natural language understanding (NLU) have been reduced to transformer-based chat bot programs. Without a doubt, the latest LLMs—Large Language Models—are fun to play with, especially when going beyond simple conversations and into the territory of more creative activities. This category of models is able to “create” simple art, prose, music and even animations/videos. The act of giving the model the right prompt is already an art form in and of itself.

We can also use this type of technology to write code. The quality of the output in this case is easier to assess, since machine-written code can be validated by running it and seeing whether it produces the desired results.

Having a productive coding session with OpenAI’s ChatGPT makes a lot of sense on the surface. As long as we can describe what we want with any precision, the bot will try to compose complete functions and programs; even if these contain bugs or are written in an idiosyncratic style, it is in many cases faster to start from a 90% working program and fix it manually than to write it from scratch. In the future, automatic coding will likely also take run-time error messages and compiler messages into account, so a correct version of the final program can be generated in a single or just a few rounds.

ChatGPT clearly solves many mundane, day-to-day coding tasks. It works exceedingly well where a very specific solution to an isolated problem is desired, such as creating a function for calculating some value. Asking for boilerplate code for talking to APIs also works extremely well, since there are so many examples it can “learn” how to perform this from. This way, ChatGPT-powered coding will likely save us a lot of time.

Once we look at more complex coding tasks, we can see that there still is plenty of room for improvement. Furthermore, we also have to remember the old software engineering joke: “so many programmers, and so few people to tell them what to do.” So, one could question whether ChatGPT is a truly profound, world-changing development; all it might lead to is a replacement of some low-level coders with AI. At the very least, it seems like creative non-coders might be able to utilize the technology to achieve things that previously lay beyond their grasp.

SEARCH AND LLM’S

Search engines today rely to a great extent on traditional NLP, including parsing, chunking and other techniques. The largest search engines may even make use of LLMs to support search in various ways (e.g., query expansion, query understanding). But at the present, LLMs are still confined to a supporting role within the domain of large-scale search.

This is because search is a slightly different problem than language generation. It is the process of locating information that already exists. Generative LLMs, on the other hand, build new things from old pieces of data. Even if we have an AI capable of generating responses to every possible query (which is easy to set up currently), it still does not mirror the actual, underlying reality, because reality is in many cases characterized by a lack of data and confusing noise. Granting an AI model enough introspection to know when it should not generate a certain output is a super-complex and unresolved problem. If search is taken as a truth-finding process, we can clearly see why trust in an LLM might be misplaced. For one, by “transforming” the underlying data in various ways, LLMs introduce extra interpretation layers between data and output that might not be what the user wants. LLMs are also notorious for synthesizing (often called “hallucinations”); worse yet, they may output the results of their hallucinating in an authoritative writing style that implies subject expertise, when such is not actually warranted.

In subjects such as health, this is dangerous and irresponsible. They are generating “answers,” not finding reliable answers. NLP can be useful for interpreting human statements and feelings, as one step in a more responsive—and responsible—process of seeking information. Other AI and search approaches then also need to be involved, alongside human interpretation. Context outside the scope of words matter profoundly for analysis and interpretation.

GOING ALL-IN WITH AN LLM AS A SEARCH ENGINE

An LLM-powered search engine would be an interesting exercise. Unfortunately, several problems need to be resolved before this is realistic. In addition to the issues raised above, we can also mention the sheer cost of running such a service. One can imagine that an LLM engine would have to rely heavily on advertising to make revenue. We all know from the past that this road has only led to search monopolies and stagnation over the last 20 years or so (Read more on the topic in Next Search Possible).
Many areas of the internet feature extremely few data points. If you try to build a catalog of all retail products on sale in online marketplaces, you will soon come to the realization that only a few percent (out of billions) come with any significant amount of useful data attached to them. For the rest, you will have to fill out gaps by means of guesswork. An LLM could make this arduous task somewhat simpler, but can we hope that it is accurate? And can we evaluate that compared with a human annotator or a simpler, rules-based algorithm for filling in missing data fields. When a buyer comes prepared to part with serious money to get hold of a rare item, it is questionable to what degree they would trust an LLM over their own or another human’s judgment.

Information reality is also about controlled sparseness, especially on the internet. Much information is fully locked down—by design—behind paywalls because the data is a financial asset to someone. Limiting information access to earn money by selling it is an ancient business model. Thus, an infinite number of cat images float around out there for an AI to learn how to synthesize felines from; but in contrast, try to construct a bot that will accurately answer knowledge-based questions and provide reliable services. If such a thing existed, many service businesses could likely collapse; hence, this information is tightly controlled and usually not available, or at least not for free. In this case, a traditional (non-LLM) search engine helps us find what is findable, ignores the rest since it cannot be reached, and it can do this without making things up or requiring huge data volumes.

As described above, areas where traditional search engines excel over theoretical search LLMs are rooted in trustworthiness and authority. LLMs do not, as a rule, care about truth—they generate their answers based on statistical relations mined from vast amounts of data fed to them in the training stage. For example, imagine you want to try going on a diet. How do you choose the most suitable one? The noise level in this space if enormous, as everyone knows who have tried searching anything diet-related. It’s a random game. We have biased input coming from all directions. The underlying science is weak. Commercial interests distort the available information. So, there are contradictions, ambiguity and bias everywhere. Is this really the kind of uninterpreted data we want to synthesize an answer from? Equating search with a generative statistical technology that understands absolutely nothing could potentially prove to be a big step backwards.

PROS AND CONS FOR LLM’S AS SEARCH ENGINES

Problems with LLMs as Search Engines

LLMs truly extends the concept of “one model to rule them all”. We’ve seen this over and over in the world of search and hopefully it will be replaced in the future by a more collaborative and open approach to search.
LLMs currently require massive training and are very far from actually managing near-real-time data acquisition, which is what we expect from search engines.
LLMs synthesize (generate) data. This synthesis, sometimes jokingly referred to as “hallucination,” may not always reflect reality. When searching, we look for reality, preferably from the source and not an interpretation/transformation.
LLMs are hard to validate. It may even be difficult to produce the same results based on a given input, something we require from most scientific research work. This is the reproducibility problem.
To block LLMs from producing results where they should not provide a result seems to be an unresolved problem. High frequency queries are currently managed by means of manual curation, but not so for queries far out on the long tail. The tail is very, very long in search.
LLMs can explain their outputs in the same way as they generate answers. But this is the same shaky statistical ground they stand on, far from understanding which normally leads to explanations. There is no simple way we can understand the underlying parameter space, either.
They are very energy-consuming. Training the largest LLMs takes massive amounts of energy. Having a development where everyone needs to train their own models is not sustainable, so a general discussion about access will arise. The “one model to rule them all” is far from sufficient for humanity.

Benefits with LLMs as Search Engines

Transformers and similar Deep Learning tech have many beneficial advantages in the context of search. The current state is primarily supporting roles in search, rather than actually replacing search engines. A few examples:
- Query auto-completion
- Query expansion
- Query understanding (for example to extract query intent)
- Transformation of input queries to increase recall & precision
- Summarization of text to extract useful facts and explanations
- Language translation
Guided search based on bot interaction, where a dialog with a user may lead to better formulated queries or insights. Using LLMs as an interface to search engines seems to be a promising approach, but of course at their current level tripped-up by the problems highlighted above.
Exploration and discovery functions, making use of creative synthesis.
“Entertainment Search” is a possible area for LLM’s today, that’s why we see so many fun chat bots emerging.

More To Explore

Blogpost

AI and the Progress Toward the Minimal and Relevant

May 2, 2025