OpenAI-Google battle is less about search. It’s more about AGI

OpenAI-Google battle is less about search. It’s more about AGI

Source: Live Mint

While this is the obvious part, beneath the surface, the bigger fight is also about controlling all streams of user data, including those from search engines and social media, which can help big tech companies such as Google, OpenAI, Microsoft, Meta, Nvidia and Elon Musk’s xAI build the world’s most powerful artificial intelligence (AI) model.

ChatGPT managed to garner more than 100 million users in just the first two months of its launch in December 2023, prompting many to dub it a search-engine killer. The reason was that ChatGPT allows us to write poems, articles, tweets, books, and even code like humans and is interactive, while search engines passively provide article links. Microsoft, which has a stake in OpenAI, even integrated ChatGPT with its own search engine, Bing. At that time, though, ChatGPT was still being tested and lacked knowledge of current events, having trained on data only till the end of 2021.

From September 2023, ChatGPT began accessing the internet, thus providing up-to-date information. But it started facing allegations of “verbatim”, “paraphrase”, and “idea” plagiarism and copyright violations from publishers around the world. Late last year, for instance, The New York Times initiated legal proceedings against Microsoft and OpenAI, alleging unauthorized “copying and using millions of its articles”. OpenAI did give publishers the option to block bots from crawling their content but separating AI bots from those originating from search engines such as Google or Microsoft’s Bing, which facilitate page indexing and visibility in search outcomes, is easier said than done.

OpenAI’s SearchGPT prototype, which is currently available for testing, will not only access the web but also provide “clear links to relevant sources”, the company said in a blog post on 26 July. This implies that more than targeting Google’s search engine, OpenAI appears to be trying to pacify and rebuild rapport with publishers it has antagonised. And this time around, OpenAI is “…also launching a way for publishers to manage how they appear in SearchGPT, so publishers have more choices”.

It clarifies that SearchGPT is about search and “separate from training OpenAI’s generative AI foundation models”. It adds that the search results will show sites even if they opt out of generative AI training. OpenAI explains that a webmaster can allow its “OAI-SearchBot to appear in search results while disallowing GPTbot to indicate that crawled content should not be used for training OpenAI’s generative AI foundation models”.

Equations are changing, but slowly

To be sure, ChatGPT’s success is already making a dent in Google’s worldwide lead, which makes most of its revenue from advertising. For instance, Google saw its smallest search market share on desktops registered in more than a decade. Microsoft’s Bing, which supported and integrated ChatGPT into its service, surpassed 10% of the market share on desktop devices, according to Statista.

Google, whose advertising search revenue was $279.3 billion in 2023, is taking a hit, with many users already preferring Generative AI (GenAI) for searching online information first. “Many companies heard the call and saw $13 billion invested in generative AI (GenAI) for broad usage, namely search engines and large language models (LLMs), in 2023,” according to Statista.

Yet, Google, according to Statista, continues to control more than 90% of the search-engine market worldwide across all devices, handling over 60% of all search queries in the US alone and generating over $206.5 billion in ad revenues from its search engine and YouTube. In India, too, the search-engine giant has a market share of over 92%, but in countries like Germany and France, though, online users are increasingly choosing “privacy- or sustainability-focused alternatives such as DuckDuckGo or Ecosia”, according to Statista. China, on its part, has Baidu, while South Korea favours Naver; even Russia’s Yandex now has the third-largest market share among search engines worldwide.

ChatGPT certainly did not topple Google, agrees Dan Faggella, founder of market research firm Emerj Artificial Intelligence Research. “But it (OpenAI) definitely was seemingly their strongest real competitor,” he adds. “I’m much more nervous for Perplexity in, say, the next three months than I am about Google,” says Fagella, for the lack of a “differentiator”.

“I think it’s a cool app. But I wonder if there’s enough of a context wrap for things like enterprise search. Google used to do enterprise search but no longer sees sense in it,” he adds. Perplexity, which has raised $100 million from the likes of Amazon founder Jeff Bezos and Nvidia, was valued at $520 million in its last funding round.

In a February interview with Mint, Srinivas argued that while Google will continue to have a “90-94% market share”, they will lose “a lot of the high-value traffic—from people who live in high-GDP countries and earning a lot of money, and those who value their time and are willing to pay for a service that helps them with the task”. He argued that over time, “the high-value traffic will slowly go elsewhere”, while low-value “navigational traffic” will remain on Google, making Google “a legacy platform that supports a lot of navigation services”.

“The bigger consideration is that the means and interfaces through which search occurs are evolving. These may become new interfaces other than the Chrome tab, where Google can very much get pushed aside, and I think the VR (virtual reality) ecosystem will be part of that as well. I don’t see Google dying tomorrow. But I think they should be shaking in their boots a little bit around what the future of search will be,” says Fagella.

Race to dominate the AI space

Fagella believes that “search is a subset of a much broader substrate monopoly game. It’s all about owning the streams of attention and activity—from personal and business users for things like their workflows, personal lives and conversations to help them (big tech companies) build the most powerful AI”. This, he explains, is why all big companies want you to have their chat assistant so that they can continue to economically dominate.

Fagella believes that all the moves indicate that the big tech companies, including Google, Meta, and OpenAI, “are ardently moving towards artificial general intelligence (AGI). “Apple’s a little quieter about it. I don’t know where Tim Cook stands. They’re always a little bit more standoffish. But suffice it to say, they’re probably in that same running as well, although seemingly not as overt about it,” he adds.

OpenAI, for instance, has multimodal GenAI models, including GPT-4o and GPT-4 Turbo, while Google’s Gemini 1.5 Flash is available for free in more than 40 languages. Meta recently released Llama 3.1 with 405 billion parameters, which is the largest open model to date, and Mistral Large 2 is a 128 billion-parameter multilingual LLM. Big tech companies are also marching ahead on the path to achieve AGI, which envisages AI systems that are smarter than humans.

OpenAI argues that because “…the upside of AGI is so great, we do not believe it is possible or desirable for society to stop its development forever; instead, society and the developers of AGI have to figure out how to get it right…We don’t expect the future to be an unqualified utopia, but we want to maximize the good and minimize the bad and for AGI to be an amplifier of humanity”.

And OpenAI does not mind spending a lot of money to pursue this goal. The ChatGPT maker could lose as much as $5 billion this year, according to an analysis by The Information. However, in a conversation this May with Stanford adjunct lecturer Ravi Belani, Sam Altman said, “Whether we burn $500 million a year, or $5 billion or $50 billion a year, I don’t care. I genuinely don’t (care) as long as we can, I think, stay on a trajectory where eventually we create way more value for society than that, and as long as we can figure out a way to pay the bills like we’re making AGI it’s going to be expensive it’s totally worth it,” he added.

In July, Google DeepMind proposed six levels of AGI “based on depth (performance) and breadth (generality) of capabilities”. While the ‘0’ level is no AGI, the other five levels of AGI performance are: Emerging, competent, expert, virtuoso and superhuman. Meta, too, says it’s long-term vision is to build AGI that is “open and built responsibly so that it can be widely available for everyone to benefit from”. Meanwhile, it plans to grow its AI infrastructure by the end of this year with two 24,000 graphics processing unit (GPU) clusters using its in-house designed Grand Teton open GPU hardware platform.

Elon Musk’s xAI company, too, has unveiled the Memphis Supercluster, underscoring the partnership between xAI, X and Nvidia, while firming up his plans to build a massive supercomputer and “create the world’s most powerful AI”. Musk aims to have this supercomputer—which will integrate 100,000 ‘Hopper’ H100 Nvidia graphics processing units (and not Nvidia’s H200 chips or its upcoming Blackwell-based B100 and B200 GPUs)—up and running by the fall of 2025.

What can spoil the party

No AI model to date can be said to have powers of reasoning and feelings as humans do. Even Google DeepMind underscores that other than the ‘Emerging’ level, the other four AGI levels are yet to be achieved. LLMs, too, remain highly advanced next-word prediction machines and still hallucinate a lot, prompting sceptics like Gary Marcus, professor emeritus of psychology and neural science at New York University, to predict that the GenAI “…bubble will begin to burst within the next 12 months”, leading to an “AI winter of sorts”.

“My strong intuition, having studied neural networks for over 30 years (they were part of his dissertation) and LLMs since 2019, is that LLMs are simply never going to work reliably, at least not in the general form that so many people last year seemed to be hoping. Perhaps the deepest problem is that LLMs literally can’t sanity-check their own work,” says Marcus.

I elaborated on these points in my 19 July newsletter, Misplaced enthusiasm over AI Appreciation Day. When will AI, GenAI provide RoI?, where Daron Acemoglu, institute professor at the Massachusetts Institute of Technology (MIT), argues that while GenAI “is a true human invention” and should be “celebrated”, “too much optimism and hype may lead to the premature use of technologies that are not yet ready for prime time”. His interview was published in a recent report, Gen AI: too much spend, too little benefit?, by Goldman Sachs.

There’s also the fear that all big AI models will eventually run out of finite data sources like Common Crawl, Wikipedia and even YouTube to train their AI models. However, a report in The New York Times said many of the “most important web sources used for training AI models have restricted the use of their data”, citing a study published by the Data Provenance Initiative, an MIT-led research group.

“Indeed, there is only so much Wikipedia to vacuum up. It takes billions of dollars to train this thing, and you’re going to suck that up pretty quickly. You’re also going to start sucking up all the videos pretty quickly, despite how quickly we can pump them in,” Fagella agrees.

He believes that the future of AI development will involve integrating sensory data from real-world interactions, such as through cameras, audio, infrared, and tactile inputs, along with robotics. This transition will enable AI models to gain a deeper understanding of the physical world, enhancing their capabilities beyond what is possible with current data.

Fagella points out that the competition for real-world data and the strategic deployment of AI in robotics and life sciences will shape the future economy, with major corporations investing heavily in AI infrastructure and data acquisition, even as data privacy and security will remain critical issues. He concludes, “The inevitable transition is to be touching the world.”



Read Full Article

Leave a Reply

Your email address will not be published. Required fields are marked *