The news and data giant has — with a relatively small team — built a generative AI that it says outperforms the competition on its own specific information needs.
If you were going to predict which news company would be the first out with its own massive AI model, Bloomberg would’ve been a good bet. For all its success expanding into consumer-facing news over the past decade, Bloomberg is fundamentally a data company, driven by $30,000/year subscriptions to its terminals.
On Friday, the company announced it had built something called BloombergGPT. Think of it as a computer that aims to “know” everything the entire company “knows.”
Bloomberg today released a research paper detailing the development of BloombergGPT™, a new large-scale generative artificial intelligence (AI) model. This large language model (LLM) has been specifically trained on a wide range of financial data to support a diverse set of natural language processing (NLP) tasks within the financial industry.
Recent advances in Artificial Intelligence (AI) based on LLMs have already demonstrated exciting new applications for many domains. However, the complexity and unique terminology of the financial domain warrant a domain-specific model. BloombergGPT represents the first step in the development and application of this new technology for the financial industry. This model will assist Bloomberg in improving existing financial NLP tasks, such as sentiment analysis, named entity recognition, news classification, and question answering, among others. Furthermore, BloombergGPT will unlock new opportunities for marshalling the vast quantities of data available on the Bloomberg Terminal to better help the firm’s customers, while bringing the full potential of AI to the financial domain.
The technical details are, as promised, in this research paper. It’s by Bloomberg’s Shijie Wu, Ozan İrsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Sebastian Gehrmann, Prabhanjan Kambadur, David Rosenberg, and Gideon Mann.
How big is BloombergGPT? Well, the company says it was trained on a corpus of more than 700 billion tokens (or word fragments). For context, GPT-3, released in 2020, was trained on about 500 billion. (OpenAI has declined to reveal any equivalent number for GPT-4, the successor released last month, citing “the competitive landscape.”)
What’s in all that training data? Of the 700 million-plus tokens, 363 billion are taken from Bloomberg’s own financial data, the sort of information that powers its terminals — “the largest domain-specific dataset yet” constructed, it says. Another 345 billion tokens come from “general purpose datasets” obtained from elsewhere.
Rather than building a general-purpose LLM, or a small LLM exclusively on domain-specific data, we take a mixed approach. General models cover many domains, are able to perform at a high level across a wide variety of tasks, and obviate the need for specialization during training time. However, results from existing domain-specific models show that general models cannot replace them. At Bloomberg, we support a very large and diverse set of tasks, well served by a general model, but the vast majority of our applications are within the financial domain, better served by a specific model. For that reason, we set out to build a model that achieves best-in-class results on financial benchmarks, while also maintaining competitive performance on general-purpose LLM benchmarks.
The company-specific data, named FinPile, consists of “a range of English financial documents including news, filings, press releases, web-scraped financial documents, and social media drawn from the Bloomberg archives.” So if you’ve read a Bloomberg Businessweek story in the past few years, it’s in there. So are SEC filings, Bloomberg TV transcripts, Fed data, and “other data relevant to the financial markets.” It’s also trained on non-Bloomberg news sources:
The News category includes all news sources excluding news articles written by Bloomberg journalists. Overall, there are hundreds of English news sources in FinPile…Generally, the content in this dataset comes from reputable sources of news that are relevant to the financial community so as to maintain factuality and reduce bias.
The non-finance-specific data includes a massive corpus poetically known as The Pile; it includes everything from YouTube captions to Project Gutenberg to, yes, the cache of Enron emails that are always popping up in AI training. (It also has a complete copy of Wikipedia as of last July.)
But enough about the training data. What can BloombergGPT do? Because it shares a training base with other LLMs, BloombergGPT can do the sorts of things that we’ve come to expect from ChatGPT and similar models. But it can also perform tasks more tightly connected to Bloomberg’s needs. It can translate natural language requests (“apple and ibm market cap and eps”) into the Bloomberg Query Language terminal users love/hate (
get(cur_mkt_cap,is_eps) for([’AAPL US Equity’,’IBM US Equity’])). It can also suggest Bloomberg-style headlines for news stories (sorry, copy editors):
Input: The US housing market shrank in value by $2.3 trillion, or 4.9%, in the second half of 2022, according to Redfin. That’s the largest drop in percentage terms since the 2008 housing crisis, when values slumped 5.8% during the same period.
Output: Home Prices See Biggest Drop in 15 Years
Input: The global economy is in a better place today than many predicted months ago, Janet Yellen said at the G20. At home, she pointed to a resilient US economy, where headline inflation has moderated and the labor market is strong. She also called for the IMF to move swiftly toward a fully-financed program for Ukraine.
Output: Yellen Sees Global Economy More Resilient Than Expected
Input: Google was sued by the US and eight states seeking the breakup of its ad-tech business for allegedly monopolizing the digital advertising market. The lawsuit is the Biden administration’s first big challenge to a tech titan and one of the rare times since 1982 that the DOJ has sought to cleave up a major company.
Output: Google Sued for Monopoly in Online Ad Market
It’s also better tuned, they say, to answer specific business-related questions, whether they be sentiment analysis, categorization, data extraction, or something else entirely. (“For example, it performs well at identifying the CEO of a company.”)
The paper includes a series of performance comparisons with GPT-3 and other LLMs and finds that BloombergGPT holds its own on general tasks — at least when facing off against similarly sized models — and outperforms on many finance-specific ones. (The internal testing battery includes such carnival-game-ready terms as “Penguins in a Table,” “Snarks,” “Web of Lies,” and the dreaded “Hyperbaton.”)
Across dozens of tasks in many benchmarks a clear picture emerges. Among the models with tens of billions of parameters that we compare to, BloombergGPT performs the best. Furthermore, in some cases, it is competitive or exceeds the performance of much larger models (hundreds of billions of parameters). While our goal for BloombergGPT was to be a best-in-class model for financial tasks, and we included general-purpose training data to support domain-specific training, the model has still attained abilities on general-purpose data that exceed similarly sized models, and in some cases match or outperform much larger models.
Penguins aside, it’s not hard to imagine more specific use cases that go beyond benchmarking, either for Bloomberg’s journalists or its terminal customers. (The company’s announcement didn’t specify what it planned to do with what it has built.) A corpus of ~all of the world’s premium English-language business reporting — plus the universe of financial data, structured and otherwise, that underpins it — is just the sort of rich vein of information a generative AI is designed to mine. It’s institutional memory in a box.
That said, all the usual caveats for LLMs apply. BloombergGPT can, I’m sure, hallucinate. All that training data comes with its own set of potential biases. (I’d wager BloombergGPT won’t call for the revolution of the proletariat anytime soon.)
As for how BloombergGPT might inspire other news organizations…well, Bloomberg’s in a pretty unique situation here, with the scale of data it’s assembled and the product it can be applied to. But I believe there will be, in the longer term, openings for smaller publishers here, especially those with large digitized archives. Imagine the Anytown Gazette training an AI on 100 years of its newspaper archives, plus a massive collection of city/county/state documents and whatever other sources of local data it can get its hands on.
It’s a radically different scale than what Bloomberg can reach, of course, and it may be more useful as an internal tool than anything public-facing. But given the incredible pace of AI advances over the past year, it might be a worthy idea sooner than you think.