(Virtual) Boots on the Ground: The State of LLMs in the Workforce
Astronauts meeting on the moon [Firefly]
Tim Sharp from AI consultancy Gen8 tells us what’s really going on with LLMs in the workplace.
With over 300 million weekly users of ChatGPT alone, popular AI tools are increasingly intertwined with our workplaces, teams and sensitive information. Leaders are split; welcoming, banning or simply ignoring the issue. Yet for all of the above, ‘shadow’ (or ‘unsanctioned’) AI use creates a major new blindspot. Inaction is not an option. So what’s really happening, on the ground? Some themes from GEN8’s work training over 1,000 professionals in the what, why and how of large language models.
1. The Big Picture: Model Parity Gives Us Choice
Generally speaking, there is now little to separate the major foundation models in terms of day-to-day capabilities. Performance is no longer a moat. GPT, Claude, Gemini et al. are all competent with the sorts of tasks undertaken by those hundreds of millions of knowledge workers; from summarising a report to crunching market research.
The new frontier of ‘reasoning’ is more nuanced. As I write, DeepSeek’s R1 is causing a major splash (notable, even in a churning ocean of hype) for matching OpenAI’s performance with a fraction of the resources. Not only is the model a high performer, it’s also ‘open’ (if not truly open-source); a domain where the likes of Meta had already been leveling the playing field by making billions of dollars in AI investment available for free. Don’t mention the geopolitics.
The implication for everyday users is that, for now, there is no ‘best’ foundation model. ChatGPT has lost its first-mover-advantage, at least for everyday tasks. But no, that hasn’t gone to DeepSeek. For many, Anthropic has the best general purpose model in Claude Sonnet. We often think of Claude as a sort of virtual chief-of-staff; able to help with everything from pricing strategy to data analysis to entering new markets.
If you’re interested to really zoom in to the differences between the major foundation models, including multimodalities (like interpreting or generating images), Ethan Mollick has a newly updated overview:
Battlestar Distribution
Performance aside, a key variable in the AI wars is that many organisations are yet to provide, or even acknowledge, such tools in the workplace. And so, among the few tech companies able to train high-end models, distribution becomes the battleground. For Microsoft, that’s Office. Google: Search, Gmail, Workspace. Meta: three billion users. You get the idea. With performance increasingly commoditised, the AI we use at work could very well become, over time, the one offered (or bundled) by an existing vendor.
If this all rings a bit of a bell, we’ve been here before. Netscape vs. Internet Explorer, perhaps, or (speaking of Microsoft) more recently, Slack vs. Teams. In 2016, Slack infamously took out a full page ad in the New York Times (“welcome to the revolution”), only to see Teams crush its first-mover-advantage care of Office 365. Today, Slack is a minnow, largely for the fact that Teams is bundled with Office. In the AI race to date, Copilot or Gemini are not well-loved products. But there’s a lot to be said for vertical integration.
Yet despite their distribution challenges it must also be said that the newer players are remarkably bullish, which brings us to OpenAI. Reputation, however, goes a long way, with the ChatGPT creator’s very public brand of chaos leading us to question whether initiatives like the $500bn ‘Stargate’ are more style than substance. Will this prove to be a mirage, or supreme confidence in making today’s AI look very quaint, very quickly?
Regardless, in the here and now, real-world users are spoilt for choice when it comes to picking an LLM. Even if - at work - that might really only net out to two or three of the usual suspects. None of which is to mention a vastly different calculus for those building Generative AI products or systems. And spare a thought for traditional machine learning. One needn’t take a Ferrari to the shops.
Article continues below.
Like this article? Join the thousands of tech founders, board members and investors who subscribe to our free monthly newsletter, Tech Round-Up. Sign up below!
2. Shadow AI: They Work Among Us
Importantly, this idea of longer-term consolidation also masks the very messy muddle that many workplaces currently find themselves in, thanks to all those early adopters.
Many companies remain unprepared for Generative AI, even though it’s most certainly here. In a recent survey of 5,000 knowledge workers, Section (where I teach ‘AI for Marketers’) found that “43% of employees in companies that have explicitly banned AI still use it [anyway]”. At GEN8, we’ve been seeing for well over a year now that at least 50% of client teams are using LLMs in some way, authorised or not, at least once a week. And rising.
Adept users see real gains; like the ASX-listed technology firm saving 300 hours per week in a crack team of 30. Or the BCG consultants completing everyday tasks (from customer segmentation to writing a press release) 40% faster, 25% more effectively. On the flipside, these users know what an hallucination is, and how to spot them. Search engines, LLMs are not.
Time saved is a common theme. But what to do with it? Our recommendation is to invest in strategy and people. In the public sector, the Australian government offers a meaningful approach. During a national trial, 40% of Copilot users found more time for higher-value (and decidedly more human) activities like mentoring or strategic planning.
For those sitting this one out: don’t. Waiting for Copilot or Gemini to formally land in your inbox is not a strategy. The chairs are being rearranged; wait too long, and you may find yourself without. Major disruption looms, whether from a competitor with more time on its hands, or from someone completely rethinking how things are done.
3. Man + Machine: A Growing Skills Divide
Some of these trends seem destined to resolve, or at least collide with, each other. Corporate adoption increases. Shadow AI use decreases. Agents pick up the tedious bits, and so on. Yet lost in all this is how to actually get your business there; identifying the benefits to working with LLMs.
Upskilling plays a crucial role. We find, for example, that confidence in using large language models can double after training, and that companies with even small teams can go on to save hundreds of hours per week. The ones rolling up their sleeves and figuring it out are simply better-positioned to capitalise.
For the enterprise, the gains are both bigger… and more existential. Take Moderna, where “a team of a few thousand can perform like a team of 100,000”. Or Klarna, where duelling press appearances tout both a shrinking workforce and enhanced productivity from Generative AI. Our POV? Don’t follow Klarna’s lead. The tools are showing most value as an aide, not a replacement. Generative AI remains an emerging, experimental technology. Reduction-in-force care of LLMs is, in our opinion, a misguided strategy. A very human hallucination.
Looking Ahead: A Foregone Revolution?
No. To really parse the implications of ever-capable foundation models, we also need to take the FOMO out of it; abandoning cliches like ‘AI won’t replace you, a person using it will’. Often (very often) the right answer is not to use AI, or certainly not ‘just because’. Questionable implementations abound. Like the Malaysian radio station deploying an AI DJ. Or the New Zealand supermarket offering up poisonous recipes. And then there’s intellectual property. If OpenAI is the Titanic, The New York Times is a pretty big iceberg.
The future comes with a question mark. Answering it for the ‘age of AI’ requires a clear understanding of what the technology is and - crucially - isn’t. If you’ve done the work, and the answer for your teams is positive: great, but don’t forget to upgrade your governance. Create the conditions for shared innovation. Whether that leads to increased productivity, more time for strategy or new ways of serving customers.
If LLMs carry too much risk and aren’t right for your business: also great. Deciding when not to adopt Generative AI is as important as deciding when to lean in. But update your governance all the same; telling your people why AI is - or isn’t - right for you.
Two years into the GPT revolution it remains a noisy, chaotic environment. Our advice is to pursue extreme clarity; shining a light on the good, the bad and the ugly. With AI increasingly intertwined with our teams, meetings and sensitive information; inaction is not an option. Large language models cast a long shadow. Don’t let them fly under the radar.