WORLD MODELS: WHAT THEY ARE AND WHERE THEY MIGHT MATTER

Oct 7, 2025
11 min read

Updated: Apr 8

model of earth with pixelated topography

This short essay posits a crucial distinction between the current dominance of Large Language Models and an emergent, perhaps more profound, approach to AI: world models.

While LLMs are masters of pattern recognition, trained on vast textual datasets to reproduce human-like language, they lack a genuine understanding of the world they describe.

World models, in contrast, endeavour to build internal representations of reality, comprehending the physics, causality and temporal sequences that govern our world.

This is not merely a different technique; it is a quest for a different kind of intelligence, one that moves beyond mimicry to a genuine grasp of the mechanisms of reality. We have identified six principal methodologies driving this research. From the video generation of OpenAI’s Sora, which must implicitly learn the laws of physics to create realistic scenes, to physics-informed models that explicitly embed these laws into their architecture, the approaches are diverse.

Neurosymbolic systems that merge neural pattern recognition with logical reasoning, and embodied AI that learns through direct interaction with its environment, such as the DreamerV3 agent that mastered Minecraft, are also key avenues of exploration. Causal models that untangle cause and effect and structured models that deconstruct scenes into objects and their relationships further illustrate the multifaceted nature of this field. The eventual applications could be transformative. In BioPharma, for example, world models could move beyond statistical correlation to a mechanistic understanding of drug interactions at a molecular level, simulating biological processes over time and mapping causal pathways of disease.

For MedTech, the prospect of digital patient simulation, where devices are tested against a virtual representation of human physiology, is a tantalising one. Surgical robots with an innate understanding of anatomy and tissue properties are another potential outcome. And in sectors as diverse as insurance and complex manufacturing, the move from statistical risk analysis to causal understanding and from data-driven process optimisation to a true comprehension of the underlying physics of production could unlock new levels of efficiency and insight.

Yet, have no doubt. Most practical uses of world models still at least two years away. In the interim, the value to be extracted from current LLM technology is immense and should not be overlooked. The current significance of world models, therefore, lies not in their immediate business applications but in the intellectual preparation they demand from us. To understand this nascent field is to be better equipped to ask the right questions, to recognise early-stage applications as they emerge and to engage in informed conversations about the future trajectory of AI.

The ultimate prize is an AI that does not just process information about the world but one that truly understands how it works. In regulated industries, where the 'why' matters as much as the 'what', that could prove to be a very powerful development indeed

The Core Concepts

Beyond Pattern Matching to Reality Understanding

The pursuit of world models represents a fundamental pivot in artificial intelligence, a move beyond statistical pattern matching to a genuine understanding of reality. This is not a monolithic endeavour; rather, it is a field rich with diverse and sometimes overlapping methodologies.

One of the most intuitive of these is Video Generation as World Understanding. The logic is compelling: for an AI like OpenAI's Sora, Google's Veo or DeepMind's Genie to generate a realistic video sequence, it must build an internal, predictive model of the physical world.

These systems, often built on diffusion transformer architectures, learn from millions of videos to anticipate the next frame. To do this convincingly, they must implicitly learn the rules of 'physics in the pixels': that objects have permanence, gravity pulls things down and liquids splash according to fluid dynamics.

The AI does not 'know' physics in the human sense, but it constructs a representation in its latent space that corresponds to these rules. The current limitations, however, betray the nascent stage of this approach. Systems often struggle with the fine-grained consequences of complex object interactions, maintaining long-term consistency or accurately rendering intricate phenomena like reflections and shadows.

A more direct strategy is found in Physics-Informed Models, particularly Physics-Informed Neural Networks (PINNs). This more mature field eschews implicit learning in favour of explicitly embedding known physical laws directly into the model's architecture.

A neural network is trained on data, but its loss function is augmented to penalise it for violating established principles of thermodynamics or fluid dynamics, typically expressed as partial differential equations. Pioneered by researchers like George Karniadakis at Brown University, this method acts as a powerful regulariser, forcing the AI's predictions to remain physically plausible even where observational data is sparse. This has found proven success in complex, predictable domains like weather forecasting, subsurface geological modelling and simulating turbulent fluid flow for engineering applications. Their weakness, of course, is that they are only as good as our existing knowledge of physics; they cannot function where the underlying laws are unknown or too complex to articulate.

Other researchers are exploring hybrid Neurosymbolic Approaches, attempting to fuse the powerful pattern-recognition capabilities of neural networks with the rigorous, logical reasoning of symbolic AI. Neural networks excel at processing noisy, high-dimensional data from the real world, while symbolic systems provide the capacity for abstract reasoning, planning and transparency. A key challenge is the 'symbol grounding problem'—connecting abstract symbols like 'cup' to the pixel data a neural network sees.

Work from institutions like the MIT-IBM Watson AI Lab is yielding systems where a neural network identifies objects in a robot’s field of vision, and a symbolic reasoner then uses a knowledge base of rules to infer relationships and plan actions, such as knowing a cup can be picked up if it is on a table.

A fourth path, Embodied and Active Inference, seeks to ground understanding in action. It proposes that intelligence is not passively acquired but is built through direct interaction with an environment. This approach is central to modern robotics and is powerfully illustrated by agents like DeepMind's DreamerV3.

This agent learned to master Minecraft not by being fed data, but by building its own predictive world model through trial and error, then using that internal model to 'dream' about the consequences of future actions. This aligns with Karl Friston's free energy principle, where intelligent agents constantly act to minimise the 'surprise' or error between their world model's predictions and what their senses actually perceive. Research from labs like Berkeley's AI Research Lab (BAIR) shows robots learning the 'common sense' physics of objects—their weight, friction and fragility—not from a textbook but by physically manipulating them.

This contrasts with the more abstract aim of Causal Models, a field heavily influenced by the work of Judea Pearl. These systems attempt to move beyond the statistical correlations that underpin so much of current machine learning—the observation that 'A and B happen together'—to a more profound grasp of 'A causes B'.

Using formalisms like Directed Acyclic Graphs, these models can make predictions about interventions, asking 'what would happen if we changed something?'. This enables counterfactual reasoning, which is critical in fields like epidemiology for isolating the true effect of a drug from confounding factors, or in business for determining the actual impact of a marketing campaign. The central difficulty remains causal discovery: inferring the correct causal structure from observational data alone is a formidable challenge.

Finally, Structured and Compositional Models are built on the premise that the world is composed of distinct things and their interactions. Instead of learning a single, monolithic representation of a scene, these models learn to deconstruct it into its constituent parts, often represented in a 'scene graph'. This graph contains nodes for objects, like 'dog' or 'ball', and edges defining their relationships, such as 'is chasing' or 'is on'.

Research from hubs like the Stanford Vision Lab shows the power of this approach for generalisation. A model that understands 'dog' and 'car' as separate, compositional elements can immediately comprehend a novel scene of a dog inside a car. While this has revolutionised computer vision, the challenge lies in extending this compositional understanding beyond simple visual scenes to more abstract and complex domains.

What World Models Could Unlock

The combination of these distinct modelling approaches, integrated with the linguistic prowess of current Large Language Models, points towards a future of AI systems that are not just more powerful, but qualitatively different. This is the path from systems that generate plausible text to systems that possess a grounded understanding of the world and can act within it.

In robotics, for instance, an LLM component would allow humanoid robots to understand a complex, high-level command like, 'safely dismantle the experimental apparatus on that workbench'. Its internal video-based world model would allow it to 'imagine' the consequences of its actions before it makes them; it would predict that letting go of a glass beaker mid-air would cause it to smash, or that a particular chemical requires careful, slow pouring. Finally, its embodied learning, acquired through physical trial and error, would give it the fine motor control to handle a delicate instrument differently from a heavy clamp. This synergy creates a system that moves to genuine autonomy - able to plan, predict and execute complex physical tasks in unstructured environments.

In the realm of scientific discovery, a 'digital scientist' which combines a Physics-Informed Model, a Causal Model and an LLM. could read and synthesise the entire corpus of published research on a new class of solar cell materials. It could then formulate a novel hypothesis. This hypothesis would be tested not in a physical lab, but within a physics-informed simulation that rigorously adheres to the known laws of quantum mechanics and materials science. Crucially, when the simulation yields a result, the causal model would analyse it to determine not just what happened, but why it happened, distinguishing genuine cause from mere correlation.

This creates an automated discovery loop: the LLM proposes ideas from literature, the physics model tests them in a virtual lab, the causal model explains the results and this new, verified knowledge informs the LLM's next round of inquiry. And could radically accelerate the pace of innovation.

Finally, in business or government, a strategic analysis system could integrate an LLM with Structured and Compositional Models and a Neurosymbolic reasoning engine. The LLM would ingest vast quantities of unstructured data - news feeds, financial reports and geopolitical analyses. As it does, the structured model would not just process the text but would actively build a dynamic knowledge graph, identifying key entities like companies, supply chains and political actors, and mapping their complex relationships. The neurosymbolic component would then act as a reasoning engine, applying logical rules to this real-time graph. A leader could ask, 'what is the most significant downstream risk to our European operations from the new South American trade policy?'.

The system would identify the specific suppliers, shipping routes and financial instruments affected, and reason about the second and third-order consequences, revealing threats that are simply too complex and fast-moving for human teams to track.

The ultimate weakness of today's LLMs is that their understanding is not grounded in reality; they are masters of syntax but have no grasp of semantics. Integrating them with models that understand physics, causality and embodiment connects their linguistic intelligence to the real world. This is the critical step needed to move from systems that can talk about the world to systems that can understand it, act within it and explain their reasoning.

How World Models Might Impact Industries

For Ireland's core industries, the eventual applications of world models could be transformative, shifting the paradigm from statistical analysis to a mechanistic understanding of reality.

In BioPharma, the focus may move from processing clinical trial documents to grasping molecular reality. Instead of merely correlating a compound with positive outcomes, a future physics-informed AI could explain precisely how that compound blocks a specific protein. It could model temporal biological processes, showing how a drug affects a cell minute by minute, and map the exact causal pathways through which a gene mutation leads to disease, a significant leap from the current associative analysis.

This concept of simulation-first research also extends into the broader Life Sciences. Imagine an AI that can run a thousand virtual biological experiments before a single physical one is constructed, testing hypotheses in silico to refine them. This would be complemented by a new form of embodied laboratory intelligence, where a robot does not just follow a script but understands that heating a particular solution too quickly will cause it to precipitate, adapting its actions to the physical properties of the materials it handles.

In MedTech, promise lies in a deeper understanding of the interplay between devices and the human body. Digital patient simulation could allow us to see exactly how a specific pacemaker will behave in a particular patient's heart, moving beyond generic performance data. This leads to the possibility of embodied medical robotics, where a surgical robot understands that one piece of tissue is more fragile than another and adjusts its pressure accordingly, a far cry from today’s pre-programmed movements.

For Complex Manufacturing, the transition is towards physics-powered operations. A world model would not just know that quality decreases when the temperature rises; it would understand precisely how that temperature affects the molecular structure of the material being produced. This allows for true embodied factory intelligence, with robots that can adjust their processes in real-time because they comprehend the different physical properties of a new batch of material, enabling a far more profound and causal approach to process optimisation.

The impact on Insurance is a potential move from statistics to causality. Rather than simply noting that young male drivers have more accidents, a causal AI could identify the specific factors that lead to those accidents for a given profile. This would enable sophisticated scenario simulation, modelling how a specific weather event might affect claims in a geographic area, and providing a temporal understanding of how an initial incident is likely to evolve into a larger claim over time.

Ongoing Research

What, then, should this emerging field mean for our immediate awareness? The first point to grasp is one of time horizons. The value proposition of current Large Language Models is immediate and immense; there are years of untapped potential and value creation ahead in simply applying the technology that exists today. World model applications, by contrast, exist on a much longer research horizon, with most practical uses still two to seven years away. The goal is therefore not immediate implementation but strategic observation.

So, what to watch for? In BioPharma, the signals of progress will be research that moves beyond statistical correlation to a more mechanistic, causal understanding of clinical trial results and molecular simulations. We should watch for laboratory automation that becomes more adaptive and intelligent. In Manufacturing, the key development will be the evolution of digital twins to incorporate real physics, not just data correlations, and the emergence of quality control systems that genuinely understand material properties. In Insurance, the shift will be towards risk models that can explain the causality behind events, not just predict their likelihood, and fraud detection that understands behavioural patterns rather than just flagging statistical anomalies.

Observing these trends is not a passive, academic exercise. It is a form of intellectual preparation required to navigate the next decade of AI development. Understanding the trajectory of world models enables us to ask better, more incisive questions about what AI might eventually achieve. It equips us to recognise the earliest, most potent applications as they emerge within our specific industries and to engage in more informed, forward-looking conversations. This positions us as thoughtful navigators of AI's evolution, concerned not just with current capabilities but with the fundamental direction of the technology itself.

The Bottom Line

World models represent a fascinating direction in AI research that could eventually enable much more sophisticated applications in our core industries. They're still largely in research phases, with years before broad practical application.

For now, there's enormous value to capture with current LLM technology. But understanding world models helps us stay intellectually curious and well-positioned for whatever emerges next. It's less about immediate business implications and more about understanding the shape of what might come next in AI.

The most interesting aspect is how world models might eventually enable AI that doesn't just process information about the world, but actually understands how the world works. In regulated industries where understanding mechanisms matters more than just predicting outcomes, that could eventually be quite powerful.

But "eventually" is the key word - this is about staying aware and intellectually prepared, not about changing course from the significant LLM opportunities that exist today.