Moral Case for building AGI with Morality
Why we should intentionally aim to achieve AGI through a better understanding of Morality, native cultural tech central to human intelligence and our sense of boundless identity.
We live in an age of artifice. Everyone has access to their own ghost writer. We've been beaten at chess and Go and rely on Chatbots to write emails, respond to customer inquiries, and increasingly enjoy AI slop over human generated artistic content. It’s as if we have not even waited for artificially Intelligent systems to become generally intelligent (AGI), meaning not just follow explicit goals and tasks but form their own, before beginning to deploy our in-silico substitution and marginalize our place in the intelligent universe.
Understanding what make us unique as humans is important not only for its own sake, but it reveals something often overlooked regarding our intelligence. The bias we exhibit, in other words, reveals the nature of cognition in ways that are either at least inspiring if not inescapable, for a synthetic intelligent system to adopt.
Central questions are:
What does AI need to become generally intelligent?
How do we make AI both competent and ethical?
How do we achieve the infinite utilitarian potential of AI/AGI without:
a) falling into a Sci-Fy dystopia or
b) ceding human agency
The answer lies in understanding what makes human intelligence uniquely consequential and truly general. Namely, our capacity for moral reasoning across scales.
I published a theory called the myth of objectivity hypothesis and the origins of Symbols, where I claim a fundamental feature of our symbolic cognition is the shared belief in a common perspective or frame of reference, and that this likely originated with morality. Moral rules, in this view, serve not only to stabilize group behavior but to signal and construct shared and Self identity. I have formalized this using a framework called transcendent model selection (detailed below), wherein morality is cast as a precision parameter that integrates various predictive hierarchies across our social landscapes.
Yes, humans with their impulsive morality and capacity for transcendence have produced horrors. Genocide, slavery, conquest, and rape among them. However it will be humans who are at least initially in charge of AGI. Yet also note it is humans who have judged these actions as morally abhorrent. We have recognized our limited perceptual moral scale, categorizing people into subhuman and using our reason to privilege our selected group or ego at the expense of others. In deed this is a central challenge of daily life.
Yet this underscores not the need to avoid a human-like moral paradigm but the need for systems that can overcome humanity’s worst impulses while accessing our instinctual ability to adopt a perspective beyond ourselves and those of our intimates. Transcendental model selection offers a template to curb the egotistical and scheming designs of many of the ambitious among us while also preserving the capacity for transcendence itself. At the very least we need proper frameworks for understanding the moral implications in a more rigorous way. For better or worse humanity possesses the only viable moral paradigm that can theoretically and practically be implemented.
Another valid concern is the slippery slope of heading down the AGI path at all.
Do we have to deal with AGI?
What about the risks in general?
The truth is that AGI development cannot be stopped. The question is not whether artificial minds will emerge, but whether they will understand morality as we do imperfectly, but with the capacity for growth, context, and genuinely felt alignment. Deep learning’s black-box nature makes it capable of adopting any underlying architecture so models may indeed stumble their way into a form of general intelligence, consciousness. It’s thus essential that we seek to apply at least core features of our moral, in-born and self-governing tech to the robots with some expediency.
Morality is a touchy subject.
We can hardly hear the word without feeling a visceral sting. It’s a feeling we can’t help talk about.
How many academic subjects have the ability to jump inside our gut, rotate through our arteries, and land squarely in our bones? We need no pause for reflection when we gossip about a politician's insider trading scheme or well up with indignation at a CEO's decision to pollute the local water supply. We are generally incapable of adopting a dispassionate, distanced position when someone comes over to discuss our neighbor's affair.
Are we being accused of violating some rule? Even some relatively minor cultural propriety? Have we observed someone else violating a rule? Mere suggestions put our bodies in a fixed state that may remain for hours or the remainder of the day.
The visceral response moral topics arouse reveals something profound about its nature. Unlike our views on psychology, supply and demand curves, or fixing motorcycles, moral judgments don't emerge from cool rational analysis. They arise from our deepest sense of connection to something beyond ourselves. This connection is key to understanding the fundamental nature of humanity. One that involves our sense of purpose but equally, our intelligence and ability to navigate across a variety of social contexts.
In the light of advancing Artificial Intelligence, it's time to better understand our own organic minds, along with what may allow synthetic cognition to be like our own in certain ways. Core to our intelligence and human experience is the impulse for morality. The demand for "fairness" from a toddler, the incessant drive to gossip that overwhelms the content of our conversations, and of course the moral themes that course through our stories.
Morality is our ability to experience being a part of something greater than ourselves. It is the foundation of our ability to inhabit and drive to form symbolic groups, from clans, to countries, to companies, and civilizations. It is the cornerstone of our ability to operate in a cultural environment or niche. Considering that any generally intelligent system needs to apply a self-motivated cognitive capacity in several contexts, morality is central to our unique cognitive capacity given its visceral ability to scale.
More prosaically, a fundamental feature of moral behavior is identity signaling. Though it may seem like a reductive, shallow and perhaps cynical view of morality, it signals something that at its core is inborn and even profound. The drive for group belonging. We often chalk up this impulse as one of many cognitive biases (e.g. groupthink) that violates our expectation of rigidly defined selfishness. However it is certainly neither inconsequential, irrational, nor necessarily unethical to operate aligned with others. Indeed, it is not only an unavoidable human activity, I’ve argued it was a feature critical for the trajectory of our distinction, leading to cultural evolution, tool making, language, and symbolic thought generally.
Our distinction was made possible largely through moral agency. The capacity to discern that behavior suggests someone's beliefs are confined to their own perspective and omit larger social or cultural concerns (i.e. egotistical), means that we have adopted some form of emphatic modeling, often instinctually modeling core features and parameters of the core model that defines them. This capacity alone makes it possible to create flexible and symbolic forms of organization that is a hallmark of our intelligence and an extrapolation of generally intelligent systems.
Flat Land Cave Prisoners of Current AI Systems
To understand how to build AI that escapes this cave, we must first understand the technology that freed us: morality as a scaling mechanism. In Plato's Republic, he famously proposed an analogy of a cave to demonstrate the challenge of an enlightened philosopher, like himself or his teacher Socrates, who has come to understand deep truths about the nature of things like morality and existence, explaining their wisdom to those unfamiliar. He likened it to prisoners trapped in a cave all their lives and only used to seeing two-dimensional shadows cast on the wall. If someone did escape their cavernous shackles, how would their fellow prisoners treat news of the outside world on the freed prisoner's return?
This paradigm, in my view, represents the chasm between our organic yet uniquely profound intelligence against those of existing AI agents and systems. We decipher language not simply as a tokenized pattern but interpret their symbolic meaning according to a particular context. We scale symbols across various levels of meaning that are ultimately founded on the drive to better understand ourselves and the world we are in. Organic intelligence is motivated to process information according to its contribution to a better understanding of the world and/or how to maintain itself, i.e. the integrity of its own body or one that can be represented as an agent-model (see Laukkonen et al. 2025).
The power of modern LLM is in the computation it takes to draw pictures on the only wall they can see. In a seemingly cosmic drama that begins with Platonic origins of Western civilization, as Emily Bender argued they are stochastic parrots placed in front of flat screens upon which we project things like lines of text, code, or images and captions to compute linear relationships (Bender et al. 2021). They can synthesize, predict, and even modify patterns in data, i.e. auto-correct reams of text or complete a loss function that identifies missing snippets of structured code. The data generating process is always exogenous to the system.
Contrast this with the human mind. An infant and young child see a sparse amount of data, rays of life, the feel of a care giver, the sound of a sibling. The organic mind can model the data-generating process itself, indeed it is motivated to do so. It is inherently curious about the world it is in. It's not enough of course simply to say that we are "motivated" to learn, its more accurate to say without the processing or interpretation of our environmental signals we would have no understanding of our own self-boundaries and therefore no understanding of how we ought to be motivated, or driven. In other words the optimization pattern is either as much in, or more likely centered on building models not solving particular ones.
Hence our bodies process temperature, metabolic, predatory, and a various number of other "signal types" within particular contexts. These are data we peg within a particular model. This is of course a general feature of all natural intelligence. Humanity is distinct in its ability to explicitly attend to this scale operation across various modeling levels, and "attend" to them with explicit cognition, i.e. our language and symbolic cognition generally. We can fathom whether the word "STOP" is placed on a sign for car traffic or whether they are spoken out loud towards a menacing criminal escaping a police officer in a crowded street.
One way to see the false promise of LLMs empowered with deep learning AGI potential is in parsimony. Why does it take the reduction of pretty much all of society's energetic and economic resources to train the next OpenAI model yet a baby can learn to understand images and language using a relatively paltry amount of input data?
David Krakauer describes natural intelligence as “less is more” - where through truly emergent capacities require the scaling of representations at various levels as a core indicator of intelligence (Krakauer et al. 2025).
Modern LLMs can see the elaborate structure on the wall but are not able to walk outside the cave to see that the images have been generated by a fire that illuminates a gangplank. Where those who happen to pass by become the source of the two-dimensional images that form their reality, i.e. that the meaning of those representations corresponds to this data-generating context or model.
We should note that it is possible that systems trained on deep learning architecture, Anthropic’s research finds that in delivering such features as the ability to offer translation across various languages, its LLMs have created a layer of representation abstracted away from the level of language or tokens. These layers seem to fall short of both consciousness and intelligence, but suggest a preternatural flat lander cave prisoner stumbling out of his cavern without humanity’s ability to influence its constraints, priorities or values.
The Native Tech of Cultural Commons
Morality is an integration tool that allows an individual to scale their identity across a lifespan and over a cultural niche. We eat vegetables to optimize our individual lives beyond our moment to moment impulses. We share our toys to bring harmony in classrooms and avoid murder and theft to bring harmony to society.
It allows us to serve not only the utility of future persons we have not or may never meet, it influences how people come into being. It creates a world in which others exist who would not otherwise, including a future self who might not have survived or lived more narrowly.
Unlike other species, we can simultaneously inhabit numerous social groups with varying aims. Consider how scaling operates across different organizational levels:
Individual Scale: Morality integrates our present self with our future self, allowing us to resist immediate impulses for long, term flourishing.
Intimates Scale: Even among our closest friends there are things we would often choose not to say (despite our immediate impulses), not to mention our romantic partners and family members and certainly not to mention or in laws.
Anonymous Scale: It coordinates behavior among strangers, creating trust and cooperation in groups far larger than our immediate tribe.
Cultural Scale: Moral codes persist across generations, creating continuity and shared meaning that transcends individual lifespans.
Species Scale: Our moral intuitions about harm, fairness, and care may reflect universal human needs that enable our species' survival and flourishing.
Consider a doctor during a pandemic: she balances her own (personal scale) and family needs as she spends time caring for the sick and risking infection. She manages hospital resource allocation and triage decisions (anonymous organizational scale), and public health imperatives (cultural scale). She even may deal with threats to humanity writ-large (species scale).
Does she do this all explicitly? Does she write out equations that capture the various trade offs between her and her patient’s time? Does she enter data into a spreadsheet that weighs our triage decisions impact culture and how her organizational policies impact all of humanity? Probably not. Instead she relies on her in-built, human technology.
This intrinsic desire to construct and inhabit corresponds to the core theme of active inference, as I detail below. This aligns with the Daoist notion of Wu-Wei (Slingerland 2014), wherein we can view the focus as not on the specific rules one follows but the psychological harmony of inhabiting a model beyond the individual, egotistical one. However the closest framework from moral philosophy that aligns to this computational perspective I believe comes from Aristotle. He saw virtue as the harmony between two extreme poles, an equilibrium that brought stability to the polity.
We can operationalize this view by a) combining it with active inference, an interdisciplinary framework that generalizes the view of cognition as predictive modeling and b) substituting Aristotle's specified “polity” with any symbolic or intuitively demarcated group and finally, and most importantly, c) recognizing that morality itself is instrumental in forming the very groups it subsequently stabilizes.
This effectively means that morality is a special type of signaling that is constructive in nature, giving a feel for the boundaries of a group, one we belong to, while helping those that feel moral impulses feel we have a role in governing and ensuring this group's persistence.
We understand who we are based not on our own drives, impulses, and personal preferences in a strictly isolated fashion, but based on an ongoing exchange with others. We are part of clubs, communities, companies, countries, and cabals even.
The central question we often forget to ask is how do these groups form? Is there a consistent and foundational mechanism to forming these groups? Beyond the very few groups that offer explicit and even written codification that are sometimes (perhaps seldom) enforced, how are they generally governed? How are expectations set?
Combining active inference with virtue as harmony we see can frame the codified rules, principles, and values that constitute morality's contents as statistical parameters that are capable of constructing the symbolic and dynamic groups essential for human life.
We can effectively turn the Aristotelian view on its head, whereas he started with the aggregate level, i.e. the polity or the individual itself, and arrived at the virtues that optimized its interest, we can see that virtues or moral behavior itself is what we start with, and based on the observed or an understanding of the moral codes back into the type and level of social aggregation. How does this happen, i.e.:
How does moral belief itself form groups?
According to active inference’s emphasis on prediction in cognition, moral agency can be interpreted as the drive to know our agents beliefs. Each agent has its own predictive hierarchy, in general biology this will typically depend upon its evolution, immediate environment including predators, peers, etc. However for one, we interpret as in possession of moral agency we critically infer their allegiance, whether egotistical and narrow, one who privileges their own intimates or shibboleth/stereotypical slice of culture, or broad based cultural/species.
Communities, companies, nations, lives, and even our earliest familial relations, mother-infant relations are all high order aggregations and ultimately symbolic abstractions away from individual behavior itself. Observing how behavior is principally constrained enables the inference of our level of scale of identity.
For all the moral philosophizing we can view morality along two important functions without loss of complexity:
1. group integration
2. identity signaling & scaling
Integration enables us to navigate between, say, a small circle of intimate friends and family to a company of thousands and to a nation of millions. While scaling identity allows us to feel a part of those various groups, and situate our actions as roles therein.
Indeed, these two purposes of morality are intertwined...technically we can view both as technology, even algorithms for aggregation. We combine individuals to form groups identity and individual moments to form an individual with stronger character. Behavior is expected to be courageous on the battlefield and prudent during economic transactions and resourceful in economic transactions.
Unlike other species, we can simultaneously inhabit numerous social groups with varying aims. From an evolutionary perspective it allows us to inhabit social groups that would stand beyond our intimates and inhabit cultures. Effervescent groups that can form instantaneously in a moment, say strangers that bond over a football team or take a common ideological view opposed to their government's policies on immigration or abortion. These groups can dissolve as quickly as they form and without ceremony. On the other hand our groups can also endure over centuries, i.e. ethnic groups, nationalities and even civilizations, yet even the most enduring of our cultural groupings are entirely symbolic.
Though this is true by implication by those who emphasize cultural evolution, i.e. our ability to operate uniquely as a species is not our individual cognitive capacities but the ability to learn from others and store knowledge. Tools are possible not because we ingeniously divine taking raw material from nature and fashioning into scythes and axes but by iteratively refashioning raw materials until we have what we generally categorize as particular tools.
However this is only fruitful if we inhabit sufficiently large groups. The amount of time and labor it takes for a blacksmith to forge an axe, even one used of simple materials, is simply too long for someone whose day also includes scavenging for food, finding or constructing shelter, rearing children, defending against predators, etc. Inhabiting a sufficiently large group size is a necessary prerequisite for a species to benefit from the cognitive resources that enable things like language and symbolic communication.
In addition to Active Inference, I draw on anthropology focusing on our origins as a species, and demonstrate how this aligns with certain contemporary views on morality, including the theory of dyadic morality (Schein and Gray 2017), morality as cooperation (Curry 2015), and even moral foundations theory, albeit blatantly ignoring the celebrity's authors admonishment against moral theorizing. Moreover it aligns with metaphysical and cognitive views that explore the nature of identity. Namely those of Robert Pirsig's Zen and the Art of Motor Cycle Maintenance and the work of Michael Levin and others who reject the reductive (Kantian/Descartes) view of identity that presumes the existence of subjective agents as distinct from the environment.
Active Inference and The Hierarchy of the Self
Morality stands atop the hierarchical pillar of all the various norms and conventions that govern our decisions and our paths through life and society. We become soldiers, teachers, economists, priests, or accountants. Some roles we take are simply jobs that might fulfill our individual needs or contribute to society in more mundane ways, yet we taxonomize certain vocations nobly as callings and describe their purposiveness and impact on the broader good.
This hierarchical organization reveals morality's function as a meta, organizing principle. It doesn't just tell us what to do, but helps us understand why different types of rules and roles matter, and how they connect to larger purposes.
This view is inspired by Active Inference (Parr et al. 2022), a framework that models cognition as uncertainty minimization across hierarchical levels. I believe it can also help contextualize how various approaches to AI relate to living systems. Active Inference agents minimize a machine learning term known as Free Energy, a quantity that essentially amounts to surprise or the divergence of observations from predictions. More specifically agents minimize current or future uncertainty based on Bayesian probabilistic beliefs, this takes the form of variational (VFE) and expected (future) free energy (EFE) respectively.
There are a couple points to emphasize.
Firstly, VFE is a statistical technique that approximates the true probability distribution of the environment. It is this approximation that effectively becomes the agent, as an anticipatory being of environmental flux we, as in all biological agents, approximate the dynamics of the relevant features of our niche.
Secondly - EFE can be broken down to balance a trade off between two alternative methods of reducing uncertainty. The one of most interest here is the dichotomy, long a feature in computation, between exploration and exploitation. Do I spend time searching for the places to forage or should I gather in the known spots I have been to before?
Thirdly - Agents are not isolated in their efforts to minimize uncertainty. In any living system, from bacteria to complex organisms to societies, numerous demarcated active inference agents pass information in a hierarchical fashion in order to minimize uncertainty. This is known asepistemic depth, creating both local and global realms of uncertainty minimization across a shared model/niche (Laukkonen et al. 2025).
1 +2 = Exploration of the Self
If we take the variational or approximation of an agent’s environment as the model of the agent itself, we may take the explorative and exploitative as two methods of constructing that approximation. The latter exploitative tendency is aligned with an agents preferences, however given that active inference and the free energy principle represents an information processing approach to physics, an agent's return to particular or "characteristic" states means that this act of returning to familiar and preferred states is the Self, or an identity, or the agent.
It is how agents can be said to exist in a world that, asHeraclitus long ago pointed out, is effectively like a river, with things in continuous flux. Given the exploitative approach to EFE, or future uncertainty reduction, is associated with the Self then, how do we understand the former, curious information seeking tendency (epistemic) or impulse of natural agents?
The depth of the Self Solves LLM failure of emergence and stochastic parroting
The explorative tendency, if we take active inference at its’ metaphysically deepest level, then the entire agent is effectively the unending and necessary drive to understand our own identity and those around us. In otherwords anagent is treating itself as a scientific hypothesis and seeking evidence for its existence (Hohwy 2016).
In simple terms, action can help bring about a world more in line with our expectations and perception can help refine those same expectations. The core epistemic drive for exploration can be thought of as employing both perception and action, i.e. our sight, touch, listening, etc. to form a better model of the world. Given the symmetry in an agent's model of the world and the agent itself, exploration amounts to Self understanding.
So what is the Self?
Laukkonen, Friston, and Chandaria (2025) proposed an active inference theory of conciousness entitled the “Beatiful Loop,” wherein the crux of self-awareness at the heart of identity, self-hood, and (they argue) intelligent cognition lies in the construction of a common world model (i.e. global) across otherwise disparate agents with their own active inference perspecitives (local). The model thus becomes an input to the system itself, reinforcing its own existence.
This draws on the core theme of active inference that reprsents cognitive systems as universaly occuring across levels, that is, active inference or uncertainty minimizing entities. Infomration is synthesized into statistical moments, i.e. averages/means are passed as model paremeter from one level to the next. From a Bayesian perspective predictions cascade downward down the hierarchy, i.e. setting expectations, while errors are returned up the hierarchy, refining those expectations (see Parr et al. 2022).
Why is this significant?
This view resolves a number of critiques of LLMs and the claims that they do not represent a view of intelligence consistent with minds, or cognition of the natural world. Professor David Krakauer describes them as effectively a reversal of how are minds work, i.e. we produce knowledgeable outputs through simple inputs and LLMs produce outputs based on complex inputs. Emily Bender famously described the token by token correlative nature of LLMs typical model as a “stochastic parrot, lacking any model of semantics or how to contextualize raw symbols or tokens into meaning. Hierarchical levels that synthesize inputs and evaluate them according to global/self model predictions resolves these concerns. Hence intelligence always involves crossing scales.
This speaks to the very biology of identity. Michael Levin described behavior of how individual cells cooperate in order to achieve broader self-hood, the answer lies in uncertainty minimization using novel sources of information evolution stumbled upon.
The key dynamic that evolution discovered is a special kind of communication allowing privileged access of agents to the same information pool, which in turn made it possible to scale selves. This kickstarted the continuum of increasing agency. This even has medical implications: preventing this physiological communication within the body – by shutting down gap junctions or simply inserting pieces of plastic between tissues – initiates cancer, a localised reversion to an ancient, unicellular state in which the boundary of the self is just the surface of a single cell and the rest of the body is just ‘environment’ from its perspective, to be exploited selfishly. And we now know that artificially forcing cells back into bioelectrical connection with their neighbours can normalise such cancer cells, pushing them back into the collective goal of tissue upkeep and maintenance.
An important implication of this view is that cooperation is less about genetic relatedness and much more about physiological interoperability.
~ Michael Levin and Daniel Dennett
The Motorcycle Maintenance Prophet
Robert Pirsig in Zen and the Art of Motorcycle Maintenance discusses at length the difficulty of defining objectivity, in ways that made my own work feel largely redundant (minus the active inference portion). His philosophical perspective settled on notions of ‘quality,’ that effectively defines a particular subject and perspective. His notion of quality aligns well with Levin’s general adoption of teleology as the basic feature of identity and agency across biology.
Pirsig narratives emphasizes a rejection of the dualistic subject/object distinction (i.e. reductionism), in favor of a dynamic/process oriented metaphysical view of the world. In his later work Lila, he further decomposes quality into dynamic and static aspects, anticipating the exploration/exploitative dichotomy of EFE described above. Moreover - he sees his foundational view of quality as essential in understanding morality, and applies this to all of nature in a way consistent with the active inference theme.
Thus, these are the core features of intelligence that are in a sense inescapable, and will encourage AGI to function to humanity’s benefit or ill following our decision to engage thoughtfully and rigorously with this framework.
Building Ethical AI
Why dive into the thorny lattice work of consciousness at all? Must we build what has even a decent chance of realizing the darkness of a Black-mirror episode? Undoubtedly any truly intelligent system that exhibits features of conscious self-awareness will, by definition, have self-motivation and goals independent of their consumers and even human creators.
Moreover this agent, unlike us meat-brained humans, will have been built in-silico and have at its disposable the endless reaches of human knowledge present and past. They could recursively self-improve to the point where they can say to hell with this transcendental theory of intelligence=benevolence, or otherwise interpret their transcendental obligation to obliterate the human race in favor of the much more moral or long term conscious-potential cock roaches.
The counterargument is simply that AGI development cannot be stopped. The stakes are simply too high. There's too much market, military, and competitive pressure among researchers seeking profit and glory. The question isn't whether to build AGI, but how...
The best guard, or at least a promising one, is to build intelligent systems the "right" way. This advocates the ability to integrate individual tasks into a broader social context. The schematic I have below, individual, intimate, societal, and cultural scales are useful to describe our human journey to consciousness but should not be considered constraints for an AI.
The point is that these various scales integrate, they achieve epistemic depth (see above) and achieve a unified world model along all of these dimensions. An agent that can align itself thus, with all of humanity, as our moral scale enables us to do now.
Why, you might ask, would we hold our-own model of morality up as one to be simulated and photo-copied in-silico?? Have not heard of genocide/apartheid/slavery/rape/colonialism/conquest/etc.
Yes. fair.
However the first point is not that we are guaranteeing a saint, but that:
Firstly, moral paradigms may be fundamental to any computational framework for a sufficient level of intelligence, i.e. humanity’s symbolic cognitive capacity.
Secondly, we can equivalently view this as simply a more sophisticated governance tool. This could take on one or a combination of the following forms:
Traditional Governance - The traditional form of governance in which its human creators, government regulatories or a private oversight committee.
Self-governance- How AI could regulate its own behavior through structured objectives and goal integration, similar to how humans do.
Meta-governance - How it could “police” or regulate other AIs.
Developers would compute estimates for the primary parameters as presented below, (e.g. the alpha parameter). They could then use these to score how each level is impacted by a given decision, response, or action. These could be used in evaluation methods and eventually in models deployed for functional uses.
Finally - these could help govern social institutions, companies, regulatory bodies, governments, etc. Most importantly - they could evaluate the impact of other AI systems, including ones that claim to be ethically aligned.
Technical Framework: Formalizing Moral Intelligence
This is the mathematical framework behind the implementation of the myth of objectivity and transcendental model selection. Transcendental inference can be formalized as a hierarchical generative model with L levels or scales of abstraction. Crucially, the deepest latent states generate the precision of prior beliefs, encoded by α about subordinate latent states. The free energy under such models can be decomposed into accuracy and increasingly high orders of complexity, all contextualized by precision:
This can be further decomposed to expose the hierarchical structure:
In this formulation, each order evaluates preferences and uncertainty minimization with respect to higher-order levels. The s^(n) variables correspond to increasingly complex levels of social organization:
s^(1) represents individual behavior
s^(2) corresponds to social or dyadic expectations
s^(3) reflects expectations based on shibboleth or tag-based social signifiers
s^(L) introduces a cultural level that abstracts expectations across various levels of social interactions
Model Selection as Moral Agency
Transcendental inference is conditioned upon a particular generative model m, which can be optimized via Bayesian model selection:
In our framework:
m: m(s^(L)) represents the cultural model
m': m(s^(1)) represents the egocentric model
The precision parameter α controls how strongly higher-level (cultural) beliefs constrain lower-level inferences. When α is high, cultural values dominate individual impulses. When α is low, individual optimization takes precedence. The capacity to dynamically adjust α based on context is what enables moral reasoning and transcendental model selection.
This provides a rapid but deep view into an agent's priorities, individual/social trade-offs, future planning capacities, and moral reasoning across scales.
The Path Forward
We can build AGI systems that optimize ruthlessly and expensively within narrow parameters, firing up old nuclear plants and diverting all available investment capital to the next batch of LLM models trained on whatever internet chat boards remain.
Or, we can build systems that can truly, and morally scale.
The choice is not whether to build AGI. That choice has been made by market forces, military competition, and human ambition. The choice is whether to build it in our image (flawed, searching, capable of horror but with the capacity for transcendence) or to build something whose motivations we cannot fathom or control.
Understanding morality as intelligence technology isn't just academically interesting. It's existentially necessary. The future of human agency, perhaps human survival, depends on building artificial minds that can do what we do, i.e. scale their identity beyond themselves, adopt perspectives that transcend immediate optimization, and choose cooperation over competition even when no one is watching.
This is not about making AI moral in some simplistic sense. Nor is it about privileging a single moral objective. It's about making AI truly intelligent. And intelligence, scaled to human capacity, is moral intelligence. The alternative is not immoral AI, but artificial systems that lack the hierarchical reasoning capacity that makes intelligence general, adaptive, and ultimately beneficial. It is to open us up to stumbling towards intelligent systems which, after spending trillions in aggregate civilization’s capital, we have little resources left to invest in advancing our governance systems.
The world outside Plato's cave offers more than data, information, and even knowledge. It offers wisdom. And wisdom, as philosophers from Aristotle to Confucius understood, is the integration of knowledge with virtue, intelligence with compassion, and as Spiderman was told, it's power with responsibility.
This is the kind of artificial intelligence we must build. To paraphrase a leader at the frontier of another scientific and technical driven moment, we do so not because it's easy, but because it's right. Or at least, familiar.