Pages

About

Journey Mapping 101

Mike's Notes

I'm wondering whether Journey Maps could be added to Pipi.

  • To better understand the needs of the different users of Pipi visually
  • As a tool for customers.
I first used Journey Mapping a few months ago while testing Krobar.ai, an excellent simulation platform from Kromatic.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library > Subscriptions > NNg Newsletter
  • Home > Handbook > 

Last Updated

18/01/2026

Journey Mapping 101

By: Sarah Gibbons
NN Group: 09/12/2018

Vice President of Nielsen Norman Group. She works at the intersection of research, strategy, and design.

Summary

A journey map is a visualization of the process that a person goes through in order to accomplish a goal.

Journey maps are a common UX tool. They come in all shapes, sizes, and formats. Depending on the context, they can be used in a variety of ways. This article covers the basics: what a journey map is (and is not), related terminology, common variations, and how we can use journey maps.

In This Article

  • Definition of a Journey Map
  • Key Components of a Journey Map
  • Journey-Map Variations
  • Why Use Journey Maps
  • Conclusion

Definition of a Journey Map

Definition: A journey map is a visualization of the process that a person goes through in order to accomplish a goal.

In its most basic form, journey mapping starts by compiling a series of user actions into a timeline. Next, the timeline is fleshed out with user thoughts and emotions in order to create a narrative. This narrative is condensed and polished, ultimately leading to a visualization.

Most journey maps follow a similar format: at the top, a specific user, a specific scenario, and corresponding expectations or goals in the middle, high-level phases that are comprised of user actions, thoughts, and emotions;  at the bottom, the takeaways: opportunities, insights, and internal ownership.

The terms ‘user journey map’ and ‘customer journey map’ can be used interchangeably. Both reference a visualization of a person using your product or service.

While the argument can be made that the term ‘customer’ does a disservice to the method (because, especially for certain business-to-business products, not all of end users are technically customers, i.e., product buyers), alignment on what you call the map is far less important than alignment on the content within the map.

Key Components of a Journey Map

Journey maps come in all shapes and sizes. Regardless of how they look, journey maps have the following 5 key elements in common:

  1. Actor
  2. Scenario + Expectations
  3. Journey Phases
  4. Actions, Mindsets, and Emotions
  5. Opportunities

Actor

The actor is the persona or user who experiences the journey. The actor is who the journey map is about — a point of view. Actors usually align with personas and their actions in the map are rooted in data.

Provide one point of view per map in order to build a strong, clear narrative. For example, a university might choose either a student or a faculty member as actor — each would result in different journeys. (To capture both viewpoints, the university will need to build two separate maps, one for each of the two user types.)

Scenario + Expectations

The scenario describes the situation that the journey map addresses and is associated with an actor’s goal or need and specific expectations. For example, one scenario could be switching mobile plans to save money, and expectations for it include to easily find all the information needed to make a decision.

Scenarios can be real (for existing products and services) or anticipated — for products that are yet in the design stage.

Journey maps are best for scenarios that involve a sequence of events (such as shopping or taking a trip), describe a process (thus involve a set of transitions over time), or might involve multiple channels.

Journey Phases

Journey phases are the different high-level stages in the journey. They provide organization for the rest of the information in the journey map (actions, thoughts, and emotions). The stages will vary from scenario to scenario; each organization will usually have data to help it determine what these phases are for a given scenario.

Here are some examples:

  • For an ecommerce scenario (like buying Bluetooth speakers), the stages can be discover, try, buy, use, seek support.
  • For big (or luxury) purchases (like buying a car), the stages can be engagement, education, research, evaluation, justification.
  • For a business-to-business scenario (like rolling out an internal tool), the stages could be purchase, adoption, retention, expansion, advocacy.

Actions, Mindsets, and Emotions

These are behaviors, thoughts, and feelings the actor has throughout the journey and that are mapped within each of the journey phases.

Actions are the actual behaviors and steps taken by users. This component is not meant to be a granular step-by-step log of every discrete interaction. Rather, it is a narrative of the steps the actor takes during that phase.

Mindsets correspond to users’ thoughts, questions, motivations, and information needs at different stages in the journey. Ideally, these are customer verbatims from research.

Emotions are plotted as single line across the journey phases, literally signaling the emotional “ups” and “downs” of the experience. Think of this line as a contextual layer of emotion that tells us where the user is delighted versus frustrated.

Opportunities

Opportunities (along with additional context such as ownership and metrics) are insights gained from mapping; they speak to how the user experience can be optimized. Insights and opportunities help the team draw knowledge from the map:

  • What needs to be done with this knowledge?
  • Who owns what change?
  • Where are the biggest opportunities?
  • How are we going to measure improvements we implement?

An example of a simplistic, high-level customer-journey map depicting how the persona “Jumping Jamie” switches her mobile plan. While all comprehensive journey maps should include key components, what the map chooses to prioritize can (and should) depend on the goal of the journey-mapping initiative. (For your convenience, we provide a journey-map template that you can use.)

Journey-Map Variations

There are several concepts closely related and thus easily confused with journey maps.

It is important to note that this section is only meant to help your personal understanding and clarification of these terms. It is not advised to debate or attempt to shift a whole organization’s language to abide by the definitions stated here. Instead, use these definitions to guide you towards aspects of another method that your team has not previously considered.

Journey Map vs. Experience Map

Think of an experience map as a parent to a journey map. A journey map has a specific actor (a singular customer or user of a product) and specific scenario (of a product or service), while an experience map is broader on both accounts — a generic human undergoing a general human experience.

The experience map is agnostic of a specific business or product. It’s used for understanding a general human behavior; in contrast, a customer journey map is specific and focused on a particular business or product.

For example, imagine the world before the ridesharing market existed (Uber, Lyft, Bird, or Limebike, to name a few). If we were to create an experience map of how a person gets from one place to another, the map would likely include walking, biking, driving, riding with a friend, public transportation, or calling a taxi. Using that experience map we could then isolate pain points: unknown fares, bad weather, unpredictable timing, paying in cash, and so on. Using these pain points, we would then create a future journey map for specific product: how does a particular type of user call a car using the Lyft app?

Journey Map vs. Service Blueprint

If journey maps are the children to experience maps, then service blueprints are the grandchildren. They visualize the relationships between different service components (such as people or processes) at various touchpoints in a specific customer journey.

Think of service blueprints as a part two to customer journey maps. They are extensions of journey maps, but instead of being focused on the user (and taking the user’s viewpoint), they are focused on the business (and take its perspective).

For the Lyft scenario above, we would take the journey map and expand it with what Lyft does internally to support that customer journey. The blueprint could include matching the user to a driver, contacting the driver, calculating fares, and so on.

Journey Map vs. User Story Map

User stories are used in Agile to plan features or functionalities. Each feature is condensed down to a deliberately brief description from a user’s point of view; the description focuses on what the user wants to do, and how that feature will help. The typical format of a user story is a single sentence: “As a [type of user], I want to [goal], so that [benefit].” For example, “As a checking account holder, I want to deposit checks with my mobile device, so that I don’t have to go to the bank.”

A user story map is a visual version of a user story. For example, take the user story above (“As a checking account holder, I want to deposit checks with my mobile device, so that I don’t have to go to the bank.”) and imagine writing out the different steps that the team plans for the user to take when using that functionality. These steps could be: logging in, beginning deposit, taking picture of check, and entering transaction details. For each step, we can document required features: enabling camera access, scanning check and auto filling numbers, and authorizing signature. In a user story map, these features are written on sticky notes, then arranged based on the product release that each functionality will be added to.

While, at a glance, a user story map may look like a journey map, journey maps are meant for discovery and understanding (think big picture), while user story maps are for planning and implementation (think little picture).

Although a journey map and user story map may contain some of the same pieces, they are used at different points of the process. For example, imagine our journey map for Lyft indicated that a pain point appeared when the user was in a large group. To address it, the team may introduce a multicar-call option. We could create a user story map to break this feature (multicar call) into smaller pieces, so a product-development team could plan release cycles and corresponding tasks.

Why Use Journey Maps

The benefits of journey maps (and most other UX mappings) are two-fold. First, the process of creating a map forces conversation and an aligned mental model for the whole team. Fragmented understanding is a widespread problem in organizations because success metrics are siloed; it is no one’s responsibility to look at the entire experience from the user’s standpoint. This shared vision is a critical goal of journey mapping, because, without it, agreement on how to improve customer experience would never take place.

Second, the shared artifact resulting from the mapping can be used to communicate an understanding of your user or service to all involved. Journey maps are effective mechanisms for conveying information in a way that is memorable, concise, and that creates a shared vision. The maps can also become the basis for decision making as the team moves forward.

Conclusion

Journey mapping is a process that provides a holistic view of the customer experience by uncovering moments of both frustration and delight throughout a series of interactions. Done successfully, it reveals opportunities to address customers’ pain points, alleviate fragmentation, and, ultimately, create a better experience for your users.

Additional articles are available, discussing: 

  • When to create customer journey maps
  • The 5-step process
  • Journey mapping in real life

The Feynman Lectures are now online and free

Mike's Notes

Physicist Richard Feynman shared a Nobel Prize for quantum electrodynamics and was a wonderful teacher of physics. This is a resource of his lectures for my future study of particle physics.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 

Last Updated

17/01/2026

The Feynman Lectures are now online and free

By: Nurin Ludin
From Quarks to Quasars: 13/03/2025

Mike is the inventor and architect of Pipi and the founder of Ajabbi.

Science often loses its allure in a maze of jargon, with wonder quickly giving way to dry facts and equations.

The problem is particularly noticeable in classrooms, where rote memorization trumps ingenuity and understanding. However, when Richard Feynman taught science, this was never a problem. 

His eccentricity and flair for showmanship made science accessible and exciting, leaving his listeners wide-eyed and eager to learn more. 

But perhaps this is to be expected. He was, after all, one of the greatest and most influential theoretical physicists in history. But, of course, he wasn't infallible.

Richard Feynman: Early beginnings

Feynman was born in 1918 in New York City to a Jewish family — though he was a professed atheist from his early teenage years on. Unfortunately, Feynman’s Jewish heritage often made life difficult. At the time, antisemitism was widespread in the United States, and Feynman was rejected when he applied to Columbia University. Today, many argue that this rejection was due to the university's quota on Jewish admissions.  

He was, however, accepted to MIT. Feynman subsequently completed his doctoral work at Princeton University. 

Feynman married his high school sweetheart, Anne Greenbaum, in 1942. The previous year, Greenbaum had been diagnosed with lymphatic tuberculosis. It was a death sentence, with doctors stating that she only had two years to live. But she was the love of Feynman's life, and so they married. Directly after the ceremony, Feynman took her Greenbaum to the hospital, where he would visit her every weekend.

Greenbaum died with Feynman by her side in 1945. She was twenty-five. 

The science years

Feynman began his career in science as a junior physicist in the Manhattan Project, working toward producing the world’s first atomic bomb. Specifically, Feynman worked on a theory of how to separate Uranium 235 from Uranium 238. 

Later in the project, he became the leader of the Theoretical Division and developed a formula to calculate the yield of a fission bomb with Hans Bethe. 

After the war ended, he moved on to teaching as a professor of theoretical physics at Cornell University and the California Institute of Technology, where he conducted his most groundbreaking work.

Feynman's primary contributions were to quantum mechanics. He introduced diagrams (now called “Feynman diagrams”) that are graphic analogs of the mathematical expressions needed to describe how particles interact.  

While his partners, Julian Schwinger and Sin-Itiro Tomonaga, approached quantum electrodynamics mathematically, Feynman drew pictures of every possible interaction between photons and electrons. His iconic doodles helped transform the very foundation of physics, allowing scientists to calculate the probability of each scenario and add them up to get the correct answer. 

David Kaiser eloquently articulates the significance of Feynman’s diagrams: "With the diagrams’ aid, entire new calculational vistas opened for physicists. Theorists learned to calculate things many had barely dreamed possible before....It might be said that physics can progress no faster than physicists’ ability to calculate. Thus, in the same way that computer-enabled computation might today be said to be enabling a genomic revolution, Feynman diagrams helped to transform the way physicists saw the world, and their place in it."

In 1965, Feynman was awarded the Nobel Prize in Physics alongside Schwinger and Tomonaga for this work and its contributions to quantum electrodynamics. But, of course, Feynman was much more than just a theoretical physicist.

Feynman’s lectures

During his tenure at Caltech in the early 1960s, Feynman delivered a series of lectures that revolutionized the teaching of introductory physics. These lectures subsequently spawned a book that explained the most basic principles of astonishingly complex and formidable theories, like general relativity and quantum mechanics, in a way that was accessible, accurate, and comprehensive. 

In 2013, Caltech and the Feynman Lectures website collaborated to post these lectures online, and they’re completely free. 

In the first two years alone, the site was accessed more than 8 million times by roughly 1.7 million people — a testament to his teaching abilities and demonstrating just how timeless his lectures are. 

That’s not to say that Feynman did it alone. Fellow physicists Matthew Sands and Robert Leighton took turns editing and compiling the individual lectures, which took between 10 to 20 hours each. 

Although, Feynman did face some controversy — and it must be noted that such critical reflections are very much valuable and needed — by nurturing a sense of wonder for nature and fostering a desire to understand the mechanisms that govern the physical world in so many, Feynman forever transformed our world and our understanding of it.

Ontology Documentation Tools and Workflows

Mike's Notes

Marcelo Xavier started a fascinating discussion on the Ontolog Forum about creating online documentation for an Ontology. My friend Alex Shkotin used Gemini to identify a possible solution. Nico Matentzoglu also made valuable ponis including suggestions around using DiataxisOBOOK (Open Biological and Biomedical Ontologies Organised Knowledge) is a fantastic resource and a great example of how to organise successful training material.

I have copied the whole thread here, along with the resources, for future reference. I can use all of this in future development of the Pipi Ontology Engine (ont).

Dear community,

I would like to kindly request suggestions on the best way to create online documentation for an ontology currently being developed in WebProtégé.

We are looking for tools or approaches that enable us to share the progress and structural details of the ontology in a clear and accessible manner for all stakeholders involved.

Thank you very much in advance for your help and recommendations.

Sincerely,

Marcelo Xavier


"Dear Marcelo,

I am personally for self documented code, then we need one or another rendering engine. Did you ask OBO Foundry and [protege-user] list?

I asked Gemini, have a look https://gemini.google.com/share/6e091b60d275

Best,

Alex"


Dear Marcelo,

Unfortunately I cannot answer your question for WebProtege specifically; Note that independently of your issue (and despite this: https://github.com/protegeproject/webprotege/issues/284), I would urge anyone developing an ontology regardless of where it is curated/edited to use a standard version control system like GitHub or GitLab for community engagement and "open science best practice" (standard workflows, etc).

A lot of OBO ontologies use the Ontology Development Kit (ODK) which comes with some built-in functions to generate an mkdocs (material-themed) documentation scaffolding which can then be extended by the ontology team. This is usually deployed on github.io (which you seem to have some personal experience with as well). Some of our docs pages are very detailed, others vanilla, see for example:

https://obophenotype.github.io/uberon/

https://obophenotype.github.io/human-phenotype-ontology/

https://oborel.github.io/obo-relations/

We mostly curate our documentation manually, but claude code or similar can, with some guidance and careful review, generate quite reasonable pages as well. 

On a more personal note:

- I like the clarity of the diataxis framework (https://diataxis.fr/) for organising docs, which we more or less try to follow in OBO Academy (https://oboacademy.github.io/obook/). 

- I really like it if modelling patterns in the ontology are documented explicitly using something like DOSDP: https://github.com/monarch-initiative/mondo/blob/master/src/patterns/dosdp-patterns/autoimmune.yaml. It is trivially possible to generate documentation pages from these to have something like this: https://mondo.readthedocs.io/en/latest/editors-guide/patterns/ (I find this is, if I may be so bold, the most important part of ontology documentation - even though hardly anyone does it).

Good luck!

Nico

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library > Subscriptions > Ontolog Forum
  • Home > Handbook > 

Last Updated

16/01/2026

Ontology Documentation Tools and Workflows

By: Alex Shkotin
Gemini: 14/01/2026

Alex is an independent researcher in Moscow.

Gemini

Some responses below were generated according to the creator's saved info and preferences. Learn moreOpens in a new window 14 January 2026 at 21:51

Published on 14 January 2026 at 21:53

Prompt

request suggestions on the best way to create online documentation for an ontology currently being developed in WebProtégé.

We are looking for tools or approaches that enable us to share the progress and structural details of the ontology in a clear and accessible manner for all stakeholders involved.

Output

Creating online documentation for an ontology in WebProtégé is a common challenge, as the platform is excellent for editing but doesn't have a built-in "one-click" public documentation site.

To share structural details with stakeholders, the best approach is to export your ontology from WebProtégé and then use a specialized documentation generator. Here are the most effective tools and methods for 2026.

1. Top Documentation Tools

Once you export your ontology (usually in Turtle or RDF/XML format), use one of these tools to generate a human-readable website.

Tool Best For Key Features
WIDOCO Standard-compliant Docs A "Wizard" that generates a complete HTML site with metadata, cross-references, and a visual diagram (via WebVOWL).
LODE Quick Previews A web service that turns an OWL file into a single, clean HTML page. Great for non-technical stakeholders to read class definitions.
WebVOWL Visual Stakeholders Provides an interactive, graphical view of the ontology. Excellent for showing relationships and hierarchy visually.
OnToology Automated Workflows If you use GitHub, this tool automatically runs WIDOCO and WebVOWL every time you push a change to your repository.

2. Recommended Workflow: The "Professional" Route

If you want a live site that stays updated as your team works in WebProtégé, follow these steps:

  1. Export from WebProtégé: Go to Project > Export and select the Turtle (.ttl) or RDF/XML format.
  2. Upload to GitHub: Create a repository for your ontology.
  3. Use WIDOCO: Run the WIDOCO wizard. It will ask you for a title, description, and logo. It then generates a folder full of HTML files.
  4. Host on GitHub Pages: Enable GitHub Pages on your repository to turn that folder into a public URL (e.g., https://your-org.github.io/ontology-doc).

3. Collaborative Alternatives (Internal)

If you aren't ready to publish a full website yet, you can use WebProtégé’s internal features to keep stakeholders in the loop:

  • Sharing & Permissions: You can invite stakeholders to the WebProtégé project with "View Only" or "Comment Only" permissions. This allows them to see the structure without accidentally changing anything.
  • Threaded Comments: Use the "Comments" tab in WebProtégé to discuss specific classes or properties. Stakeholders can leave feedback directly on the entity they are concerned about.
  • Project Feed: This provides a timeline of changes, which is helpful for stakeholders to see the "velocity" of the project and what has been added recently.

4. Enriching Your Documentation

For any documentation tool to work well, stakeholders need clear descriptions. Ensure your team fills out these Annotation Properties in WebProtégé for every class and property:

  • rdfs:label: A human-friendly name (e.g., "FinancialTransaction" vs "financial_transaction").
  • rdfs:comment or skos:definition: A clear, non-circular definition of the concept.
  • dc:creator: To show who is responsible for that part of the model.

Would you like me to help you draft a set of "Documentation Best Practices" for your team to follow within WebProtégé to ensure the final output is high quality?

Net-SNMP for PDU?

Mike's Notes

I'm looking for a solution to a problem. Some working notes.

"The first Data Centre

Once scaling begins, Pipi 9 will then need a data centre to use as a render farm to automatically create customised SaaS enterprise applications based on user requirements. The data centre will be completely isolated from the internet to maximise security. It can be expanded in stages if it is planned appropriately.

Each industry and each enterprise customer will get a dedicated server to store a mirrored copy of their deployment configuration and parameters, including localisation. No user data will be stored." - On A Sandy Beach

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 

Last Updated

15/01/2026

Net-SNMP for PDU?

By: Mike Peters
On a Sandy Beach: 15/01/2026

Mike is the inventor and architect of Pipi and the founder of Ajabbi.

The future Pipi 9 data centre must be fully automated and physically isolated from the internet.

Power

19" rack-mounted intelligent PDU will distribute power to servers.

According to Server Room Environments

Intelligent Rack PDUs

Intelligent PDUs add a level of sophistication to power distribution within IT server racks and include:

  • PDU and Outlet Metering: metered PDUs provide power-related information locally and remotely, which can include Amps (A), Volts (V), Frequency (Hz), Watts (W), Energy (kWh) and Power (kVA). The power readings can be for individual outlets and the total PDU usage. The information can be used by IT managers for capacity planning, the prevention of circuit overloads, client and cost centre billing (with +/- 1% accuracy) and efficiency calculations, including Power Usage Effectiveness (PUE).
  • Switched and Outlet Switched: a switching PDU will include metering and add the ability to remotely control (ON/OFF) to the PDU and the outlets. Outlet-switched PDUs provide a way for IT and data centre managers to remotely reboot or power down connected loads and allow for cascading power-ups to manage load inrush currents. Switching PDUs adds a layer of security to a server room or data centre power plan in terms of controlling unauthorised access to rack-level loads and their power connections. Costs are also reduced, as an onsite engineer visit is removed in most instances.
  • Remote IP Monitoring: the PDU provides connectivity for remote monitoring and can include HTTP/HTTPS, iPV4 and iPV6, Telnet, SSH, Virtual Serial, SNMP (v1, v2c, v3), JSON-RPC, LDAP, FTP/SFTP and RADIUS for secure login. The monitoring provides a way to view the status of the PDU and its individual socket outlets using a browser, monitoring software or a data centre infrastructure management (DCIM) software suite. The PDU may also offer a RESTful API for bespoke communications applications. Dual Ethernet ports can provide communications redundancy, and the communications module will typically be a ‘hot-swap’ type.

This means that each individual power outlet can be remotely switched on and off via IP. That could be controlled by another server.

A server can power up automatically when the power outlet is turned on. The server can be made to run a program (Pipi 9), do some work, create a backup, and then shut down.

My question is:

"Could the PDU detect that the server has shut down and then turn off the power outlet?"

According to a post on the Schneider Electric forum, this can be done with scripting using Net-SNMP.

Net-SNMP

It runs on Linux and Windows and is open-source.

"Net-SNMP is a suite of software for using and deploying the SNMP protocol (v1, v2c and v3 and the AgentX subagent protocol). It supports IPv4, IPv6, IPX, AAL5, Unix domain sockets and other transports. It contains a generic client library, a suite of command line applications, a highly extensible SNMP agent, perl modules and python modules." - Wikipedia

Pipi 9 in production

Pipi 9 is large but also very power-efficient (it's not an LLM). Each copy of Pipi 9 requires its own server. In this data centre, each enterprise customer has its own backend server running a customised Pipi 9 as a digital twin. I'm experimenting to see whether a small-form-factor refurbished PC could do the job.

There would need to be hundreds of these PCs in racks, autonomously coming online and offline as required to run batch jobs. Massive redundancy is provided by having spare PCs synced, NAS storage, VMs, etc.

This means that single-phase PDUs can be used. Maybe 1 PDU per 10-15 PCs per shelf, with only a few PCs running at any given time. 1 UPS per cabinet at the bottom, supplying multiple shelves.

Mechanical air-lock

Data needs to pass between this isolated data centre and customer deployments hosted in the cloud. My thought is to add a staging area (or several) between the two and pass data back and forth via network switches that are mechanically cycled (analogue). A bit like a double air-lock in space.

This would make it impossible for an external attacker to breach the system.

Robots in control

Pipi 9 is designed to serve as the system administrator for all customer cloud deployments. Coming on only as needed and in quiet times to minimise any disruption.

  • Updates
  • Configuration
  • Security
  • Databases
  • Deploying Docker
  • etc

Pub/Sub Clearly Explained (in Under 6 Minutes)

Mike's Notes

Excellent description.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library > Subscriptions > Level Up Coding System Design
  • Home > Handbook > 

Last Updated

14/01/2026

Pub/Sub Clearly Explained (in Under 6 Minutes)

By: Nikki Siapno
Level Up Coding System: 20/12/2025

Founder LUC | Eng Manager | ex-Canva | 400k+ audience | Helping you become a great engineer and leader.

(6 Minutes) | How It Works, Best Practices, Why Teams Reach for It, When Not to Use It, and Tradeoffs

Pub/Sub Clearly Explained

Pub/sub is not “just a queue with topics.”

It’s a different way of wiring a system. Instead of calling a specific service and waiting, you publish an event and let whoever cares react.

That shift sounds small, but it changes your scalability, your failure modes, and how teams add features without tripping over each other.

If you’ve ever added “just one more downstream call” and watched latency, coupling, and deploy coordination explode, pub/sub is the pattern that turns that mess into a cleaner fan-out.

What Pub/Sub Actually Is

Publish/subscribe (pub/sub) is a messaging pattern where publishers broadcast messages to a topic, and subscribers receive messages for the topics they registered to, without either side needing to know who the other is.

That last part is the real power: decoupling.

The publisher doesn’t care who listens, and the listener doesn’t care who produced the event.

Each side evolves, scales, deploys, and even fails independently. You stop wiring services together one call at a time and start letting events flow through the system like signals; allowing teams to plug in new subscribers without modifying the service that emits the event.

How It Works

Most pub/sub setups have a broker (also called an event bus). Publishers send messages to the broker on a named topic/channel, and the broker routes a copy to every subscriber of that topic.

Key parts:

  • Publisher → Produces an event and publishes it to a topic.
  • Topic/Channel → The category label that subscribers use to filter what they receive.
  • Message broker → Tracks subscriptions and handles routing/buffering/filtering.
  • Subscriber → Consumes events for topics it subscribed to.

Two details that matter in real systems:

  • Push vs pull → Brokers may push events to subscribers, or let subscribers pull, depending on the implementation.
  • Transient vs durable → Some brokers store-and-forward (so offline subscribers can catch up), others only deliver to active subscribers.

Why Teams Reach for Pub/Sub

Pub/sub earns its popularity the moment a team feels how much friction it removes.

Instead of stitching services together with fragile chains of calls, you let events move through the system and let each component respond on its own terms.

That shift opens up a kind of freedom that’s hard to give up once you’ve experienced it.

It buys you that freedom in three ways:

  • Loose coupling → Publishers and subscribers don’t need to know about each other, so components evolve independently.
  • Async flow → Publishers don’t wait for subscribers to finish, so the system stays responsive and producers keep moving.
  • Fan-out scaling → One event triggers many independent workflows, and subscribers can scale out in parallel.
And you see the impact immediately.

A UserSignedUp event can kick off analytics, send a welcome email, run fraud checks, and sync the CRM; all without the signup service knowing or calling any of those systems.

The Tradeoffs

Pub/sub feels like a breakthrough the first time you use it.

You publish an event, everything reacts in parallel, and the system suddenly looks cleaner and more scalable.

But the moment you move past the happy path, you start to see the tradeoffs that come with that freedom. The architecture gets simpler; the responsibilities shift somewhere else.

Here’s where the cracks start to show:

  • Debugging gets harder → Tracing “who reacted to what” is less obvious, so you need strong observability.
  • Delivery semantics vary → Many systems are at-least-once or at-most-once, so consumers must handle duplicates or missed messages.
  • Ordering is not guaranteed → Global ordering is difficult; you may only get ordering within a partition, or none at all.
  • Operational overhead is real → Running brokers, scaling subscribers, and tuning throughput adds complexity.
  • Fire-and-forget” can bite you → Publishers typically only know whether the broker accepted the event; not whether any subscriber successfully processed it.
  • Schema still couples you → You remove service coupling, but you keep contract coupling via event shape/meaning.

In the end, pub/sub doesn’t fully erase complexity; it moves it.

Instead of wrestling with chains of downstream calls, you manage event contracts, consumer behavior, and the operational realities of the broker.

The system becomes more flexible, but you’ll need to carry the new responsibilities that come with that flexibility.

When to Use Pub/Sub

As systems grow, there’s a moment when direct calls stop being a convenience and start becoming a liability.

Pub/sub shines in that moment.

Use it when your system benefits from decoupled, asynchronous, one-to-many communication.

Where it works well:

  • One-to-many events → A single event triggers updates across multiple services/UI components.
  • High throughput workflows → Many tasks can run in parallel off the same event.
  • Rapid evolution → You expect to add new subscribers later without changing the service that emits the event.
  • Real-time feeds/notifications → One update fans out to many subscribers without polling.

When Not to Use It

But pub/sub isn’t magic; it works beautifully in the right context and poorly in the wrong one.

Don’t force pub/sub into problems that need the following:

  • Single dedicated recipient → Use a queue or direct call because fan-out isn’t needed.
  • Strict global ordering → Pub/sub ordering is hard; you’ll add complexity fast.
  • Immediate confirmation → If the workflow requires “did it succeed?” you need request/response or an explicit acknowledgment protocol.
  • Small/simple systems → Pub/sub can be overkill when a few components can just talk directly.

Best Practices

Pub/sub works best when you treat events as first-class APIs.

The architecture gives you room to move, but only if the contracts and operations are strong enough to support that freedom.

These practices help keep the system predictable as it grows:

  • Design clear event schemas → Define each field, its meaning, and its expected stability because consumers rely on your contract.
  • Version events thoughtfully → Add fields in a backward-compatible way and deprecate slowly so subscribers have time to adjust.
  • Make consumers idempotent → Handle duplicates safely because delivery semantics vary across brokers.
  • Keep events focused → Emit facts (“UserSignedUp”) rather than commands (“SendWelcomeEmail”) because facts allow many independent reactions.
  • Avoid leaking internal details → Publish stable domain-level events instead of exposing internals that might change as the system evolves.

Recap

Pub/sub gives you a different way to wire a system: publish once, react in many places.

It removes the tight coupling of direct calls and opens the door to parallel workflows, faster evolution, and cleaner boundaries between teams. But it also shifts complexity into event design, operations, and debugging.

In the end, pub/sub pays off when you embrace its power and its responsibilities.

Building the Reasoning Engine at Axiom

Mike's Notes

This is a very worthy aim. I hope they succeed. I discovered this article from reading This Week In AI Research from Dr Ashish Bamania.

"AxiomProver is an autonomous multi-agent ensemble theorem prover for Lean 4.21, developed by Axiom Math.

It autonomously and fully solved 12 out of 12 problems in Putnam 2025, the world’s hardest college-level math test, using the formal verification language Lean, 8 of which within the exam time.

A repository containing the solutions generated by AxiomProver can be found using this link.

A technical report will follow in the coming days, as per the team." - This Week In AI Research

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library > Subscriptions > This Week In AI Research
  • Home > Handbook > 

Last Updated

13/01/2026

Building the Reasoning Engine at Axiom

By: Axiom
Axiom: 24/05/2025

.

How hierarchical planning, verification, and self-play converge to mathematical superintelligence.

For centuries, mathematics has been humanity's most powerful sandbox – a place where we construct logical frameworks to understand reality's most complex systems. Why do grid cells fire in a hexagonal pattern? Why do quantum energy levels align with the spacing of primes?

Yet mathematical progress has always been shackled by a fundamental cruel bottleneck: the scarcity of extraordinary minds. When Évariste Galois revolutionized algebra in a fevered night before his fatal duel at twenty, he left behind ideas so far ahead of their time that decades passed before his contemporaries could grasp them. Srinivasa Ramanujan, before he succumbed to malnutrition and ill health, channeled thousands of formulas from his dreams into notebooks – results so profound that mathematicians spent nearly a century proving what he had intuited.

Even when breakthroughs occur as mathematicians' legacy, the extreme fragmentation of modern mathematics means experts in different subfields often cannot understand each other's work, causing vital connections between domains to remain hidden for decades. This combination of scarce genius and siloed knowledge creates an extraordinarily long pipeline from discovery to application – fundamental theorems discovered today might take generations before their full implications reshape technology and society – a delay so long that Hardy didn't anticipate in A Mathematician's Apology.

We're on a moonshot mission to change that. Axiom is building a reasoning engine capable of mathematical discoveries at scales and speed previously unimaginable – an AI mathematician.

The timing

Three trends are colliding for the first time in history.

First, neural networks have stepped beyond pattern matching into scalable reasoning, improving as compute, model size, and data grow.

Second, mathematical formalization has come of age through languages like Lean: by the Curry–Howard correspondence, proofs become executable programs, and programming languages are no longer just tools for producing outputs but instruments for certifying properties of abstract objects.

And lastly, LLMs have crossed a critical threshold in code generation, reliably producing high-quality code across many languages – including formal specification languages – and serving as a strong prior over the otherwise infinite action space of mathematics.

This synergy creates an unprecedented opportunity: reasoning engines that can conjecture and prove infinite theorems with zero human involvement.

The Convergence

Autoformalization is the Natural Language Compiler

Our data factory

Imagine you were a programmer in the 1950s. Your day to day was punching machine code into cards.

In the 1970s, you were wrestling with FORTRAN. By the 1990s, maybe C++. Today? You're basically just talking to the computer in English. Turing's childhood dream is now your reality with coding agents.

Each generation got to think a little less about the machine and a little more about the fun problem they were actually trying to solve.

Modern compilers are one-way DAGs. They take Python and transform it down the stack through multiple representations until it becomes machine code. There's some upward flow - as you type in your IDE, a partial compilation happens via Language Server Protocol, feeding information back up to create those squiggly lines and suggestions. But compilers never go fully back up the abstraction ladder. Disassembly exists, but it doesn't reconstruct your original high-level intent.

Mathematics needs something compilers never achieved: a true bidirectional cycle. For thousands of years, mathematicians think in high-level, intuitive leaps, not formal logic. Yet, math is undergoing a modernization. With proofs now spanning hundreds of pages, they are often riddled with hidden errors. In fact, every time proofs dogmatically resist being formally proven, the informal human source was wrong – just recently, mistakes were fixed during the formalization effort of Fermat's Last Theorem. The bottom formal layer catches what high-level intuition misses.

Meanwhile, autoformalization - the process of converting natural language proofs to Lean code - is a form of hierarchical planning, bridging between the abstraction layers.

Going down the stack: Autoformalization translates intuitive mathematical reasoning into formal proof – the compiler's traditional direction.

Going up the stack: Autoinformalization translates formal proofs back into human intuition – something compilers never truly do.

When combined, these create an interactive prover in natural language, freeing mathematicians to explore dozens of strategies and rapidly prototype ideas while the machine handles formalization.

But here's the most powerful part: machine discoveries at the formal level feed back up. The machine finds patterns, lemmas, and connections in the formal space that humans never intuited, then surfaces them as new high-level insights. The mathematician's intuition becomes augmented by discoveries happening in a space they couldn't naturally explore.

The compiler revolution took decades; the mathematical compiler revolution is happening now.

Formal Verification Guides Mathematical World Modeling

Our algorithmic core

You are a gold prospector in 1849. Everyone brings you shiny rocks claiming they've struck it rich.

Experienced prospectors examine them: "Looks like gold to me."

But when confronted with exotic ores, even experts disagree. Their pattern matching fails on things they've never seen.

Then someone brings an assayer's scale. The metal either dissolves in acid or it doesn't. Binary truth.

When you write a proof, it's either correct or it's not. Formal verifiers like Lean provide perfect ground truth while model judges are pattern matchers that fail when being pushed on generating genuinely novel proofs. From an engineering angle, verification gives us efficiently scalable rewards for learning.

And here's the philosophical perspective of why we need formal verification: Our self-consistent, observer-supporting universe follows rules that can be captured mathematically – from laws of physics to probability theory. Mathematics is the consistent language of our consistent universe and formal languages like Lean let us consistently operate in the mathematical world model.

We are training systems in mathematics as reality's minimal simulation – by learning to navigate the world of mathematics one grounded step at a time, the hope is that the AI has learned some fundamental rules that our reality has to follow. Video generation models learn physics too. Sometimes one ponders … where do abstract reasoning and spatial reasoning join?

Conjecturing-Proving Loop Realizes Self-Improving AI

Our discovery engine

While able to test if gold is real, finding new veins is harder: working in concert with verification, we enter the chapter of scientific discovery. Imagine you're in the middle of an ocean. Sailing towards new lands, of course, you start daydreaming about mathematics:

Your map is the Knowledge Base – showing where you've been. The entire corpus of mathematics indexed into a knowledge graph: definitions, theorems, and proofs. Formalized mathematics as a searchable, Internet-scale dataset.

Your ship is the Conjecturer – navigating uncharted territories. It spots distant landmasses through fog: "something valuable three days west." Built for open-ended exploration beyond known results, it samples out of distribution and generalizes with leaps guided by intrinsic motivations.

But when you spot an unknown island on the horizon, how do you know if it's India or the West Indies? The shape looks right, the distance seems plausible, but educated guess isn't certainty. You ask the experienced captain for wisdom that you trust – that is the Prover. Successful proofs extend the knowledge base. Failed attempts provide signals for improving both the Conjecturer and Prover. While formal verification turns "might be true" into "is true," counterexample construction shows "is false." Both grow the library.

The loop is self-reinforcing. More verified theorems mean a richer knowledge base. A richer knowledge base enables more sophisticated conjectures. More proof attempts (successful and failed) train better models. Better models generate more interesting conjectures and find proofs faster.

Axiom is building the AlphaGo for mathematics, but with infinite branching.

The Path Forward

The implications extend far beyond pure mathematics. Every complex system humans want to understand – from protein folding to quantum field theory and economic models – ultimately reduces to mathematical structures. A reasoning engine that can autonomously explore mathematical space and generate new theories doesn't just solve math problems; it provides a general-purpose tool for understanding reality.

Our founding team brings together precisely the expertise needed for this revolution. We were among the first to apply AI to compilers, bringing deep experience in programming languages and compiler technology. Our work spans from AI for mathematical discovery to pioneering self-improving systems. We're building reasoning engines that can operate in the mathematical world model at superhuman scale to tackle our most complex challenges.

The mathematical renaissance isn't coming. It's here.


AxiomProver at Putnam 2025

Putnam 2025, the world's hardest college-level math test, ended December 6th. By the end of the competition, AxiomProver had solved 8 out of 12 problems. In the following days, it solved the remaining 4. AxiomProver is an autonomous multi-agent ensemble theorem prover for Lean 4.21.0, developed by Axiom Math.

This repository contains the solutions generated by AxiomProver. Asterisk denotes solutions found after the competition.

  1. 2025 A1: [source], [graph]. Prover: 110 minutes, 7M tokens. Proof: 652 lines, 23 theorems, 561 tactics.
  2. 2025 A2: [source], [graph]. Prover: 185 minutes, 6M tokens. Proof: 556 lines, 26 theorems, 581 tactics.
  3. 2025 A3: [source], [graph]. Prover: 165 minutes, 8M tokens. Proof: 1,333 lines, 78 theorems, 1,701 tactics.
  4. 2025 A4: [source], [graph]. Prover: 107 minutes, 8M tokens. Proof: 960 lines, 32 theorems, 1,107 tactics.
  5. 2025 A5*: [source], [graph]. Prover: 518 minutes, 9.1M tokens. Proof: 2,054 lines, 52 theorems, 3,074 tactics.
  6. 2025 A6*: [source], [graph]. Prover: 259 minutes, 16M tokens. Proof: 588 lines, 28 theorems, 670 tactics.
  7. 2025 B1: [source], [graph]. Prover: 270 minutes, 7M tokens. Proof: 1,386 lines, 49 theorems, 1,841 tactics.
  8. 2025 B2: [source], [graph]. Prover: 65 minutes, 2M tokens. Proof: 417 lines, 28 theorems, 325 tactics.
  9. 2025 B3: [source], [graph]. Prover: 43 minutes, 2.9M tokens. Proof: 340 lines, 11 theorems, 422 tactics.
  10. 2025 B4*: [source], [graph]. Prover: 112 minutes, 249K tokens. Proof: 1,061 lines, 23 theorems, 1,433 tactics.
  11. 2025 B5: [source], [graph]. Prover: 354 minutes, 18M tokens. Proof: 1,495 lines, 66 theorems, 1,967 tactics.
  12. 2025 B6*: [source], [graph]. Prover: 494 minutes, 21M tokens. Proof: 1,019 lines, 30 theorems, 1,052 tactics.

The Strange Physics That Gave Birth to AI

Mike's Notes

Part of a series "Science, Promise and Peril in the Age of AI" from Quanta magazine in 2025. The series is excellent, explaining the science behind AI.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 

Last Updated

12/01/2026

The Strange Physics That Gave Birth to AI

By: Elise Cutts
Quanta Magazine: 30/04/2025

Elise Cutts is one of those ex-researchers who realized that writing about science is much more fun than doing it herself. Previously a geobiologist, she now writes about physics, geoscience, and space research in Europe and beyond from her home in Graz, Austria.

Modern thinking machines owe their existence to insights from the physics of complex materials.

Spin glasses might turn out to be the most useful useless things ever discovered.

These materials — which are typically made of metal, not glass — exhibit puzzling behaviors that captivated a small community of physicists in the mid-20th century. Spin glasses themselves turned out to have no imaginable material application, but the theories devised to explain their strangeness would ultimately spark today’s revolution in artificial intelligence.

In 1982, a condensed matter physicist named John Hopfield borrowed the physics of spin glasses to construct simple networks that could learn and recall memories. In doing so, he reinvigorated the study of neural networks — tangled nets of digital neurons that had been largely abandoned by artificial intelligence researchers — and brought physics into a new domain: the study of minds, both biological and mechanical.

Hopfield reimagined memory as a classic problem from statistical mechanics, the physics of collectives: Given some ensemble of parts, how will the whole evolve? For any simple physical system, including a spin glass, the answer comes from thermodynamics: “toward lower energy.” Hopfield found a way to exploit that simple property of collectives to store and recall data using networks of digital neurons. In essence, he found a way to place memories at the bottoms of energetic slopes. To recall a memory, a Hopfield network, as such neural nets came to be known, doesn’t have to look anything up. It simply has to roll downhill.

The Hopfield network was a “conceptual breakthrough,” said Marc Mézard, a theoretical physicist at Bocconi University in Milan. By borrowing from the physics of spin glasses, later researchers working on AI could “use all these tools that have been developed for the physics of these old systems.”

In 2024, Hopfield and his fellow AI pioneer Geoffrey Hinton received the Nobel Prize in Physics for their work on the statistical physics of neural networks. The prize came as a surprise to many; there was grumbling that it appeared to be a win for research in AI, not physics. But the physics of spin glasses didn’t stop being physics when it helped model memory and build thinking machines. And today, some researchers believe that the same physics Hopfield used to make machines that could remember could be used to help them imagine, and to design neural networks that we can actually understand.

Emergent Memory


A black-and-white portrait of a man in a tweed jacket and tie with his arms crossed.

The American physicist John Hopfield, pictured in 1988, developed a model of a neural network that laid the foundation for modern AI. Caltech Archives and Special Collections

Hopfield started his career in the 1960s working out the physics of semiconductors. But by the end of the decade, “I had run out of problems in condensed matter physics to which my particular talents seemed useful,” he wrote in a 2018 essay(opens a new tab). So he went looking for something new. After a foray into biochemistry that produced a theory of how organisms “proofread(opens a new tab)” biochemical reactions, Hopfield settled on neuroscience.

“I was looking for a PROBLEM, not a problem,” he recalled in his essay, emphasizing the need to identify something truly important. “How mind emerges from brain is to me the deepest question posed by our humanity. Definitely a PROBLEM.”

Associative memory, Hopfield realized, was a part of that problem that his tool kit from condensed matter physics could solve.

In a normal computer, data is stored statically and accessed with an address. The address doesn’t have anything to do with the information that’s stored. It’s just an access code. So if you get the address even a little bit wrong, you’ll access the wrong data.

That’s not how humans seem to remember things. We often remember by association. Some cue or scrap of memory brings the full thing flooding back. It’s what happens when you smell lilacs and recall a childhood episode in your grandpa’s garden, or when you hear the first few lines of a song and find yourself belting out every word to a ballad you didn’t know you knew.

Hopfield spent years on understanding associative memory and translating it to a neural network. He tinkered with randomly wired neural networks and other potential models of memory. It wasn’t looking good until, eventually, Hopfield identified an unlikely key to the “PROBLEM.’’

Two smiling men in suits stand side by side.

Geoffrey Hinton (left) and John Hopfield accepted the 2024 Nobel Prize in Physics at a ceremony in Stockholm in December. The prize honored their pioneering work on the earliest neural network models, which were based on the physics of spin glasses. Wikimedia Commons

Spin Glasses

In the 1950s, scientists studying certain dilute alloys such as iron in gold realized that their samples were doing some strange things. Above a certain temperature, these alloys behave similarly to a normal material such as aluminum. They aren’t magnetic on their own, but they do interact weakly with external magnetic fields. For instance, you can use a very strong magnet to move an aluminum can, but aluminum itself can’t work as a magnet. Usually, materials such as aluminum lose their magnetization as soon as the external magnet disappears. But below a certain temperature, spin glasses do something different. Their transient magnetization sticks around, albeit at a lower value. (This isn’t the only weird thing that spin glasses do; their thermal properties are also puzzling.)

Around 1970, condensed matter physicists started to get a theoretical handle on these materials by tweaking physicists’ go-to model of collective magnetic behavior: the Ising model.

An Ising model looks like a simple grid of arrows, each of which can point up or down. Every arrow represents the intrinsic magnetic moment, or “spin,” of an atom. This is a simplification of a real atomic system, but by tweaking the rules by which nearby spins affect one another, the model can generate surprisingly complex behaviors.

In general, nearby arrows that point in the same direction have low energy, while arrows that point in opposite directions have high energy. If the spins are free to flip, the Ising model’s state will thus evolve towards a lower-energy state of alignment, like a ball rolling downhill. Magnetic materials such as iron end up settling into simple states with their spins aligned in either the all-up or all-down state.

In 1975, the physicists David Sherrington and Scott Kirkpatrick devised a model that could capture the more complicated behavior of spin glasses by modifying the rules of how spins interact. They randomly varied the interaction strengths between spin pairs and allowed each spin to interact with every other spin — not just its nearest neighbors. That change led to a rugged “landscape” of possible energy states. There were peaks and valleys corresponding to higher and lower energy configurations; depending on where the spin glass started off in this landscape, it would end up in a unique valley, or low-energy equilibrium state. That’s quite different from ferromagnets such as iron, which “freeze” into one of two orderly states with all spins aligned, and nonmagnets, whose spins fluctuate randomly and don’t settle down at all. In a spin glass, randomness gets frozen.

The Ising model is very much a toy model. Using it to try to predict anything about real materials is a bit like using a stick figure to plan a surgery. But remarkably, it often works. The Ising model is now a workhorse of statistical mechanics. Variations on its theme can be heard in just about every corner of the study of complex, collective phenomena — including, because of Hopfield, memory.

Spin Memory

A simple view of interacting neurons has a lot in common with an Ising model of magnetic spins. For one thing, neurons are often modeled as basically binary on-off switches; they either fire or they don’t. Spins, likewise, can point either up or down. In addition, a firing neuron can either encourage or discourage the firing of its neighbor. These variable interaction strengths between neurons recall the changeable interaction strengths between spins in a spin glass. “Mathematically, one can replace what were the spins or atoms,”  said Lenka Zdeborová, a physicist and computer scientist at the Swiss Federal Institute of Technology Lausanne. “Other systems can be described using the same toolbox.”

To make his network, Hopfield started with a web of artificial neurons that can be either “on” (firing) or “off” (resting). Each neuron influences every other neuron’s state, and these interactions can be adjusted. The network’s state at any given time is defined by which neurons are firing and which are at rest. You can code these two states in binary: A firing neuron is labeled with a 1 and a resting neuron with a 0. Write out the state of the entire network at any given moment, and you’ve got a string of bits. The network doesn’t “store” information, exactly. It is information.

A woman in a white sweater stands in front of a large architectural feature.

Lenka Zdeborová, a physicist and computer scientist at the Swiss Federal Institute of Technology Lausanne, studies how the physics of matter can help model the behavior of machine learning algorithms. Samuel Rubio for Quanta Magazine

To “teach” the network a pattern, Hopfield sculpted its energy landscape by modifying the strengths of interactions between neurons so that the desired pattern fell at a low-energy steady state. In such a state, the network stops evolving and stabilizes in just one pattern. He found a rule for doing this inspired by neuroscience’s classic “neurons that fire together wire together” rule. He would tune up interactions between neurons that both fire (or both rest) in the desired final state and dial down interactions between mismatched pairs. Once a network is taught a pattern this way, it can reach the pattern again simply by navigating downhill through the network’s energy landscape; it will naturally reach the pattern when it settles into an equilibrium state.

“Hopfield made the connection and said, ‘Look, if we can adapt, tune the exchange couplings in a spin glass, maybe we can shape the equilibrium points so that they can become memories,’” Mézard said.

Hopfield networks can remember multiple memories, each in its own little energy valley. Which valley the network falls into depends on where it begins in its energy landscape. In a network that stores a picture of a cat and a picture of a spaceship, for instance, a starting state that’s vaguely cat-shaped will roll down into the cat valley more often than not. Likewise, starting the network in a state that recalls the geometric forms of a spaceship will usually prompt it to evolve toward the spaceship. That’s what makes Hopfield networks a model of associative memory: Given a corrupted or incomplete version of a memory, a Hopfield network dynamically reconstructs the whole thing.

Old Model, New Ideas

From 1983 to 1985, Hinton and his colleagues built on Hopfield’s work. They found ways to inject randomness into Hopfield networks to create a new type of neural network called a Boltzmann machine. Rather than remember, these networks learn the statistical patterns in training data and spin up new data to match those patterns — an early kind of generative AI. In the 2000s, Hinton was able to use a pared-down version of the Boltzmann machine to finally crack the stubborn problem of training “deep’’ neural networks consisting of multiple layers of neurons.

By 2012, the success of deep neural networks developed by Hinton and other pioneers was impossible to ignore. “It became clear that this is actually working amazingly well and just transforming the whole tech industry,” Zdeborová said. The generative AI models many of us now interact with every day, including large language models such as ChatGPT and image-generation models such as Midjourney, are all deep neural networks. They can trace their success back to curious physicists in the 1970s who refused to let the “useless” properties of spin glasses go unexplained.

Hopfield networks aren’t just part of AI’s past, however. Thanks to new ideas, these old models could be making a comeback.

In 2016, Hopfield and Dmitry Krotov(opens a new tab) of IBM Research realized that Hopfield networks weren’t just one model, but a whole family of models with different memory storage capacities(opens a new tab). Then, in 2020, another team showed that a key part of the transformer architecture, the blueprint of most modern successful AI models, was a member of that extended Hopfield network family(opens a new tab).

Armed with that insight, Krotov and his colleagues recently developed a new deep learning architecture called the energy transformer(opens a new tab). Typical AI architectures are usually found by trial and error. But Krotov thinks energy transformers could be designed more intentionally with a specific energy landscape in mind, like a more complex take on a Hopfield network.

Though Hopfield networks were originally designed to remember, researchers are now exploring how they can be used to create. Image generators such as Midjourney are powered by “diffusion models,” which are themselves inspired by the physics of diffusion. To train them, researchers add noise to the training data — say, pictures of cats — and then teach the model to remove the noise. That’s a lot like what a Hopfield network does, except instead of always landing on the same cat picture, a diffusion model removes “non-cat” noise from a noisy, random starting state to produce a new cat.

A smiling man with crossed arms stands in front of a blackboard.

Dmitry Krotov, a computer scientist at IBM Research, has shown that some of the most advanced AI models in use today follow the same basic principle that Hopfield networks employed from the start. Kim Martineau

It turns out that diffusion models can be understood as a particular kind of modern Hopfield network(opens a new tab), according to Krotov and his colleagues, including Benjamin Hoover(opens a new tab), Yuchen Liang(opens a new tab) and Bao Pham(opens a new tab). And that approach can be used to predict aspects of these networks’ behavior. Their work suggests that feeding a modern Hopfield network more and more data doesn’t just saturate its memory. Instead, the model’s energy landscape gets so rugged that it is more likely to settle on a made-up memory than a real one. It becomes a diffusion model(opens a new tab).

That a simple change in quantity — in this case, the amount of training data — can trigger an unexpected change in quality isn’t anything new for physicists. As the condensed matter physicist Philip Anderson wrote back in 1972, “more is different(opens a new tab).” In collective systems, simply scaling up networks of interactions between parts can add up to surprising new behaviors. “The fact that [a neural network] works is an emergent property,” Mézard said.

Emergence in a deep learning architecture — or a brain — is as captivating as it is puzzling; there’s no universal theory of emergence. Perhaps statistical physics, which provided the first tools for understanding collective behavior, will be the key not just to using but also to understanding the inscrutable machine intelligences changing our world.