Predictive Coding

Mikes Notes

Alex Shkotin kindly shared two links to websites with me last night.
https://www.verses.ai/research-development-roadmap , which looks close to CAS.

An interesting discussion on the Ontolog Forum https://groups.google.com/g/ontolog-forum/c/SWoioGgyx3g/m/Ut2DOEcXAQAJ

From the Verses AI website

"Biological agents are efficient, curious, self-organizing systems that anticipate the effects of their actions on the world while smoothly coping with noise and uncertainty. The human brain, while an essential source of inspiration for AI, is only one manifestation of such capacities, which characterize intelligence in nature at many scales. 

The hypothesis guiding research and development at VERSES AI is that artificial general intelligence (AGI) can be attained by discovering the deeper principles underlying biological intelligence and deploying them as design principles to construct cyber-physical ecosystems of intelligent agents in which humans are integral participants — what we call “shared intelligence”.

We originally laid out this vision for the present and future of AI in our white paper at the end of 2022. ..."

Ontolog Forum

The discussion on the Ontolog Forum was raised by John Sowa and about Dr. Karl Firston, Chief Scientist at Verses AI.




"After a bit of searching, I found more info about Verses AI and their new chief scientist.  I like the approach they're taking:  putting more emphasis on natural thinking process in neuroscience.  And their new chief scientist has publications that would lead them in that direction.  The ideas look good, and I would recommend them.  But I don't know how far he and his colleagues have gone in implementing them, or how long it will take for anything along those lines to be running in a practical system.

However, it's unlikely that any company would hire somebody as chief scientist without a considerable amount of prior work.  And I doubt that any company would make an announcement in a full-page ad in the New York Times unless they already had some kind of prototype. ..."

Mikes Notes

The approach by Verses is broadly correct.

This is a fascinating article from Verses
https://www.verses.ai/blogs/executive-summary-designing-ecosystems-of-intelligence-from-first-principles

It's the same approach I used to successfully build Pipi 9. The main difference is that Pipi is software running on a server containing hundreds of interacting agents causing emergent and adaptive properties. Pipi also sets out to provide a Complex Adaptive System (CAS) as a SAAS platform to host SAAS CAS Applications.

I did this through curiosity-led experimentation inspired by the computer modelling of biological cells. The approach by Verses is led by the research-led theory of the brain. So, I found the research reading below a possible insight into why my experiments worked. I need to do some reading to understand the theory better.

From the forum

A recent book (2022) from MIT Press with a foreword by Friston covers the field:  "Active Inference: The Free Energy Principle in Mind, Brain, and Behavior."  Chapters of that book can be downloaded for free.  Appendix C has an annotated example of the Mathlab code.

From Wikipedia

Karl John Friston FRS FMedSci FRSB (born 12 July 1959) is a British neuroscientist and theoretician at University College London. He is an authority on brain imaging and theoretical neuroscience, especially the use of physics-inspired statistical methods to model neuroimaging data and other random dynamical systems. 

Friston is a key architect of the free energy principle and active inference. In imaging neuroscience, he is best known for statistical parametric mapping and dynamic causal modelling.

In October 2022, he joined VERSES Inc, a California-based cognitive computing company focusing on artificial intelligence designed using the principles of active inference, as Chief Scientist.



Friston is one of the most highly cited living scientists and in 2016 was ranked No. 1 by Semantic Scholar in the list of top 10 most influential neuroscientists.

In the discussion, Dan Brickley of W3C, Dublin Core, Schema.org and Google Research shared this GIT hub



 repository maintained by Beren Millidge. who is a Postdoctoral Researcher in Machine Learning and Computational Neuroscience at the University of Oxford. Unravelling intelligence in both brains and machines.

Predictive Coding Paper Repository

This repository provides a list of papers that are interesting or influential about Predictive Coding. If you believe I have missed any papers, please contact me at beren@millidge.name or make a pull request with the information about the paper. I will be happy to include it.

Predictive Coding

Predictive Coding is a neurophysiologically-grounded theory of perception and learning in the brain. The core idea is that the brain always maintains a prediction of the expected state of the world, and that this prediction is then compared against the true sensory data. Where this prediction is wrong, prediction errors are generated and propagated throughout the brain. The brain's 'task' then is simply to minimize prediction errors.

The key distinction of this theory is that it proposes that prediction-errors, rather than predictions, or direct representation of sense-data is in some sense the core computational primitive in the brain.

Predictive coding originated in studies of ganglion cells in the retina, in light of theories in signal processing, about how it is much more efficient to send only 'different' or 'unpredicted signals' than repeating the whole signal every time -- see delta-encoding.

Predictive coding has several potential neurobiologically plausible process theories proposed for it -- see 'Process Theories' section, although the empirical evidence for precise prediction error minimization in the brain is mixed

Predictive coding has also been extended in several ways. It can be understood as a variational inference algorithm under a Gaussian generative model and variational distribution. It can be setup as an autoencoder (predict your input, or next-state), or else in a supervised learning fashion.

Predictive coding can also be extended to a hierarchical model of multiple predictive coding layers -- as in the brain -- as well as using 'generalised coordinates' which explicitly model the higher order derivatives a state in order to be able to explicitly model dynamical systems.

More recent work has also focused on the relationship between predictive coding and the backpropagation of error algorithm in machine learning where under certain assumptions, predictive coding can approximate this fundamental algorithm in a biologically plausible fashion. Although the exact details and conditions still need to be worked out.

There has also been much exciting work trying to merge predictive coding with machine learning to produce highly performant predictive-coding-inspired architectures.

Surveys and Tutorials

This is a great review which introduces the basics of predictive coding and its interpretation as variational inference. It also contains sample MATLAB code that implements a simple predictive coding network. I would start here.

This review walks through the mathematical framework and potential neural implementations in predictive coding, and also covers much recent work on the relationship between predictive coding and machine learning.

This is a fantastic review which presents a complete walkthrough of the mathematical basis of the Free Energy Principle and Variational Inference, and derives predictive coding and (continuous time and state) active inference. It also presents the 'full-construct' predictive coding including with hierarchical layers and generalised coordinates in an accessible fashion. I would reccomend reading this after Bogacz' tutorial (although be prepared -- it is a long and serious read)

A short and concise review of predictive coding algorithms up to 2017.

A nice review of simple predictive coding architectures with a focus on their potential implementation in the brain.

Classics

A key influential early paper proposing predictive coding as a general theory of cortical function.

One of the earliest works proposing predictive coding in the retina.

An early but complete description of predictive coding as an application of the FEP and variational inference under Gaussian and Laplace assumptions. Also surprisingly readable. This is core reading on predictive coding and the FEP

The first paper establishing the links between predictive coding and variational inference.

Makes a conjectured link between precision in predictive coding and attention in the brain.

Presents the 'full-construct' predictive coding model with both hierarchies and generalised coordinates.

Extends predictive coding to generalised coordinates, and derives the necessary inference algorithms for working with them -- i.e. DEM, dynamic expectation maximisation.

Foundational treatment of variational inference for dynamical systems, as represented in generalised coordinates. Also relates variational filtering to other non-variational schemes like particle filtering and Kalman filtering.

Andy's book is great for a high level overview, strong intuition pumps for understanding the theory, and a fantastic review of potential evidence and neuropyschiatric applications.

Neurobiological Process Theories

A key process theory paper. Proposing perhaps the default implementation of predictive coding in cortical layers.

Demonstrates that predictive coding is equivalent to popular biased competition models of neural function.

A process theory of predictive coding including action predictions which implement active inference (continuous version).

A great review delving deep into the evidence for predictive coding being implemented in the brain. Evidence is currently somewhat lacking, although the flexibility of the predictive coding framework allows it to encompass a lot of the findings here.

Neuroscience applications

Relationship to Backpropagation

PC-inspired machine learning

Extensions and Developments

This paper investigates how serveral biologically implausible aspects of the standard predictive coding algorithm -- namely requiring symmetric forward and backward weights, nonlinear derivatives, and 1-1 error unit connections can be relaxed without unduly harming performance of the network.

This paper further looks at how various implausibility of the predictive coding algorithm can be relaxed, and focuses especially on the question of how negative prediction errors could be represented, as well as invents a divisive prediction error scheme -- where prediction errors are the activities divided by the predictions.

No comments:

Post a Comment