How RNNs Work (And Why Everyone Stopped Using Them)

Mike's Notes

A great explanation of how RNNs work.

Resources

References

  • Attention is All You Need by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. 2017. ArXiv 1706.03762.

Repository

  • Home > Ajabbi Research > Library > Subscriptions > Into AI
  • Home > Handbook > 

Last Updated

28/03/2026

How RNNs Work (And Why Everyone Stopped Using Them)

By: Dr. Ashish Bamania and Jose Parreño Garcia
Into AI: 07/02/2026

Dr. Ashish Bamania: I help you to level up in AI and Quantum Computing
Jose Parreño Garcia: I write about Data Science, Machine Learning and leading data teams. I have built teams from scratch and lead 50+ data scientists @Skyscanner. Now, I share my experience with you.

A gentle walkthrough of how Recurrent Neural Networks (RNNs) work, and the math that breaks them.

This week’s newsletter is written by Jose Parreño Garcia. He is a senior Data Science manager at Skyscanner.

He regularly shares insights on building effective teams, developing leadership skills, and advancing careers in Data Science and Machine Learning through his newsletter, Senior Data Science Lead.

You can also find him and stay up to date with his content on LinkedIn.

This week, I went all the way back to 2017. That’s when the now-legendary ‘Attention is All You Need’ paper came out — the one that introduced the world to Transformers, and set the foundation for everything from ChatGPT to image generation to code-writing copilots.

And sure, I could jump straight into explaining how Transformers work. But given the impact these models have had — and the fact that you probably see the word “attention” 30 times a week now — I thought it would be worth taking a step back (actually 2 steps back).

Before we can truly understand Transformers, we need to understand where they came from. And that means revisiting the architectures that paved the way: Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks.

In this blog post, I am diving into RNNs.

We will walk through what they are, how they work, and most importantly, why they struggle. By the end, RNNs will feel like a clever little for-loop with memory instead of scary maths magic.

Ready? Let’s jump in!

What we will cover in this article

  1. How is an RNN different from a classical Deep Neural Network? (or other classical sequence models)
  2. Introducing a made-up use case with 3 data points for Stock price prediction
  3. The scary official diagram of an RNN cell. (Don’t worry, we will break it down super easily.)
  4. A walkthrough of RNN calculations (And you will see how the maths is not that scary after all.)
  5. Three problems associated with RNNs

How is an RNN different from a classical Deep Neural Network?

I assume in this post that you have worked with (or are familiar with) the basics of classical Deep Neural Networks (from here on, DNNs).

If my assumption is correct, then the diagram below should feel really familiar. It is a diagram of a DNN with:

  • X: A set of input nodes, representing the variables you want to use for prediction.
  • H1, H2: 2 hidden layers with 4 nodes. This is where the parameters that learn how to dial up or down specific signals from X. Basically, the knobs the model adjusts to learn.
  • Y: The prediction node. In this case, it’s only 1 for simplicity.

Diagram of a classical Deep neural network (DNN)

Now, there are 2 main things to highlight in this diagram:

  1. The DNN processes all the data at once.
    You see the X input? From a DNN’s perspective, it’s a torrent of data, all pushed and processed at once. There is nowhere in this diagram that the network can say, “Hey, can you just send me X1 and X2 first, and then I can process X3 and X4?”.
  2. The DNN is feed-forward (or sequential).
    In other words, the data flows from left to right (from X → Y). There is nowhere in this diagram that a node can stop the data flow and ask: “Hey, what data did I have in the previous step?” It is oblivious to that.

DNNs are really powerful, but they are also “memory-less”

DNNs are “memory-less” because of the two points mentioned above. And being “memory-less” means that DNNs really struggle to predict when sequence or order matters.

Take a simple stock price prediction scenario. The only way that DNNs can consider what happened yesterday or the day before is if you tell them what matters. This is usually done by manually creating features like:

  • Yesterday’s price
  • A moving average over the past 7 days
  • A sine-transformed day-of-week feature to capture seasonality

But wasn’t Deep learning supposed to eliminate Feature engineering?

Yes (and no).

Neural networks do learn internal representations (i.e., “features”) from raw data. But when it comes to sequences, classical deep neural nets still need all the help they can get (so, kind of back again to square one, where we have to feature engineer stuff…)

This is where Recurrent Neural Networks (RNNs) come in.

Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important.

Their two main characteristics are:

  1. RNNs process one input at a time.
    Instead of taking all of the input X in one big gulp, RNNs look at one data point at a time — like reading a sentence word by word, or stepping through a stock price day by day. This allows them to focus on how each input evolves over time.
  2. RNNs are recursive.
    Yes, the information flows left to right, but at each step, it can also look at what happened before (kind of a right-to-left motion). It’s like a left-to-right with memory.

Don’t worry if this feels dense right now, we will break it down step by step.

By the end of this post, you will not only understand what “recurrent connections” mean, but you will also see why RNNs became a foundational architecture for handling sequences.

Introducing a made-up use case for Stock price prediction

Ok, before we get into the RNN section, let me introduce you to the simplest stock price prediction exercise in the course of human history.

The diagram below shows a toy time series where:

  • For simplicity, all stock prices are set to 0.
  • We have 3 data points.
  • The idea is to use yesterday’s and today’s stock prices to predict tomorrow’s stock price.

I want to introduce this in such a simple way because:

  1. I want to actually show you the maths of the RNN using super simple numbers.
  2. Labelling each step as yesterday, today, and tomorrow will help anchor the RNN diagrams that follow.

Let’s keep this mental model in our back pocket — it’s going to make the scary-looking RNN diagrams feel a lot less scary.

Ok, now we are ready to get scared by a diagram of an RNN cell!

The scary official diagram of an RNN cell

So far, we have discussed how RNNs differ from classical DNNs because they remember their past. But what does that actually look like inside the model?

Well... time to face the infamous RNN diagram.

It might look like a tangle of wires and equations at first, but don’t worry, we will walk through it slowly, tie it back to the stock price example, and by the end, you will see it’s just a simple process of multiplication, addition, and a squiggly activation function or two.

The diagram below is a vanilla RNN cell.

I can sense you sweating… a flow diagram? With parameters? With maths operations? Ok, let’s break it up so that you don’t have to process it all at once.

Here are the elements to focus on:

  1. Note the X, h and Y elements (similar to the classical DNN diagram).
  2. X(t) represents today's data point, h(t-1) is what came from yesterday, h(t) is what is passed to predict tomorrow, and Y(t+1) is the predicted data point.
  3. Y(t+1) represents the prediction that we want.
  4. tanh and softmax are just activation functions. The same kind you have seen in regular neural networks, so nothing special for RNNs. They take raw values and squash them into a friendlier range, like between -1 and 1 (for tanh) or 0 and 1 (for softmax).
  5. Finally, there are a couple of blobs with mult and sum. These are just visual aids so that you can see the operations in action when we pull the numbers in.
  6. There are maths operations outside of the cell. These are basically there just so that we can transform a squashed value coming out of tanh, into a real value that makes sense. For example, transform 0.9 coming out of the tanh function to maybe $5.

✍️ Quick note

Technically, the RNN makes a prediction y(t) after seeing input x(t) and memory h(t-1).

But since in our toy example we are trying to predict the next value, it’s tempting to call it y(t+1) — just know that it’s really y(t) in the math, but the target we are aiming for is the value at t+1.

I purposely named it y(t+1) for pedagogical reasons for this post.

Let’s map this diagram with the theoretical RNN math function

In the image below, I have added the math functions that lead to the two outputs from the RNN cell: y(t+1) and h(t).

So, if we wanted to either predict an output (Y) or carry to the next stage (h), then the neural network should learn:

  • Wx: This is the weight applied to today’s data, X(t). It controls how much the model should care about today’s stock price. Extreme case, if Wx = 0, then this means we don’t care about X(t) because Wx * X(t) would yield 0.
  • Wh: This is the weight applied to the previous hidden state, h(t–1). It tells the model how much to rely on memory. If Wh = 0, the past is forgotten.
    ⚠️ Don’t misinterpret h(t-1) by thinking it is yesterday’s stock price. It is what comes out of the cell (with its multiplications, sums, and activation functions) applied to yesterday’s stock price.

  • b: This is the bias term. Think of it as a small correction applied regardless of the input. It’s important in training, but not very interesting for understanding how RNNs work conceptually (as it affects DNNs the same way). If you want to deep dive, check this link.

I believe that only when we plug in numbers to these diagrams will we start really understanding what is happening inside the RNN. Let’s do this next.

A Walkthrough With Real Numbers

Alright, time to take what we have learnt and run it step by step. Instead of just showing the internals of a single RNN cell, we’ll now “unroll” it (you will see what that is in a second).

This will finally answer the big question: how does an RNN actually use the past to predict the future?

How do we represent tomorrow’s prediction diagrammatically?

Pretty simple. We just copy and paste the same RNN cell forward in time, one per data point in our sequence.

In our toy stock price example, we only have two data points (t–1 and t), so we unroll the cell twice in order to make a prediction at t+1.

And this is why they are called recurrent, because the same logic is applied over and over, like a for loop.

Cool, now that you are comfortable with what happens inside a single RNN cell, let’s walk through this unrolled diagram in detail to ensure we are all on the same page.

  1. We begin by plugging in yesterday’s stock price. Because there is no data prior to yesterday, we can ignore the previous hidden state input h(t-2).
  2. Using both math functions, we then calculate y(t) and h(t-1). From these two, only h(t-1) is useful for us. This is the value that describes the memory, and that will be passed to the next cell. y(t) is irrelevant, so we ignore it.
  3. Finally, we plug in today’s data and run through the relevant maths operations to calculate y(t+1). You can see from the diagram that the RNN is using today’s data X(t) and yesterday’s data h(t-1) from memory to calculate what could happen tomorrow.

💡 An important highlight: Shared weights and biases

“Wait a sec... you are using the same Wx, Wh, and b in both cells. Shouldn’t they be different?”

Great question! This is the part that makes RNNs elegant, but also tricky (you will see at the end how these shared weights break an RNN’s learning process).

Unlike feedforward layers that might learn new weights for every input, an RNN cell reuses the same weights at every timestep. That’s the “recurrent” part. An RNN not only repeats the cell structure, but it also repeats the exact same function with the same learned parameters.

So yes — Wx, Wh, and b are constant across time. What changes is the input x(t) and the memory from the previous step h(t–1), which is how the model updates its thinking as it moves forward.

Plugging numbers into the diagram

Before doing some basic maths, let’s talk about the numbers being used:

  1. Note how I substituted Wh, Wx, b, W_output and b_output with numbers. I made these numbers up, but they are the ones the neural network would tweak during its learning process.
  2. The input data points are the ones we know from the time series. X(t-1) and X(t) are both 0.

Now we are ready to take pen and paper and perform all the calculations in this diagram.

Plugging in all the numbers, y(t+1) comes out to be 0. Nice, this is what we expected from our mock stock price time series!

What would the diagram look like if we had 50 data points?

Well... you would copy the RNN cell forward 49 times, just like a for loop with 50 iterations (luckily for me, I am not drawing that diagram...)

But structurally, nothing changes. You still:

  • Reuse the same weights
  • Pass memory from each time step to the next
  • And only apply the final output prediction where it matters, which is usually the last cell in the sequence

What we learn from the above examples is that:

  • RNNs reuse the same weights and biases at each step.
  • Hidden states h(t) are like memory, passed forward through time.
  • Outputs y(t) are generated by applying a linear transformation and softmax on the hidden state.
  • Even a dumb toy dataset of all zeros reveals the internal mechanics beautifully.

That wasn’t that hard, right? Well, I have bad news for you…

Three Major Problems With RNNs

These vanilla RNNs are never used for real-world use cases because they come with three big problems:

  • They are slow to train.
  • They suffer from the problem of vanishing gradients
  • They suffer from the problem of the exploding gradients

Let’s cover these in detail in their own sections.

Problem 1: Training a vanilla RNN is very slow

Deep Neural Networks (DNNs) process all their inputs at once in a single forward pass. Everything flows from left to right, layer by layer. That means training can occur in parallel across many data points and GPU cores.

This is not the case with RNNs.

Because RNNs depend on previous hidden states, they are inherently sequential. You can’t calculate h(t) until you have calculated h(t–1).

It’s like reading a book: you can’t understand chapter 5 until you have read chapter 4.

RNNs make you walk through time, one step at a time, and this sequential dependency kills parallelism.

That is the first problem: RNNs are powerful, but they pay for it in training speed.

Problem 2: The problem of the Vanishing gradients

To explain this problem, I would have to take some mathematical shortcuts. To really know what is happening under the hood, you need to be familiar with the chain rule used in backpropagation. But showing the full impact of vanishing gradients using the backpropagation formula would be overkill.

Let’s zoom in on just one parameter in the RNN, i.e. Wh, the weight that is multiplied by the previous hidden state.

In our earlier 3-step example, Wh was used once (just one multiplication). But, if you are training an RNN for 50 timesteps (say, 50 days of stock data), that means Wh shows up 49 times in the full chain of calculations.

When the model tries to update Wh via backpropagation, mathematically, the gradient is multiplying Wh over and over again, kind of like:

Uh-oh. What happens when Wh < 1?

Try plugging in Wh = 0.5:

That’s basically zero. This means that the update to Wh during training is so tiny, it’s like the model is frozen. It can’t escape its starting point. It just sits there, unable to learn anything useful.

Problem 3: The problem of the Exploding gradients

This is the opposite problem. If vanishing gradients are the slow death of learning, exploding gradients are the chaotic opposite.

What happens when Wh > 1? Say Wh = 1.5.

That’s nearly a billion. This means that during backpropagation, the gradient becomes massive. And with a gradient that large, your weight update becomes a wild jump.

The result is that your model overshoots the loss minimum, bounces around the optimisation landscape like a drunk pinball (love that game), and probably never converges.

This is the exploding gradients problem. Same root cause as vanishing gradients — compounding multiplications through time — but now the problem is too much signal, instead of too little.

What’s next: How LSTMs fixed all of this (mostly)

With everything we have covered, you might think, “Hey, that vanilla RNN was easy enough to understand, but with the stated problems, it also looks pretty useless, right?”

I would mostly agree. Training a vanilla RNN is not impossible, but it requires skill, feature engineering, fine-tuning, and time. The vanishing and exploding gradient problem is the one that mostly holds back an RNN.

This is why LSTMs were introduced.

LSTMs are improved versions of RNNs. With built-in mechanisms (called Gates) to decide what to keep, what to forget, and what to pass forward, they were designed specifically to beat vanishing and exploding gradients at their own game.

LSTMs are for the next post, where we will explore how they work and why they became the go-to tool for sequence modelling… at least until Transformers came along.

Now, I want to hear from you!

In this post, we broke down how RNNs work, from the vanilla cell structure to why they struggle to train in the real world.

We kept it simple, even used an all-zero dataset, and uncovered the quiet math that makes RNNs nearly impossible to scale.

But now I’m curious about your experience.

  • Have you ever built or trained an RNN model?
  • Did you run into vanishing gradients or the joys of exploding updates?
  • Maybe you were introduced to LSTMs (or GRUs) straight away and skipped vanilla RNNs altogether?
  • Or maybe you are just now connecting the dots between hidden states, time steps, and why Transformers were such a leap.

Drop your thoughts, experiences, or lingering questions in the comments. I would love to hear how you’ve approached sequence modelling in your ML journey.

Thanks again to Jose Parreño Garcia for writing this week’s newsletter.

Don’t forget to subscribe to his newsletter and connect with him on LinkedIn.

A roadmap for accelerators

Mike's Notes

It's getting busy, so I need a roadmap for accelerators now.

Resources

  • Resource

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 

Last Updated

27/03/2026

A roadmap for accelerators

By: Mike Peters
On a Sandy Beach: 27/03/2026

Mike is the inventor and architect of Pipi and the founder of Ajabbi.

Lots of opportunities are coming in.

This is a roadmap for using coaching, workshops, incubators and accelerators to develop, test and validate the Ajabbi Mission Business Model and the Pipi closed-core and Pipi open-source applications.

It is sorted by deadline. The first row of the table is a key.

Date Roadmap

deadline

Start-End

Status

Title

Description

What

To come.

  • To come

Learned

  • To come

Resources


deadline

2017-2020

Read and tried everything

Steve Blank

"Steve Blank (born 1953) is an American entrepreneur, educator, author and speaker. He created the customer development method that launched the lean startup movement. His work has influenced modern entrepreneurship through the creation of tools and processes for new ventures, which differ from those used in large companies."

What

Learning from the very best.

  • Reading his books and blog
  • Watching videos
  • Using all the free courses and tools

Learned

  • How to use a Business Model Canvas
  • How to use a Mission Model Canvas
  • How to do customer discovery
  • How to run experiments to validate assumptions

Resources

deadline

2020-2023

Attended

KiwiSaaS

Online workshops and random monthly meetings with other founders.

What

Monthly online meeting with a random founder to chat.

  • To come

Learned

  • To keep going
  • It is OK to make mistakes
  • Lots of insights from being open
  • Importance of listening to others

Resources

January 2024

January - October 2024

Completed

Startup Aotearoa

"Startup Aotearoa ignites New Zealand’s entrepreneurial spirit by providing personalised one-to-one coaching to early-stage startup founders. Delivered nationwide through local regional providers,"

What

Mentoring from Mr G led to testing the ICP at Waimumu Southern Field Days 2024 on

  • Developers at Agritech companies
  • Agricultural suppliers

Learned

  • Developers are the ICP
  • Find a teaching customer

Resources

February 2025

February 2025 - March 2025

Completed

NZTE Export Essentials SaaS 4-part workshop. NZTE Position for Growth workshop.

What

Workshops with individual follow-up sessions.

Learned

  • To come.

Resources

April 2024

May - November 2024

Completed

Creative HQ's On the Business workshop series

Workshop series

What

Remote workshops using Milo canvas.

Learned

  • To come.

Resources

25/03/2026

April 2026 - March 2028

Application Withdrawn

Google AI Accelerator

"With this program, you can get access to startup experts, your Google Cloud and Firebase costs covered up to $200,000 USD (up to $350,000 USD for AI startups) over 2 years, technical training, business support, and Google-wide offers."

What

Collaborate with DeepMind to run wild ML integration experiments to go where no developer has gone before.

  • Pipi > IaC > GCP
  • Pipi > VM > BoxLang > Workspaces
  • Pipi > MCP > DeepMind Gemini
  • Pipi > Scientific Workflows > TPU

Learned

  • Invited to apply by a Google chap who was assisting behind the scenes using an unlisted pathway. I then discovered that free credits begin on the day of application approval, so I will reapply when ready to start in July to make the most of the 24-month window of opportunity.
  • Increase Pipi DevOps 1000x by completing work on the data centre, workspace rendering, and IaC to GCP free tier. This will enable fast, multiple automated experiments.

Resources

    29/03/2026


    May-June

    Application underway

    Access Activator

    "Whaikaha, in collaboration with Creative HQ, is piloting Access Activator to find, fund and grow accessibility solutions from across New Zealand."

    What

    Test personalised accessible UI based on the GOV.UK Design System using paper mockups for

    • Autistic
    • Colour Blindness
    • Dyslexic
    • Epilepsy
    • Irlen
    • Physical & Motor Disabilities
    Involve
    • Local disabled community
    • National support organisations for Autism, Dyslexia, Muscular Dystrophy, etc

    Learned

    • To come.

    Resources

    31/03/2026

    April-May

    Applied

    Creative Tech Activate

    "Designed for teams at the idea stage, this 8-week pre-accelerator programme supports founders to validate their concept, build an early prototype and/or actionable technology roadmap, connect with first customers and partners, and learn how to build a scalable business using proven startup tools."

    What

    Test screen production workspace using HTML Mockups on

    • Local Film Industry Community

    Learned

    • To come.

    Resources

    26/05/2026

    July - November 2026

    Application underway

    Sprout Accelerator

    "The Sprout Accelerator takes a cohort of agrifood innovators on a 3-month adventure to discover, articulate and refine the foundations to grow global startups."

    What

    Test farm management workspace using HTML Mockups

    • Dairy Farmer Catchment Group
    • Agritech wait list from Waimumu

    Learned

    • To come.

    Resources

    June 2026

    July 2026 - June 2028

    To Apply

    Google AI Accelerator

    "With this program, you can get access to startup experts, your Google Cloud and Firebase costs covered up to $200,000 USD (up to $350,000 USD for AI startups) over 2 years, technical training, business support, and Google-wide offers."

    What

    Collaborate with DeepMind to run wild ML integration experiments to go where no developer has gone before.

    • Pipi > IaC > GCP
    • Pipi > VM > BoxLang > Workspaces
    • Pipi > MCP > DeepMind Gemini
    • Pipi > Scientific Workflows > TPU

    Learned

    • To come.

    Resources

    July 2028

    Start-End

    To apply

    Y Combinator

    "The overall goal of YC is to help startups really take off. They arrive at YC at all different stages. Some haven't even started working yet, and others have been launched for a year or more. But whatever stage a startup is at when they arrive, our goal is to help them to be in dramatically better shape 3 months later."

    What

    How to scale from tiny to very large.


    Learned

    • To come

    Resources


       


       

      Continental Drift

      Mike's Notes

      This is an example of a Notebook from Observable embedded on this page using an iFrame.

      Resources

      References

      • Reference

      Repository

      • Home > Ajabbi Research > Library > Subscriptions > Observable Newsletter
      • Home > Handbook > 

      Last Updated

      26/03/2026

      Continental Drift

      By: Nicolas Lambert
      Observable: 02/2026

      Hi. I'm Nicolas Lambert, research engineer in geographic information science at CNRS (France). My job is to draw maps and teach cartography at the university. I am a geographer 🌎, and not really a developer 🖥️.

      This notebook compresses 750 million years of continental drift in a mesmerizing animation. It includes current country boundaries, so as the animation loops, you can follow how different regions emerged over time.

      Four autism subtypes map onto distinct genes, traits

      Mike's Notes

      Note

      Resources

      References

      • Reference

      Repository

      • Home > Ajabbi Research > Library > Subscriptions > The Transmitter
      • Home > Handbook > 

      Last Updated

      26/03/2026

      Four autism subtypes map onto distinct genes, traits

      By: Giorgia Guglielmi
      The Transmitter: 17/07/2025
      DOI: https://doi.org/10.53053/BZOX1294

      Giorgia Guglielmi is a freelance writer covering the life sciences, biomedicine and science policy. Her stories have appeared in Nature, Science, NOVA Next and other publications.

      Giorgia has an M.Sc. in science writing from the Massachusetts Institute of Technology and a Ph.D. in biology from the European Molecular Biology Laboratory.

      An analysis of more than 5,000 autistic children and their siblings underscores the idea that autism can be understood as multiple conditions with distinct trajectories.

      Autism has long been seen as a single—if highly heterogeneous—condition. But rather than one continuous spectrum, there are distinct autism subtypes, each tied to its own genetic signature, a new study suggests.

      The research, published 9 July in the journal Nature Genetics, connects genetic differences to specific patterns in how autism traits appear, supporting the notion that there are “many autisms,” says Michael Lombardo, senior researcher of neurodevelopmental disorders at the Istituto Italiano di Tecnologia in Genoa, Italy, who was not involved in the work.

      Identifying the genes linked to autism subtypes is just the beginning, Lombardo says. The real challenge now, he adds, “is unraveling the biology that stems from those genes.”

      The researchers, led by Olga Troyanskaya, professor of computer science and integrative genomics at Princeton University and deputy director for genomics at the Flatiron Institute in New York City, used a computational model to analyze data on 5,392 autistic children aged 4-18 years and their non-autistic siblings collected through the SPARK study. (The Flatiron Institute and the SPARK datasets used in the study are funded by the Simons Foundation, The Transmitter’s parent organization.) The data include genetic findings and parent-reported information on developmental milestones, cognitive and behavioral traits, co-occurring conditions and family history.

      The machine learning analysis grouped the autistic people into four subtypes based on their genetic signatures and clinical patterns.

      The “Social/behavioral” group—accounting for 37 percent of the sample and whose participants show repetitive behaviors, communication challenges and co-occurring conditions, but few developmental delays—has common genetic variants linked to ADHD and depression, and rare variants in genes active primarily after birth in certain brain cells. Another 19 percent belong to the “Mixed ASD with Developmental Delay” group, which hit many milestones later in development than children without autism but typically don’t have co-occurring conditions such as anxiety and depression; these participants carry many rare, damaging variants—both inherited and new—in genes active during early brain development in utero.

      [The real challenge now] ‘is unraveling the biology that stems from those genes.’ - Michael Lombardo

      Another group, called “Moderate challenges” (34 percent), is distinguished by a developmental pattern similar to the Social/behavioral group, though with less severity; its participants have rare genetic changes in less essential genes, which may explain their mild core autism traits. The 10 percent of children in the “Broadly affected” group, on the other hand, have prominent autism traits from an early age and carry a heavy load of rare mutations in key genes, including targets of a protein involved in fragile X syndrome.

      These genetic differences track with the ages at which children reached certain developmental milestones and their age at autism diagnosis. For example, the groups with variants in early-expressed genes—the Broadly affected and the Mixed ASD with Developmental Delay groups—showed delays in early developmental skills and earlier diagnoses, whereas the group with variants in genes expressed after birth—the Social/behavioral group—had later diagnoses and developmental timelines similar to those of non-autistic children.

      “There are a lot of kids who seem very neurotypical until a bit later in childhood,” says study investigator Natalie Sauerwald, associate research scientist of computational genomics at the Flatiron Institute. “The fact that we were able to find genetics that aligns with that was really surprising.”

      Other research efforts have identified autism subgroups, but none currently offer the definitive take on how to group populations within the autism spectrum—likely because results depend on who’s studied and how, Lombardo says. The new study included children aged 4-18, which increased the sample size but introduced variability, because a preschooler and a teenager are at different developmental stages, he says. His own work, focused on narrower age ranges, has found fewer subtypes, suggesting that different datasets may yield different subtypes.

      In addition, rather than linking genes to single traits as in past investigations, Troyanskaya and colleagues looked at a person’s overall combination of traits.

      The team validated their new findings using data from the Simons Simplex Collection (SSC), which contains information gathered by clinicians. The autism subtypes identified based on parent-reported data were consistent with those found in the SSC, which suggests that the subtypes reflect real differences, Troyanskaya says.

      As more data become available, the identified autism subtypes may be further refined, revealing additional genetic and clinical details within each group, she says.

      More data can validate the findings across diverse populations and adult cases, which could eventually lead to more tailored diagnoses and support in the future, Sauerwald says.

      Thomas Bourgeron, director of the Human Genetics and Cognitive Functions Unit at the Institut Pasteur in Paris, who was not involved in the work, agrees. “Some individuals need medical support, maybe gene therapy, and others need better inclusion in society, better recognition of neurodiversity and so on,” he says.

      However, he adds, a common limitation in autism research is that it focuses too much on clinical data and not enough on real-life experiences, so there’s a need for more practical, long-term information to better understand how autism affects daily life. “We need to have a better idea of the trajectory of these individuals.”

      Fig. 1: Overview of study design and description of identified subclasses.

      Decomposition of phenotypic heterogeneity in autism reveals underlying genetic programs | doi:10.53053/BZOX1294

      How YouTube and Adhesive Tape Are Disrupting Assistive Technology

      Mike's Notes

      Very cool. Making stuff.

      Resources

      References

      • Reference

      Repository

      • Home > Ajabbi Research > Library > Subscriptions > IEEE Spectrum
      • Home > Handbook > 

      Last Updated

      24/03/2026

      How YouTube and Adhesive Tape Are Disrupting Assistive Technology

      By: Jason Hahr
      IEEE Spectrum: 31/01/2026

      Jason Hahr is a 2025 IEEE Spectrum Taenzer Fellow in disability rights and assistive technology journalism.

      The “MacGyver” approach lets disabled users reconfigure their tech

      One of maker Therese Willkomm's assistive tech hacks involved developing a control panel for GPS map apps—part of her mission to help disabled people find their way through a non-compliant tech world. Therese Willkomm

      Assistive technology is expensive, and many people with disabilities live on fixed incomes. Disabled assistive tech users also must contend with equipment that was often designed without any capacity to be repaired or modified. But assistive tech users ultimately need the functionality they need—a wheelchair that isn’t constantly needing to be charged, perhaps, or a hearing aid that doesn’t amplify all background noise equally. Assistive tech “makers,” who can hack and modify existing assistive tech, have always been in high demand.

      Therese Willkomm, emeritus professor of occupational therapy at the University of New Hampshire, has written three books cataloging her more than 2,000 assistive technology hacks. Willkomm says she aims to keep her assistive tech hacks costing less than five dollars.

      She’s come to be known internationally as the “MacGyver of Assistive Technology” and has presented more than 600 workshops and assistive tech maker days across 42 states and 14 countries.

      IEEE Spectrum sat down with Willkomm ahead of her latest assistive tech Maker Day workshop, on Saturday, 31 January, at the Assistive Technology Industry Association (ATIA) conference in Orlando, Florida. Over the course of the conversation, she discussed the evolution of assistive technology over 40 years, the urgent need for affordable communication devices, and why the DIY movement matters now more than ever.

      IEEE Spectrum: What got you started in assistive technology?

      Therese Willkomm: I grew up in Wisconsin, where my father had a machine shop and worked on dairy and hog farms. At age 10, I started building and making things. A cousin was in a farm accident and needed modifications to his tractor, which introduced me to welding. In college, I enrolled in vocational rehabilitation and learned about rehab engineering—assistive technology wasn’t coined until 1988 with the Technology-Related Assistance Act. In 1979, Gregg Vanderheiden came to the University of Wisconsin-Stout and demonstrated creative things with garage door openers and communication devices. I thought, “Wow, this would be an awesome career path—designing and fabricating devices and worksite adaptations for people with disabilities to go back to work and live independently.” I haven’t looked back.

      You’ve created over 2,000 assistive technology solutions. What’s your most memorable one?

      Willkomm: A device for castrating pigs with one hand. We figured out a way to design a device that fit on the end of the hog crate that was foot-operated to hold the hind legs of the pig back so the procedure could be done with one hand.

      Assistive Technology’s Changing Landscape

      How has assistive technology evolved over the decades?

      Willkomm: In the 1980s, we fabricated devices from wood and early electronics. I became a [Rehabilitation Engineering and Assistive Technology Society of North America, a.k.a. RESNA] member in 1985. The 1988 Technology-Related Assistance Act was transformational—all 50 states finally got funding to support assistive technology and needs in rural areas. Back in the ‘80s, we were soldering and making battery interrupters and momentary switches for toys, radios, and music. Gregg was doing some things with communication. There were Prentke Romich communication devices. Those were some of the first electronic assistive technologies.

      The early 1990s was all about mobile rehab engineering. Senator Bob Dole gave me a $50,000 grant to fund my first mobile unit. That mobile unit had all my welding equipment, all my fabrication equipment, and I could drive farm to farm, set up outside right in front of the tractor, and fabricate whatever needed to be fabricated. Then, around 1997, there were cuts in the school systems. Mobile units became really expensive to operate. We started to look at more efficient ways of providing assistive technology services. With the Tech Act, we had demonstration sites where people would come and try out different devices. But people had to get in a car, drive to a center, get out, find parking, come into the building—a lot of time was being lost.

      In the 2000s, more challenges with decreased funding. I discovered that with a Honda Accord and those crates you get from Staples, you could have your whole mobile unit in the trunk of your car because of advances in materials. We could make battery interrupters and momentary switches without ever having to solder. We can make switches in 28 seconds, battery interrupters in 18 seconds. When COVID happened, we had to pivot—do more virtual, ship stuff out to people. We were able to serve more individuals during COVID than prior to COVID because nobody had to travel.

      How do you keep costs under five dollars?

      Willkomm: I aim for five dollars or less. I get tons of corrugated plastic donated for free, so we spend no money on that. Then there’s Scapa Tape—a very aggressive double-sided foam tape that costs five cents a foot. If you fabricate something and it doesn’t work out, and you have to reposition, you’re out a nickel’s worth of material. Buying Velcro in bulk helps too. Then Instamorph—it is non-toxic, biodegradable. You can reheat it, reform it, in five minutes or less up to six times. I’ve created about 132 different devices just using Instamorph. A lot of things I make out of Instamorph don’t necessarily work. I have a bucket, and I reuse that Instamorph. We can get six, seven devices out of reusable Instamorph. That’s how we keep it under five dollars.

      What key legislation impacts assistive technology?

      Willkomm: Definitely the Technology-Related Assistance Act. In the school system, however, it only says “Did you consider assistive technology?” So that legislation really needs to be beefed up. The third piece of legislation I worked on was the AgrAbility legislation to fund assistive technology consultations and technical assistance for farmers and ranchers. The latest Technology-Related Assistance Act was reauthorized in 2022. Not a whole lot of changes—it’s still assistive technology device demonstrations and loans, device reuse, training, technical assistance, information and awareness. The other thing is NIDILRR—National Institute on Independent Living and Rehabilitation Research, funded under [the U.S. Department of Health and Human Services, a.k.a. HHS]. Funding the rehab engineering centers was pretty significant in advancing the field because these were huge, multimillion-dollar centers dedicated to core areas like communication and employment. Now there’s a new one out on artificial intelligence.

      With over 2,000 hacks to improve usability of assistive technologies, veteran DIY maker Therese Willkomm has earned the moniker “the MacGyver of assistive tech.” Therese Willkomm

      A Vision for a Better Assistive Tech Future

      What deserves more focus in your field?

      Willkomm: The supply-and-demand problem. It all comes down to time and money. We have an elderly population that continues to grow, and a disability population that continues to grow—high demand, high need for assistive technology, yet the resources available to meet that need are limited. A few years back, the Christopher & Dana Reeve Foundation had a competition. I submitted a proposal similar to the Blue Apron approach. People don’t have supplies at their house. They can’t buy two inches of tape—they have to buy a whole roll. They can’t buy one foot of corrugated plastic—they’ve got to buy an 18-by-24 sheet or wait till it gets donated.

      With my third book, I created solutions with QR codes showing videos on how to make them. I used Christopher Reeve Foundation funding to purchase supplies. With Blue Apron, somebody wants to make dinner and a box arrives with a chicken breast, potato, vegetables, and recipe. I thought, what if we could apply that to assistive technology? Somebody needs something, there’s a solution out there, but they don’t have the money or the time—how can we quickly put it in a box and send it to them? People who attended my workshops didn’t have to spend money on materials or waste time at the store. They’d watch the video and assemble it.

      But then there were people who said, “I do not have even five minutes in the school day to stop what I’m doing to make something.” So we found volunteers who said, “Hey, I can make slant boards. I can make switches. I can adapt toys.” You have people who want to build stuff and people who need stuff. If you can deal with the time and money issue, anything’s possible to serve more people and provide more devices.

      What’s your biggest vision for the future?

      Willkomm: I’m very passionate about communication. December 15 was the passage in 1791 of our First Amendment, freedom of speech. Yet people with communication impairments are denied their basic right of freedom of speech because they don’t have an affordable communication device, or it takes too long to program or learn. I just wish we could get better at designing and fabricating affordable communication devices, so everybody is awarded their First Amendment right. It shouldn’t be something that’s nice to have—it’s something that’s needed to have. When you lose your leg, you’re fitted with a prosthetic device, and insurance covers that. Insurance should also cover communication devices and all the support services needed. With voice recognition and computer-generated voices, there are tremendous opportunities in assistive technology for communication impairments that need to be addressed.

      What should IEEE Spectrum readers take away from this conversation?

      Willkomm: There’s tremendous need for this skill set—working in conjunction with AI and material sciences and the field of assistive technology and rehab engineering. I’d like people to look at opportunities to volunteer their time and also to pursue careers in the field of specialized rehab engineering.

      How are DIY approaches evolving with new technologies?

      Willkomm: What we’re seeing at maker fairs is more people doing 3D printing, switch-access controls, and these five-minute approaches. There has to be a healthy balance between what we can do with or without electronics. If we need something programmed with electronics, absolutely—but is there a faster way?

      The other thing that’s interesting is skill development. You used to have to go to college for four, six, eight years. With YouTube, you can learn so much on the internet. You can develop skills in things you never thought were possible without a four-year degree. There’s basic electronic stuff you can absolutely learn without taking a course. I think we’re going to have more people out there doing hacks, asking “What if I change it this way?” We don’t need to have a switch.

      We need to look at the person’s body and how that body interacts with the electronic device interface so it requires minimal effort—whether it be eye control or motion control. Having devices that predict what you’re going to want next, that are constantly listening, knowing the way you talk. I love the fact that AI looks at all my emails and creates this whole thing like “Here’s how I’d respond.” I’m like, yeah, that’s exactly it. I just hit select, and I don’t have to type it all out. It speeds up communication. We’re living in exciting times right now.

      This article was supported by the IEEE Foundation and a John C. Taenzer fellowship grant.