When it Comes to AI Policy, Congress Shouldn’t Cut States off at the Knees

Mike's Notes

I agree with the content of this essay by Gary Marcus. I believe;

  • LLM are an overhyped and speculative craze
  • Halucinate
  • Trained on the content of the internet including copywrited material and porn
  • Their design is fundamentally flawed.
  • They can be useful for limited tasks, including text translation and rough drafts.
  • Pipi 9 does not use AI based on LLM; it uses AI that mimics biological processes

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 

Last Updated

17/05/2025

When it Comes to AI Policy, Congress Shouldn’t Cut States off at the Knees

By: Gary Marcus
Marcus on AI: 14/05/2025

This essay is coauthored with many representatives from States across the United States, as listed below.

Artificial intelligence holds immense promise—from accelerating disease detection to streamlining services—but it also presents serious risks, including deepfake deception, misinformation, job displacement, exploitation of vulnerable workers and consumers, and threats to critical infrastructure. As AI rapidly transforms our economy, workplaces, and civic life, the American public is calling for meaningful oversight. According to the Artificial Intelligence Policy Institute, 82% of voters support the creation of a federal agency to regulate AI. A Pew Research Center survey found that 52% of Americans are more concerned than excited about AI’s potential, and 67% doubt that government oversight will be sufficient or timely.

Public skepticism crosses party lines and reflects real anxiety: voters worry about data misuse, algorithmic bias, surveillance, and impersonation, and even catastrophic risks. Pope Francis has named AI as one of the defining challenges of our time, warning of its ethical consequences and impacts on ordinary people and calling for urgent action.

Yet instead of answering this call with guardrails and public protections, Congress, which has done almost nothing to address these concerns, is considering a major step backwards, a tool designed to prevent States from taking matters into their own hands: a sweeping last-minute preemption provision tucked into a federal budget bill that would ban all state regulation on AI for the next decade.

The provision, which is likely at odds with the 10th Amendment, demands that “no state or political subdivision may enforce any law or regulation regulating artificial intelligence models, artificial intelligence systems, or automated decision systems during the 10-year period beginning on the date of the enactment of this Act.” The measure would prohibit any state from regulating AI for the next ten years in any way—even in the absence of any federal standards.

This would be deeply problematic under any circumstance, but it’s especially dangerous in the context of a rapidly evolving technology already reshaping healthcare, education, civil rights, and employment. If enacted, the statute would preempt states from acting —even if AI systems cause measurable harm, such as through discriminatory lending, unsafe autonomous vehicles, or invasive workplace surveillance. For example, twenty states have passed laws regulating the use of deepfakes in election campaigns, andColorado passed a law to ensure transparency and accountability when AI is used in crucial decisions affecting consumers and employees. The proposed federal law would automatically block the application of those state laws, without offering any alternative. The proposed provision would also preempt laws holding AI companies liable for any catastrophic damages that they contributed to, as the California Assembly tried to do.

The federal government should not get to control literally every aspect of how states regulate AI — particularly when they themselves have fallen down on the job —- and the Constitution makes pretty clear that the bill as written is far, far too broad. The 10th Amendment states, quite directly, that “The powers not delegated to the United States by the Constitution, nor prohibited by it to the States, are reserved to the States respectively, or to the people.” In stepping so thoroughly on states’ rights, it is difficult to see how the proposed bill would not clash with this 234-year-old bedrock principle of the United States. (Defenders of this overbroad bill will claim that AI is part of interstate commerce; years of lawsuits will ensue.)

Of course there are always arguments on the other side. The Big Tech position was laid out well in a long piece from Friday in Lawfare by Kevin Frazier and Adam Thierer that has elements of truth but miss the larger picture. Part of it emphasizes the race with China and the need for speed. Their claim, exaggerating the costs of regulation and minimizing the costs of having none (not to mention states’ rights) is that AI regulation “could undermine the nation's efforts to stay at the cutting edge of AI innovation at a critical moment when competition with China for global AI supremacy is intensifying” and that “If this growing patchwork of parochial regulatory policies takes root, it could undermine U.S. AI innovation” and call on Congress "to get serious about preemption”.

What they miss is threefold. First, if current trends continue, the “race” with China will not end in victory, for either side. Because both countries are both building essentially the same kind of models with the same kinds of techniques using the same kinds of data, the results from the two nations are essentially converging on the same outcomes. So-called leaderboards are no longer dominated by any one country. Any advantage in Generative AI (which still hasn’t remotely made a net profit, and is all still speculative) will be minimal, and short-lasting. Our big tech giants will match theirs, and vice versa, and the only real question is about the size of the profits. Any regulation that is proposed will be absorbed as a cost of business (trivial for trillion dollar companies), and there is no serious argument that the relatively modest costs of regulation (which they don’t even bother to estimate) will have any real-world impact whatsoever on those likely tied outcomes. Silicon Valley loves to invoke China to get better terms, but it probably won’t make any difference. (China actually has far more national regulation around AI than the US does, and that has in no way stopped them from catching up)

Second, Frazier and Thierer are presenting a false choice. The comparison here is not between a coherent federal laws versus a patchwork of a state laws, but between essentially zero enduring federal AI law (only executive orders that seem to come and go with the tides) and the well-intentioned efforts of many state legislators to make up for the fact that Washington has failed. If Washington wants to pass a comprehensive privacy or AI law with teeth, more power to them, but we all know this is unlikely; Frazier and Thierer would leave citizens out to dry, much as low-touch advocates have left us all out to dry when it comes to social media.

Third, Frazier skirted the issue of States rights altogether, not even considering at all how AI fits relative to other sensitive issues such as abortion or gun control. In insisting that “might makes right” here for AI, they risk setting a dangerous precedent in which whatever party has Federal power makes all the rules, all the time, overriding the power to the States that the 10th Amendment exists to protect, and one of our last remaining checks and balances.

And as Senator Markey put it, “[a] 10-year moratorium on state AI regulation won’t lead to an AI Golden Age. It will lead to a Dark Age for the environment, our children, and marginalized communities.”

Consumer Reports’ Policy Analyst for AI Issues Grace Gedye also weighed in, “Congress has long abdicated its responsibility to pass laws to address emerging consumer protection harms; under this bill, it would also prohibit the states from taking actions to protect their residents"

Well aware of the challenges AI poses, state leaders have already been acting. An open letter from the International Association of Privacy Professionals, signed by 62 legislators from 32 states, underscores the importance of state-level AI legislation—especially in the absence of comprehensive federal rules. Since 2022, dozens of states have introduced or passed AI laws. In 2024 alone, 31 states, Puerto Rico, and the Virgin Islands enacted AI-related legislation or resolutions, and at least 27 states passed deepfake laws. These include advisory councils, impact assessments, grant programs, and comprehensive legislation like Colorado’s, which would have mandated transparency and anti-discrimination protections in high-risk AI systems. It would also undo literally every bit of State privacy legislation, despite the fact that no Federal bill has passed after many years of discussion.

It's specifically because of state momentum that Big Tech is trying to shut the states down. According to a recent report in Politico, “As California and other states move to regulate AI, companies like OpenAI, Meta, Google and IBM are all urging Washington to pass national AI rules that would rein in state laws they don’t like. So is Andreessen Horowitz, a Silicon Valley-based venture capitalist firm closely tied to President Donald Trump.” All largely behind closed doors. Why? With no regulatory pressure, tech companies would have little incentive to prioritize safety, transparency, or ethical design; any costs to society would be borne by society.

But the reality is that self-regulation has repeatedly failed the public, and the absence of oversight would only invite more industry lobbying to maintain weak accountability.

At a time when voters are demanding protection—and global leaders are sounding the alarm—Congress should not tie the hands of the only actors currently positioned to lead. A decade of deregulation isn’t a path forward. It’s an abdication of responsibility.

If you are among the 82% of Americans who think AI needs oversight, you need to call or write your Congress members now, or the door on AI regulation will slam shut at least for the next decade, if not forever, and we will be entirely at Silicon Valley’s mercy.

Signatories

  • Senator Katie Fry Hester, Maryland
  • Gary Marcus, Professor Emeritus, NYU
  • Delegate Michelle Maldonado, Virginia
  • Senator James Maroney, Connecticut
  • Senator Robert Rodriguez, Colorado
  • Representative Kristin Bahner, Minnesota
  • Representative Steve Elkins, Minnesota
  • Senator Kristen Gonzalez, New York
  • Representative Monique Priestley, Vermont

Why Developers Still Use ColdFusion in 2025

Mike's Notes

Pipi has been built in ColdFusion or CFML code since Pipi 3 in 2002. It has proven to be a good choice, because of the ease of rapid prototyping, and the few lines of code required to do anything compared to other languages.

CFML acts as a wrapper around Java. So it runs on the Java Virtual Machine (JVM).

Pipi 10 will fully migrate to the open-source BoxLang platform provided by Ortus, and still use CFML.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 

Last Updated

15/05/2025

Why Developers Still Use ColdFusion in 2025

By: Nick Flewitt
FusionReactor: 13/05/2025

In a tech landscape dominated by JavaScript frameworks, Python, and cloud-native solutions, ColdFusion continues to maintain a dedicated user base. Despite being introduced back in 1995, this platform remains relevant for specific use cases and organizations. Let’s explore why some developers and companies still choose ColdFusion in today’s rapidly evolving development environment.

Legacy Systems and Institutional Knowledge

Many organizations built critical business applications on ColdFusion during its peak popularity in the late 1990s and early 2000s. These systems have been refined over decades and represent significant investments in both time and resources. For these organizations, the cost-benefit analysis often favors maintaining and gradually modernizing existing ColdFusion applications rather than complete rewrites.

“Rewriting working systems from scratch is one of the most expensive and risky decisions an organization can make,” explains a common sentiment among ColdFusion developers.

Rapid Development Speed

ColdFusion was one of the earliest platforms designed specifically for rapid application development (RAD), and this remains one of its strongest advantages. Compared to many other languages, the CFML (ColdFusion Markup Language) syntax allows developers to accomplish complex tasks with minimal code.

Tasks that might require dozens of lines in other languages often need just a few in ColdFusion. For example, querying a database, processing the results, and outputting formatted data can be accomplished in remarkably few lines of CFML.

Strong in the Enterprise

ColdFusion has found particular longevity in enterprise environments, especially in industries like healthcare, finance, government, and education. These sectors value stability, security, and vendor support – all areas where Adobe’s commercial backing of ColdFusion provides reassurance.

Modern Evolution

Contrary to popular belief, ColdFusion hasn’t remained static. Recent versions have introduced substantial modernizations:

  • Support for modern JavaScript frameworks
  • REST API capabilities
  • Performance improvements
  • Docker containerization
  • Enhanced security features

Additionally, Lucee (an open-source CFML engine) provides a free alternative that has helped rejuvenate interest in the language.

Developer Productivity and Salary Advantages

The ColdFusion job market presents an interesting dynamic – while demand for new CF developers has declined, experienced ColdFusion developers often command premium salaries due to the combination of their scarcity and the business-critical systems they maintain.

This niche expertise can be particularly lucrative for freelancers and consultants who specialize in ColdFusion, especially when helping organizations modernize legacy applications.

Integration Capabilities

ColdFusion excels at integrating disparate systems, a crucial capability in enterprise environments with complex technology ecosystems. Its Java foundation allows for integration with virtually any system, while built-in features simplify connecting to databases, APIs, and legacy systems.

When Choosing ColdFusion Makes Sense

New adoption of ColdFusion typically occurs in specific scenarios:

  • Organizations with existing ColdFusion applications expanding functionality
  • Teams with significant ColdFusion expertise tackling new projects
  • Rapid development of internal business applications where time-to-market outweighs other concerns
  • Specialized industries where established ColdFusion solutions exist

Looking Forward

While ColdFusion isn’t likely to regain its former prominence, it demonstrates an important lesson about technology adoption: tools that effectively solve real business problems can remain viable long after technology trends have passed.

Though smaller than in its heyday, the ColdFusion community remains active and passionate. User groups, conferences like CF Summit, and online forums continue to support developers working with the platform.

Conclusion

ColdFusion’s continued use in 2025 isn’t merely about resistance to change or technical debt. It represents a pragmatic choice for specific use cases and organizations that balances development speed, maintenance costs, and organizational expertise. While not the right choice for every project, its longevity demonstrates that technology adoption isn’t always about following the latest trends but finding the right tool for specific business contexts.

As one veteran ColdFusion developer puts it: “I’ve been hearing that ColdFusion is dying for twenty years now, but somehow I keep getting paid very well to work with it.”

How to Do Mobile Testing Right

Mike's Notes

This is an excellent article by Luca Rossi from the Substack Refactoring on testing applications for mobile devices.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library > Subscriptions > Refactoring
  • Home > Handbook > 

Last Updated

15/05/2025

How to Do Mobile Testing Right

By: Luca Rossi
Refactoring: 14/05/2025

A thorough guide that includes practical playbooks for companies at every stage.

Mobile apps are the primary interface through which hundreds of millions of people interact with services daily.

Smartphones have been a thing for 15+ years, during which they have arguably changed… not so much, especially in recent years. For this reason, it may be reasonable to expect a flawless experience at every tap.

Instead, the reality is different and painful — especially for mobile engineering teams: an ever-expanding universe of devices, hardware, OS versions, screen resolutions, and capabilities, that software must navigate correctly.

Welcome to hell mobile fragmentation.

In my previous life as a startup CTO, I ran a travel web app that had native counterparts on iOS and Android, and I swear that mobile testing and QA was one of the things that kept me up at night. It is incredibly hard to do it right, yet supremely important, having a direct impact on user satisfaction, churn, and ultimately, the bottom line.

So today we are publishing a full guide on fragmentation testing which provides a comprehensive but pragmatic approach to the problem, by covering principles, strategies, and tools.

We are doing so by drawing from my own learnings, from the ones of people in the Refactoring community, and by bringing in the invaluable experience of Nakul Aggarwal, CTO and co-founder of BrowserStack.

BrowserStack is a cornerstone of how thousands of teams — including my former own — engage with real-device testing at scale, and Nakul is one of the world’s most knowledgeable people in this space.

So, as you will see, success in mobile testing is about making smart choices, and focusing your efforts where they yield the greatest return. We'll explore how to define "good enough" in a world of infinite variables, and how to build a testing approach that supports, rather than hinders, your engineering velocity.

Here is the agenda:

  • What is mobile fragmentation? — defining the beast and its many heads.
  • Cost of fragmentation — the real-world business consequences.
  • Fragmentation testing playbook — how to build your full testing process, from strategy down to tactics.
  • Testing strategy vs product maturity — how your playbook should evolve over time.
  • Navigating the trade-offs — balancing cost, speed, coverage, and developer experience.
  • The Future is Still Fragmented — trends, the role of AI, and some closing notes

Let's dive in!

What is Mobile Fragmentation?

We've thrown the term "mobile fragmentation" around, but what do we mean by that? Spoiler: it’s not something that happens to your phone screen after a drop.

At its core, mobile fragmentation is the sheer diversity of hardware and software across the millions of devices your application might encounter in the wild.

Such diversity is also multi-dimensional:

  • Device manufacturers (the who) — from the big players like Apple and Samsung, to the long tail of Xiaomi, Oppo, OnePlus, and the countless regional champions. Each comes with its own hardware quirks, custom Android skins, and unique interpretations of how Android should behave.
  • Operating systems & versions (the what & when) — you need to account for multiple major versions active concurrently. Update rollouts lead to notorious lags, and some devices never get updated beyond a certain point. This is true for both iOS and Android, with Android being typically much worse.
  • Screen sizes & resolutions (the where) — today’s smartphones range from compact screens, to phablets, foldables, and tablets: the range is vast. Other than physical size, you may need to account for pixel density, aspect ratios, and newer features like dynamic refresh rates or screen cutouts (notches, punch-holes), that can all wreak havoc on your UI if not handled gracefully.
  • Hardware differences (the how) — beneath the glass, there's even more: processors, memory constraints, GPUs, and sensors, which may or may not make a difference in how your app behaves.

Trying to account for every permutation is impossible. Understanding these dimensions, however, is the first step to building a smarter strategy.

The Cost of Fragmentation

One of the things I always try to do at Refactoring is to think from first principles, so the first question here is: what if you just ignore this? Seriously, let’s not take anything for granted.

How bad is this, for real?

Unfortunately, when fragmentation is managed poorly, it bites hard — on your users, your team, and the business:

  • Poor UX — the most immediate impact. Users encounter crashes, freezes, baffling UI glitches, or sluggish performance that makes the app feel broken. Frustrated users are 1) unlikely to give you a second chance, and 2) they often head straight to…
  • Bad reviews — users are quick to voice their displeasure, and negative App Store reviews are incredibly damaging—it doesn’t matter if they are about a small set of devices. A flood of "unusable on Android 12" reviews will torpedo your app's rating, affecting everyone.
  • Churn — if an existing user has a persistent issue on their device after an update, or a new user has a terrible first experience, they're likely to abandon your app. Acquiring users is expensive: losing them due to preventable issues is a painful, self-inflicted wound.
  • Support costs — your support team gets swamped with tickets and complaints related to device-specific bugs. Diagnosing these can be a nightmare, requiring detailed information about device models, OS versions, and steps to reproduce that users often struggle to provide.
  • Slower dev velocity (ironically) — if you are trying to move faster by avoiding thorough testing, think again. Fragmentation bugs in production can lead to constant firefighting and a reactive development cycle. This drains morale and pulls your team away from feature development.

Investing in a good testing strategy isn't just about "quality assurance" in an abstract sense: it's about protecting your revenue, your reputation, and your team's ability to move fast.

So how do you do that? Enter the playbook

Fragmentation Testing Playbook

We have established the why — now for the how. This section should work as your tactical playbook: the core strategies and tools you'll use to construct a robust, pragmatic mobile fragmentation testing process.

This playbook focuses on four key pillars:

  • Device matrix — your clear, data-driven plan of which devices and OS versions matter the most to your users.
  • Testing mix — a balanced portfolio of emulators, real devices, and cloud solutions to maximize coverage and efficiency.
  • Foundational quality — strong architecture, base testing, and monitoring to significantly reduce the number of bugs that reach device-specific testing.
  • Automation strategy — manual testing doesn't scale, and smart automation is crucial for maintaining velocity and reducing toil for your team.

Let's break these down

1) Build your device matrix

Your Device Matrix is the single most important artifact guiding your testing. It’s a curated inventory of devices and OS versions, tiered by importance, against which you validate your app. Here is how you build one:

1.1) Know your actual users (be data driven)

First of all, be data driven. Prioritize based on which devices, OS versions, and even geographical regions are most prevalent among your user base.

Action — Dive into your analytics. Understand your specific device landscape. What are the top 10, 20, 50 devices your active users are on? What OS versions dominate? This data is the bedrock of your device matrix.

1.2) Prioritize (risk assessment)

Not all devices or features are created equal in terms of risk.

Risk, in this context, is a function of likelihood (how many users on this device/OS?) and impact (how critical is this feature? What happens if it breaks?).

Action — Focus your most intensive testing on high-traffic user flows running on the most popular devices/OS versions within your user base. A bug in your checkout process on your top 5 Android devices is infinitely more critical than a minor UI glitch on an obscure device with little market share among your users.

1.3) Define tiers (risk acceptance)

Since you can't test everything equally, you need to explicitly define levels of risk you're willing to accept for different device segments. This formalizes your prioritization.

Action — Create device/OS tiers. For example:

    • Tier 1 (Critical) — your most popular devices/OS versions (e.g. top 80% of your user base). Bugs here are unacceptable. These devices get the full suite of manual and automated tests for all critical and important features.
    • Tier 2 (Important) — the next significant chunk of devices. Minor cosmetic issues might be tolerable, but core functionality must work. These might get critical path automation and focused manual testing.
    • Tier 3 (Supported/Best Effort) — older or less common devices. You aim for basic functionality, but known issues might be documented and not block a release if non-critical. Testing might be limited to smoke tests or exploratory testing if time permits.

1.4) Keep it alive

Finally, create a process in which you review and update the matrix on a periodic basis (e.g. quarterly), as your user base and the market will inevitably evolve. Your matrix is only useful as long as it is up to date.

2) Create your testing mix

No single testing method conquers fragmentation, but a balanced portfolio might do. Here are the most common approaches:

  • Emulators & Simulators — emulators are the first line of defense for developers. They are fast, free, scalable for basic layout and functional bug checks during development. However, they can't perfectly replicate real hardware performance, sensor behavior, or OEM-specific OS mods.
  • Real devices (In-house lab) — they provide the highest fidelity for performance, hardware interactions, and manufacturer quirks… but they can be expensive and logistically challenging to maintain.
  • Cloud device farms — the scalable solution for broad real-device testing (manual and automated) without owning hardware. Platforms like BrowserStack give you on-demand access to thousands of physical devices/OS versions globally, and allow precise matrix mirroring and massive test parallelization.

3) Establish foundational quality

Strong underlying code quality and good production significantly eases the load on device-specific testing. Your goal should be to minimize the number of issues that get to device-level, by intercepting them earlier:

  • Strong typing & static analysis — strongly typed languages like TypeScript (React Native), Kotlin, and Swift help you catch a lot of errors before runtime. Employ linters for further analysis.
  • Robust unit & integration tests — ensure core logic, utilities, and API integrations are thoroughly covered. Unit and integration tests are fast and cheap to run, especially compared to E2E tests.
  • Architect for testability — design choices matter. Keep the mobile app light by pushing as much business logic as possible into the backend layer, where it is easier to test. If you are using a universal framework like React Native or Flutter, restrain as much as possible from writing platform-specific code.
  • Intensive logging & production monitoring — your safety net. Implement good monitoring with tools like Firebase Crashlytics or Sentry, to catch issues that slip through as early as possible.

4) Automation strategy

Manual testing across a large matrix is unsustainable, but implementing automation across the board can be equally hard. Do smart automation choices for streamlining key areas while maintaining velocity:

  • Focus your automation — don't try to automate everything. Prioritize critical user flows ("must not break" scenarios) on your Tier 1 devices. Use well-established frameworks (Appium, Espresso, XCUITest).
  • Parallelize with cloud platforms — running suites sequentially is a bottleneck. Cloud platforms enable massive parallel execution across hundreds of configurations, providing fast feedback in CI/CD.
  • Incorporate visual regression testing — for UI-heavy apps, these tools automatically detect visual changes across devices, catching layout bugs functional tests miss.
  • Reduce toil & boost DevEx — automation's goal is to free your team from repetitive manual checks, leading to faster, more reliable feedback and higher developer confidence.

Testing Strategy vs Product Maturity

The principles we covered should somewhat work for everyone, but the truth is your specific approach to fragmentation should also change over time, and evolve alongside your product and team journey.

Obviously, applying an enterprise level of testing rigor to a pre-PMF product is a waste of resources, just as neglecting deeper testing once you have scale is a recipe for disaster.

So let's try mapping your fragmentation strategy to the typical QA / Product Journey stages we have discussed other times in the past.

Your fragmentation strategy should evolve alongside your product

There is a lot of nuance and “your mileage may vary” here, but let’s sketch a basic cheatsheet:

1) Zero-to-One

  • Focus — Speed, iteration, and validating core hypotheses.
  • Fragmentation Approach — Minimal and highly pragmatic.
    • Device "Matrix" — Likely just the founders' phones, maybe a couple of common emulators/simulators for basic layout checks. A formal matrix is overkill.
    • Testing — Primarily manual, "happy path" testing on these few devices. Does the core loop work? Can users sign up and perform the one key action?
    • Automation — Probably none, or at most, some very basic UI smoke tests if the team has a strong existing preference.
    • Risk Tolerance — Very high. Bugs are expected. The bigger risk is building the wrong product, not having a perfectly polished app on every obscure device.
  • Takeaway — Don't let fragmentation concerns prematurely slow you down. Focus on finding PMF.

2) Finding PMF / Early Growth

  • Focus — Stabilizing core features, growing the user base, and starting to understand user segments.
  • Fragmentation Approach — Begin to formalize, driven by initial user data.
    • Device Matrix — Start tracking user analytics (even basic ones from e.g. Firebase or your app store consoles). Identify your top 5-10 devices/OS versions. This forms your rudimentary, evolving matrix.
    • Testing — Still heavily manual, but more structured. Test critical user flows on your identified key devices. Introduce more thorough exploratory testing.
    • Automation — Consider introducing UI automation for 1-2 absolute critical paths (e.g., signup, core purchase flow) if you have the expertise. Keep it lean.
    • Tools — This is where you might start dipping your toes into cloud device services for occasional checks on devices you don't own, especially if user feedback points to issues on specific models.
    • Risk Tolerance — Medium. Core functionality on popular devices needs to be solid. You can still live with some rough edges on less common configurations.
  • Takeaway — Use early data to guide a Minimum Viable Testing process for fragmentation.

3) Scaling / Established Product

  • Focus — Reliability, performance at scale, expanding feature sets, and protecting brand reputation.
  • Fragmentation Approach — Strategic, data-driven, and increasingly automated.
    • Device Matrix — A well-defined, multi-tiered matrix (as discussed in "First Principles") is essential, constantly updated with fresh user analytics and market data.
    • Testing — A more sophisticated mix:
      • Manual — Focused exploratory testing, usability checks, and testing new features on key devices.
      • Automated — Significant investment in UI automation for regression testing across Tier 1 and critical Tier 2 devices, running in CI/CD.
    • Tools — Heavy reliance on cloud device farms (like BrowserStack) for comprehensive automated and manual testing coverage across the matrix. You might also maintain a small, curated in-house lab for frequently used dev/QA devices. Also this balance can shift over time.
    • Performance Monitoring — Actively monitor performance and stability across different device segments in production.
    • Risk Tolerance — Low to Very Low for Tier 1 devices and critical functionality. Higher for Tier 3.
  • Takeaway — Your fragmentation strategy is now a core part of your quality engineering process, deeply integrated and data-informed.

The Future is Still Fragmented

So, will fragmentation ever end?

Probably not, at least not anytime soon. While there are glimmers of hope (Google's Project Mainline, for instance), the fundamental drivers of diversity remain.

Hardware innovation in smartphones might be questionable today, but new form factors emerge all the time (foldables are already here, wearables are well established, and AR/VR may be on the horizon), and OS customizations persist.

What about AI?

There's certainly potential for AI to assist with this. AI may write test cases, better simulate E2E flows, and even predict high-risk device/OS combinations based on code changes.

However, for the core challenge of executing tests across diverse hardware remains, and AI is not a silver bullet.

The reality is that the right mindset and an intentional strategy (data-driven device matrix, smart testing mix, strong foundational quality, targeted automation) remain your most crucial assets for navigating the mobile landscape.

The landscape will shift, but the principles of smart, risk-based testing will endure.

Bottom line

And that’s it for today! Remember that navigating the fragmented world of mobile devices is a marathon, not a sprint. Here are some takeaways from today’s guide:

  • Let data drive your device matrix — your actual user analytics are the most reliable guide for deciding which devices and OS versions deserve your primary testing focus. Don't guess; know.
  • Embrace tiered, risk-based testing — not all devices or bugs are created equal. Prioritize ruthlessly, focusing maximum effort on high-impact areas and accepting calculated risks elsewhere.
  • Blend your testing mix wisely — combine emulators (for speed), a curated in-house lab (for frequent access/fidelity), and cloud device farms like BrowserStack (for breadth, scale, and specialized needs).
  • Build on foundational quality — strong typing, linting, robust unit/integration tests, and good architectural choices significantly reduce the burden on expensive end-to-end device testing.
  • Automate strategically, not exhaustively — focus UI automation on stable, critical user flows on your most important devices to reduce toil and get fast feedback, leveraging parallel execution in the cloud.
  • Evolve your strategy with maturity — the right level of testing rigor changes as your product grows from pre-PMF to scale. Continuously adapt your approach.

Final note: I want to thank Nakul for joining in on this. I am a fan of what BrowserStack is building for mobile testing and AI-powered workflows. You can learn more below

A world model at the core of Pipi 9

Mike's Notes

Here are some quick notes for Alex explaining how the first 19" server rack, which is currently being built and commissioned, hosts internal models. As the number of racks grows, more models can be hosted.

This is highly experimental. I shall report on results.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > 
  • Home > pipiWiki > Industries

Last Updated

16/05/2025

A world model at the core of Pipi 9

By: Mike Peters
On a Sandy Beach: 13/05/2025

Mike is the inventor and architect of Pipi and the founder of Ajabbi.

The problem

The world is not a simple place. Any large complex system becomes dynamic, has emergent properties, and has unintended consequences. They are hell to build and update.

Large IT projects;

  • 15% succeed
  • 25% makes no difference
  • 65% fail
The annual global cost of new IT system failure is $USD 3,500,000,000,000 (the size of France's annual GDP) — IT Revolution.

Solution

Just copy how nature does it. Every living biological cell is incredibly complex and self-managing in response to its environment.

Pipi 9

Pipi maintains many models of the world at all levels of reality.

Autonomous agents

Each Pipi instance contains and is driven by autonomous agents.

Swarm intelligence

Each model is curated by an instance of Pipi, which is adaptive in response. Each instance is uniquely internally named in ASCII after one of the old deities from human history, e.g., "Loki," "Thor," or "Niltsi." This will be easier than remembering internal  IP addresses. If the list of names available runs out, then the names of gods from works of fiction could be added, eg, The Lord of the Rings, Game of Thrones, etc. If even more names are needed, there are always ice cream flavours, rose varieties, racehorse names, etc.

Pipi 9 instances function as swarms, similar to a flock of birds. There can be many swarms, and the swarms can also interact.

World model

The world model contains the laws of physics, properties, and other objective constraints of the real world.

Other models

Models can automatically import human-derived ontologies and standards. Only humans apply the rules and parameters. The models are stateful and keep their histories. The connections between models are fluid and dynamic. Some multiple inheritance is also present, which is too difficult to show here in this simplistic outline. 

Self documentation

All assumptions and configurations will eventually be openly viewable on pipiWiki, GitHub, via API, etc. This is a work in progress as I sort out the self-documentation generated by Pipi 9. The problem is mainly how to automatically present a lot of versioned information in multiple languages on web pages, with the required navigation.

Rendering

Once the first server rack (completely isolated from the internet) is operational, documentation rendered in static HTML can be scheduled, copied to a memory stick (analogue security barrier), and then uploaded by FTP to the pipiWiki website. If I remember to do it, it will be monthly at first. Later, I will automate every process and build a team to help with the work.

How it might look

Below, a crude, simplified outline shows examples of the relationships between internal models in Pipi 9. There is a "kind of" hierarchy of models with updates from above, and bubble-up learning from below, all automated.

  • World Model
    • Aviation Model
    • Education Model
    • Health Model
      • UK Health Model
        • UK NHS Model
          • Birmingham Children's Hospital Model
          • Birmingham Women's Hospital Model
        • Circle Health Group Model
          • Albyn Hospital Model
    • Rail Model
    • Sewerage Model
    • Stormwater Model
    • etc

Summary

The whole setup is designed to be fully automated, autonomous, fluid, dynamic, and ever-evolving, but also lawful, unlike LLMS. This is subject to experimentation, change and has much more detail to come, but you get the rough idea.

Accountability Sinks

Mike's Notes

An excellent article by Martin Sustrick on why accountability is lacking. With lots of examples. A long read and lots of quotes to follow up on.

Resources

References

  • Reference

Repository

  • Home > Ajabbi Research > Library >
  • Home > Handbook > Teams > Culture

Last Updated

13/05/2025

Accountability Sinks

By: Martin Sustrik
250bpm: 22/04/2025

Accountability engineering in institution design

Back in the 1990s, ground squirrels were briefly fashionable pets, but their popularity came to an abrupt end after an incident at Schiphol Airport on the outskirts of Amsterdam. In April 1999, a cargo of 440 of the rodents arrived on a KLM flight from Beijing, without the necessary import papers. Because of this, they could not be forwarded on to the customer in Athens. But nobody was able to correct the error and send them back either. What could be done with them? It’s hard to think there wasn’t a better solution than the one that was carried out; faced with the paperwork issue, airport staff threw all 440 squirrels into an industrial shredder.

[...]

It turned out that the order to destroy the squirrels had come from the Dutch government’s Department of Agriculture, Environment Management and Fishing. However, KLM’s management, with the benefit of hindsight, said that ‘this order, in this form and without feasible alternatives,* was unethical’. The employees had acted ‘formally correctly’ by obeying the order, but KLM acknowledged that they had made an ‘assessment mistake’ in doing so. The company’s board expressed ‘sincere regret’ for the way things had turned out, and there’s no reason to doubt their sincerity.

[...]

In so far as it is possible to reconstruct the reasoning, it was presumed that the destruction of living creatures would be rare, more used as a threat to encourage people to take care over their paperwork rather than something that would happen to hundreds of significantly larger mammals than the newborn chicks for which the shredder had been designed.

The characterisation of the employees’ decision as an ‘assessment mistake’ is revealing; in retrospect, the only safeguard in this system was the nebulous expectation that the people tasked with disposing of the animals might decide to disobey direct instructions if the consequences of following them looked sufficiently grotesque. It’s doubtful whether it had ever been communicated to them that they were meant to be second-guessing their instructions on ethical grounds; most of the time, people who work in sheds aren’t given the authority to overrule the government. In any case, it is neither psychologically plausible nor managerially realistic to expect someone to follow orders 99 per cent of the time and then suddenly act independently on the hundredth instance.

— Dan Davies: The Unaccountability Machine

***

Someone – an airline gate attendant, for example – tells you some bad news; perhaps you’ve been bumped from the flight in favour of someone with more frequent flyer points. You start to complain and point out how much you paid for your ticket, but you’re brought up short by the undeniable fact that the gate attendant can’t do anything about it. You ask to speak to someone who can do something about it, but you’re told that’s not company policy.

The unsettling thing about this conversation is that you progressively realise that the human being you are speaking to is only allowed to follow a set of processes and rules that pass on decisions made at a higher level of the corporate hierarchy. It’s often a frustrating experience; you want to get angry, but you can’t really blame the person you’re talking to. Somehow, the airline has constructed a state of affairs where it can speak to you with the anonymous voice of an amorphous corporation, but you have to talk back to it as if it were a person like yourself.

Bad people react to this by getting angry at the gate attendant; good people walk away stewing with thwarted rage.

— ibid.

***

A credit company used to issue plastic cards to its clients, allowing them to make purchases. Each card had the client’s name printed on it.

Eventually, employees noticed a problem: The card design only allowed for 24 characters, but some applicants had names longer than that. They raised the issue with the business team.

The answer they've got was that since only a tiny percentage of people have names that long, rather than redesigning the card, those applications would simply be rejected.

You may be in a perfectly good standing, but you'll never get a credit. And you are not even told why. There's nobody accountable and there's nobody to complain to. A technical dysfunction got papered over with process.

***

Holocaust researchers keep stressing one point: The large-scale genocide was possible only by turning the popular hatred, that would otherwise discharge in few pogroms, into a formalized administrative process.

For example, separating the Jews from the rest of the population and concentrating them at one place was a crucial step on the way to the extermination.

In Bulgaria, Jews weren't gathered in ghettos or local "labor camps", but rather sent out to rural areas to help at farms. Once they were dispersed throughout the country there was no way to proceed with the subsequent steps, such as loading them on trains and sending them to the concentration camps.

Concentrating the Jews was thus crucial to the success of the genocide. Yet, bureaucrats working on the task haven't felt like they were personally killing anybody. They were just doing their everyday, boring, administrative work.

The point is made more salient when reading about Nuremberg trials. Apparently, nobody was responsible for anything. Everybody was just following the process orders.

To be fair, the accused often acted on their own rather than following the orders. And it turns out that the German soldiers faced surprisingly mild consequences for disobeying unlawful orders. So it's not like the high-ups would be severely hurt if they just walked away or even tried to mitigate the damage.

Yet, the vague feeling of arbitrariness about Nuremberg trials persists. Why blame these guys and not the others? There were literally hundreds of thousands involved in implementing the final solution. The feeling gets even worse when contemplating German denazification trials in 1960's:

Today they bring even 96-year-olds or even centenarians to justice just because they worked somewhere in an office or sat in a watchtower. But if it had been done like that in 1965, more than 300,000 German men and women would have had to be imprisoned for life for aiding and abetting murder. [...] It had to be kept under control and suppressed, because otherwise it would have been impossible to start anew. [...] There was a secretary at the Wannsee Conference and no one considered putting her in jail. And yet all these terrible orders were not typed by those who issued them. They were typed by women.

— Interview with historian Götz Aly (in German)

***

At the first glance, this might seem like a problem unique to large organizations. But it doesn't take a massive bureaucracy to end up in the same kind of dysfunction. You can get there with just two people in an informal setting.

Say a couple perpetually quarrels about who's going to do the dishes. To prevent further squabbles they decide to split the chores on weekly, alternating basis.

Everything works well, until one of the spouses falls ill. The dishes pile up in the kitchen sink, but the other spouse does not feel responsible for the mess. It’s not their turn. And yes, there's nobody to blame.

***

This is what Dan Davies, in his book The Unaccountability Machine: Why Big Systems Make Terrible Decisions — and How the World Lost Its Mind, calls "accountability sinks".

Something that used to be a matter of human judgement gets replaced by a formal process. Suddenly, nobody makes any deliberate decisions. Instead, a formal process is followed. In fact, the process may be executed on a computer, with no human involvement at all. There's nobody to blame for anything. There’s nobody to complain to when the process goes awry. Everybody’s ass is safely covered.

In any organization, incentives to replace human judgement by process are strong. Decisions may be controversial. More often than not someone's interests are at play and any decision is going to cause at least some bitterness, resentment and pushback. Introducing a non-personal process allows the decision to be made in automated manner, without anyone being responsible, without anyone having to feel guilty for hurting other people.

The one most important contribution of Davies' book is simply giving this phenomenon, that we all know and lovehate, a name.

And a name, at that, which highlights the crux of the matter, the crux that all too often goes unsaid: Formal processes may be great for improving efficiency. The decision is made once and then applied over and over again, getting the economies of scale.

Formal processes may improve safety, as when a pilot goes over a check list before the take-off. Are flight controls free and correct? Are all the doors and windows locked? And so further and so on.

Processes also serve as repositories of institutional memory. They carry lessons from countless past cases, often unknown to the person following the process. Problems once encountered, solutions once devised, all of that is encoded into a set of steps.

But in the end, deep in the heart of any bureaucracy, the process is about responsibility and the ways to avoid it. It's not an efficiency measure, it’s an accountability management technique.

***

Once you grasp the idea, it’s hard to unsee it. There's a strange mix of enlightenment and dread. Of course it's about accountability sinks!

All of the popular discontent in the West today is fueled by exactly this: A growing rage at being trapped in systems that treat people not as humans, but as cogs in a machine. Processes, not people, make the decisions. And if the process fails you, there's no one to turn to, no one to explain and no one to take responsibility.

This is why even the well-off feel anxious and restless. We may have democracy by name, but if the systems we interact with, be it the state or private companies, surrender accountability to the desiccated, inhuman processes and give us no recourse, then the democracy is just a hollow concept with no inner meaning.

You can't steer your own life anymore. The pursuit of happiness is dead. Even your past achievements can be taken away from you by some faceless process. And when that happens, there’s no recourse. The future, in this light, begins to feel less hopeful and more ominous.

It’s eerie how much of today’s political unrest begins to make sense through this lens.

The backlash against experts? Understandable. After all, they’re seen as the architects of these inhuman systems. The skepticism toward judges? It fits. They often seem more devoted to procedure than to justice. Even the growing tolerance for corruption starts to look different. Yes, it’s bad, but at least it’s human. A decision made by someone with a name and a face. Someone you might persuade, or pressure, or hold to account. A real person, not an algorithm.

And every time you’re stuck on the phone, trapped in an automated loop, listening to The Entertainer for the hundredth time or navigating endless menus while desperately trying to reach a human being who could actually help, the sense of doom grows a bit stronger.

***

But let's not get carried away.

First, as already mentioned, formal processes are, more often than not, good and useful. They increase efficiency and safety. They act as a store of tacit organizational knowledge. Getting rid of them would make the society collapse.

Second, limiting the accountability if often exactly the thing you want.

Take the institution of academic tenure. By making a scientist essentially unfireable, it grants them the freedom to pursue any line of research, no matter how risky or unconventional. They don’t need to justify their work to college administrators, deliver tangible results on a schedule, or apologize for failure.

The same pattern emerges when looking at successful research institutions such as Xerox PARC, Bell Labs, or DARPA. Time and again, you find a crucial figure in the background: A manager who deliberately shielded researchers from demands for immediate utility, from bureaucratic oversight, and from the constant need to justify their work to higher-ups.

Yet again, venture capital model of funding new companies relies on relaxed accountability for the startup founders. The founders are expected to try to succeed, but nobody holds them accountable if they do not. The risk is already priced in by the VC. Similarly to the role of the Bell Labs manager, the VC firm acts like a accountability sink between the owners of the capital and startup founders.

***

On October 1st, 2017, a hospital emergency department in Las Vegas faced a crisis: A mass shooting at a concert sent hundreds of people with gunshot wounds flooding into the ER at once. The staff managed to handle the emergency in a great way, violating all the established rules and processes along the way:

At that point, one of the nurses came running out into the ambulance bay and just yelled, “Menes! You need to get inside! They’re getting behind!” I turned to Deb Bowerman, the RN who had been with me triaging and said, “You saw what I’ve been doing. Put these people in the right places.” She said, “I got it.” And so I turned triage over to a nurse. The textbook says that triage should be run by the most experienced doctor, but at that point what else could we do?

Up until then, the nurses would go over to the Pyxis, put their finger on the scanner, and we would wait. Right then, I realized a flow issue. I needed these medications now. I turned to our ED pharmacist and asked for every vial of etomidate and succinylcholine in the hospital. I told one of the trauma nurses that we need every unit of 0 negative up here now. The blood bank gave us every unit they had. In order to increase the flow through the resuscitation process, nurses had Etomidate, Succinylcholine, and units of 0 negative in their pockets or nearby.

Around that time the respiratory therapist, said, “Menes, we don’t have any more ventilators.” I said, “It’s fine,” and requested some Y tubing. Dr. Greg Neyman, a resident a year ahead of me in residency, had done a study on the use of ventilators in a mass casualty situation. What he came up with was that if you have two people who are roughly the same size and tidal volume, you can just double the tidal volume and stick them on Y tubing on one ventilator.

— How One Las Vegas ED Saved Hundreds of Lives After the Worst Mass Shooting in U.S. History

As one of the commenters noted: "Amazing! The guy broke every possible rule. If he wasn't a fucking hero, he would be fired on the spot."

***

Once, I used to work as an SRE for Gmail. SREs are people responsible for the site being up and running. If there's a problem, you get alerted and it's up to you to fix it, whatever it takes.

What surprised me the most when I joined the team was the lack of enforced processes. The SREs were accountable to the higher ups for the service being up. But other than that they are not expected to follow any prescribed process while dealing with the outages.

Yes, we used a lot of processes. But these were processes we chose for ourselves, more like guidelines or recommendations than hard-and-fast rules. And in an aftermath of an outage, adjusting the process became as much a part of the response as fixing the software itself.

There is even an explicit rule limiting the accountability of individual SREs. The postmortem, i.e. the report about an outage, should be written in a specific way:

Blameless postmortems are a tenet of SRE culture. For a postmortem to be truly blameless, it must focus on identifying the contributing causes of the incident without indicting any individual or team for bad or inappropriate behavior. A blamelessly written postmortem assumes that everyone involved in an incident had good intentions and did the right thing with the information they had. If a culture of finger pointing and shaming individuals or teams for doing the "wrong" thing prevails, people will not bring issues to light for fear of punishment.

— Site Reliability Engineering, How Google runs production systems

***

On February 1, 1991, a jetliner crashed into a smaller, commuter propeller plane on the runway at Los Angeles International Airport. The impact killed twelve people instantly. The small plane, which was pushing the larger one ahead at a speed of one hundred and fifty kilometers, exploded in the fuel tank and caught fire. Passengers tried to get out of the burning plane, but not everyone succeeded. The death toll eventually rose to thirty-five.

Here's a great article from Asterisk about the event:

At the LAX control tower, local controller Robin Lee Wascher was taken off duty — as is standard practice after a crash. After hearing about the propeller, she knew she must have cleared USAir flight 1493 to land on an occupied runway. As tower supervisors searched for any sign of a missing commuter flight, Wascher left the room. Replaying the events in her mind, she realized that the missing plane was SkyWest flight 5569, a 19-seat Fairchild Metroliner twin turboprop bound for Palmdale. Several minutes before clearing the USAir jet to land, she had told flight 5569 to “taxi into position and hold” on runway 24L. But she could not recall having cleared it for takeoff. The plane was probably still sitting “in position” on the runway waiting for her instructions when the USAir 737 plowed into it from behind. It was a devastating realization, but an important one, so in an act of great bravery, she returned to the tower, pointed to flight 5569, and told her supervisor, “This is what I believe USAir hit.”

[...]

The fact that Wascher made a mistake was self-evident, as was the fact that that mistake led, more or less directly, to the deaths of 35 people. The media and the public began to question the fate of Ms. Wascher. Should she be punished? Should she lose her job? Did she commit an offense?

[...]

Cutting straight to the case, Wascher was not punished in any way. At first, after being escorted, inconsolable, from the tower premises, her colleagues took her to a hotel and stood guard outside her room to keep the media at bay. Months later, Wascher testified before the NTSB hearings, providing a faithful and earnest recounting of the events as she recalled them. She was even given the opportunity to return to the control tower, but she declined. No one was ever charged with a crime.

[...]

If you listen to the tower tapes, you can easily identify the moment Wascher cleared two planes to use the same runway. But if you remove her from the equation, you haven’t made anything safer. That’s because there was nothing special about Wascher — she was simply an average controller with an average record, who came into work that day thinking she would safely control planes for a few hours and then go home. That’s why in interviews with national media her colleagues hammered home a fundamental truth: that what happened to her could have happened to any of them. And if that was the case, then the true cause of the disaster lay somewhere higher, with the way air traffic control was handled at LAX on a systemic level.

When you read the report of the investigation, your perspective suddenly changes. Suddenly, there is no evil controller Wascher who needs to be publicly punished. Instead, there is a team of controllers who, despite all the broken radars, poor visibility and distracting duties, heroically ensure that planes do not collide day in, day out.

And if the result of Washer giving a honest report about the incident is that a second ground radar is purchased, that the interfering lights are relocated, or that various less important auxiliary tasks are not performed by the flight controller in charge, the effect on air traffic safety is a much bigger than what could be achieved by firing Wascher. Quite the opposite: Punishing her would have a chilling effect on other experienced controllers. At least some of them would be unwilling to take responsibility for things beyond their control and would eventually leave just to be replaced by less experienced ones.

***

Some of my former colleagues at Google were part of the effort to save Obamacare after the website that people were supposed to use to subscribe to the program turned out not be working. Here's Wikipedia:

The HealthCare.gov website was launched on the scheduled date of October 1, 2013. Although the government shutdown began on the same day, HealthCare.gov was one of the federal government websites that remained open through the events. Although it appeared to be up and running normally, visitors quickly encountered numerous types of technical problems, and, by some estimates, only 1% of interested people were able to enroll to the site in the first week of its operations.

Suddenly, a website was in the center of attention of both the media and the administration, even the president himself. As Jennifer Pahlka writes in her excellent book Recoding America: Why Government Is Failing in the Digital Age and How We Can Do Better:

If the site failed, Obama’s signature policy would likely go down with it. With this threat looming, suddenly the most important policy of the administration would live or die by its implementation. It was the first time the highest priority of the White House was the performance of a website.

Having the support from the highest places, the people in what was to become US Digital Service, outlined a plan:

He had a straightforward two-step plan. Step one was to recruit a small team of technologists with relevant experiences and skills [to fix the website]. Through a trusted network, he reached out to a set of remarkable individuals who signed up to jump into the fire with him. Step two was to win the trust of CMS—an agency [responsible for implementation of the website] that, like all other agencies, really, was highly skeptical of people from outside and resistant to their interference. The situation was tense, but Todd made it clear that he and everyone who came with him were not there to assign blame. They were there to help. The result was the opposite of the usual response to failure. Instead of the hardening that tends to come with increased oversight—of the kind Weaver would later experience working on satellite software, further limiting what he could do—the CMS team suddenly found themselves with something they hadn’t realized they needed: a group of smart nerds they could trust. Like most other agencies, they knew how to acquire technology and technology services; after all, they’d issued those sixty separate contracts for healthcare.gov alone. But now they had people on their team who could look at the code, not the contract terms. That, it turned out, made all the difference.

— ibid.

Again, sidestepping accountability had a beneficial effect. It managed to cut through the seemingly unsolvable internal hurdles and unstuck a stuck system.

***

Dominic Cummings, chief advisor to Boris Johnson during the COVID crisis, recounts:

At the peak of COVID craziness in March 2020, on the day itself that the PM tested positive for CoVID, a bunch of people come into Number 10 sit around the table and we have a meeting and it’s about supplies of PPE to the NHS.

They say, “None of this PPE that we’ve ordered is going to be here until the summer.”

“But the peak demand is over the next three to four weeks.”

“Sorry, Dominic, but it’s not going to be here.”

“Why not?”

“Well, because that’s how long it takes to ship from China.”

“Why are you shipping from China?”

“Well, because that’s what we always do. We ship it from China.”

But A, we need it now and B, all of the airlines are grounded. No one’s flying anything.

“So call up the airlines, tell them that we’re taking their planes, we’re flying all the planes to China, we’re picking up all our shit, we’re bringing it back here. Do that now. Do that today. Send the planes today.”

We did that. But only the Prime Minister could actually cut through all the bureaucracy and say, Ignore these EU rules on Blah. Ignore treasury guidance on Blah. Ignore this. Ignore that. “I am personally saying do this and I will accept full legal responsibility for everything.”

By taking over responsibility, Johnson loosened the accountability of the civil servants and allowed them to actually solve the problem instead of being stuck following the rigid formal process.

***

Finally, consider the mother of all accountability sinks: The free market.

We want government to enforce the rule of law, to enforce the contracts and generally make sure that the market operates freely. But we also explicitly don’t want government interference in the day-to-day workings of the market beyond ensuring compliance with the law and taxes.

Much has been written about how markets act as information-processing machines, how they gather dispersed data from across society and use it to optimize the allocation of scarce resources.

Much less is known about how the lack of accountability gives entrepreneurs the ability to take huge risks. If your company fails, the blame is yours and yours only. No one will come after you. There’s no need to play it safe.

While ignoring the law of supply and demand may have been the primary cause for the failure of communist economies, the fact that the management of every company was accountable to the higher-ups and eventually to the communist party must have meant that they tried to avoid any risks at any cost, which eventually led to terrible performance in implementation of new technologies — even if they were discovered by the scientists — and business practices and to the overall economic stagnation.

***

By now, you probably have a few ideas of your own about how accountability sinks can be creatively incorporated into institutional design.

But before I wrap up, let me make a few basic observations:

  1. Formal processes are mostly beneficial and they’re not going anywhere. Any complex modern society would collapse without them.
  2. Not every formal process is an accountability sink. A process you design and impose upon yourself doesn’t absolve you of responsibility when things go wrong. You remain accountable. On the other hand, a process imposed upon you from above often incentivizes blind adherence, even when it’s hurting the stated goals.
  3. Not every accountability sink leads to rigidity and cover-ups. A process can be designed in such a way as to shield those affected from the accountability, but at the same time not to impose any pre-canned solutions upon them. (E.g. blameless postmortems.)