On a Sandy Beach

AllPinouts

Mike's Notes

AllPinouts is a useful reference website I found while trying to connect some older servers to test Pipi 9.

Resources

References

Reference

Repository

Home > Ajabbi Research > Library >
Home > Handbook >

Last Updated

01/05/2025

AllPinouts

By: Nicola Asuni

AllPinouts: 20/04/2025

AllPinouts is a Web-based free content project to list cable and connectors pin-outs.

Pinout or pin-out is a term used in electronics to describe how an electrical cable is wired, or the function of each wire (pin) in a connector.

An electrical connector typically consists of several electrical contacts or pins that can be used to carry electrical power or signals. Due to the wide variety of applications and manufacturers, a wide selection of electrical connectors exists with different types and numbers of contacts or pins. The pinout of a connector identifies each individual pin, which is critical when creating cable assemblies and adapters. Proper identification of pins and wires ensures that the signals and power are transmitted across cables and connectors.

The archive is organized into the following categories:

Cables
Connectors

The Most Popular Deployment Strategies Explained.

Mike's Notes

Another great visual explanation from Nikki Siapno, who writes the weekly Level Up Coding newsletter. It's worth subscribing to.

Resources

References

Reference

Repository

Home > Ajabbi Research > Library > Subscriptions > Level Up Coding
Home > Handbook >

Last Updated

30/04/2025

The Most Popular Deployment Strategies Explained.

By: Nikki Siapno

Level Up Coding: 27/04/2025

Nikki Siapno, is an Engineering Manager at Canva and founder of Level Up Coding.

Using the right deployment strategy is crucial for seamlessly integrating new features and updates. It reduces the risks, avoids interruptions, and delivers a smooth user experience.

There are many possible approaches, let's look at 5 of the most popular deployment patterns:

Blue/Green Deployment

Renowned for zero downtime, this method uses two environments, Blue and Green. One hosts the live version while the other tests the new version.

After comprehensive testing without affecting live traffic, users are transitioned to the updated environment.

If an issue is discovered after switching environments, it is relatively easy to switch back.

The main challenge is the cost and complexity of managing two environments.

Canary Deployment

Named after canary birds in mines, it starts by rolling out changes to a small subset of users.

This allows for monitoring performance and gathering feedback.

If successful, you gradually extend the update to more users.

It excels in minimizing user impact during updates due to isolation of a small set of users.

Rolling Deployment

Updates software in phases, rather than all at once.

It incrementally upgrades different segments of the system, ensuring most of it remains operational during the deployment.

It can be ideal for critical systems that require continuous operation.

However, it extends the total update time and might introduce temporary inconsistencies.

Feature Toggles

Think of feature toggles as on-off switches for new features.

They allow teams to deploy features quietly, turning them on for specific users when it makes sense.

Feature toggles support strategies like canary releases and A/B testing.

The challenge lies in managing numerous toggles, which can become complex and risk feature conflicts.

A/B Testing

Comparable to a scientific experiment, A/B testing offers two variations of a feature to different user groups to gauge which performs better.

It's a go-to for validating user preference and effectiveness of new features, based on concrete data like user engagement or ease of use.

Each deployment pattern stands out for specific strengths:

Blue/Green for safety and zero downtime.
Canary for controlled, low-risk rollouts.
Rolling for maintaining continuous operations.
Feature Toggles for flexible feature management.
A/B Testing for data-driven user insights.

The right deployment strategy varies depending on the project's needs and objectives. Recognizing these differences allows teams to choose the best approach for a successful and user-centric software release.

The Python SQL Toolkit and Object Relational Mapper

Mike's Notes

I discovered SQL Alchemy today. I copied these notes from different pages of that website to remind me.

Resources

References

Reference

Repository

Home > Ajabbi Research > Library >
Home > Handbook >

Last Updated

29/04/2025

The Python SQL Toolkit and Object Relational Mapper

By: Mike Peters

On a Sandy Beach: 29/04/2025

Mike is the inventor and architect of Pipi and the founder of Ajabbi.

"SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.

It provides a full suite of well known enterprise-level persistence patterns, designed for efficient and high-performing database access, adapted into a simple and Pythonic domain language." - SQL Alchemy

Dialects

The dialect is the system SQLAlchemy uses to communicate with various types of DBAPI implementations and databases. The sections that follow contain reference documentation and notes specific to the usage of each backend, as well as notes for the various DBAPIs.

All dialects require that an appropriate DBAPI driver is installed.

Included Dialects

PostgreSQL
MySQL
SQLite
Oracle
Microsoft SQL Server

Included, but not currently supported dialects

The following dialects have implementations within SQLAlchemy, but they are not part of continuous integration testing nor are they actively developed. These dialects may be removed in future major releases.

Firebird
Sybase

External Dialects

Currently maintained external dialect projects for SQLAlchemy include:

Database	Dialect
Amazon Redshift (via psycopg2)	sqlalchemy-redshift
Apache Drill	sqlalchemy-drill
Apache Druid	pydruid
Apache Hive and Presto	PyHive
Apache Solr	sqlalchemy-solr
CockroachDB	sqlalchemy-cockroachdb
CrateDB	crate-python
EXASolution	sqlalchemy_exasol
Elasticsearch (readonly)	elasticsearch-dbapi
Firebird	sqlalchemy-firebird
Google BigQuery	pybigquery
Google Sheets	gsheets
IBM DB2 and Informix	ibm-db-sa
IBM Netezza Performance Server	nzalchemy
Microsoft Access (via pyodbc)	sqlalchemy-access
Microsoft SQL Server (via python-tds)	sqlalchemy-tds
Microsoft SQL Server (via turbodbc)	sqlalchemy-turbodbc
MonetDB	sqlalchemy-monetdb
SAP Hana	sqlalchemy-hana
SAP Sybase SQL Anywhere	sqlalchemy-sqlany
Snowflake	snowflake-sqlalchemy
Teradata Vantage	teradatasqlalchemy

8 Common Mistakes to Avoid in API Design

Mike's Notes

Another great visual explanation from Nikki Siapno, who writes the weekly Level Up Coding newsletter. It's worth subscribing to.

Resources

References

Reference

Repository

Home > Ajabbi Research > Library > Subscriptions > Level Up Coding
Home > Handbook >

Last Updated

28/04/2025

8 Common Mistakes to Avoid in API Design

By: Nikki Siapno

Level Up Coding: 27/04/2025

Nikki Siapno, is an Engineering Manager at Canva and founder of Level Up Coding.

Clean, reliable APIs are a hallmark of strong engineering teams.

They don't happen by accident — they happen by design.

Let's walk through 8 critical mistakes to watch for when crafting APIs that stand the test of time.

1) Not Validating User Inputs

Input validation is your API’s first line of defense.

Accepting unvalidated data leaves you vulnerable to injection attacks, corrupted databases, and confusing downstream errors. Always validate inputs on the server side; not just for type and format, but also for business logic. Assume nothing about client behavior.

Tip: Use schema validation libraries or frameworks to enforce strict rules at your API boundaries.

2) Weak Security

Security can't be an afterthought. Common oversights include exposing sensitive data, improperly handling authentication tokens, or missing encryption on data in transit. A single weak endpoint can compromise your entire system.

Tip: Apply principles like least privilege, secure authentication (OAuth 2.0, OpenID Connect), encrypted communication (TLS), and regular security audits across your API surface.

3) Overcomplicating the API

A powerful API isn’t necessarily a complex one.

When you overload endpoints with too many responsibilities, mix resources and actions, or create deeply nested payloads, you make your API harder to learn, harder to use, and harder to maintain.

Tip: Stick to clear, predictable patterns. One resource → one endpoint. One action → one HTTP method.

4) Lack of Clear Documentation

The best API in the world won't gain adoption without great docs.

Unclear documentation forces developers to guess how your API works — leading to mistakes, frustration, and support tickets.

Tip: Invest in easy-to-navigate documentation, examples, and usage guides. Tools like Swagger/OpenAPI and Postman can help you automatically generate high-quality docs from your API definitions.

5) Poor Error Handling

Errors are inevitable — unclear errors are unforgivable.

Vague 500 errors, inconsistent error formats, and missing error codes make debugging a nightmare for users.

Tip: Standardize your error responses. Include meaningful HTTP status codes, error messages, and even error IDs that developers can reference for troubleshooting.

6) Not Rate Limiting

Without rate limiting, a single client can unintentionally (or maliciously) overwhelm your API and bring your system down. This isn’t just a problem for high-traffic APIs — any public-facing service needs basic protections against spikes.

Tip: Implement sensible rate limits (eg; 1,000 requests/minute per API key) and return clear error messages (HTTP 429) when limits are exceeded.

7) Not Implementing Versioning

APIs evolve, but your users shouldn’t suffer every time you make a change. Without versioning, even minor updates can break client applications that depend on your previous behavior.

Tip: Plan for API versioning from day one. Common strategies include path versioning (/v1/users) or header-based versioning (Accept: application/vnd.company.v1+json).

8) Ignoring Performance Optimization

An API might work fine at low scale, but poor performance can quietly become a huge bottleneck as adoption grows. Inefficient queries, redundant data transfers, and lack of caching can grind systems to a halt.

Tip: Monitor and optimize your most critical endpoints. Techniques like pagination, compression, selective fields (fields=id,name), and caching (ETags, HTTP caching) make a big difference.

Final Thoughts

Your API is a product — even if it’s "internal." Treat it like one.

Clear contracts, secure practices, thoughtful UX, and strong documentation aren’t just "nice to have"; they determine whether developers love integrating with your systems or dread it.

Every decision you make at the design stage pays compounding dividends later. Start strong, and you’ll save yourself (and your users) a lot of pain.

Data Centre options for Pipi 9

Mike's Notes

I'm writing down my evolving thoughts about a future Pipi 9 data centre. This is to enable suppliers, Alex, and others to give me feedback.

Resources

References

Reference

Repository

Home > Ajabbi Research > Library >
Home > Handbook > Company > Values

Last Updated

27/04/2025

Data Centre options for Pipi 9

By: Mike Peters

On a Sandy Beach: 26/04/2025

Mike is the inventor and architect of Pipi and the founder of Ajabbi.

Pipi 4 in 2007

Pipi 4 used three ModemPak 45u rack cabinets to mount the servers. There was a separate 6 kw UPS. Here is a grainy photo from 2007, when it hosted the 17th most popular website in NZ, which received 30,000 unique visitors and 750,000 page views per month. It included servers for ESRI GIS, databases, applications, web hosting, updates, backups, security, rendering and more, with a lot of redundancy. It was never hacked due to the platform team led by Dave Evans. A report at the time to the Department of Conservation (DOC), written by Turing Solutions, a New Zealand government contractor, stated that Pipi 4 was worth NZ$3 million.

The first Pipi 9 rack cabinet

I am building out the first 45u rack cabinet. It was previously used for Pipi 4 and was in storage for 10 years. I'm having to replace the keyed barrel locks and get some rackstuds.

This temporary rack will be equipped with older servers for testing Pipi 9 and to start rendering open-source modular applications to send to GitHub. Older servers work fine, but the rendering takes longer.

I will then replace these servers over the coming months with better second-hand ones to speed up the rendering.

I'm really enjoying this job. If you've ever had Mechano, you'll understand. I also have to design and build some specialist electronic gear to put in the rack. Even more fun.

It's also a significant milestone reached in the Pipi 9 build. This will be sufficient till Ajabbi scales.

The first Data Centre

Once scaling begins, Pipi 9 will then need a data centre to use as a render farm to automatically create customised SaaS enterprise applications based on user requirements. The data centre will be completely isolated from the internet to maximise security. It can be expanded in stages if it is planned appropriately.

Each industry and each enterprise customer will get a dedicated server to store a mirrored copy of their deployment configuration and parameters, including localisation. No user data will be stored.

This enables customised updates to be created. For example, importing the latest version of Snowmed or IATA.

Colocated Staging Data Centre

The built application then needs to be copied to disk and deployed to a small, caged, colocated data centre with high security, reliability, and connectivity as a staging point for worldwide deployment. One option could be to use the proposed DataGrid Data Centre, set to open in Southland, NZ, by 2028, with high-speed undersea cables.

Pipi 9 is designed to function as a swarm, with complete self-managing automation at every stage.

The emphasis is 100% reliability and security with lots of redundancy. See the Ajabbi Handbook reference for more initial information.

Production in the Cloud

The built application then needs to be deployed to a customer-chosen cloud provider so that the customer can use the enterprise application in production. Regular updates and system changes would be rapidly provided by Pipi 9 in hours, not weeks or months.

Data Centre equipment

I really like the wide range of cabinet options that Rittal provides. My preference would be to use a standard 800mm wide 19 rack that could be flexibly used for;

Power
Cooling
Servers
Networking
Rack automation and monitoring

Other issues;

Determine the correct cabinet model to use
Use Open19 rather than OPC.
Use a cold aisle
Security doors
24 hr monitored access

Looking good

I also like the look of the Eaton racks, probably because I like black racks. But maybe Rittal could provide them in black as well, on special order from their manufacturing facility across the ditch in Australia. Or else I could have them spray-painted and paint-oven-baked locally. The Data Centre needs to be cool and look cool, and I don't like messy network cabling. Hey, I'm an artist too.

No doubt these ideas will evolve.

Upcoming ColdFusion update - Pre-release announcement

Mike's Notes

An important announcement from Adobe ColdFusion pre-release.

Resources

References

Reference

Repository

Home > Ajabbi Research > Library > ColdFusion

Last Updated

26/04/2025

Upcoming ColdFusion update - Pre-release announcement

By: Adobe

Adobe: 25/04/2025

We’re announcing an important change in the upcoming ColdFusion update that enhances the reliability and consistency of remote function calls.

Install the update

* Launch the ColdFusion Administrator.

* Select Package Manager.

* In the Settings tab, make the following changes:

* Change the update URL to

https://www.adobe.com/go/coldfusion-pr-updates.

If https://www.adobe.com/go/coldfusion-pr-updates doesn't launch, use

https://cfdownload.adobe.com/pub/adobe/coldfusion/PR/updates.xml

as the update URL.

* Change the package URL for your ColdFusion version:

* ColdFusion 2025: https://www.adobe.com/go/cf2025-pr-packages
* ColdFusion 2023: https://www.adobe.com/go/cf2023-pr-packages
* ColdFusion 2021: https://www.adobe.com/go/cf2021-pr-packages

* Select Submit changes after updating the update and package URLs.

* In the Packages tab, select Check for Updates. The pre-release update will be available.

* Download or install the update.

What’s changing

Remote CFC methods will now require strict argument matching

Going forward, if a remote method expects certain parameters, those arguments must be explicitly declared using the cfargument tag or defined directly in the function signature.

For example, if a remote function is defined to accept two arguments, it must only receive those two. Passing more than the declared number (for example, 10) will result in an error. This change ensures stricter method integrity and a better debugging experience.

This behaviour change applies only to remote CFC methods.

Sample code

test.cfc

<cffunction name="test1" output="true" returnFormat="json" access="remote">  <cfscript>
   if (isAdmin)
     {
       writeoutput("Yay. You are the admin.")
     }
  </cfscript> </cffunction>

The output will produce an exception, because function test1 does not support isadmin as an argument in test.cfc.

Modify the code as follows to resolve the exception:

<cfargument name="isadmin" type="boolean" required="true"></cfargument>

New JVM flag introduced

We’ve added a new JVM flag called

-Dcoldfusion.runtime.udf.matchArguments

that allows you to revert this behaviour if needed.

The flag is set to true by default, which enforces strict argument matching.

For the above example, setting

-Dcoldfusion.runtime.udf.matchArguments=false

will allow the application code to work as earlier.

For security reasons, we strongly advise against reverting to previous behaviour. Please update your code accordingly.

Fixes coming soon

We’ve also noted issues reported in previous updates. The official update, which will be released shortly, will address these.

Your feedback matters

We encourage you to test your applications with this pre-release and verify any remote CFC usage.

Please share your feedback, findings, or concerns with us by 2 May 2025. Your input will help us fine-tune the final release.

Thank you for being a valued part of the ColdFusion community!

Reply to this email directly or follow the link below to check it out:

https://forums.adobeprerelease.com/coldfusionnext/discussion/92/upcoming-coldfusion-update-pre-release-announcement

Opening the Box: Diagnosing Operating-System Task-Scheduler Behaviour on Highly Multicore Machines

Mike's Notes

Julia Lawal gave a live presentation at QCon London on April 8, 2024. The transcript was published last week and is reproduced below. The original website has more content.

See the GitHub repository for schedgraph.

"These are some tools for making graphs from scheduling traces created using the following command:"

trace-cmd record -e sched -v -e sched_stat_runtime {program to run}

Resources

References

Reference

Repository

Home > Ajabbi Research > Library > Subscriptions > InfoQ
Home > Ajabbi Research > Library > Linux
Home > pipiWiki > Engines > Conductor Engine

Last Updated

25/04/2025

Opening the Box: Diagnosing Operating-System Task-Scheduler Behaviour on Highly Multicore Machines

By: Julia Lawal

InfoQ: 15/04/2025

Julia Lawall is a senior researcher at Inria Paris. Prior to joining Inria, she completed a PhD at Indiana University and was on the faculty at the University of Copenhagen. Her work focuses on issues around the correctness and performance of operating systems. She develops and maintains the Coccinelle program transformation system.

Summary

Julia Lawall discusses methods for diagnosing operating system task scheduler behavior on highly multicore machines, crucial for performance analysis. She explains the challenges of understanding kernel-level scheduling decisions and shares custom-built tooling for visualizing and analyzing scheduler traces to identify and resolve performance regressions. Gain deep insights into OS behavior.

Transcript

Lawall: I'm going to talk about, Opening the Box: Diagnosing Operating System Task-Scheduler Behavior on Highly Multicore Machines. Basically, what our situation is, is we have our machines. They can be simplified as applications on the top, hardware on the bottom. Then, basically, the application does what the user wants. The hardware provides all the resources to make it possible. We need something in the middle to mediate between them to ensure all the interactions are safe, correct, not introducing security violations, and so on. That's the operating system. That's going to be the subject of our talk.

The operating system as a piece of software that's mediating the access to the hardware, can have an important and unexpected influence on the performance of our applications. Sometimes, we need to actually understand that. Most of the time, hopefully, we can just ignore the operating system and hope everything works for the best. Sometimes, for example, from changing from one operating system version to the next one, things might be unexpected. It might be different. Then one has to actually understand what's going on. In this talk, we're focusing on Linux, just because it's open source, available, and familiar to me. We're focusing on the task scheduler. Task scheduler is the thing that figures out what tasks should be running on what core at what time. What we're interested in is diagnosing performance regressions. Our target is large multicore machines, so server-class machines.

The Task Scheduler of the Linux kernel

The task scheduler of Linux kernel, what is its role? Basically, it has two things to do. One of them is to decide where each different task should run. This should run on core 27. This should run on core 148. Then, once those tasks have been placed on those cores, then maybe there's three different tasks on a particular core. It should decide which of those gets to run at a particular point in time. If you're familiar with Linux kernel source code, we're talking about the files core.c and fair.c. We'll think about how this task scheduler can impact the performance of applications.

Basically, the problem is that to do these things that a task scheduler needs to do, it needs to make decisions. The other problem is that actually a general-purpose operating system, such as Linux kernel, or Windows, or Mac, iOS, any of these operating systems that people are just using for doing random things, actually the operating system has absolutely no information on which it can base this decision. No real information. It can get some hints from previous behavior, but it doesn't know what's going to happen in the future.

Then the problem is that poor decisions can slow tasks down. This can have not just the short-term effect, but it can go on for some time and actually have a visible performance impact on how the application runs. This is a slide that actually we're going to come back to in the end. This is where we're going. This is an example of performance impact. This red line here is the vanilla Linux kernel. What we have on the x-axis is a bunch of runs. I've just sorted the runs by running time. The fastest runs are on the left. The slowest runs are on the right. It's not chronological. The y-axis is the running time. You can see that the running time, you can focus on the red line. That's the vanilla Linux kernel. The running time slowly increases. Some runs are a little bit faster than the others.

At the end, actually what we're concerned about is this thing over here. Things are going up sharply. Same inputs, same computation, but very different running time. These numbers here are random. How badly things go just depends on some unknown thing that happens during the execution. This is where we're going. You'll see this slide again later.

The Basic Issues for Scheduling (Work Conservation and Locality)

First, I'm going to start out, what are the basic issues for scheduling? One very important issue, which will come up over and over during the talk, is called work conservation. The basic, simple idea of work conservation, we have a task that's waking up, and where should we put it? Actually, there are four cores available. None of them are doing anything. We can put it anywhere. It doesn't really matter. There we have our task 1. We've put it on core 0. Now we have another task, task 2. It's waking up, we ask ourselves the same question, where should it go? It could go on core 1, core 2, or core 3. Those are all available. Putting it on core 0 would not be a very good choice. If we put task 2 on core 0, either it will have to wait for task 1 to finish, or maybe it will preempt task 1, and now task 1 will have to wait for task 2 to finish. That's all very unfortunate, because there's plenty of empty space over here for those other tasks to run.

The idea of work conservation is that no core should be overloaded if any core is idle. This is just like a concept that you can think about. It's not the overriding concept. There could be another good decision in certain situations. This is a concept that is important in many cases. Another concept, which is perhaps in contradiction with work conservation, is the concept of locality. Locality here, we already have task 1 running on core 0. We have the same questions again. We have a task 2, where should we put it? Now we have this little two bar thing here. It means it's a two-socket machine. We have basically two NUMA nodes. Now we might be concerned about memory accesses. We again have our situation with our three empty cores. If task 2 is sharing a lot of memory with task 1 or communicating a lot with task 1, then it could be a good idea to put task 2 on core 1.

On the other hand, if task 2 has no interaction with task 1, it might be a good idea to put it over here so that it doesn't destroy the cache of task 1. There are some different tradeoffs. Again, we have to know what's going to happen in the future. We have absolutely no idea about that information. That's what makes everything complicated.

A Challenge

Basically, we have seen these principles. If you do things right, things go well. If you do things wrong, things go badly. Overall, the task scheduler controls the application's access to the CPU, and so it controls the performance of the application. Work conservation, locality issues also, it's not like the cost of scheduling. It's going to impact all of the execution of the application after you have placed it in a particular place, and so it can have a real impact on the application performance.

Someone named Brendan Gregg, who works on a tool called perf, says we need to be able to understand the performance quickly and completely. This talk is pushing you in the direction of completely, so you're not just saying, this is faster, this is slower, I don't really know what's going on, but this is better because it's faster, something like that. Here we want to really understand what the operating system is doing and why, and how it's impacting the performance of the application. I've been working on this problem for the last year, actually, so I'm not sure if I can really say that we're doing things quickly, but at least we try to understand things completely. The problem that we have is the operating system, and the scheduler itself.

First, we don't want to look at the operating system in the first place, and then the scheduler is buried deep within that operating system, so how can we actually understand what it's doing, what decisions it's making at what point, and so on. That's what we're getting at in this talk.

Tooling

Fortunately, some help is available. There's a tool which is called trace-cmd. You run your application here, and trace-cmd record will record all the scheduling events, and then you can see precisely what's going on at every nanosecond, basically. Then you can use trace-cmd report on your trace, and then you get this output. This is obviously extremely abbreviated. You can have gigabytes of data coming out, because many things happen on your machine all the time, and this trace is going to include all of them. This trace gives you all the information. You know what command's running, what its PID is, what it's doing, what time, and so on. The trace is sorted chronologically, which is in some sense helpful, but it's hard to really trace through what's going on. Here things are happening on core 26, core 62, and so on. They don't really have anything to do with each other, so it's hard to look at this textual output and actually understand what's going on with the application.

Fortunately, there's also another tool that comes with it. This is called KernelShark. KernelShark will show you the different CPUs that are on your machine and what the CPU is doing at each unit of time. Then we have the same output we saw on the slide before when you want more detail. This is a very nice tool. You can zoom in, and then you get more information about a particular point. It's a bit frustrating to use if you have a very big machine, because here we see two cores. It's already taking up most of my slide just to see those two cores. It's a bit spacious, and it's a bit too friendly in certain directions that are not useful for all tasks. It's definitely useful for certain tasks, but maybe not for everything. Actually, in my own work, I found it useful. I wanted to have an overview of all of my cores at once to see what's going on everywhere, and so this tool was not really completely successful in that direction.

Basically, we want to be able to visualize these traces in some way, or we're going to see a bunch of different ways we can visualize the traces. We want to see all the activity and all the cores at once. We want it to scale to multiple gigabytes of data. I don't know about your applications. I don't know if that's enough to understand your applications, but that's at least enough for what I was looking into. I found it useful also to have something that would produce files that I could save, that I could compare, that I could share with my colleagues, and so I wanted to produce PDFs. Then, something that my colleagues complain about a lot is that I've totally given up on interactivity. With KernelShark, you can zoom. It's very nice. You see something immediately. With all of my tools, you have to actually run it again and say what range you want. I haven't actually found any drawing tool that really gave the interactivity and scalability up to the size of the data I was interested in.

Basically, I'm proposing four tools. This is the current state of the work. Dat2graph, it's a horizontal bar graph. It's like KernelShark. It shows you all the cores and what they're doing at any point in time. Running_waiting is a summary. How many tasks are running? How many tasks are waiting to be able to run? That gives you an idea of, what are the overloads on your machine? In some regions of your execution, are there some overloads? Stepper, is like zooming in even more. It shows you all the cores, all the tasks on the cores, the command name, the PID. It's good when you know there's this problem at this particular point and you want to see exactly what's going on in the machine at that point. Finally, enter events, which interleaves. Eventually, you'll need to make your own kernel events in your own print case, basically printfs that print out the information that you want. This manages that and interacts with the scheduling. We'll see how all of these tools can be useful in debugging a particular performance problem. All these tools are publicly available. The URL will be at the end of the talk.

The example that we're going to look at is based on an application from the NAS benchmark suite. These are somehow high-performance computing applications. The application we're considering is BT, Block Tri-diagonal solver. The only thing that we need to know about this is it has N tasks on N cores. If we have 64 cores, we have 64 tasks. 64 tasks on 64 cores sounds like an extremely trivial scheduling problem. You just put one task on each core and let them do what they want, and then everything should be fine. Everything would be fine if the Linux kernel scheduler would actually do that. Unfortunately, we'll see that it does not. Here's a bunch of runs of the application, this is 20 runs. Again, I've sorted them by runtime. We have here the fastest run. There's a bunch of runs that are close to that. Maybe they have some other locality issues, for example.

Then you see, like I mentioned before, we have this upward trend at the far end. Most of our runs run perfectly well as we expect them to, but every now and then there's one that is much slower, and we want to know why. This dotted line compares the fastest run and the slowest run. You can see it's about a 20% overhead for the slowest runs. To start out with, we can just get to know our benchmark. This is a scientific computing application. The standard thing to do with scientific computing applications is just to pin all the threads onto the cores and let them do what they want. Our interest is to find out how the Linux kernel scheduler is going to manage these threads. We don't want to pin them, but it's interesting to compare the difference.

If we pin them, we get even more performance improvement, and we don't get this huge jump at the end. We just get a small increase. Here I have not actually pinned specific threads to specific cores. I have said the first half of the thread goes on one socket. This is a two-socket machine. Put half of them on one socket and put half of them on the other socket and let them do whatever they want beyond that. The fact that we see a difference between the execution times suggest that memory locality is actually an issue for this benchmark. That's something just like having some idea of what the benchmark does and what its issues are.

Now what we can do is we can make some runs and try to see what's going on in those runs. This is the tool dat2graph. We run the application with trace-cmd. We obtain this dat file. It's fairly big, maybe. Then we can pass the tool to dat2graph, and then we get output like this. In the x-axis, we have the execution time. In the y-axis, we have the set of cores. These lines here represent the different tasks that are running on the different cores. Basically, there is a different color for each PID, so for each different task, but there's only a finite number of colors. For example, here we have two green tasks. They're obviously not the same PID. It's a bit approximate, but it's never really been found that to be a problem in practice. Maybe we could use random numbers, but when I tried random numbers, the colors sometimes came out too close to each other anyway, and so it wasn't really beneficial.

Basically, what we see here is that these tasks run along. Sometimes here we have a task that was on this core. These two tasks have switched with each other, but there's not much other excitement going on here. Here's a somewhat slower run, not very much slower, just a little bit slower. Here is a very slow run. Before we were around 8, 8-and-a-half seconds, and now we have jumped up to over 10 seconds.

Gaps, and Their Impact

Now what we're going to do is we're going to look into, like how is this run different from the first run, and what's going on? Basically, the issue that we find is that once we start looking at these slower runs, we start to see these gaps. There's a gap here. There's a gap here. What does a gap mean? A gap means that the task was running along, and now one core is not being used at all. Using fewer cores, less CPU time available, so it seems natural that we should get slower. If we look here, this one actually has a huge gap. This is the slowest run, and it has a huge gap. These gaps seem to be important with respect to the performance. What's the impact of a gap? Maybe there's just nothing to do at those points, and in that case, the gap is just nothing to do, so we don't do anything, so it doesn't matter. You can think back to what I said previously. This is N tasks on N cores. There probably is something to do, so this is probably not our situation.

The other situation is that if some core is idle, that means that some other core is probably overloaded, and those overloaded cores are going to be running at half speed. Basically, you have two people standing in the same place, and they can't both walk forward at the same time. One of them will walk forward for a little while, and the other one will walk forward, and then another one will walk forward, and so they'll be going half as fast as all of the other tasks that can run freely by themselves. This is actually also an application that synchronizes very often. We have these two tasks that are walking forward slowly, and then we'll have all those other fast tasks, and they will be waiting for the slow ones. That will slow everybody down, and not just those two slow tasks. That can have a big performance impact.

Basically, we want to see which situation we're in. I've hinted that we're definitely in the second one, but we can see that in practice. The tool, running_waiting, is going to show us where there are overloads. Here again we have x-axis is the time. Now the y-axis is the number of threads that are either running, which are the green ones. We have a 64-core machine, and so it's not surprising that we have 64 running threads. When we are up at this line, that means that everything is as we hope it to be.

Then the red line is the total number of all of the threads, which means that the difference between the red line and the green line is the number of waiting threads. These are tasks that are waiting in the run queue of some core. They would like to run, they are able to run, but there's no space on that core for them to run. That's slowing things down. You see lots of red here. This is a graph, it's made with some graphing tool which has some resolution, and so on. What we're interested in is really the width of these lines, and since they're lines, they have a minimum width, and so this is perhaps a bit misleading. What we can do instead is we can focus on what we're really interested in, which is the application. The application threads would be the one that have been forked since the start of the application. If we just focused on the forked ones, then we have quite a different picture. This is the fast run. In the fast run, most of the time, the application is running at full speed on all of the cores, and things are as they should be.

On the other hand, this is the slow run. In the slow run, now you can clearly see that there is a problem. We saw the gap in this region here, and what we see from this particular graph is that, up to this point, we have 64 threads running. Now we have only 63 threads running at any point in time, but we have 64 threads that want to run. One of our application threads wants to run, but it's not able to run. This gap is indicating a work conservation problem. Basically, somewhere on the machine, there is an idle core. Somewhere else on the machine, there is a task that wants to use that idle core, but it's not able to get there for some reason. This is a long overload period. It's lasting from 7 seconds, that's like 3 seconds. It's basically one-third of the execution time. Things are not going as we hope they should go. This is really undesirable for the behavior of the application.

Now we can try to understand what happened, what caused this bad situation to arise? We can go back to dat2graph. Dat2graph has an option, which is color-by-command. Color-by-command colors all the tasks, not by their PID, by what command they're trying to execute. These blue lines are BT, that's the application we're interested in. At the beginning of our gap, we have a green line, and the green line is called migration. Then you have to ask yourself, so what does migration do? Migration is a task that comes from the kernel. It's a kernel thread, and its purpose is basically to take some happily running task on some CPU, yank it off that CPU, and put it some other place.

Basically, migration wakes up, it preempts the running task. Then the running task can be moved off to some other CPU, and then migration goes to sleep once its evil work has been done. Migration is used for load balancing. Now our question is, why do we need load balancing? We have N threads on N cores, the load should be balanced. The kernel should just deal with it. Clearly, the kernel was dealing with it before. We can see what's going on.

Is Load Balancing Needed?

Now we go to the stepper tool, where we see exactly what's happening at a particular point in the code. Here we have all of our cores, and here we have the running task. Here, this bar means, it separates the running task from the waiting tasks. Here we have a task that is waiting. It has just woken up, and it has not been able to run yet. It will get to run very shortly. This is the situation just before our migration tasks run. We can see down here, what is the event at the kernel level that's happening? The event, it says, sched_move_numa. This is a special kind of load balancing, which is actually not moves with respect to the load, it's with respect to the memory usage. NUMA balancing is, you study all of your tasks, and what memory are they accessing, and if you find out that a task that's running on one NUMA node is actually accessing a lot of data on another NUMA node, then you move it over there in the assumption that it will be happier.

Basically, what we want to do here is we are moving this task. This is one NUMA node, this is the other NUMA node. It's moving it from this NUMA node into this space. This looks like a really excellent idea, because now it will be over there with its own memory, and this space here looks like it's empty, so that will be a good place for it. We know we have N tasks on N cores, we have 64 of these BT threads, that space is not going to be empty very long.

There is some other BT task that was actually already planning to run on this socket and run in this place. 214 is the one that we moved, and even before this 214 task ever gets to run on core 51, the task that had wanted to be there before has woken up. Actually, here's another overloaded core, but there's another space for it, so hopefully that won't get moved over soon. This is the migration task here that will go away, then we'll have two slots, and this one will get moved in there. Basically, we are now in a situation where we have two sockets, each one has 32 cores, and this one over here has 31 threads running on it, and this one has 32 threads running on it. That's a bad thing.

Assessment

Now we understand exactly what's going on. We have our NUMA balancing, the goal is to address locality issues. Moving a task to another NUMA node, if it has a locality issue, that should improve its performance in theory. Remember, it might be important for BT, because BT, when we had everything perfectly placed according to the NUMA nodes, the performance was better. There are two types of NUMA balancing, one of them is when you switch cores, you take a thread here, a thread here, and you exchange them. That seems fine. We're going to stay with the same number of threads on each core, no work conservation issues.

The other one is the one that we actually had in our situation, which is move, and in that one you just take some thread over here and put it on the socket over here, and hope for the best. The problem is, in this case, we risk introducing an overload, and that's actually what happens in our case. We have an overload, but the scheduler is supposed to manage these kinds of issues for us. The really important question is, we have this overload, why does it not get corrected? Why does the scheduler not just realize that this is a very bad situation? It never resolves it, the application just ends. Why does it take so much time for it to figure out what to do, or why does it spend so much time not doing anything at all?

To understand that, we need to actually delve into the code, and figure out what the load balancer actually does, because that's what we hope will do the work for us. We have overload now, it's not a NUMA issue. We have overload, so load balancing should help us deal with overloads. Load balancing has several parts. First, it has a bunch of filters. The first filter is, should_we_balance? Basically, what happens with load balancing is, on every clock tick, each core gets to reflect on whether it wants to steal load from some other part of the machine. The problem is that we don't want all of those cores to be stealing tasks from other parts of the machine at the same time, because they will have the same vision of what's going on, and they will all naturally try to steal the same task. That task will have to bounce around the machine with no practical benefit. Should_we_balance addresses that issue.

Then, if we are the core that should be allowed to balance at the current moment, then we have to find who are we going to steal from. We find the busiest group, and in our situation, that's going to be the busiest socket. It should be obvious to you which is the busiest socket, it's the one with the 33 threads on it. Then we want to find the busiest queue. The core that is going to steal, it's just going to identify one other core that it should steal from, and then it will take some threads from that core. If it needs to, it can actually then go around, there's a loop, and it can steal from some other cores later. In our situation, we just have one extra thread on one socket that we will need to steal, so there's not that complexity.

Basically, we see if the current CPU is the one that should steal, and then we see if that CPU can find another CPU that it can steal from. If any of these conditions are violated, then we just abort, and then we try again in the future. If we manage to find a core that we can steal from, then we look through its different tasks and see if there's one that we can steal, or more that we can steal. If we are able to steal something, that's great, that's success. There's a bunch of heuristics here that help you try to decide which is the task that you should steal. If you're not able to steal anything, there's another kind of stealing, which is called active balance. Active balance is the last resort. Active balance is able to steal a task that's running. Otherwise, you only steal tasks that are just waiting, because they don't have any context, they're easier to steal somehow.

Custom Tracing

Basically, we have this flowchart here, and we want to understand, why do we not end up at the success places in the flowchart? Since we never manage to steal anything, nothing ever runs on this core, it seems that we always end up in the abort context. We want to figure out why does this unfortunate thing happen? All of the graphs that we have made before are based on events that exist already in the Linux kernel. In this case, we want to put in some specialized things, because we want to understand, actually, how we get through this function, and that's not something that other people have wanted to do before. Basically, we want to add custom events into the kernel. You just think about how you debug things in the same situation in user space, and you might put in prints, or you might use a debugger. It's not super easy to use a debugger at the kernel level, but you can put in prints. It's a nice command called trace_printk. It works just like ordinary printf at the user level with the same percents and formats and everything. It's very simple to use.

Then we have another tool, which is called events, which shows when these events happen and how that relates to the other scheduling activities. I've put trace_printks at the different places in this flowchart, like when we fail here, when we fail here, when we fail here, and when we fail here. The events tool has lots of options, and one of them is this cpu 40. Basically, I have compiled my kernel. I have installed my kernel. I have run my application under this new kernel. I have got a new trace. Then I want to print out the events, and I can use dat2graph to study the trace to figure out where the gap is and what core it's on. Now I use the events thing, because there's lots of events. I use the events thing to focus on the core that has all the idles. That's why we only see events on this core. There are lots of events that happen on the other ones, but they're not interesting. What we're interested in is, what does this core do and how does that prevent it from actually being able to steal anything.

You can look over here. Here we have the names of the little strings that I put in my trace_printks, so the names of my events. I also know how many of those events are and what percentage of the total number of events they are. If you do a little math, then you can see that here we start load balancing 864 times, and every single one of those times we fail for one of these different reasons. Most of the time we fail because we are not able to find a busiest group. This might seem surprising, because we clearly have a busiest group. One of them has 33 threads on it, one of them has 31 threads on it.

The Linux kernel has some different characteristics, like categories and different heuristics for deciding that one socket is busier than the other one. Often it fails just at this level. This is 693 out of 864. The next one is we succeed to find the busiest group, but now we don't find a busiest queue. These are little red diamonds. You can see that there are many green Xs, those find_busiest_group is failing. There are also many red diamonds, and those red diamonds are scattered around a lot. There's 149 of them.

Again, this is failing because we have a bunch of different heuristics. Even though we see that there is an imbalance between the different cores, the Linux kernel developers have decided that the correct heuristic in this situation is that that imbalance is not enough to motivate doing anything. It's surprising because we have an imbalance situation that persists for multiple seconds, which is a long time from the point of view of an operating system.

Again, it's like heuristics, they don't know the future, they don't really know that much about the past either. According to its strategy making process, it makes this particular decision. There are 21 cases where we managed to find a busiest group, we managed to find a busiest queue, but we still don't manage to steal anything. This is for the active balance. Active balancing is a violent operation because we're stealing a task that's actually happily running on some other core. There's a threshold, and you have to get over the threshold of failures before you can actually do this active balancing. There's these very few cases where we even reach active balancing, but we always fail to get over the threshold. Our counter value is only 0 or 1, and the threshold we have to get to is being greater than 4. All a bit arbitrary, we don't make it.

Observations

Basically, as I've summarized so far, the find_busiest_group fails a lot, but from time to time we get through it, we get to find_busiest_queue. Find_busiest_queue almost always decides that the load is balanced. This is due to something called a migration type, which is chosen to be migration_util. It's not very important what that means. The thing that will allow us to get into the active balancing is called migrate_task. We want to have it be migrate_task instead of migrate_util. What we're going to do now is we identify the problems, we're just going to violently fix those problems, and then see if it makes things work. I'm not necessarily going to propose a correction for the Linux kernel, I'm going to propose, do something that makes my application give the right performance.

Then once we validate that our hypotheses are correct, then we can at some future time think about what is the better way to get the Linux kernel. The problem with the Linux kernel is it has to work for everything, it doesn't have to just work for this specific application. That's the direction in which we're going. We're going to try to switch to migrate_task, get rid of this migrate_util thing, that will get us through the find_busiest_queue, then we will try to do actually normal stealing of waiting threads. That's going to fail as well, because actually all of our threads that are on that other socket, the NUMA balancing have decided they would prefer to be on that socket. Actually, we have a tradeoff between memory locality and work conservation. Then we fall down to the active balancing, that's the last hope.

As I said, the threshold doesn't get high enough. We can see how that happens here. If I changed from migrate_util to migrate_task, we still have most of our failures being find_busiest_group. Find_busiest_queue happens only one time. We end up with a lot of these need_active_balances with our counter, since we reach that code more often, the counter gets to 0 to 1 to 2 to 3 to 4, but it never gets to the 5 that we need to get over the threshold to actually do the active balancing. That suggests that the counter is the bottleneck, it's maybe too high, or maybe it shouldn't be there at all. To investigate the problem, we are just going to get rid of it.

Resulting Performance

We want to avoid the find_busiest_queue failures, we did that already. We want to avoid this threshold, this counter failure, and so we're going to do that now. Basically, I just modified the Linux kernel to make the situation be as I want it to be. Then we end up with this graph that you saw previously in the beginning. The red line is the vanilla Linux 6.7. The green line is when we use migrate_task. These lines, they're ordered in a certain way, but these numbers are actually fairly random, so using migrate_task can give you worse performance or better performance than the original Linux kernel. It doesn't really have any impact at all. It's like we fail early, or we fail later, but we still fail, so it doesn't make any difference.

The nice thing is that when we put the two changes together, we switch to migrate_task and we remove the threshold that we need to do the active balancing, then we have this blue line. This blue line shows a very slow increase in the running time, but the difference between the fastest time and the slowest time is not that great, actually I don't know what percentage it is. We can also look at this from a different point of view, which is the work conservation, that's what you see at the bottom. This is showing you what percentage of the time does the application spend with a work conservation problem. At the fastest times, with all of the different variants, it has basically no work conservation problem, maybe very small time.

The middle times, there's some work conservation problems, sometimes the issue maybe is work conservation, sometimes the issue maybe it's memory locality. It's kind of a tradeoff. When things go badly, we're spending up to a third of our time with a work conservation problem, that is with idle cores and overloaded cores. The nice thing that we can see is if we make all of our changes in terms of work conservation, basically the work conservation problem has gone away. There are executions that are slower than other executions, but the problem is not work conservation. Then we could go on and try to understand what the rest of the problem is, but I don't know what it is, I haven't done that yet. I just want to emphasize, these changes that I made, I don't consider them to be a solution that we should use for all applications, it's just a way of probing the system and understanding what's going on.

Conclusion

To understand the scheduler behavior, you have to have a precise understanding of what's going on at every point in time. What's going on at every point in time is like a really huge amount of information. You need some tools to help you organize the data, so you can zoom in on the part that's actually interesting for you. We have these four tools that we have found interesting in a general situation. Maybe we have too many tools, maybe we should just have one tool. It's awkward to have to remember the different lists of arguments that different tools have, and so maybe just one tool would be good. Maybe you could also have a domain-specific language that would let you program the thing that you are interested in seeing. You wouldn't have to program an entire visualization tool.

It's a lot of work to make the colors come out in a nice way and make the thing be visible, allow you to zoom and all these kinds of things. You know what things you are interested in seeing, and so it'd be nice to be able to somehow easily specify what things you should want to see and what should be their relationships to each other, and what should be the format in which they are presented, and so on. This is the URL where you can find the tools, https://gitlab.inria.fr/schedgraph/schedgraph.git. It's a little bit of documentation. There's a link to a paper that gives a description of how it all works.

Questions and Answers

Participant 1: You mentioned the possibility of a DSL to specify visualization of large quantities of events. That seems to be something that would be pretty generically useful for the analysis of distributed systems as well, because of the problem of collecting lots of event data and then needing to somehow visualize it. It seems to be something that comes up a lot in all sorts of performance analysis, not only of a single system, but of larger systems as well.

Lawall: I don't even have an answer for my particular case. I did make a language and then I proposed it to some colleagues and I said, what does this specification do? They had no idea. I don't think my language was a success. So far, I don't have an answer. It is a good point that maybe such a language, even maybe these visualization tools, if you can also generate your data in the format that it expects, maybe it could be also interesting for other large-scale systems.

Participant 2: To what degree does the scheduler take into account the L0, L1, L2 caching and the different number of processes that they might encompass, or is it just not important?

Lawall: The first question is about the different levels of cache. I didn't emphasize it because we were only interested in the inter-socket thing, but there's actually a hierarchy of load balancing. The bottom of the hierarchy is the hyper-threads, and then there are the things on the same socket, and then across the machine. It favors moving things around locally. I think that addresses that issue.

Participant 2: Is there any way we can give hints to the operating system about how it should schedule?

Lawall: There are some knobs that you can turn. There's some runtime, I think it's more for favoring performance versus energy consumption. I am not sure if there's any hints that would address this specific very low-level issue.

Participant 3: Is there any specific like gaps you saw in the visualization tools you were using from the massive scale problems? When you're actually evaluating the movement of tasks around, is there any kind of quantization of how much it costs to really move a task, because context switching is always expensive? What's the threshold of when moving actually makes sense?

Lawall: First question is visualization tools. These graphs are all made with a quite old tool, which is called JGraph. Basically, if your data is very large, then you may want to have some coffee while it's dealing with the issue. Actually, I had an intern who used Plotly, I believe. The nice feature was that it allowed interactivity. The not-so-nice feature was as soon as the data got a bit larger, actually nothing happened at all.

Participant 3: Task switching and context switching is always expensive.

Lawall: Part of it is the question that was already asked about the L0. Part of it is the fact that the active balancing is the least favorite option. It's very hard actually to get into that. That's the most costly thing because it preempts active tasks. There's actually several context switches that are involved in that. The whole scheduler is just a large collection of heuristics. Sometimes they're very intertwined. A certain heuristic here is actually for some impact here as things like that, and it's very complex.

Participant 1: You ended up adding printks and recompiling your kernel. Do you think it would be feasible to get the same information by using a kprobe and eBPF, or would that be too intrusive? A lot of people find, for better or worse, recompiling the kernel to be intimidating, but the eBPF tooling has gotten better.

Lawall: I'm not an expert on eBPF. I just would point out that what we're doing is like there's a specific if or something like that. It's not on function boundaries. I don't know if eBPF is limited to that or not, but there's specific ifs that we need to do. To make the specific changes, that was involving an if test, it's like if x and-and-and y, and I want to get rid of y. I'm not sure that eBPF can actually make that kind of change at the kernel level.

Design Systems 101

Mike's Notes

An excellent overview of what a Design System is from Therese Fessenden, of NN Group.

Including the relationships and definitions of;

style guide
component library,
pattern library.

This article provides guidelines for future work on the Pipi Design System Engine (dsg), which is designed to host users' design systems.

Resources

References

Reference

Repository

Home > Design System >

Last Updated

24/04/2025

Design Systems 101

By: Therese Fessenden

NN Group: 11/04/2021

Therese Fessenden is a Senior Experience Specialist with Nielsen Norman Group and host of the NN/g UX Podcast. Her research focuses on understanding human behaviour, attitudes, and expectations in order to better orchestrate system and service design strategies.

Summary:

A design system is a set of standards to manage design at scale by reducing redundancy while creating a shared language and visual consistency across different pages and channels.

As UI design has evolved over the years, the scale and speed at which UI screens must be created has also increased. Not only are there millions of applications and billions of websites (with more created each year), but each of those apps and websites might have hundreds or thousands of pages (or screens). With this drastic expansion comes a dire need for organizations to streamline design work. So, many design teams leverage robust design systems to manage designs at scale.

Definition: A design system is a complete set of standards intended to manage design at scale using reusable components and patterns.

Why Use a Design System?

Design systems, when implemented well, can provide a lot of benefits to a design team:

Design (and development) work can be created and replicated quickly and at scale.

The primary benefit of design systems is their ability to replicate designs quickly by utilizing premade UI components and elements. Teams can continue to use the same elements over and over, reducing the need to reinvent the wheel and thus risking unintended inconsistency.

It alleviates strain on design resources to focus on larger, more complex problems.

Since simpler UI elements are created already and reusable, design resources can focus less on tweaking visual appearance and more on more-complex problems (like information prioritization, workflow optimization, and journey management). While this payoff might seem small when you create only a small number of screens, it becomes substantial when you must coordinate efforts across dozens of teams and thousands of screens.

It creates a unified language within and between crossfunctional teams.

Especially when design responsibilities shift or when teams become geographically dispersed, a unified language reduces wasted design or development time around miscommunications. For example, the functionality or appearance of a dropdown menu would not be debated, since that term is reserved for a specifically defined element within the design system.

It creates visual consistency across products, channels, and (potentially siloed) departments.

Particularly when teams work in silos, where each product or channel operates independently of the others, the absence of an organization-wide design system can lead to inconsistent visual appearance and experiences that seem fragmented or unrelated to the brand. Design systems provide a single source of components, patterns, and styles and unify disjointed experiences so that they are visually cohesive and appear to be part of the same ecosystem. As an added bonus, any major visual rebrands or redesigns can be managed at scale through the design system.

It can serve as an educational tool and reference for junior-level designers and content contributors.

Explicitly written usage guidelines and style guides help onboard individual contributors who are new to UI design or content creation and also serve as a reminder for the rest of the contributors.

Why Not Use a Design System?

There are some potential hurdles and limitations which may prevent a design team from using a design system:

Creating and maintaining a design system is a time-intensive activity which requires a dedicated team. Design systems, unfortunately, are not a one-and-done solution. At their best, they are constantly evolving as teams gather feedback from those who use them.
It takes time to teach others how to use the design system. Any design system, even if it were adapted from an existing one, needs instructions for use — otherwise there is a risk that it may be applied inconsistently or incorrectly across screens or across teams.
There may be a perception that projects are static, one-off creations, which generally don’t require reusable components. Whether true or not, this perception may signal a lack of unified strategy across projects and a missed opportunity to increase efficiency.

Elements of a Design System

There are two important parts to a design system:

The design repository
The people who manage it

Design-System Repository

Design repositories can take many forms, but they often contain a style guide, a component library, and a pattern library.

Style Guide

Style guides contain specific implementation guidelines, visual references, and design principles for creating interfaces or other design deliverables. The most-common style guides tend to focus on branding (colors, typography, trademarks, logos, and print media), but style guides also offer guidance on content (such as tone of voice and language recommendations) and visual- and interaction-design standards (also known as front-end style guides). These guidelines are sometimes incorporated into the component library as well, to provide relevant guidance in context.

The 1976 NASA Graphics Standards Manual (NHB 1430.2) is an example of a thorough branding style guide. It offers much more than remarkably modern visual examples: guidelines on color pairings to improve visibility and readability, explicit design principles, like “A sign should be thought of as a large-scale headline; therefore, language should be clear and concise. Brevity is desirable in order communicate quickly, especially to drivers of vehicles.”

Mailchimp’s Content Style Guide contains robust guidelines for how to write different kinds of content so that they align with Mailchimp’s company values and company tone of voice.

Component Library

Component libraries (also known as design libraries) are what many people associate with design systems: these thorough libraries house predetermined, reusable UI elements and serve as a one-stop shop for designers and developers alike to learn about and implement specific UI elements. Creating these libraries takes significant time and resources. In addition to visual examples of components, they include:

Component name: a specific and unique UI component name, to avoid miscommunication between designers and developers
Description: a clear explanation for what this element is and how it is typically used, occasionally accompanied by do’s and don’ts for context and clarification
Attributes: variables or adjustments that can be made to customize or adapt the component for specific needs (i.e., color, size, shape, copy)
State: recommended defaults and the subsequent changes in appearance
Code snippets: the actual code excerpt for the element (some design systems go as far as sharing multiple examples and offering a “sandbox” environment to try out different component customizations)
Front-end & backend frameworks to implement the library (if applicable), to avoid painful and unnecessary debugging

Google’s Material Design system features a component library which includes implementation guidelines and code snippets (shown above) for specific operating systems and frameworks, as well as thorough design guidelines with usability do’s and don’ts in a separate tab.

IBM’s Carbon design system features usage, style, and code guidelines, as well as accessibility considerations and a code sandbox for designers and developers to visualize any customizations before implementation.

Pattern Library

Sometimes, the terms ‘component library’ and ‘pattern library’ are used synonymously; however, there is a distinction between these two types of libraries. Component libraries specify individual UI elements, while pattern libraries feature collections of UI-element groupings or layouts. Pattern libraries are often thought of as less robust compared to component libraries, but they can be as thorough or as high-level as needed. They typically feature content structures, layouts, and/or templates. Much like the components, the patterns are meant to be reused and adapted.

The Atlassian design system identifies a number of reusable patterns including a page-header template. Not only does it show a visual example, but it also highlights the exact component that designers should leverage and explains how each component should be used.

While many public-sector websites have a long way to go, the US Web Design System (USWDS) is a great starting point to unify many disparate departments and agencies with clear guidelines. The USWDS specifies page templates (pictured above), as well as design principles, components, and coding specifications.

Design-System Team

A design system is only as effective as the team that manages it. Whether created or adapted, design systems require continuous maintenance and oversight to ensure they don’t become outdated, obsolete, or overcrowded with redundant entries or submissions. The size of this team can vary, given that design systems themselves can take on different sizes and levels of customization, but, at a minimum, the team should include 1 interaction designer, 1 visual designer, and 1 developer, each meant to help write interaction-design guidelines, create visual examples, and provide code snippets and implementation specifications for each element, respectively. Ideally, the team should also include a part-time researcher, part-time architect, and content writer, if these roles are explicitly determined in your organization.

Lastly, consider securing an executive sponsor (from leadership ranks) to orchestrate design-system efforts. While the lack of a sponsor won’t be a show-stopper, sponsors can secure money and resources, while also conveying the strategic importance of a design system to the rest of the organization.

How to Approach Design-System Adoption

There are generally three approaches to using a design system:

Adopting an existing design system
Adapting an existing design system
Creating your own proprietary or custom design system

There are pros and cons to each, but generally speaking, the more custom your design-system solution is, the more time and money it will take to implement. Thus, using an existing design system is the lowest-cost approach and requires the least time to implement. (It will, still, need more time than if you continue design as usual, however, because you will have to either replace or update some UI elements and agree upon a standard).

Investment in a custom design system will be worth it if the organization has particular needs that cannot be met by open-source design systems. As customizations and adjustments to the design system increase, the cost savings you may have gained from using the existing design system will diminish, and, in the long run, you may be better off creating your own design system anyway. Be sure you know what your organization needs before you embark on design system endeavors and evaluate the tradeoffs.

Depending on budget and needs, companies can select one of three approaches to design systems: adopt an existing system in its entirety, adapt an existing system to the company’s needs, or create an entirely new one.

Last, for a proof of concept or an initial prototype which is likely to change, creating a full-fledged design system is probably not going to generate a desirable ROI in the near term. The benefit, after all, is the replicability of design, which is in the future. Though it may be tempting to establish these from the outset, keep in mind that a design system should not be thought of as a portfolio of work, but rather as a functional toolkit or resource for designers and developers to work more quickly. That said, if you are doubting the usefulness of a design system, it might be worth considering the timescale you will use to evaluate your design work. Design systems are best when the company foresees years of future, replicable design work.

Conclusion

Design systems are made of many components, patterns, styles, and guidelines, which can help operationalize and optimize your design efforts. However, they are designed, managed, and implemented by people. The main factors to consider when creating a design system are the scale and replicability of your projects, as well as the resources and time available. When poorly implemented and maintained, design systems can become unwieldly collections of components and code; but, when implemented well, they can educate team members, streamline work, and enable designers to tackle complex UX problems.