Awesome CTO

Mike's Notes

This is a snapshot of a great list of CTO resources written by Dima Kuchin. It came from Dima's GitHub repository and was last updated 4 years ago. It might be handy one day.

Resources

Awesome CTO

— Hello, my name is Dima and I'm a CTO

— (chorus) Welcome, Dima

A curated and opinionated list of resources for Chief Technology Officers and VP R&D, emphasising startups and hyper-growth companies.

Subscribe to get updates

Contents

  • CTO Position
  • Hiring
  • People management
  • Career growth
  • Project management
  • Handbooks
  • Development process
  • Architecture
  • Technologies
  • Data
  • Startups
  • Due Diligence
  • Money / Finance
  • Related stuff
  • Product
  • Marketing
  • Books
  • More links

    CTO Position

      Hiring

        People Management

          Career growth

            Project management

              Handbooks

                Development process

                Architecture

                Technologies

                Data

                Startups

                Due Diligence

                Money / Finance

                Related stuff

                Product

                Other

              • Awesome CEO (GitHub)
              • Awesome TPM (GitHub)
              • Figma on Design Systems

                Mike's Notes

                Figma has created a website about Design Systems. It looks very useful.

                • Talks
                • Examples
                • Code
                • Where to start
                • Articles
                • Etc

                Resources

                James Simons: My Guiding Principles

                Mike's Notes

                I'm a fan of Jim Simons. A brilliant mathematician, he made a fortune and then donated billions to support mathematics and open scientific research through the Simons Foundation, which he and his wife Marilyn established. Marilyn was the President, and Jim was the Chair. 

                All the research results are made freely available. The foundation publishes Quanta, which is free and better than Nature. I often reproduce Quanta articles on this website.

                Jim died last year.

                Resources

                James Simons: My Guiding Principles

                By: James H. Simons
                Simons Foundation: January 22, 2020

                The chair of the Simons Foundation describes his five principles for building a successful organization.

                The foundation began in 1994, and in its early years, it had no particular mission and was simply Marilyn’s and my vehicle for distributing charity. In 2003, after the foundation had grown considerably (in part due to principle 5!), Marilyn and I became interested in autism — its cause and possible methods of alleviating its symptoms. We didn’t know how to begin this effort, but in the summer of that year, a friend of ours agreed to convene a roundtable of outstanding neuroscientists as well as some people who were already working in the field. We learned a great deal from this: first, that autism is very largely genetic; and, second, that few great scientists were working in the field.

                Our mission seemed clear: Attract great scientists, and begin with genetics. This decision was consistent with principle 1, since no one to our knowledge was taking this combined approach. A second decision was to bring Mike Wigler of Cold Spring Harbor Lab, a friend of mine and one of the world’s greatest geneticists, into the game. We were off and running.

                Another decision Marilyn and I made that year was to focus the foundation almost exclusively on mathematics and science research. Again, this was consistent with principle 1, as very few U.S. foundations had such a focus, and we felt we could really make a difference.

                After a few years of my overseeing the autism project, Marilyn and I realized that we needed an outstanding scientist to head the project, and we were lucky enough to hire Gerry Fischbach, the first scientist on the foundation staff. Not only was Gerry terrific, but he was indeed fun to work with. Together with Mike Wigler and a small staff, he created the Simons Simplex Collection. This was a cohort of almost 3,000 families that included both parents, one and only one child with autism, and at least one unaffected sibling. A great deal of effort was expended in designing and implementing this collection, and it has led to an enormous amount of great science. I consider this collection a beautiful thing — a fine example of principle 3.

                The Simons Foundation Autism Research Initiative (SFARI) has now gone on for 16 years. When Gerry stepped down in 2013, Louis Reichardt, an outstanding scientist and leader, became its head. Our perseverance has borne fruit. Not only have we discovered numerous genetic causes of the condition, but our first drug trials have been initiated. Our patience to some extent exemplifies principle 4.

                The next area we focused on was math and physical sciences (MPS), and at Marilyn’s suggestion, we brought on David Eisenbud to head that effort. David’s first step was to invite to the foundation a group of mathematicians, a group of theoretical physicists and a group of computer scientists to tell us what the foundation could do to advance each of their respective fields. The math and physics groups proposed various grant programs, which were subsequently put in place, but the computer science group wanted only one thing: an institute for theoretical computer science. There was no such institute in the world.

                A competition was established among leading U.S. research universities, and after several winnowing rounds, Berkeley was selected. Not only was their proposed leader, Richard Karp, an outstanding and renowned computer scientist, but Berkeley also agreed to give us the entirety of a small building that they would renovate for our purposes. This institute is now in its sixth year and has been an outstanding success, attracting multitudes of visitors and being run in a beautiful manner. The creation of this institute exemplifies principles 1, 2 and 3.

                Guiding Principles

                IN LATE 2010, just after I stepped down from Renaissance and started full-time at the foundation, I gave a talk at MIT. Marilyn accompanied me to the talk, and on the way there, she suggested that at the end I discuss my values. I felt ‘values’ was not quite the right word, so I used ‘guiding principles’ instead. They are listed below, and I believe they have been useful in my life and careers. After setting them forth, I will give a number of examples of their effect in building the foundation.

                1. DO SOMETHING NEW; DON’T RUN WITH THE PACK. I am not such a fast runner. If I am one of N people all working on the same problem, there is very little chance I will win. If I can think of a new problem in a new area, that will give me a chance.
                2. SURROUND YOURSELF WITH THE SMARTEST PEOPLE YOU CAN FIND. When you see such a person, do all you can to get them on board. That extends your reach, and terrific people are usually fun to work with.
                3. BE GUIDED BY BEAUTY. This is obviously true in doing mathematics or writing poetry, but it is also true in fashioning an organization that is running extremely well and accomplishing its mission with excellence.
                4. DON’T GIVE UP EASILY. Some things take much longer than one initially expects. If the goal is worth achieving, just stick with it.
                5. HOPE FOR GOOD LUCK!

                After a few years, Yuri Tschinkel succeeded David as head of MPS. Yuri is an excellent mathematician and a pleasure to work with. Under his leadership, a number of excellent grant-making programs have been initiated, as well as several collaborations (discussed below).

                The last grant-making area to be established was life science, headed by Marian Carlson, a member of the National Academy of Sciences and a totally fun person to work with. She too has initiated some innovative programs and collaborations (discussed below).

                As the programs grew, particularly SFARI, there was increasing need for IT personnel. This problem was solved by bringing on board Alex Lash, a very senior IT type at Memorial Sloan Kettering. I learned the hospital was quite unhappy to lose him, but Marilyn and I were delighted to hire him, and he gradually built an excellent team of almost 30 people who serve both the science and administrative sides of the house.

                In 2012, we came up with a new approach to grant-making, what we called collaborations. These would be goal-driven efforts involving a fairly large number of investigators from various institutions around the world, lasting as long as 10 years, or perhaps even more. To determine whether this was a good idea, we convened a weekend meeting of distinguished scientists from a broad set of fields. By the end of the meeting, we had decided this was indeed a good idea, provided the goal was extremely important, there was a reasonable chance of achieving it, and the leadership was outstanding.

                The first of these collaborations was Origins of Life, which was quickly followed by the Global Brain, an effort to understand the dynamics of the brain as a whole. Today we have 18 active collaborations covering life science, math and physical sciences. By and large these are going very well. This program is a clear example of principle 1, as we know of no other foundation or government organization that funds such efforts.

                In the course of the weekend meeting on collaborations, it was suggested that we create an institute devoted to data science. I liked this idea very much, since the wealth I had amassed was based on data science in the financial world, but instead of creating an institute on a university campus, I thought to assemble such an activity in-house. Marilyn was in full agreement. We were fortunate to recruit Leslie Greengard to head this effort, which became known as the Simons Center for Data Analysis (SCDA). Leslie not only is an outstanding applied mathematician, being a member of both the National Academy of Sciences and the National Academy of Engineering, but also holds an M.D. from Yale. He focused SCDA on computational biology and built up an excellent team. I consider this another example of principle 1, as nothing quite like this seemed to exist.

                SCDA worked so well that Marilyn and I decided to generalize the idea and create the Flatiron Institute for computational science. For this, of course, more space was required, and happily it existed right across the street! We recruited David Spergel from Princeton to build Computational Astrophysics, and then Antoine Georges from the Collège de France to build Computational Quantum Physics (working closely with Andy Millis). Finally, we determined that the fourth unit be Computational Mathematics, and that Leslie step down from Computational Biology to head this new unit. After a fairly long search, we decided to promote Mike Shelley, a group leader in the biology unit, to be its director. Underpinning this variegated research effort is the Scientific Computing Core, a team brilliantly headed by Nick Carriero and Ian Fisk.

                The Flatiron Institute has run magnificently in every way, its staff producing outstanding research and convening meetings and workshops that inspire other scientists to do the same. It clearly exemplifies principles 1, 2 and 3: it has originality and great leadership and is truly a thing of beauty.

                According to its bylaws, the Simons Foundation is intended to focus almost entirely on research in mathematics and science and to exist in perpetuity. If future leadership abides by these guiding principles, Marilyn and I believe the foundation will forever be a force for good in our society.

                Enshittification, tarpits and other things your mother never told you about

                Mike's Notes

                Here are some valuable resources about stuff you may have to deal with one day.

                • Enshittification
                • Tarpits
                I am so disgusted with the degeneration of previously useful websites provided by the world's largest software companies that I took these measures to ensure that Pipi would never end up like that.
                • No Investors
                • No social media
                • No sales, only word-of-mouth
                • No ads
                • No moats
                • Open-source as much as possible
                • Ownership by a foundation
                • Only SaaS applications that are socially useful
                • Built for experienced developer teams
                • Open Handbook
                AI companies are ignoring robots.txt files when they go out and scrape websites. That sucks.

                Resources

                Enshittification

                "Enshittification, also known as crapification and platform decay, is the term used to describe the pattern in which online products and services decline in quality over time. Initially, vendors create high-quality offerings to attract users, then they degrade those offerings to better serve business customers, and finally degrade their services to users and business customers to maximize profits for shareholders." - Wikipedia.

                Examples

                • Google Search
                • Facebook
                • Amazon

                Tarpit

                "A tarpit is a service on a computer system (usually a server) that purposely delays incoming connections. The technique was developed as a defense against a computer worm, and the idea is that network abuses such as spamming or broad scanning are less effective, and therefore less attractive, if they take too long. The concept is analogous with a tar pit, in which animals can get bogged down and slowly sink under the surface, like in a swamp." - Wikipedia

                Examples

                • Nepenthes
                • Iocane

                The creative process behind Pipi

                Mike's Notes

                I discovered this superb, filmed interview with author J.K. Rowling about her writing process. It's worth watching how this talented individual consciously uses a creative process in her work.

                Besides being in awe of her writing ability, I was also thinking about my creative process.

                I'm a pattern thinker. That's visual thinking.

                My creative process is similar to J.K. Rowling's, except I use images, relationships, 3D and 4D models, and patterns instead of words. I, too, have many ideas pouring out of my head like a firehose, just not in words.

                I, too, keep a notebook beside my bed, so when I wake up at 2 a.m., I can write a word or draw a diagram.

                I keep my drawings in A4 3-hole ring binders. There must be almost 100 binders now, and cartons of drawings are waiting to be filed away correctly. I use colour high-lighters a lot. Everything is colour-coded like this website.

                I also have to use assistive technology to write. Every word on this website has been rewritten 30 to 40 times by me and corrected by Grammarly. I'm a slow writer; writing a page takes a whole day.

                I find articles written by others I agree with and republish them on this engineering blog, expressing my thoughts in their words and adding a few notes to explain the connection.

                I also have multiple kinds of synesthesia, which Wikipedia defines as " a perceptual phenomenon in which stimulation of one sensory or cognitive pathway leads to involuntary experiences in a second sensory or cognitive pathway."

                It was the same when I was a sculptor, making docos, building sets, doing landscape projects, and everything else, where I didn't need to write words. Reading and talking are easy. I have given hundreds of talks, usually technical, at conferences and workshops, and been on TV, radio, press interviews, etc.

                But I could never describe this process in written words. Now I can. Watch J K Rowling describe it.

                I also copied this post on my film-maker website so that people could understand how I work

                Resources

                J K Rowling on Writing

                Jkrowling.com

                J.K. Rowling talks in depth for the first time about her writing

                J.K. Rowling is often asked questions by fans and budding writers about her writing process: where she writes, how she writes, her inspiration and her research, how a book comes about, from the germ of an idea to the editing process and eventual publication.

                Here for the first time, she responds to those questions, talking openly and in depth about her writing including Harry Potter, her other children’s books The Ickabog and The Christmas Pig, as well as writing as Robert Galbraith, the Cormoran Strike crime fiction series.

                Filmed in her writing room in Edinburgh and in a London pub, these three On Writing films provide a personal insight into J.K. Rowling’s writing world.

                Part 1

                Part 2

                Part 3

                Pipi self-hosts and the chicken-or-egg problem.

                Mike's Notes

                I wanted to write about why and how Pipi 9 is self-hosting and why this created a chicken-or-egg problem.

                Pipi 9 is still largely headless, and I need to use the UI to build a UI.

                Resources

                Pipi self-hosts and the chicken or egg problem.

                By: Mike Peters
                01/02/2025

                Mike is the inventor and architect of Pipi and the founder of Ajabbi.

                According to Wikipedia;

                "An operating system is self-hosted when the toolchain to build the operating system runs on that same operating system. For example, Windows can be built on a computer running Windows.

                Before a system can become self-hosted, another system is needed to develop it until it reaches a stage where self-hosting is possible. When developing for a new computer or operating system, a system to run the development software is needed, but development software used to write and build the operating system is also necessary. This is called a bootstrapping problem or, more generically, a chicken or the egg dilemma.

                A solution to this problem is the cross compiler (or cross assembler when working with assembly language). A cross compiler allows source code on one platform to be compiled for a different machine or operating system, making it possible to create an operating system for a machine for which a self-hosting compiler does not yet exist. Once written, software can be deployed to the target system using means such as an EPROM, floppy diskette, flash memory (such as a USB thumb drive), or JTAG device. This is similar to the method used to write software for gaming consoles or for handheld devices like cellular phones or tablets, which do not host their own development tools.

                Once the system is mature enough to compile its own code, the cross-development dependency ends. At this point, an operating system is said to be self-hosted." - Wikipedia

                Part of the secret sauce of Pipi 9's success is its self-generating nature.

                It's like watching the workings of a living biological cell, where hundreds of processes maintain and interact with each other. 

                I had to manually create each engine and let them run against each other. It was a very slow and experimental process, like inventing a new cake recipe through trial and error. I stumbled across something that worked by accident.

                As a result, it is headless. It has also been designed to generate its own no-code front-end, an HTML user interface for humans.

                The current problem is that a no-code interface is needed to create the no-code interface. The lower layers of Pipi are supposed to help generate the UI and learn from user interactions, so a lot needs to be done, making it challenging to simulate.

                This means the front end will go through many generations as Pipi learns from feedback loops to improve the UI further. The backend will also self-evolve.

                So, it is slow, but progress is steady. And the more I get done, the easier and faster it will get.

                So, back to that darn chicken, or is it the egg?

                The half-life of code & the ship of Theseus

                Mike's Notes

                I found this via the excellent engineering newsletter from PostHog. It's an article by Erik Bernhardsson, originally published on his blog in 2016. He has a lot of thoughtful articles.

                The article has a very enticing name.

                To do

                What is the half-life of Pipi ?

                Resources

                The half-life of code & the ship of Theseus

                By: Erik Bernhardsson
                erikbern.com: 5/12/2016

                As a project evolves, does the new code just add on top of the old code? Or does it replace the old code slowly over time? In order to understand this, I built a little thing to analyze Git projects, with help from the formidable GitPython project. The idea is to go back in history historical and run a git blame (making this somewhat fast was a bit nontrivial, as it turns out, but I'll spare you the details, which involve some opportunistic caching of files, pick historical points spread out in time, use git diff to invalidate changed files, etc).

                In moment of clarity, I named “Git of Theseus” as a terrible pun on ship of Theseus. I'm a dad now, so I can make terrible puns. It refers to a philosophical paradox, where the pieces of a ship are replaced for hundreds of years. If all pieces are replaced, is it still the same ship?

                The ship wherein Theseus and the youth of Athens returned from Crete had thirty oars, and was preserved by the Athenians down even to the time of Demetrius Phalereus, for they took away the old planks as they decayed, putting in new and stronger timber in their places, in so much that this ship became a standing example among the philosophers, for the logical question of things that grow; one side holding that the ship remained the same, and the other contending that it was not the same.

                It turns out that code doesn't exactly evolve the way I expected. There is a “ship of Theseus” effect, but there's also a compounding effect where codebases keep growing over time (maybe I should call it “Second Avenue Subway” effect, after the construction project in NYC that's been going on since 1919).

                Let's start by analyzing Git itself. Git became self-hosting early on, and it's one of the most popular and oldest Git projects:


                This plots the aggregate number of lines of code over time, broken down into cohorts by the year added. I would have expected more of a decay here, and I'm surprised to see that so much code written back in 2006 is still alive in the code base – interesting!

                We can compute the decay for individual commits too. If we align all commits at x=0, we can look at the aggregate decay for code in a certain repo. This analysis is somewhat harder to implement than it sounds like because of various stuff (mostly because newer commits have had less time, so the right end of the curve represents an aggregate of fewer commits).

                For Git, this plot looks like this:


                Even after 10 years, 40% of lines of code is still present! Let's look at a broader range of (somewhat randomly selected) open source projects:


                It looks like Git is somewhat of an outlier here. Fitting an exponential decay to Git and solving for the half-life gives approx ~6 years.


                Hmm… not convinced this is necessarily a perfect fit, but as the famous quote goes: All models are wrong, some models are useful. I like the explanatory power of an exponential decay – code has an expected life time and a constant risk of being replaced.

                I suspect a slightly better model would be to fit a sum of exponentials. This would work for a repo with some code that changes fast and some code that changes slowly. But before going down a rabbit hole of curve fitting, I reminded myself of von Neumann's quote: With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. There's probably some way to make it work, but I'll revisit some other time.

                Let's look at a lot of projects in aggregate (also sampled somewhat arbitrarily):


                In aggregate, the half-life is roughly ~3.33 years. I like that, it's an easy number to remember. But the spread is big between different projects. The aggregate model doesn't necessarily have super strong predictive power – it's hard to point to a arbitrary open source project and expect half of it to be gone 3.33 years later.

                Moar repos

                Apache (aka HTTPD) is another repo that goes way back:

                Rails:




                Beautiful exponential fit!

                Node:



                Wanna run it for your own repo? Again, code is available here.

                The monster repo of them all

                Note that most of these repos took at most a few minutes to analyze, using my script. As a final test I decided to run it over the Linux kernel which is HUGE – 635,229 commits as of today. This is 16 times larger than the second biggest repo I looked at (rails) and took multiple days to analyze on my shitty computer. To make it faster I ended up computing the full git blame only for commits spread out at least 3 weeks and also limited it to .c files:


                The squiggly lines are probably from the sampling mechanism. But look at this beauty – a whopping 16M lines! The code contribution from each year's cohort is extremely smooth at this scale. Individual commits have absolutely no meaning at this scale – they cumulative sum of them is very predictible. It's like going from Newton's laws to thermodynamics.


                Linux also clearly exhibits more of a linear growth pattern. I'm speculating that this has to do with its high modularity. The drivers directory has by far the most number of files (22,091) followed by arch (17,967) which contains support for various architectures. This is exactly the kind of things you would expect to scale very well with complexity, since they have a well defined interface.

                Somewhat off topic, but I like the notion of how well a projects scales with complexity. A linear scalability is the ultimate goal, where each one marginal feature takes roughly the same amount of code. Bad projects scale superlinearly, and every marginal feature takes more and more code.

                It's interesting to go back and contrast Linux to something like Angular, which basically exhibits the opposite behavior:



                The half-life of a randomly selected line in Angular is about 0.32 years. Does this reflect on Angular? Is the architecture basically not as “linear” and consistent? You might say the comparison is unfair, because Angular is new. That's a fair point. But I wouldn't be surprised if it does reflect on some questionable design. Don't mean to be shitting on Angular here, but it's an interesting contrast.

                Half-life by repository

                A somewhat arbitrary sample of projects and their half-lifes:

                Project Half-life (years) First Commit
                angular 0.32 2014
                bluebird 0.56 2013
                kubernetes 0.59 2014
                keras 0.69 2015
                tensorflow 1.08 2015
                express 1.23 2009
                scikit-learn 1.29 2011
                luigi 1.3 2012
                backbone 1.48 2010
                ansible 1.52 2012
                react 1.66 2013
                node 1.76 2009
                underscore 1.97 2009
                requests 2.1 2011
                rails 2.43 2004
                django 3.38 2005
                theano 3.71 2008
                numpy 4.15 2006
                moment 4.54 2015
                scipy 4.62 2007
                tornado 4.8 2009
                redis 5.2 2010
                flask 5.22 2010
                httpd 5.38 1999
                git 6.04 2005
                chef 6.18 2008
                linux 6.6 2005

                It's interesting that moment has such high half-life, but the reason is that so much of the code is locale-specific. This creates a more linear scalability with a stable core of code and linear additions over time. express is an outlier in the other direction. It's 7 years old but code changes extremely quickly. I'm guessing this is partly because (a) lack of linear scalability in code (b) it's probably one of the first major Javascript open source projects to hit mainstream/popularity, surfing on the Node.js wave. Possibly the code base also sucks, but I have no idea 😊

                Has coding changed?

                I can think of three reasons why there's such a strong relationship between the year the project was initiated, and the half-life

                1. Code churns more early on in projects, and becomes more stable a while in
                2. Coding has changed from 2006 to 2016, and modern projects evolve faster
                3. There's some kind of selection bias where the only projects that survive are the scalable stable ones

                Interestingly, I don't find any clear evidence of #1 in the data. The half-life for code written earlier in old projects are as high as late code. I'm skeptical about #3 as well because I don't see why there would be a relation between survival and code structure (but maybe there is). My conclusion is that writing code has fundamentally changed in the last 10 years. Code really seems to change at a much faster rate in modern projects.

                By the way, see discussion on Hacker News and on Reddit!