How Is Science Even Possible?

From The Joy of Why

https://www.quantamagazine.org/how-is-science-even-possible-20240620

  • The link above has a podcast player link for the audio

How are scientists able to crack fundamental questions about nature and life? How does math make the complex cosmos understandable? In this episode, the physicist Nigel Goldenfeld and co-host Steven Strogatz explore the deep foundations of the scientific process.

The universe seems like it should be unfathomably complex. How then is science able to crack fundamental questions about nature and life? Scientists and philosophers alike have often commented on the “unreasonable” success of mathematics at describing the universe. That success has helped science probe some profound mysteries — but as the physicist Nigel Goldenfeld points out, it also helps that the “hard” physical sciences, where this progress is most evident, are in major ways simpler than the “soft” biological sciences.

In this episode, Goldenfeld speaks with co-host Steven Strogatz about the scientific importance of asking the right questions at the right time. They also discuss the mysterious effects of “emergence,” the phenomenon that allows new properties to arise in systems at different scales, imposing unexpected order on cosmic complexity.

Transcript

STEVEN STROGATZ: Albert Einstein once wrote, “The eternal mystery of the world is its comprehensibility.” It really is awesome when you think about it. The laws of nature, at least in physics, turn out to be amazingly simple. So simple that we human beings can discover those laws and understand them and use them to change the world.

But why is nature like this? Why is it so comprehensible? And why is math so uncannily effective at explaining it, not just in physics, but also in chemistry, in astronomy, and even in some parts of biology? In short, why is science even possible?

I’m Steve Strogatz and this is “The Joy of Why,” a podcast from Quanta Magazine, where my co-host, Janna Levin, and I take turns exploring some of the biggest mysteries in math and science today. In this episode, we’ll be speaking with physicist Nigel Goldenfeld about the mystery of nature’s comprehensibility.

Nigel holds the Chancellor’s Distinguished Professorship in Physics at the University of California, San Diego, where his research spans condensed matter theory, the theory of living systems, hydrodynamics and non-equilibrium statistical mechanics. Previously, he was a professor at the University of Illinois at Urbana-Champaign, and a founding member of its Institute for Genomic Biology, where he led the biocomplexity group and directed the NASA Astrobiology Institute for Universal Biology. In addition to being a fellow of the American Physical Society, the American Academy of Arts and Sciences, and the U.S. National Academy of Sciences, Nigel is also well known for authoring one of the standard — and I have to say, terrific — graduate textbooks in statistical mechanics.

Nigel, thanks so much for coming on the show.

NIGEL GOLDENFELD: Oh, it’s a pleasure to be here, Steve.

STROGATZ: Yes, it really is a pleasure for me. I am curious where we’re going to go with this. It’s such a really very profound philosophical question, this Einstein quote about nature’s comprehensibility, but I wonder what you think of it? I mean, let’s talk about both parts of it. Is the world really comprehensible, at least to some degree? And if it is, does that strike you as mysterious?

GOLDENFELD: So I think it’s a wonderful quote, and certainly one that inspired me, and I’m sure other people thinking about the research that we do. And I think the reason it’s important is because we’ve grown physics to such an extent that it now starts to impinge on other disciplines. You mentioned biology, but also, you know, I could mention economics and atmospheric sciences, climate change, all these sorts of things.

And as you start getting into these much more complex and complicated areas of science, you wonder how were we even able to do anything in physics, let alone these other things. And in fact, the reason these other fields are difficult is something that’s also not clear. You know, you could also ask what is the reason for the unreasonable ineffectiveness of mathematics in biology.

[STROGATZ laughs]

GOLDENFELD: When you start to think about it, you realize that when we talk about the effectiveness, we’re talking about problems where we’ve been lucky to make an impact. And so our sample is skewed.

We have a lot of successes in science. Some of the most accurate things that we know in science are in physics. You could say, “Well, that’s because, you know, we only talk about those problems, because those are the ones that actually worked. All the many other things that we try to do failed dismally, and we never ask about those. And our sample is somewhat biased.”

STROGATZ: Well, that’s great, this point that you’re making that we’re sort of assuming facts not necessarily in evidence here in saying that the world is comprehensible. Because as you say, there are these parts of science that we still have yet to really figure out — economics, parts of atmospheric science and so on.

So for listeners who aren’t necessarily following what we’re talking about here, think about the example from the 1850s or ’60s: James Clerk Maxwell figuring out the equations for how electricity and magnetism work.

It’s just four little equations that nowadays fit on a T-shirt — physics and math nerds like me and Nigel and maybe even you like those T-shirts. What’s crazy is that you can really understand almost everything there is to know about electricity and magnetism with the help of those equations and some clever math.

For instance, Maxwell himself figured out that a prediction from those equations is that something called electromagnetic waves could exist. And today, those are the basis for wireless communication, technology that we all use every day in our cell phones.

And so the question is: How is it possible that we, with our puny primate brains, can figure out these four equations that are so marvelous? And is it, as you suggested, Nigel, just that we’re asking questions whose answers are likely to be simple and ignoring the really hard ones? Or, I don’t know, how should we think about this? How is it possible Maxwell could have come up with these equations?

GOLDENFELD: Let me take another example, which is Einstein’s prediction of gravitational waves — which has been in the news a lot in the last couple of years. The story is that Einstein had this idea of thinking about somebody falling in an elevator. And they realized that the falling in an elevator is similar to what you get from gravity.

And so they came up with a principle of equivalence. And from that very, very slender insight, translated into mathematics through Riemannian geometry and tensor calculus and so on, which Einstein had to learn in order to do that, he was able to create this amazing mathematical edifice, which we call the general theory of relativity today, which is actually the theory of gravitation.

And it explains gravitation to a higher accuracy than Newton’s law of gravitation, and makes numerous predictions, of which the gravitational waves are one of the most spectacular. So that’s another fantastic example.

And it just boggles the mind that somebody could imagine that and create the science that makes these predictions. And, you know, a hundred years later using astonishing technology, we’re able to actually observe these things.

STROGATZ: It is. It seems almost like a miracle. It’s something that the physicist Eugene Wigner, in a famous essay in 1960, posed [as] “the unreasonable effectiveness of mathematics in the natural sciences,” and you already alluded to this phrase of his. What is unreasonable about it?

GOLDENFELD: So you and I have been talking about new qualitative phenomena that you predict, for example, from Faraday’s law and all these things that Maxwell had to work with. That’s one thing that’s very important about science, is that we can predict things that you would otherwise not expect.

But the unreasonable effectiveness that Wigner is talking about is the accuracy with which it makes those predictions.

So here’s another example. You look at, say, the quantum mechanics of an atom interacting with Maxwell’s electromagnetic field. When you take electromagnetic field, you apply quantum mechanics to the interaction of that with an atom. You’re able to make predictions to something like 10 decimal places of accuracy. And those agree with experiments to all significant figures that the experiments, in theory, are applicable for. And that’s astonishing. And I think Wigner and Einstein wanted to know how could it be that such very simple mathematics has such great explanatory power.

And people may say, ”Well, what do you mean it’s simple? You know, Einstein’s theory of relativity, general theory of relativity, is one of the most complicated pieces of mathematical physics that you can learn.” And that’s true.

But the physical insight that goes into it is literally very simple. Just, acceleration is literally like a gravitational force. And then being able to turn that into a mathematical equation, which you can then make simple predictions from, is really where the beauty and the amazingness lies. So, I think that’s one aspect of it.

There’s another thing, though, that is not talked about very much, which is that this idea that mathematics and physics is so powerful in its explanations makes another assumption. That assumption is reductionism.

This goes back to another quote of another founder of modern physics, Paul Dirac, who wrote down the relativistic wave equation, an equation that describes quantum mechanics connected with special relativity — it’s called the Dirac equation. And he rather arrogantly wrote that his equation describes most of physics and all of chemistry.

[STROGATZ laughs]

GOLDENFELD: And his idea is that basically — and it’s the same idea that motivates what today we call high-energy physics, but in an era with more bravado would be called elementary particle physics.

And there was the idea that you can just find the elementary building blocks of matter. And then once you’ve got those, all you have to do is put them together and you’ve explained everything in the world. And we know that that’s not true. And that’s the sort of fundamental insight that came out of physics around about 1950 or so. Led to the birth of what’s known as condensed matter physics, and is certainly operative on steroids when you look at biological phenomena, where just knowing the basic forces between atoms doesn’t explain, you know, why you can think.

So when we talk about the effectiveness, we’re talking about the effectiveness on very simple problems.

STROGATZ: Hmm, interesting distinction. So just to review some of these examples again to make sure I’m with you. With Maxwell, his equations, not so simple unless you know vector calculus or something equivalent. But then once you know that math, as I tried to emphasize, it’s just four little equations that can fit on a T-shirt. So simple in that way, and simple principles going into them.

But then your point seems to be, yes, but you can only predict simple phenomena like a propagating wave through a vacuum, whereas really complicated stuff, say, predicting patterns of thought in a human mind — I mean, this is the tricky part.

In principle, do we believe that it is actually somehow in the physics, but we just can’t figure out how to do the math to show phenomena like consciousness and emotion and all that? Or is there something else than what the physical laws imply?

GOLDENFELD: Well, I think there is. And this goes back to the question of why it is that we can do science at all.

If you truly believe that to understand, say, the phenomenon that we see in biology, you can get all of that, say, from Dirac’s equation or, you know, quantum mechanics and so on, then every time you try to understand something quantitatively — in biology or solid state physics, for example — you know, you’d have to worry about the radiative corrections to the mass of the top quark. And none of us think that all of those things that happen at such small scales inside a nucleon at very high energies have anything to do with, you know, why a bird can fly or stuff like that.

The fact that we can do science tells us that somehow these scales get separated through something which we typically call emergence. The great benefit of that is that we don’t have to solve everything all the way down in order to understand something.

STROGATZ: Interesting. So you’re saying worrying about quarks isn’t going to tell us anything about the behavior of the stock market tomorrow. We can somehow… It’s like, as if different scales in nature are insulated from each other, or something like that. What’s the language you would use? You spoke of separation.

GOLDENFELD: I talked of separation and I talked about emergence. And I’d like to give you another example of that which is very different from the one that people like Einstein and Wigner and Dirac and so on would’ve used, and they wouldn’t even have known about it.

So there’s a phenomenon in nature called a phase transition. The simple example is, you take a lump of ice and heat it up, and eventually it’ll melt into liquid. So it’ll go from the solid phase into the liquid phase. And then from there, if you heat it up further, it’ll go into the gas phase.

And another example would be if I took a magnet and I heated it up. It turns out that above a certain temperature, a magnet will stop being magnetic. There is a theory of that transition, the magnetic transition, and other transitions which are like it, such as how materials become superconducting and very exotic things like that. But the most interesting thing about the transition is that we can understand it using a branch of physics called “renormalization group theory.” And I’m not going to go into the technicalities of it, but what the theory predicts is that if you measure how magnetic something is very close to the temperature where it first becomes a magnet, whilst also applying a magnetic field, you get a certain magnetization that you can measure as a function of temperature and external magnetic field.

And you can do this for any magnet that you’d like. But it doesn’t really matter what the atoms are. And if you take the data and process it in a certain way, what you find is the results are the same for every single magnetic material. It doesn’t matter what it is. As you go just below the temperature where it first becomes magnetic, you find that it obeys a certain equation. And that equation is exactly the same for every material. And not just exactly the same: All the data from all the different magnetic materials, they all lie on one curve. And physicists call this universality. We completely understand that.

Now, the other thing is amazing, though, is that we can make a theoretical prediction about what that curve should be if you process the data in the way that the theory tells you to do it. And when you take the data and you take the theoretical curve, it falls exactly on the experimental data.

OK, so that’s fantastic. This model of what a phase transition is is very successful and obviously extremely accurate. Not only does it predict this universality, but it also predicts exactly not just a number, but a whole function.

And it’s a whole relationship that you can measure experimentally. So that’s true. And I like to say that it’s not really true that the model has given a precise prediction in agreement with experiment. It’s really a model of a model of a model of a model.

STROGATZ: [laughs] OK, what?

GOLDENFELD: Yeah. OK.

STROGATZ: You better explain that.

GOLDENFELD: Yes. A model of a model of a model of a model. So, so why is that? Well, so suppose you said to a scientist, OK, make a theory for me of a magnet. So they’d say, well, a magnet is made out of atoms. So in order to understand atoms and how they interact and become magnetic, I need to worry about the electrons on the atoms. I need to worry about the magnetic moments of those atoms. And so I make a model of the material based on quantum chemistry.

But that model is unimaginably complicated, and it gives you no hint that there could be something that doesn’t depend on atoms in it. Because the model itself is very specific to the particular atoms.

So then you say, well, really, that’s way too hard. Maybe a quantum chemist could simulate this and make a prediction. And if they did that, they would see that the prediction did agree with what you see experimentally, and does agree with what the theory predicts. But that’s a very huge computer calculation.

So then you say, let’s simplify it. Let’s just not worry about the atoms too much. Let’s just worry about how the electrons move around in the material. So you go ahead and do that, and you find you’ve got a complicated model of electronic structure.

STROGATZ: Sorry let me interrupt for a second, just to make sure that this whole model of a model thing is clear. So there was the real magnet, then there was the quantum chemistry model of the magnet, then there was the electronic structure model of the quantum chemistry model.

GOLDENFELD: Yes, well, now we’re going to go to the quantum Heisenberg model of the magnetic moments of the electrons inside the electronic structure, which came from the quantum chemistry. And that model is too hard.

So you say, OK, well, let’s throw away quantum mechanics. We’ll just make it classical. So you do that, and the model is still too complicated. So then you say, well, let’s take the thermodynamics, which is what everything depends upon in any case, and let’s do some kind of expansion of that. And that’s a model where you can finally do a calculation.

As you said, you’ve got one, two, three, four, five models of a model of a model of a model of a model of this material. And at each step along the way, you have made an approximation that would be rejected from every physics journal. Because everybody would say, “That’s approximation you can’t justify. There’s no small quantity. No idea what you’re talking about. How can that possibly work?”

STROGATZ: I must also say that here in the math department, you know, people would be hysterical

GOLDENFELD: Oh, yes. Oh, yes. They would be horrified. But the joke’s on them. Because, at the end of the day, you do this whole procedure, and then you find you make a prediction with no adjustable parameters, and it agrees precisely with experiment.

STROGATZ: Dun, dun, dun. [laughs]

GOLDENFELD: Dun, dun, dun. Every step along the way, the approximations you’re making are not systematic and not justifiable, at least ahead of time. And that, I think, is a fantastic way to articulate this mystery that you’re alluding to.

STROGATZ: Hmm, that is a marvelous exposition. I didn’t imagine this ahead of time while preparing for this interview, but I love it. And I think you’re really capturing the mystery. It’s like we have no right for this to work as well as it does. It’s as if nature is somehow acting in a very forgiving or convenient or cooperative manner for us. Like it’s helping us get lucky or something.

GOLDENFELD: Well, that’s the thing. This happens only under special circumstances. In this particular case, very close to a phase transition. So we understand how it works there. But these different levels of description that I alluded to, you know, all of these are different ways of describing something at different length and time and space scales. And as you go to each level, you kind of absorb all the complications of the level lower down into some parameter that is in the description that you’re talking about. And then once you’ve done that, you don’t need to worry about what happened below.

That, I think, is why we can solve this particular problem and why it works so accurately.

STROGATZ: We’re going to take a short break and we’ll be right back.

STROGATZ: Alright, welcome back. I’m speaking with Nigel Goldenfeld about how we can model complex phenomena — and how we manage to do it so accurately.

GOLDENFELD: So, when we talk about how we can do science at all, here is an example which says the only reason you can do this sort of calculation is because there’s these separations of scales and energy and time and space.

When you start talking about, you know, physics being successful, and biology or economics or social interactions and things like that. Can we expect to be able to do those things if there isn’t any obvious way that one can separate scales, and make sure that what happens at very small scales doesn’t affect what happens at large scales? And it may be that there’s some areas of science where that is not true. And then you may not be able to be successful in those things.

STROGATZ: Hmm. That’s an interesting point. I may be going off the rails here, but I’m thinking of something like economics, which you might want to think of as the byproduct of hundreds or thousands or millions of people and firms interacting through markets and so on. That it’s a kind of complex system, economics, where the smaller scale, the molecules or the atoms or the quarks are people making individual decisions that then aggregate into an economy or a market.

In your example, where the fussy behavior of the top quark doesn’t affect what’s happening to the birds flying overhead, here we might not have that separation. Like, individual decision makers can have an outsized impact on the economy? Is that the issue that makes economics so difficult or one of the issues?

GOLDENFELD: Yeah, I don’t know about economics per se, but I’ve given this some thought in terms in finance. So, finance is a very interesting example to think about emergence. So remember in finance, we have data. We know every single transaction that occurred. We know when it occurred, how much. We have every piece of information like that. And now the question is, can you make predictions based on it?

So let me give you an example. First of all, of course, we know that you can’t predict things very well, and not only can you not predict things into the future, you can’t even predict things into the past.

[STROGATZ laughs]

GOLDENFELD: So, there was a wonderful example of this, which was an event called the Flash Crash. Do you remember what that is?

STROGATZ: You should remind us. I’m not sure I remember when and what happened.

GOLDENFELD: On May the 6th, 2010, there was a trillion-dollar crash of the U.S. stock market. The Dow Jones plunged like a thousand points within a few minutes. And eventually it came back up again. And this was an unexpected event, and to this day, people aren’t really 100% sure what triggered that. It certainly wasn’t something that people expected at the time.

What actually happened, I believe, is that you have a cooperative phenomenon where a lot of people are doing algorithmic trading, they’re all more or less using the same signals to trigger their computer guided trades and I think the whole system just synchronized and crashed and eventually people had to stop the thing happening by pulling from the network and things like this. So this is an example of extreme sensitivity cascading through the system because of collective properties of the whole financial system, properties that nobody even knew were there.

STROGATZ: Hmm. Yeah, it’s interesting to hear you use the word “cascade,” because that comes up in connection with the power grid, where sometimes you’ll have an event like a lightning storm somewhere and then because, as you say, there’s this connectivity, in this case, through high voltage transmission lines in the power grid, you can get propagating failures.

So this does seem to be another example where a small-scale event can propagate and have consequences at a much broader scale. So is this the idea why maybe the hard sciences are the easiest?

GOLDENFELD: Oh, I always say that the hard sciences are the easiest. Yes, the reason physics is so successful is because we only ask very simple questions.

STROGATZ: So the supposed soft sciences are, in a certain sense, you would say then, the hardest?

GOLDENFELD: So you have to ask a question. You know, what is the purpose of science? What do we want to be able to predict? So, let’s go back to my example about the phase transition. I talked about this example of how you can look at the behavior of a magnet very close to the temperature where it becomes a magnet. And there’s a universal phenomena there, and we understand it exquisitely, and it’s wonderful and it’s amazing. So, the listener might get the impression that we understand everything about this and there’s nothing mysterious about it at all. But there is still one thing that I didn’t tell you. And that is this.

There is a temperature where every material becomes a magnet. But that temperature is different for each of the materials. And we don’t know how to predict that number very accurately. That number is not something that is universal, unlike the curve that I alluded to that tells you the response of a magnet.

That number depends on everything. All of the levels of description that I swept under the rug in order to explain what happens near a phase transition. All of those things come back to bite you when you want to know what is that actual critical temperature where the material first becomes a magnet.

STROGATZ: Hmm, interesting.

GOLDENFELD: So you have to ask the questions that you ask in science with an eye to saying, “First of all, let me ask the easy questions, the ones that don’t depend on too much. First I understand those things, and then later on we’ll get to the other ones.” And maybe never, but there’s a sort of rational order in which you, would ask questions. And so science in some sense, has to be realistic in what its goals are.

STROGATZ: Hmm, so then the resolution to our earlier question about why is science even possible, if I’m hearing you right, you’re suggesting that some things in nature could be described by the adage that you hear people say all the time, “Everything depends on everything else.” And some things in nature are not like that; not everything depends on everything else.

Am I on the right track there? That the ones where everything does depend on everything else are really going to be hard.

GOLDENFELD: Yeah. Yeah. And there’s no shortcut. And there’s other things where, if you ask the question in the right way, you can get an interesting answer, which is useful and it helps your understanding of the phenomena and so on. But if you want to know, you know, what the actual number is in degrees Fahrenheit, well, it’s not going to tell you that.

STROGATZ: Hmm. So then it seems like we’re coming to what some people might view as a disappointing cop-out of an answer, which is that science is possible because we restrict ourselves to the questions that have this kind of separation or an insulation that lets us do calculations where what’s happening here doesn’t depend on what’s happening out at Alpha Centauri.

And so it’s like we can answer the things that are easy in this sense, that they’re well separated. The others are just going to be hopeless forever? Is that the idea?

GOLDENFELD: Well, I don’t think it’s a cop-out. I think it’s a great advance to be able to say, “This question here, that’s an example of one of those things that you shouldn’t ask. And this question here is an example of one that you should.” So about 15 years or so ago, we came up with a theory that explains why there is one genetic code. It’s a general theory about the ability to express genes and make proteins and that’s what the genetic code is for, and also, by the way, it explains how life could have evolved so rapidly early on. So, it’s quite an interesting theory. And so often I’ll go and give a talk about this work, and people will ask me, “Well, why are there 20 amino acids of life?” OK? And I’ll say, “I haven’t a clue.”

[STROGATZ laughs]

GOLDENFELD: And so I think that’s an example of one of those questions that you shouldn’t ask. And I’ve got another reason for saying that. So the genetic code is literally a code book that goes from DNA — or actually, messenger RNA to amino acid, that then gets linked into a protein.

So it’s a kind of grammar, a language of molecular biology. So Francis Crick, who of course had with [James] Watson discovered the structure of DNA, wanted to try to understand why there are 20 amino acids in life. And he came up with an amazing and beautiful theory, which is mathematical. Can I tell you what the theory is? I don’t know if you know about it.

STROGATZ: So 20 amino acids, and there’s a theory for why 20?

GOLDENFELD: Yeah. So the theory is very simple. If you have a sequence of letters in threes — ACG, TAC, whatever, these correspond to certain nucleotides — you don’t know where the sequence starts. So really, whenever you read the genome, you should put commas in to tell you where the words start. So, Crick asked the question, ”Well, can you make a code so that if I’ve got four nucleotide bases, what is the largest number of amino acids you can code for so that every string in this code can make sense without you having to put in the commas?”

STROGATZ: It’s a very natural question, a beautiful question.

GOLDENFELD: It’s a beautiful question, and he came up with an answer. And the answer was that the largest number of amino acids you can get is 20. Hence, 20 amino acids of life. So then you can enumerate all of these codes without commas that Francis Crick had postulated. You can enumerate them. And when the actual genetic code was discovered by [Marshall] Nirenberg and others five or six years later, the actual code is not one of the ones that he had predicted. It’s completely wrong.

So, this is, if you like, the reasonable ineffectiveness of mathematics in biology. Because, in fact, the real code is a product of evolution. And there’s nothing special about the number 20.

So, this is an example of, you’ve got to ask the right question. You thought you could do science, biology in this case, using the same sort of elegant mathematical principles that are so powerful in physics, but you completely get egg on your face when you try them, without really understanding more about the scientific phenomena that are relevant in biology.

STROGATZ: And so would you generalize, then, to say that the role of history or evolution or contingency, those kinds of things, are another ingredient for why we might expect certain subjects to be difficult, or maybe not amenable to the elegance of math? Is that the issue?

GOLDENFELD: Well, it is, but it’s not completely hopeless. I mean, we did make a theory for the evolution of the genetic code, which did explain, you know, how is it that the world started 4.6 billion years ago? The last universal common ancestor of all life on Earth today was around 3.8 billion years ago.

So that means that in less than a billion years, life went from nothing to the architectural complexity of the modern cell. And then after that hardly evolved at all.

OK, so that is staggering. I mean, I don’t know, of course, the ultimate reason of how life evolved and so on, but at least when we made our theory of this, it explained why it evolves so rapidly, and it explained why the genetic code is so accurate and why there’s only one of them. So it explains some things, but not the other.

So we definitely understood — advanced in our understanding of basic science, but we were able to do that because we fully recognized, “Here’s a question that’s not going to be a good one to go after. Here’s a question that we might be able to do.” And I think one of the jobs of the scientist is to really ask the right questions in the right way. And that’s harder than it looks.

STROGATZ: Oh, that’s a very, very marvelous stopping point for us in a way, that part of the secret of science is the art of asking the right questions. There’s even a book with that name, isn’t there? Isn’t that Peter Medawar’s book, The Art of the Soluble?

GOLDENFELD: That’s right, but yes, I mean all science starts with asking questions. And if you don’t know how to ask questions, you can’t do science. Science is not the technology, the techniques of doing science. Of course, that’s how we’re able to do it, but fundamentally, it comes from asking questions.

STROGATZ: Probably a lot of our listeners are thinking: What about everything that’s going on today with machine learning, artificial intelligence, the possible existence of quantum computers that’s supposed to solve all kinds of problems once they really start to get serious?

Do you think that those kinds of technologies will help us deal with these intricately interwoven kinds of problems where everything or many things depend on each other, and we don’t have a good separation.

GOLDENFELD: Yes. Well, there’s two things I want to say about that. I’m really glad you raised that issue. So one of the things is the phenomenon of emergence. I mean, when people started building, you know, things like ChatGPT and so on, what those things are, are basically machines that can predict the next word. And nobody expected that those machines could pass the bar exam or medical exams or help people with their homework or help people write computer programs and so on. The range of applications has been staggering and a surprise even to the people who built these machines. And in fact, nobody really knows how they work.

In fact, if you look at the effectiveness of AI in solving problems, it also exhibits parallel relationships very much like the ones that you see in phase transitions. So one of the things I think is a great frontier for science is trying to understand how these machines are able to do so much more than what they were designed to do.

The other thing, where I think it’s important, is that what AI is very good at is discerning patterns in data, which are so complex that we don’t perceive as well as these machines.

So I think there’s great opportunity to use them to solve problems, which are very, very hard. The problem that I think is an ambitious problem I think could only be solved by using AI is trying to understand the origin of instinct. So how is it that instincts are coded in biological organisms? OK? We understand the genetic code and we understand how the proteins that go into living organisms, how they’re coded and so on.

But going from that level of description to the complexity of an organism like a fish that knows where to swim to in order to go to its breeding ground and seagulls and things like this. Clearly, we’ve somehow managed to encode very, very complex behavior. So this is something that reaches across all scales of living systems. And it’s hard for me to see, in principle, how something as complicated as instinct can be coded, but I think that AI would be able to perhaps be a tool that we could use to help us make a scientific discovery and not just, you know, build amazing technological machines.

STROGATZ: Hmm. Well, that is fascinating, Nigel. I knew it would be provocative and stimulating to talk to you and you’ve just, I think, demonstrated how, the art of science is asking good questions with that question you’ve left us with. So thank you. We’ve been speaking with physicist Nigel Goldenfeld. It has been a really great pleasure to talk to you today. Thank you.

GOLDENFELD: Thank you.

STROGATZ: Thanks for listening. If you’re enjoying “The Joy of Why” and you’re not already subscribed, hit the subscribe or follow button where you’re listening. You can also leave a review for the show — it helps people find this podcast.

“The Joy of Why” is a podcast from Quanta Magazine, an editorially independent publication supported by the Simons Foundation. Funding decisions by the Simons Foundation have no influence on the selection of topics, guests or other editorial decisions in this podcast or in Quanta Magazine.

“The Joy of Why” is produced by PRX Productions; the production team is Caitlin Faulds, Livia Brock, Genevieve Sponsler, and Merritt Jacob. The executive producer of PRX Productions is Jocelyn Gonzales. Morgan Church and Edwin Ochoa provided additional assistance. From Quanta Magazine, John Rennie and Thomas Lin provided editorial guidance, with support from Matt Carlstrom, Samuel Velasco, Nona Griffin, Arleen Santana and Madison Goldberg.

Our theme music is from APM Music. Julian Lin came up with the podcast name. The episode art is by Peter Greenwood and our logo is by Jaki King and Kristina Armitage. Special thanks to the Columbia Journalism School and Bert Odom-Reed at the Cornell Broadcast Studios.

I’m your host, Steve Strogatz. If you have any questions or comments for us, please email us at quanta@simonsfoundation.org..

[Theme fades out]

The State of ...

Mikes Notes

Each year, Sacha Greif publishes regular surveys on using various web technologies. These surveys involve the opinions of tens of thousands of web developers and show trends over time. The reports are published in many languages and are considered authoritative.

Reports

Resources

"This whole thing started because of my own confusion. Way back in 2016 I knew I needed to step my web development game up, but I didn’t know where to start or which framework to pick. I figured other people might be struggling with the same issue, and so the State of JavaScript survey was born.

Later on, thanks to Raphaël Benitte’s dataviz expertise we were able to not only turn the State of JavaScript survey into one of the biggest developer surveys around, but even launch the State of CSS survey, which itself has gone on to play a key role in the ecosystem by informing browser vendors’ roadmaps.

Then Alexey Pyltsyn helped us manage volunteers to translate the surveys in over 25 languages, Sarah Fossheim gave us a hand with accessibility, Chris Kirk-Nielsen contributed some amazing t-shirt and logo designs, Kilian Valkhof pitched in to help with mobile testing, and Philip Jägenstedt helped unlock vital funding for the surveys as well as connect us with key players in the ecosystem.

And today, thanks to Eric Burel’s back-end know-how we’re building a new survey platform and using it to launch a third survey, the State of GraphQL.

Take all these people; add many more contributors, volunteers, and translators; and finally me, Sacha Greif; and what you get is an amazing community of developers spanning the globe. We needed a name for this collective, so say hello to Devographics " - Sacha Grief

HTML5 Boilerplate

Mikes Notes

I have been looking for a robust and proven base CSS example. I have found two open-source and freely available with a community behind them. Both are on GitHub and have good documentation, code and examples. I have used the best of both.

Now in production.

CSS Templates

  • HTML5 Boilerplate
  • Motherplate

HTML5 Boilerplate

"HTML5 Boilerplate is an HTML, CSS and JavaScript template (or boilerplate) for creating HTML5 websites with cross-browser compatibility." - Wikipedia
"The web’s most popular front-end template.

HTML5 Boilerplate helps you build fast, robust, and adaptable web apps or sites. Kick-start your project with the combined knowledge and effort of 100s of developers, all in one little package." - HTML5 Boilerplate

Resources

Motherplate

"Motherplate: A Responsive SCSS Boilerplate for Web Designers.

This is a bare bones HTML/CSS framework. This is what I'll typically start off most web projects with.
It includes a CSS reset and a bunch of minimal boilerplate styles that should come in useful for any project, including a responsive grid, typography, buttons, icons and forms.
It is not as in depth as something like HTML5 Boilerplate and doesn't include styled components like Bootstrap.
It can be used for a static web project as is, or you can copy the CSS folder into an existing framework (e.g. Rails)." - Motherplate

CCS

  • base/config Put all your variables in here e.g. colors, padding, border radius - this helps with consistency across your project.

  • base/grid A basic responsive grid system with 12 columns.
  • base/ie Any styles that you need to add in order for Internet Explorer to work.
  • base/mixins Reusabled Sass mixins e.g. clearfix.
  • base/print Basic print stylesheets to make your pages look better when printed.
  • base/responsive Add any global responsive styles here e.g. hide elements, show elements, resize elements.
  • base/shame Keep this to hand for any quick and dirty CSS you need to add but plan to tidy later.
  • base/type Basic styling for your typography.
  • components/alerts Alerts to notify or give feedback to the user
  • components/buttons Styles for any text links and/or buttons.
  • components/forms Some basic form styles.
  • components/media Styles for images, video etc.
  • components/nav Inline navigation.
  • components/other Other reusable styles that come in handy.
  • components/tables Styles for tables.
  • pages/home Styles that are specific to the homepage
  • pages/layout Global layout styles e.g. header, footer, logo etc.
  • main.scss This brings all the partials together.

Example

CSS
// Reset default browser styles using Normalize
@import "../../node_modules/normalize-scss/sass/_normalize.scss";

// Install FontAwesome for useful icons
@import "../../node_modules/font-awesome/scss/font-awesome.scss";

// Import fonts from Google
@import url(https://fonts.googleapis.com/css?family=Open+Sans:400,700|Lato:400,700);

// Set variables and reusable mixins
@import "base/config";
@import "base/mixins";

// Import typical layout styles
@import "base/grid";
@import "base/type";

// Import reusable modules
@import "components/media";
@import "components/buttons";
@import "components/tables";
@import "components/forms";
@import "components/alerts";
@import "components/nav";
@import "components/other";

// Specific project styles, add any section specific sass modules here
@import "pages/layout";
@import "pages/home";

// Additional styles to think about
@import "base/responsive";
@import "base/print";

// If IE support is needed
@import "base/ie";

// For anything quick and dirty that needs thrown in
@import "base/shame";

Resources

Things You Should Know About Databases

Things You Should Know About Databases

Relational Databases Explained

How Relational Databases Work. This post talks about how indexes and transactions work on the inside of relational databases.

By: Mahdi Yusuf

Architecture Notes: 27/07/2022

It is often surprising how little is known about how databases operate at a surface level, considering they store almost all of the states in our applications. Yet, it's foundational to the overall success of most systems. So today, I will explain the two most important topics when working with RDBMSs indexes and transactions.

So, without fully getting into the weeds on database-specific quirks, I will cover everything you should understand about RDBMS indexes. I will touch briefly on transactions and isolation levels and how they can impact your reasoning about specific transactions.




Relational Databases Explained Infographic

What is an RDBMS?

A relational database is a digital database based on the relational data model, as proposed by E. F. Codd in 1970. A relational database management system (RDBMS) is used to maintain relational databases. Many relational database systems have an option of using the SQL (Structured Query Language) for querying and maintaining the database. Examples include MySQL and PostgreSQL.

What is an index?

Indexes are a data structure that helps decrease the look-up time of requested data. Indexes achieve this with the additional costs of storage, memory, and keeping it up to date (slower writes), which allows us to skip the tedious task of checking every table row.

Like an index in the back of a textbook, it helps you get to the right page. I am not a great fan of the book analogy, it quickly falls apart as we dig deeper into database indexes, but it is an excellent way to introduce the topic.

Why do we need indexes?

Small amounts of data are manageable but, (think of an attendance list for a small class) when they get larger (think birth registry for a large city) less so. Everything that used to be quick gets slower, too slow.

Think about how your strategy would change if you had to find something on 1 page vs. thousand pages of names. No, seriously, take a second and think.

No matter what you come up with some database has implemented almost all the good strategies you can come up with at some point. As they grow, systems collect and store more data, eventually leading to the problem above.

We need indexes to help us get the relevant data we need as quickly as possible.

How do indexes work?



Read performance increases as you index the data, but that comes at the cost of write performance since you need to keep index up to date.

So one of the solutions question that is often posed above is to store this data logically on how you would search it. Meaning if you want to search the list by name you would sort the list by first name. There are few issues with that strategy. I will pose them mostly as questions for the reader here:

  • What if you want to search the data in multiple ways?
  • How would you deal with adding new data to the list? Is that fast?
  • How would you deal with updates?
  • Whats is the O notation on these tasks?

Something to think about. Regardless of your original strategy we definitely need a way to maintain order so we can quickly get relevant unordered data (more on that soon)

Lets take the Figure 1.1 below.

The underlying data is spread around storage with no order and allocated perceivably randomly. Nowadays, most production servers come with SSDs, but there are some cases where you would want (HDD) spinning disks, but honestly, the reasons are getting less and less as prices for SSDs come down significantly.

SSD vs. HDD

The main difference between a solid-state drive (SSD) and a hard disk drive (HDD) is how data is stored and accessed. HDDs use mechanical spinning disks and a moving read/write head to access data (latency), while SSDs use much faster memory chips, especially when reading many small files. Therefore, if the price isn’t an issue, SSDs are a better option — especially since modern SSDs are just about as reliable as HDDs.

Now reading in that small amount of data into memory is quite fast and relatively trivial to scan. Now what if the data we are searching across can't be cached entirely in memory? or the time to read all the data from disk is taking too long?

So here is where most developers go – I have seen this problem before; we need some dictionary (hash map) and a way to get to the specific row we are looking for without having to scan the slow disk, reading tons of blocks to see if the data we need is there.

These are called index leaf nodes which are given a specific column to index, they can store the location of the matching row(s).

RowIDs indexes mapping to table data.

These index leaf nodes are the mapping between the indexed column and where the corresponding row lives on the disk. This gives us a quick way to get to a specific row if you reference it by indexed column. Scanning the index can be much faster since it is a compact representation (fewer bytes) of the column you are searching by. It saves you time reading a bunch of blocks looking for the requested data and is much more convenient to cache, further speeding up the entire process.

Scale of data often works against you, and balanced trees are the first tool in your arsenal against it.

These indexes leaf nodes are of uniform size, and we are trying to store as many of these leaf nodes as a possible per block. Since this structure requires things to be sorted (logically, not physically on disk), we need to solve the problem of having to add and remove data quickly; the good ol' linked list manages this, more specifically, a doubly linked list.

Blocks

In computing, blocks are a grouping of bytes that usually contain a fixed number of records which are limited by a total length (block length). So if we were to calculate the number of bytes it would take to store a row divided by the block length, it would give us how many rows could be read from a specific block.

At a very low level you can use this to reason about how performant your systems can be. Quick Maths™ can be very powerful when you are capacity planning.

The benefits here are twofold: it allows us to read the index leaf nodes both forward and backward and quickly rebuild the index structure when we remove or add new rows since we are just modifying pointers—potent stuff.

Linked List

A linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each piece points to the next. It is a data structure consisting of a collection of nodes representing a sequence together. In its most basic form, each node contains data and a reference (in other words, a link) to the next node in the series.

Since these leaf nodes aren't arranged physically on disk in order (remember pointers maintain the sorting in the doubly linked list), we need a way to get to the correct index leaf nodes.

Balanced Trees (B-Trees)


Structural difference BTrees vs. B+Trees

So you might wonder where you made a massive error to find yourself reading about B-Trees you hated from school. I get it these things are boring, but they are powerful and worth understanding.

B+Trees allows us to build a tree structure where each intermediate node points to the highest node value of its respective leaf nodes. It gives us a clear path to find the index leaf node that will point to the necessary data.

This structure is built from the bottom up so that an intermediate node covers all leaf nodes until we reach the root node at the top. This tree structure gets its name balanced because the depth is uniform across the entire tree.

B-Tree vs. B+Tree

The main difference B+ Trees show off is that intermediate nodes don't store any data on them. Instead, all the data references are linked to the leaf nodes, which allows for better caching of the tree structure.

Secondly, the leaf nodes are linked, so if you need to do an index scan, you can do a single linear pass rather than traversing the entire tree up and down and loading more index data from the disk.



How B+Trees are used in RDBMSs

Logarithmic Scalability

I want to take a brief aside here to hit home the power of this structure. Of course, most developers are aware of the exponential growth of data and, ideally, your company's valuations. But unfortunately, scale of data often works against you, and balanced trees are the first tool in your arsenal against it.

Depending on the number of items the intermediate nodes can reference (M) plus the overall tree (N) depth, we can reference M to the N objects.

Here is a table illustrating the concept with an M value of 5.

Tree Height (N) Index Leaf Nodes 3 125 4 625 5 3125 6 15625 7 78125 8 390625 9 1953125

So as the number of index leaf nodes increases exponentially, the tree height grows incredibly slowly (logarithmically) relative to the number of index leaf nodes. This coupled with balanced tree height, allows for almost instant identification of relevant index leaf nodes that point to actual data on disk.

Ain't that a beautiful sight!

What is a transaction?

A transaction is a unit of work you want to treat as a single unit. Therefore, it has to either happen in full or not at all. I would argue most systems don't need to manage transactions manually, but there are situations where the increased flexibility is instrumental in achieving the desired effect. Transactions mainly deal with the I in ACID, Isolation.

What is ACID?

In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps.

  • A guarantee of Atomicity prevents updates to the database from occurring only partially, which can cause more significant problems than rejecting the whole series outright.
  • Consistency guarantees that transaction can move database from one valid state to the next. This ensures these all adhere to all defined database rules. Also preventing corruption by illegal transaction. 
  • Isolation determines how a particular action is shown to other concurrent system users.
  • Durability is the property that guarantees that transactions that have been committed will survive permanently.

These concepts are generally well understood, but their definitions may not be consistent from system to system depending on the database system. So be sure to read up on each one for your production database.

These can be done automatically for you so you aren't even aware they are taking place, or you can create them manually like so:

We will focus on the time between BEGIN and COMMIT or ROLLBACK and what happens to various other transactions acting on the same data.

COMMIT/ROLLBACK

All manual transactions either end in successful COMMIT or ROLLBACK.

COMMIT durability persists the changes made by the current transaction.

ROLLBACK undoes the changes made by the current transaction.

When you aren't manually managing transactions, if all queries within a transaction are completed successfully, they are COMMITTED. If there is any failure, the changes during that transaction are ROLLED BACK to ensure the atomicity of the entire action.

Read Phenomena

Several read phenomena can occur in these isolations, and understanding them is essential in debugging your systems and honestly helping understand what kind of inconsistencies your system can tolerate.

Non-repeatable reads



Non-repeatable reads example

As in the image above, non-repeatable reads occur if you cannot get a consistent view of the data between two subsequent reads during your transaction. In specific modes, concurrent database modification is possible, and there can be scenarios where the value you just read can be modified, resulting in a non-repeatable read.

Dirty reads



Dirty read example

Similarly, a dirty read occurs when you perform a read, and another transaction updates the same row but doesn't commit the work, you perform another read, and you can access the uncommitted (dirty) value, which isn't a durable state change and is inconsistent with the state of the database.

Phantom reads



Phantom read example

Phantom reads are another committed read phenomena, which occurs when you are most commonly dealing with aggregates. For example, you ask for the number of customers in a specific transaction. Between the two subsequent reads, another customer signs up or deletes their account (committed), which results in you getting two different values if your database doesn't support range locks for these transactions.

Range Locks

Range locks are best described by illustrating all the possible lock levels.

  • Serialized Database Access — Making the database run queries one by one—terrible concurrency, the highest level of consistency, though.
  • Table Lock — lock the table for your transaction with slightly better concurrency, but concurrent table writes are still slowed.
  • Row Lock — Locks the row you are working on even better than table locks, but if multiple transactions need this row, they will need to wait.

Range locks are between the last two levels of locks; they lock the range of values captured by the transaction and don't allow inserts or updates within the range captured by the transaction.

Isolation Levels



4 Isolation levels for SQL Standard

The SQL standard defines 4 standard isolation levels these can and should be configured globally (insidious things can happen if we can't reliably reason about isolation levels).

REPEATABLE READ

Let's start with REPEATABLE READ. It is relatively straightforward to understand and sets the table for the remainder of the isolation levels. This isolation level ensures consistent reads within the transaction established by the first read. This view is maintained in several ways; some affect the overall system's performance, others don't, but outside this post's scope.

See the graphic above; once we do our first read, that view is locked for the duration of the transaction, so anything that happens outside the context of this transaction is of no consequence, committed or otherwise.

This isolation level protects us from several known isolation issues, mainly non-repeatable and dirty reads. It does have the minor data inconsistency while its locked to specific view of the database; keeping transactions short-lived as possible here is beneficial.

SERIALIZABLE

This operating mode can be the most restrictive and consistent since it allows only one query to be run at a time.

All types of reading phenomena are no longer possible since the database runs the queries one by one, transitioning from one stable state to the next. There is more nuance here, but more or less accurate.

It is essential to note in this mode to have some retry mechanism since queries can fail due to concurrency issues.

Newer distributed databases take advantage of this isolation level for consistency guarantees. CockroachDB is an example of such a database. Worth a look.

READ COMMITTED

This isolation mode is different from REPEATABLE READ in that each read creates its own consistent (committed) snapshot of time. As a result, this isolation mode is susceptible to phantom reads if we execute multiple reads within the same transaction.

READ UNCOMMITTED

Alternatively, the READ UNCOMMITTED isolation level doesn't maintain any transaction locking and can see uncommitted data as it happens, leading to dirty reads. The stuff of nightmares... in some systems.

There you have it, The Things You Should Know About Databases

If you enjoyed this, we have a ton more content like this on the way! We strive to make all these detailed and nuanced topics understandable and highlight where you would run into them!

Signing up or sharing it with someone who you think could benefit from this write up would be really appreciated.

Feedback is appreciated and can be directed at @myusuf3 on Twitter!

Resources

How To Design Complex Data Tables (+ Figma Kits)

How To Design Complex Data Tables (+ Figma Kits)

By: Vitaly Friedman

LinkedIn: June 19, 2024

https://www.linkedin.com/pulse/how-design-complex-data-tables-figma-kits-vitaly-friedman-spaue/

Architecting a complex data table is quite an adventure. Wonderful work by Goldman Sachs team.

Complex data tables are difficult to get right. They always come along with filters, sorting, customization options, batch actions, cell states, pagination and a huge amount of data. Their purpose is usually to help people compare data points and find insights — yet navigating a table is often painfully slow and frustrating, especially on mobile.

Let’s explore practical techniques and useful Figma toolkits to help users find and compare the right data faster, without relying on endless horizontal scroll.

Architecting A Complex Data Table

When we start designing a complex data table, we need to first understand what features, states and accessories we actually need. Slava Shestopalov has put together a tree of table features — a practical overview of what goes into complex tables, along with all features, states, accessories that might need to be considered in the design process.



A comprehensive tree of features for a data table. Neatly put together by Slava Shestopalov.

In the design, we start with observing, collecting and prioritizing user needs. Based on them, we define a full set of complex functionality that we need — such as drag-n-drop, resizing, reshuffling or multi-sorting. These features will require separate accessibility considerations as all draggable controls must be keyboard-accessible due to WCAG 2.2 AA requirements.


The different types of cells, from the (incredible!) Goldman Sachs Design System.

Then, we define the different types of table cells that we need. Some of them will be accessible to everyone, others will have restrictions applied to them. So we discuss logic and permissions, such as read-only, comment-only or editable. We explore filtering, sorting and customization features. We discuss sticky headers and columns. And for each of them, we set default values, presets and templates.


For nested filters, you might consider an overlay with horizontal stacking, instead of a tree.

Eventually, we move to the fine little details of the data table design. Things like truncation, wrapping, stretching and resizing rules. We look at interaction design with validation rules and error messages. Some tables might require very long technical titles or localization, so stress test your design with very long and very short titles — this might also require compact, comfortable and condensed modes.


Data table design with a dedicated "Actions" buttons might perform better than hover actions.

And: whenever possible, try to avoid row hover actions: they often cause errors and rage clicks. Instead, use a standalone button ("Actions"), or few buttons, on each row instead.

Drawing a table tree diagram like the one pictured above is a good way to document your decisions — and understand the beast that is actually in front of you. A data table might seem like just another regular component, but its complexity is often underrated and it's effectiveness is often undermined. Especially when it comes to mobile display.

Useful resources:

Complex Data Tables on Mobile

We often assume that customers expect data tables to appear exactly the same on mobile and on desktop. That's not necessarily true. What they do expect is that features that they heavily rely on for their work exist in all environments — but these features don't have to work or look exactly the same way.


Row-column-data-tables are terribly inefficient on mobile — you might consider cards instead. Example: Goldman Sachs Design System.

In general, row-column-data-tables are terribly inefficient on mobile — that's where users often struggle, making mistakes and scrolling back and forth to make sure that they are looking at the right piece of data.

Instead, it's a good idea to think about the data alone, rather than its tabular structure. See how to aggregate data and span it across fewer columns. Show only what users really need, then show more on tap. And while doing so, try to leave out unnecessary data and details and eliminate repetition. For example, we could abbreviate dates, long labels, units of measure and currency. Replace statuses and permissions with icons and badges.


Users rarely need all columns and rows at once. We can use drop-downs to navigate and explore data cells in bulks. By Joe Winter.

As of interaction design, expand rows to show details if your data doesn't need much vertical space, and use a drawer when your data does need it — preferably instead, not in addition to, modal dialogs. However, don’t rely on tooltips or hover to show critical details.

It's worth noting that users rarely navigate through all columns in the table. So let them show and hide columns, for example with a “Columns” button. There, let them also re-arrange, lock and reset columns. You could use tabs above the table to change the view, or use tabs within the table to jump between columns.


Clever: use tabs within the data table to navigate its columns. By Netty Konovalova.

For row actions, you might be better off with a bottom sheet (edit, delete, move). A helpful way to make the content more accessible is by re-inggroup data from columns across multiple rows (pivoting). You could also combine columns within vertical accordions (stacked columns) and add a sticky filter in each column to help users navigate faster. Finally, if you do use pagination, show it above and below the data list.

Bottom line: Show only what users really need. Think “card”, not “row” to present a single record of data. Aggregate and re-group data across the table. You might not always need labels, but keep them available to screen reader users. And most importantly: re-organize, rethink and redesign data, rather than squeezing a multi-column table layout in a narrow mobile space.

Useful resources:

Data Tables Figma Kits

Designing data tables in Figma from scratch is remarkably tedious and time-consuming. You can get off the ground with a few helpful kits, kindly shared and released by the community:


Data Tables Figma Kit, by Jordan Hughes.

A huge thank you to contributors, authors and designers who spend time and effort and energy into making these resources available for everyone to use!

Resources