Future

The Bike Shed

397: Dependency Graphs

Stephanie is consciously trying to make meetings better for herself by limiting distractions. A few episodes ago, Joël talked about a frustrating bug he was chasing down and couldn't get closure on, so he had to move on. This week, that bug popped up again and he chased it down! AND he got to use binary search to find its source–which was pretty cool!

Together, Stephanie and Joël discuss dependency graphs as a mental model, and while they apply to code, they also help when it comes to planning tasks and systems. They talk about coupling, cycles, re-structuring, and visualizations.

Transcript:

JOËL: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Joël Quenneville.

STEPHANIE: And I'm Stephanie Minn. And together, we're here to share a bit of what we've learned along the way.

JOËL: So, Stephanie, what's new in your world?

STEPHANIE: So, I'm always trying to make meetings better for me [chuckles], more tolerable or more enjoyable. And in meetings a lot, I find myself getting distracted when I don't necessarily want to be. You know, oftentimes, I really do want to try to pay attention to just what I'm doing in that meeting in the moment. In fact, just now, I was thinking about the little tidbit I had shared on a previous episode about priorities, where really, you know, you can only have one priority [laughs] at a time. And so, in that moment, hopefully, my priority is the meeting that I'm in.

But, you know, I find myself, like, accidentally opening Slack or, like, oh, was I running the test suite just a few minutes before the meeting started? Let me just go check on that really quick. And, oh no, there's a failure, oh God, that red is really, you know, drawing my eye. And, like, could I just debug it really quick and get that satisfying green so then I can pay attention to the meeting? And so on and so forth. I'm sure I'm not alone in this [laughs]. And I end up not giving the meeting my full attention, even though I want to be, even though I should be.

So, one thing that I started doing about a year ago is origami. [laughs] And that ended up being a thing that I would do with my hands during meetings so that I wasn't using my mouse, using my keyboard, and just, like, looking at other stuff in the remote meeting world that I live in. So, I started with paper stars, made many, many paper stars, [laughs] and then, I graduated to paper cranes. [laughs] And so, that's been my origami craft of choice lately.

Then now, I have little cranes everywhere around the house. I've kind of created a little paper crane army. [laughs] And my partner has enjoyed putting them in random places around the house for me [laughs] to find. So, maybe I'll open a cabinet, and suddenly, [laughs] a paper crane is just there. And I think I realized that I've actually gotten quite good at doing these crafts.

And it's been interesting to kind of be putting in the hours of doing this craft but also not be investing time, like, outside of meetings. And I'm finding that I'm getting better at this thing, so that seemed pretty cool. And it is mindless enough that I'm mentally just paying attention but, yeah, like, building that muscle memory to perfecting the craft of origami.

JOËL: I'm curious, for your army of paper cranes, is there a standard size that you make, or do you have, like, a variety of sizes?

STEPHANIE: I have this huge stack of, like, 500 sheets of origami paper that are all the same size. So, they're all about, let's say, two or three inches large. But I think the tiny ones I've seen, really small paper cranes, maybe that would be, like, the next level to tackle because working with smaller paper seems, you know, even more challenging.

JOËL: I'd imagine the ratio of, like, paper thickness to the size of the thing that you're making is different.

STEPHANIE: At this point, they say that if you make 1,000, then you bring good luck. I think I'm well on my way [laughs] to hopefully being blessed with good luck in this household of my little paper crane army.

JOËL: It's interesting that you mentioned the power of having something tactile to do with your hands during a meeting, and I definitely relate to that. I feel like it's so easy, even, like, mindlessly, to just hit Command-Tab when I'm doing things on a screen. Like, my hands are on the keyboard. If I'm not doing something, I'm just going to mindlessly hit Command-Tab. It's kind of like on your phone sometimes. I don't know if you do this, like, just scrolling side to side. You're not actually doing anything. You just want motion with your fingers.

STEPHANIE: Yes. I know exactly what you're talking about. And it's funny because it's a bit of a duality where, you know, when you are in your development workflow, you want things to be as quick and convenient as possible, so that Command-Tab, you know, is very easy. It's just built in, and that helps speed up your, you know, day-to-day work. But then it's also that little bit of mindlessness, I think, that can get you down the distraction path.

When I was first looking for something to do with my hands, to have, like, a little tactile thing to keep me focused in meetings, I did explore getting one of those fidget cubes; I have to say. [laughs] It's just a little toy, you know, that comes with a bunch of different settings for you to fidget with.

There's, like, a ball you can roll, you know, with your thumb, or maybe some buttons to click, and it gives you that really satisfying tactile experience. And I know they work really well for a lot of people, but I've really enjoyed the, I guess, the unexpected benefits [chuckles] of getting better at a hobby [laughs] while spending my time at my work.

Joël, what is new with you?

JOËL: So, a few episodes ago, I talked about a really kind of frustrating bug that I was chasing down that was due to some, like, non-determinism in the environment. And it kind of came, and then it went away. And I wasn't able to get sort of closure on that and had to move on. Well, this week, that bug popped up again, and this time, I was actually able to chase it down. So, that felt really exciting. And I got to use binary search to try to find the source of it, which made me feel really cool.

STEPHANIE: Oooh, do tell. What ended up being the issue?

JOËL: I'm connecting to an external Snowflake data warehouse, and ActiveRecord tries to fetch the schema and crashes as part of that with some cryptic error that originates from the C extension ODBC Ruby driver package. I figured out that it's probably something to do with, like, a particular table name or something in the table metadata when we're pulling this schema that we're not happy about. But I don't know which table is the one that it's not happy with.

Well, this time, I was able to figure out, by reading through some of the documentation, that I can pull subsets of the schema. So, I can pull the first n values of that schema, and it won't crash. It only crashes if I try to fetch the entire set, which is what is happening under the hood. At that point, you know, I could fetch each row individually, but there's hundreds of these. So, you know, I try, okay, what happens if I try to fetch 1,000 of these? Is it going to crash? Because it's a massive system. So, yes, I get a crash.

So, I know that a table less than a thousandth in the list of tables is what's causing the problems. So, okay, fetch 500 halfway in between there. It's still going to crash. Okay, 250, 125. I then kind of keep halving all the time until I find one that doesn't crash. And now I know that it is somewhere between the last crash and this one. So, I think it was between 125 and 250. And now I can say, okay, well, let's fetch the first, you know, maybe 200 tables, okay, that crashes. And I keep halving that space until you finally find it. And then, like, okay, so it's this one right here.

Now, the problem is the bad table actually crashes. So, I think it ended up being, like, number 175 or something like that. So, I never get to see the actual table itself. But because the list of tables is in alphabetical order, and I can see because I can fetch the first 174 and it succeeds, so I can tell what the previous 5, 6, you know, previous 174 are.

I can pretty easily go and look at the actual database and the list of tables and say, okay, well, it's in the same order. And the next one is this one, and hey, look, there is some metadata there that has some very long fields that are longer than one might expect, specifically going over a potentially implied 256-character limit. That seems somewhat suspicious. And, oh, if we remove this table, all of a sudden, everything works.

STEPHANIE: Wow, binary search, an excellent debugging tool [laughs] when you have no idea, you know, what could possibly be causing your issue.

JOËL: It's such a cool tool. Like, I'm always so happy when I get a chance to use it. The problem is, you need a way to be able to answer the question, like, have I found it? Yes or no? Or, generally, is it greater or less than this current position?

STEPHANIE: Well, that's really exciting that you ended up figuring out how to solve the bug. I know last time we talked about it, you kind of had left off in a space of, hopefully, we won't run into this issue again because it's no longer happening. But it seems like you were also set up this time around to be able to debug once it cropped up again.

JOËL: Yes. So, binary search is really cool. It's got this, like, very, like, fancy computer science name. But in reality, it's a fairly simple, straightforward technique that I use fairly frequently in my development. And there's another kind of computer sciency fancy-sounding concept that I use all the time. You've all heard me reference this multiple times on the show. You're right; we're finally doing it. This is the dependency graph episode.

STEPHANIE: Woo. [laughter] It's time. I'm excited to really dig into it because, you know, as someone who has heard you talk about it a lot, you know, and is maybe a little less familiar with graph theory and how, you know, it can be applied to my day to day work, I'm really excited to dig into a little bit about, you know, what a regular developer needs to know about dependency graphs to add to their toolbox of skills.

JOËL: So, I think at its core, the idea of a dependency graph is that you have a group of entities, some of which depend on each other. They can't do a task, or they can't be created unless some other subtasks or dependent actions take place. And so, we have a sort of formal structural way of describing these things. Visually, we often draw these things out where each of the pieces is like a little bubble or a circle, and then we draw arrows towards the things that it depends on.

So, if A cannot be done without B being done first, we draw an arrow from A to B. That's kind of how it is in the abstract. More concretely, this kind of thing shows up constantly throughout the work that we do because a lot of what we do as developers is managing things that are connected to each other or that depend on each other. We build complex systems out of smaller components that all rely on each other.

STEPHANIE: Yeah, I think it's interesting because I use the word dependency, you know, very frequently when talking about normal work that I'm doing, you know, dependencies as in libraries, right? That we've pulled into our application, or dependencies, like, talking about other classes that are referenced in this class that I'm working in. And I never really thought about what could be explored further or, like, what could be learned from really digging into those connections.

JOËL: It's a really powerful mental model. And, like you said, dependencies exist all over our work, and we often use that word. So, you mentioned something like packages, where your application depends on Rails, which in turn depends on ActiveRecord, which in turn depends on a bunch of other things. And so, you've got this whole chain of maybe immediate dependencies, and then those dependencies have dependencies, and those dependencies have dependencies, and it kind of, like, grows outward from there.

And in a very kind of simplistic model, you might think, oh, well, it's more, like, a kind of a tree structure. But oftentimes, you'll have things like branches on one side that connect back to branches on the other. And now you've got something that's no longer really tree-like. It's more of a sort of interconnected web, and that is a graph.

STEPHANIE: I think understanding the dependencies of your system has also become more important to me as I learn about things that can go wrong when I don't know enough about what my system is, you know, relying on that I had kind of taken for granted previously. I'm especially thinking about packages like we were mentioning, and, you know, not realizing that your application is dependent on this other library, right? That's brought in by a gem that you're using. And there's maybe, like, a security issue, right? With that.

And suddenly, you have this problem on your hands that you didn't realize before. And I know that that has been more of a common discussion now in terms of security practices, just being more aware of all the things that you are depending on as really our work becomes more and more interconnected with the things available to us with open source.

JOËL: I think where understanding the graph-like nature of this becomes really important is when you're doing something like an upgrade. So, let's say you do have a gem that has a security problem, and you want to upgrade it to fix that security issue. But the upgrade that includes the security patch is also a breaking upgrade. And so, now everything else in your system that depends on that gem or on that package is going to break unless you have them in a version that is compatible with the new version of that gem.

And so, you might have to then go downstream and upgrade those packages in a way that's compatible with your app before you can bring in the security patch. And a lot of that can be done automatically by Bundler. Bundler is software that is built around navigating dependency graphs like that and finding versions that are compatible with each other.

But sometimes, your code will need to change in order to upgrade one of these downstream gems so that you can then pull in the upgrade from the gem that needs a security patch. And so, understanding a little bit of that graph is going to be important to safely upgrading that gem.

STEPHANIE: So, I know another application of dependency graphs that you have thought about and written a blog post for is RSpec let declarations and how a lot of the time when we are using let, you know, we are likely calling other variables defined by let. And so, when you are encountering a test file, it can be really hard to grok what data is being set up in your test.

JOËL: Yeah, so that is really interesting because you can define something that will get executed in a lazy fashion if it gets referenced. But then not only is the let lazy and will not trigger unless it's referenced, but a let can reference other lets, which are also lazy, and only get triggered if they get referenced.

So, you might have a bunch of lets defined in any order you want throughout a file, and they're all kind of interconnected with these references to each other. But they only get triggered if something calls it directly or it's in this, like, chain of dependencies. And getting a grasp on what actually gets created, which lets will actually execute, which ones don't in a file can quickly get out of hand. And so, thinking of this in terms of a dependency graph has been a really helpful mental model for me to understand what's going on in a complex test file.

STEPHANIE: Yeah, absolutely. Especially when sometimes the lets are coming from all over the place, you know, maybe a describe block hundreds of lines away, or even a completely different file if you are using a shared context that's being pulled in. So, I can see why this was a complex problem that could be made a little simpler with plotting out a dependency graph.

And in preparation for this episode, I was doing a little bit of my own exploration on this because I certainly know, you know, the pain of trying to figure out what is being executed in my tests when there are a lot of lets that reference each other. And in the blog post, you kind of gave a little step-by-step of how you could start with creating a dependency graph for the test that you're working with.

And I was really curious if this process could be automated because, you know, I do enjoy, you know, pulling out the pen and paper [chuckles] every now and then. But I'm not, like, a particularly visual person. God forbid I, like, draw a circle, but then, like, don't have enough space for the rest of the circles. [laughs] So, I was really hoping for a tool that could do this for me, especially if, you know, you do, you have a lot of tests that you have to try to understand in a relatively short amount of time. And so, I ended up doing something kind of hacky with RSpec and overriding let definitions to automate this process.

JOËL: That's really cool. So, is the tool that you're trying to build something where you feed it in a spec file, and it gives you some kind of graphical representation like an SVG or something as output?

STEPHANIE: Yeah. I did consider that approach first, where you feed in the file, but then I ended up going with something more dynamic where you are running the test, and then as it gets executed, tracing the let definitions and then registering them to build your dependency graph.

JOËL: So, you've got some sort of internal modeling that describes a dependency graph. And then, somehow, you're going to turn that, you know, a series of Ruby objects into some kind of visual.

STEPHANIE: Yeah, exactly. And the bulk of that work was actually done with a library called RGL, which stands for just Ruby Graph Library. [laughs] And what's nice is that it has a really easy interface for plugging in the vertices and edges of the dependency graph that you want to build. And then, it is already hooked up with Graphviz to, you know, write the SVG to a file. And so, I ended up really just having to build up an array of my dependencies and the connections to each other and then feed it into the constructor of the graph.

JOËL: And for all of our listeners, you mentioned Graphviz. That is a third-party tool that can be installed on your machine that can generate these SVG diagrams from...I believe it has its own sort of syntax. So, you create, I believe it's dot, D-O-T, so dot dot file. And based off of that, it generates all sorts of things, but SVG being potentially one of them.

STEPHANIE: Yeah. The nice thing was that I actually didn't end up having to use the DSL of Graphviz because the RGL gem was doing them for me.

JOËL: Nice. So, it plugs in directly.

STEPHANIE: Yeah, exactly. And I was really curious about using this gem because I, you know, just wanted to write Ruby, especially to plug into other things that are already in Ruby. And I found that surprisingly easy, thanks to all of the RSpec config options that they make available to you, including an option to extend the example group class, which is actually where let and let bang is defined.

And so, I ended up overriding those classes and using, you know, the name of the let that you're defining and then the block to basically register the dependencies. And I also ended up exploring a little bit with using Ruby's built-in parser to figure out in the block that's being passed to the let, what parts of that block could potentially be a reference to another let.

JOËL: That's really cool. Did you get any fun results from that?

STEPHANIE: I did. It worked pretty well in being able to capture all of the let declarations, and other lets that it references. And so, I was able to successfully, you know, like, generate a visual dependency graph of all of the lets, so that was really neat.

The part that I was really kind of excited about trying next, though I didn't end up having time to yet, was figuring out which of those let values are executed by way of the let bang, right? Which is eager or what is referenced in the test that then gets executed as well. And so, the RGL library is pretty neat and has some formatting options, too, with the Graphviz output. So, you can change the font color or styling options for different, you know, nodes and edges. And so, I was really curious to pursue this further, maybe, and use it to show exactly what gets evaluated now that I have successfully mapped my let graph.

JOËL: Right. Because the whole point of this exercise is that not the entire graph is going to get evaluated. The underlying question is, what data actually gets created when my test runs? And so, you build out this whole dependency graph, and then you can follow a few simple rules to say, okay, this branch gets called, this branch gets called, this series of things gets called. And okay, this subset of let blocks trigger, and therefore this data has been created for my given test.

STEPHANIE: Yeah. Though I will say that even where I got so far to, just seeing all of the let definitions in a spec file was really helpful to have a better understanding, you know, if I do have to add a test in here, and I'm thinking about reaching for a pre-existing let declaration, to be like, oh, like, it actually, you know, goes on to reference all of these other things that may be factories [chuckles] that are created might make me, you know, think twice, or just have a little better understanding of what I'm really dealing with.

JOËL: Right. The idea that when you're calling out to a let, or a factory, or something else that's just a node in a large graph, you're not necessarily referencing just one thing. You might actually be referencing the head of a very long chain of things that maybe you don't intend to trigger the whole thing.

STEPHANIE: Yeah, exactly.

JOËL: So, in that sense, having a sort of visual or at least an idea of the graph can give you a much better sense of the cost of certain operations that you might have to do.

STEPHANIE: The cost of the operations certainly, especially when, you know, you are working in a legacy codebase, and you, you know, like, maybe don't know how everything plays together or is connected. And it's very tempting to just reach for [chuckles] the things that have been, you know, created or built for you. And I'm certainly guilty of that sometimes on this client project, where the domain is so complex, and there are so many associated models.

And I'm like, well, like, let me just, you know, use this let that already, you know, has a factory set up for what I think I need for this test. But then realizing, oh, actually, like, it is creating all these things, and do I really need them? I think it can be really challenging to unravel all of that in your head. And so, with this very scrappy tool that I [chuckles] built for my own purposes, you know, maybe it makes it, like, one step easier to try to fully understand what I'm working with and maybe do something different.

JOËL: One aspect that I think is really powerful about dependency graphs is that it takes this kind of, like, abstract concept that we oftentimes have an intuitive sense around, the idea that we have different components that depend on each other, and it shows it to us visually on, like, a 2D plane. And that can be really helpful to get an understanding or an overview of a system.

You mentioned that RGL uses Graphviz to generate some SVGs. A visual tool that I've been using to draw some of my dependency graphs has been mermaid.js. It has a syntax that's, like, a text-based syntax, but it's almost visual in that you have a piece of text and name of a node. And then, you'll draw a little ASCII arrow, you know, two dashes and a greater than sign to say this thing depends on, and then write another name, and just have a row, like, a bunch of entries to say; A depends on B. A also depends on C. C depends on D, and so on, and, like, build up that list. And then Mermaid will just generate that diagram for you.

STEPHANIE: Yeah. I've used Mermaid a few times. One really helpful use that I had for it was diagramming out a bunch of React components that I had and wanting to understand the connections between them. And I think you can even paste the Mermaid syntax into your GitHub pull request description, and it'll render as the graph image.

JOËL: Yeah, that's what's really cool is that Mermaid syntax has become embedded in a lot of other places in the past few years. So, it's really easy to embed graphs now into all sorts of things. You mentioned GitHub. It works in pull requests descriptions, comments, I think pretty much anywhere that Markdown is accepted. So, you could put one in your README if you wanted.

Another place that I use a lot, Obsidian, my note-taking tool, allows me to embed graphs directly in there, which is really much nicer than previously; sometimes, when I wanted to express something as a visual, I would use some sort of drawing tool to do something and export an image, and then embed that in my note. But now I can just put in this text, and it will automatically render that as a diagram.

And part of what's really nice about that is that then it's really easy for me to go and change that if I'm like, oh, but actually, I want to add one more connection in here. I don't have to re go back to, hopefully, a file that I've saved somewhere and, like, change an image file and re-export it. I just, you know, I add one line of text to my note, and it just works.

STEPHANIE: That's awesome. Yeah, the ability to change it seems really useful.

So, we've talked a little bit about tools for creating a visual aid for understanding our dependencies. And now that we have our graph, maybe we might have some concerning observations about what we see, especially when perhaps some of our dependencies are pointing back to each other.

JOËL: Yes. So, I think you're referencing cycles, in particular. That would be the formal term for it. And those are really interesting. They happen in dependency graphs. And I would say, in many cases, they can be a bit of a smell. There's definitely situations where they're fine. But there are things that you look at, and you're like, okay, this is going to be a more complex kind of tricky bit of the graph to work with.

Some cases, you just straight up can't have them. So, I want to say that the way RSpec lets are set up, you cannot write code that produces cycles. But you might have...I think Ruby allows classes to reference each other in such a way that it creates a cycle, and not all languages do that. So, Elm and F#, I believe, require that modules cannot reference each other. The fancy term for this is a dependent acyclic graph, or DAG, which basically just means that there are no cycles in that graph.

STEPHANIE: Yeah. What you said about classes referencing each other is very interesting because I've definitely seen that. And then, if I have to go about changing something, maybe even it's just the class name, right? Now there's no way in which I can really make just one change. I have to kind of do it all in one go.

JOËL: I think that's a common property of a cycle, and a graph is that changes that happen somewhere in that cycle often need to be all shipped together as one piece. You can't break it up into smaller chunks because everything depends on everything else. So, it has to be kind of boxed together and shipped as one thing.

STEPHANIE: And you'd mentioned that cycles, you know, can be a bit of a code smell. And if the goal is to be able to break it up so that it is a little bit more manageable to work with, how would you go about breaking a cycle?

JOËL: So, I think breaking a cycle is going to vary a little bit based on your problem domain. So, are you modeling a series of classes that are referencing each other? Is this a function call graph? Is this even, like, a series of tasks that you're trying to do? But typically, what you want to do is make sure that eventually, at some point, like, something doesn't loop back to referencing something higher up in your hierarchy. And so, oftentimes, it ends up being about what is allowed to know about what? Do you have higher-level concepts that can know and depend on lower-level concepts but not vice versa?

And again, we are talking about this a little bit at the abstract level. But in terms of, let's say, different code modules, or classes, or something like that, commonly, you might say, well, we want some sort of layering where we have almost, like, more primitive types of classes at the bottom. And they don't get to know about anything above them. But the ones above that might be more complex that are composed of smaller pieces know about the ones below them. And you might have multiple layers kind of like that that all kind of point down, but nothing points up.

STEPHANIE: That is a very common heuristic. [chuckles] I think you were basically just describing how I also understand creating React components, where you want to separate your presentational ones from your functional ones. And, yeah, it makes a lot of sense that as soon as you start adding that complexity of, you know, those primitive classes at the bottom, starting to, you know, point to things higher up or to know about things higher up, that is where a cycle may be accidentally introduced.

JOËL: It's interesting just how many design principles that we have in software. If you dig into them a little bit, you find out that they're about decoupling things, and oftentimes, it's specifically breaking up cycles. So, one way that you might have something like this that actually has dependency in the name, the dependency inversion principle, where what you're effectively doing is you're taking one of those dependency arrows, and you're flipping it the other way. So, instead of A depending on B, you're flipping it. Now B depends on A, and that can be enough to break a cycle.

STEPHANIE: So, one thing I've picked up from our conversations about dependency graphs is that oftentimes, you know, when you're trying to figure out where to start, you want to look for those areas or those nodes where there's nothing else that depends on it.

JOËL: Yeah. I think you have those nodes that, if this were a tree, you would call them the leaf nodes. In the case of a graph, I'm not sure if that's technically correct, but they don't depend on anything. They're kind of your base case. And so, you can, you know, if it's a function, you can run it. If it's a file, you can load it; if it's a class, also you can load it up and not have to do anything else because it has no dependencies. And knowing that those are there, I think, can be really useful in terms of knowing an order you might want to execute something in. And this is really interesting for one of my favorite uses of a graph, which is breaking down a series of tasks that you need to do.

So, commonly, you might say, okay, I have a large task I need to do. I break it down into a series of subtasks. And, you know, maybe I draw out, like, a bulleted list and, you know, task 1, 2, 3, 4, 5. The problem is that they're not necessarily just a flat list. They all have, like, orders, like dependencies between each other. So, maybe one has to happen before 2, but it also has to happen before 3, which needs to happen before two, and, like, there's all these interconnections.

And then, you find out that you can't ship them independently the way you thought initially. So, by building up a graph, you end up with something that shows you exactly what depends on what. And then, like you said, the parts that are really interesting where you can start doing work are the ones that have no dependencies themselves. Other things might depend on them, but they have no dependencies. Therefore, they can be safely built, shipped, deployed to production, and they can be done independently of the other subtasks.

STEPHANIE: Yeah. I was also thinking about things that could be done in parallel as well. So, if you do have multiple of those items with no dependencies, like, that is a really good way to be able to break up that work and, yeah, identify things that are not blocked.

JOËL: For a complex set of tasks, it's great to see, okay, these two pieces have no dependencies. We can have them be done in parallel, shipped independently. And then you can just kind of keep repeating that process. Because once all of the tasks that have no dependencies have been done, well, you can almost, like, remove them from the graph and see, okay, what's the new set of things that have no dependencies? And then, keep doing that until you've eventually done the whole graph.

And that may sound like, oh okay, we're just kind of using a little bit of intuition and working through the graph. It turns out that this is a, like, actual, like, formal thing. When it comes to graphs, it's a traversal algorithm called topological sort is the fancy name for it, and it basically, yeah, it goes through that. It gives you a list of nodes in order where each node that you're given has no dependencies that have not been evaluated yet.

So, it works from effectively to use our tree terminology, from the leaf nodes to the root, potentially roots plural, of the graph, and each step is independent. So that's a lot of, like, fancy terminology, and getting a little bit of, like, computer science graph theory into here.

So, my, like, general heuristic is that graphs should be evaluated from the bottom up when you're trying to evaluate each piece independently. So, when you do that, you get to do each piece independently, as opposed to if you're evaluating from the top down. So, starting from the one thing that depends on everything else, well, it can't be shipped until all of its dependencies have been shipped.

And all the transitional dependencies can't be shipped until their dependencies have been shipped. And so, you end up being not able to ship anything until you've built the entire graph. And that's when you end up with, you know, a 2,000-line PR that took you multiple weeks and might be buggy. And it's going to take a long time to review. And it's just not what anybody wants.

STEPHANIE: I'm glad you brought this up because I think this is where I am really curious to get better at because oftentimes, when I am breaking down a complex task, it's quite hard for me to see all of the steps that need to happen. And so, you know, you maybe start out with that, like, top-level node, like, the task that needs to be done as you understand it immediately. And it's really hard to actually identify the dependencies and, like, the smaller pieces along the way. And because you're not able to identify that, you think that you do have to just do it all in one go.

JOËL: Yeah, that sort of root node is typically the overarching task, the goal of what you want to do. And a common, I think, scenario for something like this would be, let's say, you're doing a Rails upgrade. And so, that root node is upgrade Rails. And a common thing that you might want to do is say, okay, let's go to the gem file, upgrade Rails, see what breaks, and then just keep fixing those things. That's working from the top down.

And you're going to be in a long-running branch, and you're going to keep fixing things, fixing things, fixing things until you have found all the things but done all the things. And then you do a big bang upgrade that may have taken you weeks. As opposed to if you're working from the bottom up, you try to figure out, okay, what are all the subtasks? And that might take some exploration. You might not know upfront.

But then you might say, okay, here, I can upgrade RSpec versus a dependency, or I need to change the interface of this class and ship all these pieces one at a time. And then, the final step is flipping that upgrade in the gem file, saying, okay, now I've upgraded Rails from 4 to 5, or whatever the version is that you're trying to do.

STEPHANIE: I think you've really hit the nail on the head when it comes to trying to do something but not knowing what subtasks may compose of it and getting into that problem of, you know, having not broken it down, like, enough to really see all the dependencies.

And, you know, maybe this is a conversation [chuckles] for another episode, but the skill of breaking up those tasks and exploring what those dependencies are, and being able to figure them out upfront before you start to just do that upgrade and then see what happens, that's definitely an area that I want to keep investing in. And I'm sure other people would be really curious about, too, to help them make their jobs easier.

JOËL: I think one tip that I've learned that's really fun and that connects into all of this is sometimes you do end up with a cycle in your dependencies of tasks. A technique for breaking that up is a pattern that I have pitched multiple times on the show: the strangler fig pattern. And part of why it's so powerful is that it allows you to work incrementally by breaking up some of these cycles in your dependency graph.

And one of the lessons that I've learned from that is that just because you have sort of an initial set of subtasks and you have a graph of them doesn't mean that you can't change them. If you're following strangler fig, what you're actually doing is introducing one or more new subtasks to that graph. But the way you introduce them breaks up that cycle. So, you can always add new tasks or split up existing ones as you get a better understanding of the work you need to do. It's not something that is fixed or set in stone upfront.

STEPHANIE: Yeah, that's a really great tip. I think next time, what I really want to explore, you know, your heuristic of going from bottom up, yeah, sure, it sounds all fine and dandy. But how to get to a point where you're able to see everything at the bottom, right? And, like, when you are tasked, or you do start with the thing at the top, like, the end goal. Yeah, I'm sure that's something we'll explore [chuckles] another day.

JOËL: On that note, shall we wrap up?

STEPHANIE: Let's wrap up.

Show notes for this episode can be found at bikeshed.fm.

JOËL: This show has been produced and edited by Mandy Moore.

STEPHANIE: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review in iTunes. It really helps other folks find the show.

JOËL: If you have any feedback for this or any of our other episodes, you can reach us @_bikeshed, or you can reach me @joelquen on Twitter.

STEPHANIE: Or reach both of us at hosts@bikeshed.fm via email.

JOËL: Thanks so much for listening to The Bike Shed, and we'll see you next week.

ALL: Byeeeeeee!!!!!!

ANNOUNCER: This podcast is brought to you by thoughtbot, your expert strategy, design, development, and product management partner. We bring digital products from idea to success and teach you how because we care. Learn more at thoughtbot.com.

Support The Bike Shed

Episode source