Future

The Bike Shed

307: Walking Contradictions

On this episode, Chris talks about testing external services and dissects a tweet on refinements for Result. Steph talks about thoughbot's recent improvement to their feature flag system.

Links:

Transcript:

CHRIS: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Chris Toomey.

STEPH: And I'm Steph Viccari.

CHRIS: And together, we're here to share a bit of what we've learned along the way. So, Steph, what's new with you?

STEPH: Hey, Chris. Well, today is Summit Day at thoughtbot, and it's the day where all the bots gather, and we hang out, and we chat, and we play games. And it's a lot of fun. We're actually taking more of a respite this year just because life has been taxing. And so we decided to give people more of the day off. So we still had some fun events, but most of it is everybody gets a chill day. Do something that brings you joy is the theme of the day.

But we had Lightning Talks, which is my favorite thing that we do on Summit Day because I realize that I just work with the coolest people, and they have such interesting things to talk about. And we had such a variety of topics. So one of them, Alex Chen taught us acronyms in K-pop. And Sam Kapila, our resident foodie, taught us about a variety of spices. And one of my favorite talks was by Akshith Yellapragada, and it's the top 10 best limo entrances by The Bachelor, and it was phenomenal. And I really want to share some stuff that I learned with you.

CHRIS: The Bachelor like the TV show?

STEPH: Yeah, like the TV show. Are you familiar with it? Have you seen it before?

CHRIS: I am familiar with it. I know it exists. I know that there's a spinoff, The Bachelorette. And I believe we have now exhausted my information on the matter.

STEPH: [laughs] That's fair. For anyone that hasn't seen the show, the show revolves around a single person. For the bachelor, it's a single bachelor who dates a number of people over several weeks, and then they narrow down the people. There are elimination rounds, and the whole goal is for them to find their true love. So each week, someone is eliminated, and I think the show ends with a marriage proposal. So it's a wild show. It's something. [chuckles]

And in Akshith's talk, I learned some really fun terminology. The first one is the Crown, and this is actually an important building block because we're going to get to the rest of the terminology that uses this word, so we got to start here. So the first one is the Crown, and this is the person that everyone's competing for. So they're the star of the show. They're the one that everybody is hoping to fall in love with or will fall in love with them so they get a marriage proposal.

So then the other stuff that I've learned is all about the entrance because again, we're talking about the top 10 best entrances. And one of them is the sidecar entrance. So this is where the player, because yes, this is totally a game, has someone assist them in meeting the crown. So it could be like a family member, maybe it's like your grandma.

And then there's TOT, T-O-T, which is short for Trick Or Treat. And this person exits the limo wearing a costume. So it's someone wearing a shark costume. There was someone wearing a sloth costume where they really dedicated to the role, and they climbed a tree and hung from a branch. I don't know for how long but for long enough to really vibe with the role.

And then there's the Kringle, and this person brings a prop or a present to the Crown. And there's the Grandy, and this player arrives in something other than a limo. So the example that Akshith provided is someone arrived in a motorized cupcake.

CHRIS: Was the cupcake edible?

STEPH: I don't think so, fair question. [laughs]

CHRIS: So really just like a go-kart that looked like a cupcake, not really a motorized cupcake, if I'm going to meet pedantic about the thing, [chuckles] which I think is my job.

STEPH: Yes, it is a motorized non-edible cupcake, but that seems like something a next player should do. They should really up the game, and they should bring an edible motorized cupcake.

CHRIS: Yeah, because you get the visual novelty, but then you layer on top of it that it's actually something that you can now eat, and it's a double win.

STEPH: Ooh, and then you're a Grandy, and you're a Kringle because you arrived in something other than a limo, and it's a present.

CHRIS: I love how you have so deeply internalized this now that you're like, ooh, okay. I can remix here. I'm going to bring together the pieces. Yeah, all right. Yeah, this all makes sense.

STEPH: Yeah, it was a lot of fun. Those are most of my notes for today. I have some tech stuff too, but this felt like the most important thing to start the show with.

CHRIS: We use the phrase tech talk and nonsense to describe the show often, but I think nonsense and tech talk is the correct orientation.

STEPH: [chuckles]

CHRIS: Correct in terms of importance and chronological order, and whatnot. But yeah.

STEPH: I love that we start with a bit of nonsense. So I do have some tech stuff. But first, before I share any of that, what's going on in your world?

CHRIS: I'm sure there's plenty of nonsense in my world, but at the top of my list is some tech stuff. So someone on Twitter, Adam Lassek, reached out and he suggested related to the conversation and the back and forth that I've been having with myself around some of the data structures within the app that I'm building…So I've talked about the dry-monads result object, and there's this success and failure. And I wanted to introduce this new method called bimap, but I wanted to do it in a reasonable way. So I wrapped, and then I wrapped, and I wrapped things.

As an aside, former colleague and friend of the show, Joel Oliveira, sent a wonderful tweet which was a reference to the SNL video where they make a taco and put it inside of a pizza and put it inside of a bag. And that was his joke about it, which I really liked. That was an excellent reference. But in this case, Adam Lassek reached out and suggested if I'm that squeamish about monkey patching, which I am, have I considered refinements? And so he sent an image of a code sample, which is so kind of him to send that much detail over, but it was interesting because I know of refinements in Ruby. I know of that as an alternative to monkey patching, a more refined way, but a safer way, a more controlled way to alter code, but I've not actually used them.

STEPH: I'm not familiar with refinements. What is that?

CHRIS: Refinements are a way...so similar to monkey patching, where you say like, I'm going to reopen this class or this module and define a new method or redefine a method or do something like that, a refinement is a way to do that in a scoped manner. So I'll be honest, I'm not super familiar with them. I think I came into Ruby at a time where the community was moving away from monkey patching. And the dogmatic swing of the pendulum was like, that's a bad thing to do. And so even the refinements were introduced, as far as I understand it, to be a more controlled way to do it. So it's not just like, hey, cool. This module is redefined now in your app in a magical way that's really hard to figure out and hard for folks to debug refinements. You have to explicitly opt into within a certain lexical scope.

I'll be honest; I know that at the headline level. I don't actually know the ramifications or where and when you can use them and how you can. But I know that that was the idea is refinements are a way to do monkey patching but in a more controlled, more understandable manner, and so the code sample that Adam shared does that. And it's very interesting. As I'm looking at it, I'm like, okay, that's cool because I think it'll be a little bit safer.

But at the end of the day, my concern wasn't safety in this case because I was introducing a method that would be new, that would be additive to the API of this module that I'm working with, and so that I think of as a relatively safe operation. My hesitation was more around how does someone figure it out if they're working with this? And particularly, the name of the method that I was introducing was bimap so, B-I-M-A-P. And if someone sees that in our codebase and is like, "Bimap, where is this coming from?" Well, this is one of those dry-monad result objects. And they go to the code, and they try and look it up in the docs, and they're just not going to find anything.

And I can imagine losing a lot of time to try and chase that down. There are ways to figure it out. There's the method in Ruby, which is a wonderful trick for chasing things down. Or if you grep the codebase, you'd find it. But I think I'm possibly over-indexed on worrying about that lost time, that moment. But I've lost that time so many times in my life where I'm like, I can't grep for this. I can't Google for this. And so I have so strongly moved in the direction of being like, everything should be grepable, everything should be googleable. Those are the two of the things that I believe about software. I think I believe a bunch of stuff.

STEPH: I think we have a full episode that talks about what we believe in software.

CHRIS: I believe we do.

STEPH: Cool. Thanks. Yeah, I have not heard of refinements. That sounds really interesting. I really like that bit about everything should be grepable, and everything should be googleable, googling everything. I kind of agree with that one. We live in a world where we're always doing bespoke things so that one feels a little bit harder that we're always going to be able to Google it. But then that encourages people to constantly publish the bespoke work that they're doing so then others can benefit from that work. But the grepable, I absolutely agree with that one. It's so frustrating where I see a method, but I cannot find its definition. And then having the ways to figure out where that method is defined to then find its definition is crucial.

CHRIS: Yeah, it's interesting. I definitely feel that way very strongly. And it's in such stark contrast to Rails. Rails is like, hey, don't worry. There's going to be a lot of methods. You don't need to worry about where they come from, or why they exist, or what they are, or what they do. Well, probably what they do. But all of the magic inflections on database tables,, and suddenly you have methods named after every column. That's both very magical and hard to grep for or impossible to grep for, but it also leaks the entire structure of your database into your application in a way that I've always felt a little bit complicated about. And so explicitness, grepability, those are things that I care about.

There's another one, delegates in Rails, that I sometimes pause around using especially when it's like delegates 19 methods to user prefix user. And so you end up with methods that are like username. And that's a delegation to the user object to get the name method off of it, but it creates the method user_name. And you're never going to be able to grep for that. And it saves like a little bit of code, definitely, but it saves this very obvious, very knowable code. So this one I actually shy away from using delegates in most cases, and I'll just write out the methods manually because sometimes I like to hear the clackety of my keyboard. There's a reason I have a clackety keyboard.

STEPH: You want to get your money's worth. You want to clackety as much as possible. Yeah, I'm also not a fan of delegates. This may be a lie, but I don't know that I've actually ever used it. I've worked with it, but I can't think of a time that I've implemented delegates. Maybe that's a lie, but I'm going to say it anyways because that feels true, at least in the last couple of years.

CHRIS: I feel like that could be true for the last couple of years. I would be surprised if you have never even added to a delegates line. Because that's the thing, you can just keep shoveling stuff into them as well. So I would put money on you having used it at some point and then just forgotten about it. But who knows, maybe not.

STEPH: This is where we play two truths and a lie and that one's my lie. [laughs] Yeah, that's also fair about adding to it because if that's already defined and it's easier to add to it, I don't know. Who knows what past Stephanie has done, probably some wild stuff.

CHRIS: It's unknowable at this point. It's lost to the sands of time. But looping back to the core thing of this refinement and the module, I think I'm leaning in the direction of doing that and unwinding my wrapping and wrapping layer thing. Because obviously, as I talked about...I think it was the previous episode or maybe two episodes ago. There was conceptual complexity to the additional wrapping layer. Even as I was fully in the context of working on that, I was still getting myself confused in either triple wrapping or then unwrapping too much or whatever. And these are the concerns with this type of code. So moving away from that feels better, having just a single layer of context wrapping around a given value.

And then the other thing it's actually just a lot less code, and it's less prone to error, I think. That's my hope. I have to look into exactly how refinements get used, but I noticed in a couple of places that sometimes we were wrapping with this local value object that gave us the bimap method, and sometimes we were forgetting to. And so, I could see that being a very subtle, easy way to introduce failures into the app that would be hard to catch just by looking at it.

So I think having a more global refinement...although I think that's sort of a contradiction, a global refinement because I think refinements are meant to be local. But anyway, I'm going to look into it because it's a much more concise code sample than what I have. Yeah, I'm going to poke at that a little bit. But it was an interesting exploration of some different things. And then it forced me to consider why am I so resistant to monkey patching at this point, especially in this particular case where I think it's okay-ish?

STEPH: That's a good question. Do you have any insights? I am also resistant to monkey patching. I feel that pain and also that timidness of diving into that space. But I'm curious, have you figured out any other reasons that you really prefer to avoid it?

CHRIS: I think this one falls into that sort of...what's the word? Like tribal knowledge of we've been burned by it in the past and therefore we build almost a...religious is too strong of a word but that sort of cultural belief. This is a thing that we do not do because of the bad things that we've experienced in the past. And there are a lot of things that fall into that experiential negative space.

So with monkey patching, things that I know we can run into is if I introduced this bimap method, but I introduce it subtly differently than the library will eventually, then they could eventually introduce it themselves. And suddenly, I have this fork of my code expects it to work this way, but you've now implemented it that way. I no longer can upgrade. This is a critical piece of infrastructure in my app. I've just painted myself into a corner by doing this. Whereas if I do this wrapping layer, that's my code. I own that. It's not going to be a problem in that same way.

There's also the subtlety, the grepability that sort of thing is a concern in my mind. Like, is this our code? Is this their code? Is this an engine? Being able to find code within a codebase, I think, is a critical thing. And so that's a part of the hesitation. I also know longer ago prototypes...I want to say Prototype JS was the name of the project, but it was one that was just like, yeah, JavaScript doesn't have enough stuff in the standard library. So we're just going to override everything and add all of these wonderful methods sort of in the way that Active Support does, which is an interesting comparison.

But the JavaScript community definitely moved away from Prototype. And now JavaScript is a language or the standard runtime that's available in most JavaScript engines. It has a lot of the methods, but there are conflicts, and stuff gets weird, and it's all complicated. But again, as I thought of it, Active Support is a complete contradiction to everything I'm saying. Active Support just adds whatever to anything, 2.days.ago. Why does the number 2 have a days method? Because it's great, that's why. But I'm just a walking contradiction, I guess.

STEPH: Everything you said really resonates with me. And I'm just trying to reason with myself like yes, Active Support uses a lot of this, a lot of metaprogramming, and adds everything it wants to. So why does that feel okay? And I wonder if it comes down to one is more almost like an agreed standard. It's built by a team, and it's maintained by a team, and then it's used by a large number of people, and then you get that feedback. Or maybe it's not even just a team, but it's a larger community versus if it's internal to your software team, maybe that doesn't feel like a big enough group or if it just needs...Rails is also documented. So maybe that's part of it, too, is if you are going to dive into that space, it's easy to discover, and it's well-documented as if you are building an open-source project that other people are going to use. Like, you designed for the intent of people to use this pattern that you've introduced, then perhaps that's when it starts to feel okay. ,

But the experiences I have had is where people basically will add some dynamic programming or monkey patch an existing feature. And then that's very hard to find and has surprising results, or it gets outdated. So I guess it comes down to who are you designing for? Are you designing for more of an open-source community, or you're at least designing for the people behind you that are going to be using this? Or is this a one-off adventure that you have chosen for yourself and future developers to discover? [chuckles]

CHRIS: Yeah, I think that's a good summary, although I'm open to the fact that I exist in a state of contradiction. I'm also fine with that, to be clear. [chuckles] But I think what you said is true, and I think there is subtlety and nuance and reasons that it's okay in one context and less okay in others. And that idea of just like, I don't know, this is one of those things that I got in my head that I've done the thinking a long time ago to decide this is a thing I don't do.

So now, in order to override that, I would have to do so much thinking. I would have to be like, all right, well, my brain tells me, no, but I'm going to go reread everything about monkey patching right now to convince myself that it's okay or to fully get the context and the subtlety and the nuance. And so sometimes we have to rely on that heuristic knowledge of monkey patching, nope, don't do that. That's not a thing, but other stuff is fine. And well, Active Support is fine because it's Rails. But it is interesting to observe contradictions and be like, huh, look at me go. All right. Well, moving on.

STEPH: It's our lizard brain that's saying, "Hey, there's danger here." [laughs]

CHRIS: Exactly.

STEPH: I rather like living in a world of contradictions, or at least I find it that I'm drawn to them. And maybe that's also one of the things that I really like about consulting is because then I join all these different teams, and I hear all these different opinions. So as I'm forming these opinions around something like tests are great, I really like tests, and then someone's like, "I really hate tests." I'm like, "Cool. Let's talk. I want to understand why you don't like this thing that I think is wonderful because then I'm really interested." So I find that I'm often really drawn to contradictions as I like hearing opinions that are very different than mine and finding out why people have a different opinion than mine.

CHRIS: Yeah, the world is full of contradictions. So it's, I think, at least a useful way to exist in the world, to be open to them and to enjoy exploring them. But yeah, I'll update in future weeks if I do end up going the refinements route. I'll let you know if anything interesting falls out of that.

And now we're going to take a quick break to tell you about today's sponsor, Orbit. Orbit is mission control for community builders. Orbit offers data analytics, reporting, and insights across all the places your community exists in a single location. Orbit's origins are in the open-source and developer relations communities. And that continues today with an active open source culture in an accessible and documented API.

With thousands of communities currently relying on Orbit, they are rapidly growing their engineering team. The company is entirely remote-first with team members around the world. You can work from home, from an Orbit outpost in San Francisco or Paris, or find yourself a coworking spot in your city

The tech stack of the main orbit app is Ruby on Rails with JavaScript on the front end. If you're looking for your next role with an empathetic product-driven team that prides itself on work-life balance, professional development, and giving back to the larger community, then consider checking out the Orbit careers page for more information. Bonus points if working in a Ruby codebase with a Ruby-oriented team gives you a lot of joy. Find out more at orbit.love/weloveruby.

STEPH: So we made a recent improvement to our feature flag system, which I'm really excited about, that we have found a way to improve that workflow because it felt really great that we're...well, okay, I should say that with a caveat. It felt really great that we're using feature flags to ensure that the main branch is always in a deployable state. But it did not feel great around how tedious it was becoming to add all of the feature flags specifically because each time we're adding a feature flag, we're having to add a migration. So we're having to run a migration, add the feature flag column, and then we can interact with that feature flag. And that part's okay. It was more removing that feature flag once we're done with it, that that part was starting to feel tedious because then that's becoming a two-deploy process.

So one change is to remove the code that's relying on that feature flag. And then the second deploy was to actually drop that column because we wanted it to be safe to make sure that the code wasn't trying to reference a database column that didn't exist anymore, which is what happened at one point at first when we weren't doing the two-deploy process.

So the improvement that Chris White came up with is where we're now using a Postgres JSONB column. And it's here that we actually have a feature flag YAML file. And we can have the name of the feature flag. We have a description of the purpose of the feature flag. And we have an enabled property on there, so then we can turn it on and off. The benefit of this is now we don't have to do that two-deploy process. And we also don't have to run a migration for when we're adding a new feature flag. So we can add it to the feature flag file, we can load it in, and then we can set that property to say, "Yes, this is enabled," or "No, it's not." And that has just simplified our feature flag process.

One tricky bit that I believe the team ran into is around enabling this with Active Admin because Active Admin was just relying on those database columns to then turn something on or off. But then we've added some methods that work well with Active Admin that then say, "Read from here when you're checking to see if something is enabled," or "Look at this list to see which feature flags can be turned on and off." So it's been a really nice improvement, and everybody on the team seems to be in favor of the ways that we've improved this. So it's been really nice. So I wanted to come back and bring an update on how we've simplified our feature flag system.

CHRIS: That definitely sounds like a nice improvement, the ability to just more regularly iterate around that or taking away the pain, any pain associated with using feature flags. Because they are such a nice thing to have, but there's that overhead. Then you start to have that voice in your head that's like, do I really need a feature flag for this? Could I just sneak this one in? And we always regret that.

I had a similar thing this week where I wrote some code. I didn't quite write as many tests as I should have. And it was wildly broken, just like all of the connection points through everything were broken. But then it pushed me in an interesting direction where I was like, well, what I'm going to do is write an integrated test. It was basically an event coming in from a webhook that then enqueued a job, which did a thing, which then spit out an email. But it was broken at like three layers, and I was very embarrassed, if we're being honest. But, I don't know, I was just having a low energy afternoon, and I did not write the test, which I know I'm supposed to do.

So similarly, any pain that we can take out of these things that we're supposed to do, any way that we can pave the happy path, I'm all about those. I'm intrigued because I think we've talked about this before, but it sounds like you guys have a very home-grown feature flag system. Is that true?

STEPH: We do.

CHRIS: Is there something about it that makes it unique to your situation, or was it just like that's what happened? Someone early on was like, "We need feature flags. I can just do the simplest thing that works," and then that's where you're at now or?

STEPH: You're asking a very good question. And I'm trying to recall what led us to the state that we're in because I feel like we had this same discussion several episodes back when we were introducing the home-grown feature flag system. And I was like, there are reasons, but I didn't really dive into those reasons because it felt very custom to the application. But now I've forgotten what those reasons were. So I think you ask a great question where it'd be worth revisiting to confirm that yes, there's a reason for this home-grown version versus using something like Flipper.

CHRIS: I'm glad I'm at least consistent over time in the questions that I ask and the heuristics that I have. This does feel like one of those things. It's not quite like crypto where I'd be like, we can never write our own crypto. But a feature flag system, I would be really intrigued if there are things that they are just workflows or functionality that you really need that are not supported by any of the existing solutions that are out there. I think audit trails is an interesting one. I think Flipper has a hosted product at this point that does that, but the local version wouldn't necessarily. So maybe that's a thing that you want to get. Again, I'd just be really interested. It sounds like the current state of the world that you have is enabled or disabled; just broadly, that's it. Those are the two states for any given flag. Is that true?

STEPH: It is. There's nothing complex with the flags in that nature. And then we use naming to indicate if something is more for beta, so if it's a change that we're making to the codebase, but it's a feature flag that we plan on removing, versus maybe it's a feature flag for enterprise customers.

CHRIS: Oh, interesting. I wouldn't think of using a feature flag in that context where it's going to be like a persistent, long-lived; this is conditional logic around some state or some property of the viewer. I think of feature flags as a way to gate code conditionally based on a point in time. And the reason I asked about the enabled-disabled basically like the Boolean state for your flags is when I've worked with feature flags in the past, I've liked having the ability to say, for this user or these users, or this group of users, which we've named this is our beta list…and it's the ten people that just really love the product and are happy to bump into some rough edges. And so we'll put things on for them first or even like percentages, so roll it out to 10% and then 50% and so on. And I think the larger an application and user base gets, the more that sort of thing starts to feel right.

STEPH: Yeah, we certainly have some complexity around where each customer can really specify which features that they want. And then the features also differ a bit for each customer. So we are in a world where we're pretty customized or configurable for different customers. And whether that's something that we could simplify, that would certainly be a good question or something to pursue.

But part of this also feels like our decision may have been based around what the system was already doing, and we're looking for ways to make slow improvements versus trying to redesign the whole thing. Because initially, the way we were customizing all of these different features for customers was in a YAML file. And that part was painful because then, anytime we wanted to make a change, it required a deploy. So the introduction of feature flags is really to get away from having to deploy to then make a small change like that.

But now that we're in the space that we can easily configure that change and do that on the fly and not have to issue a deploy, I think we're now in a good space to reassess. And the team may have some really good answers. Perhaps I'm just not recalling as to why we've chosen the more home-grown feature flags. But yeah, I'll visit that topic and report back. Because I've been coasting along on our new system and enjoying it, but you're asking some really good questions.

CHRIS: I mean, as an aside, if you're coasting along and really enjoying it, then maybe you don't need to ask any questions. It's still interesting. I would be intrigued to know. But if it's not causing you any pain, then you probably shouldn't change it. Because frankly, changing out the feature flag system is going to be non-trivial, I'm pretty sure. You could feature flag the feature flag system, and then you can transition from one to the other. You need a third feature flag system for that. But anyway, I digress. [chuckles]

STEPH: You referenced crypto earlier. So I think I like the feature flag, the feature flag system. We should have some crypto flags in there somewhere. I think that's a thing too. But I think the main goal if I'm looking into changing it would be, circling back to what we were talking about earlier, is discoverability, so having a home-grown feature flag system. How easy is it for…if nobody was around on the team and there was someone new working with it, how easy would it be for them to turn something on or off? And if that's easy, then that's great. Then I think we've got a great home-grown system. If that's challenging, then I definitely think it's worth reassessing.

And now a quick break to hear from today's sponsor, Scout APM.

Scout APM is leading-edge application performance monitoring that's designed to help Rails developers quickly find and fix performance issues without having to deal with the headache or overhead of enterprise platform feature bloat. With a developer-centric UI and tracing logic that ties bottlenecks to source code, you can quickly pinpoint and resolve those performance abnormalities like N+1 queries, slow database queries, memory bloat, and much more.

Scout's real-time alerting and weekly digest emails let you rest easy knowing Scout's on watch and resolving performance issues before your customers ever see them. Scout has also launched its new error monitoring feature add-on for Python applications. Now you can connect your error reporting and application monitoring data on one platform.

See for yourself why developers call Scout their best friend and try our error monitoring and APM free for 14 days; no credit card needed. And as an added-on bonus for Bike Shed listeners, Scout will donate $5 to the open-source project of your choice when you deploy. Learn more at scoutapm.com/bikeshed. That's scoutapm.com/bikeshed.

CHRIS: One of the things that's been interesting working lately in the app that I'm building is thinking about testing. We have a number of interactions with third-party services. Frankly, a lot of the app is that at this point. We have a handful of different external data providers systems that we're interacting with, webhooks and flows and things like that. And so we had to make that decision that you always have to make in these sorts of situations which is, how are we going to test this?

And there's a wonderful blog post on the thoughtbot blog called Faking External Services in Tests with Adapters. It's by the one and only German Velasco. And it is a beautiful summary of the different approaches that you can take, but it really dials into one, which is the adapter pattern. There's also a weekly iteration episode on Upcase with Joël Quenneville, which discusses a little bit more of an exploration of the different options. There are sort of a handful of different options that we can consider your whereas the blog post by German talks specifically about the adapters approach.

But to talk about them briefly, there's one where you can go all the way outside your app, spin up a fake service. Typically, we would do this with Capybara Discoball, which is a wonderfully named project. But it allows you to spin up a little Sinatra app type thing such that your web application is still making quote, unquote "real HTTP requests." This external service is going to catch that and respond with whatever canned data or structured responses that you want.

But you still have the ability in that to, say, tell it to create data beforehand or be in a certain state or respond with certain data or have any stateful persistence. So if you create a record in that external system, and then later you query for it, that system can do that. But it has the complexities of now, your test suite is running different systems. And do you have thread-safety or all that kind of stuff? So that's a particularly complex end of the spectrum. At the lowest end would be stubbing and mocking. You just take whatever external clients you have, and you're mocking the API calls in them. That's the lowest end. And that's the one, especially for feature specs, those I try and avoid. Then there's a middle ground of like WebMock or VCR, those sort of things where you're saying whenever you see an HTTP request that looks like this, respond in this way. You record the cassettes, all that kind of stuff.

And then there's the one that we've settled on, which is the adapters. So the client that we've introduced in our local codebase to interact with any of these third-party systems internally has a class attribute, a cattr_accessor in the Rails parlance, I believe. And that allows us to switch out the backend. And so we have a real HTTP backend, and that's the one that actually runs in production and a test in-memory backend. And that in-memory backend can implement whatever logic. We're ending up with one of them almost recreating this external service, sort of re-coding some of their inconsistencies or oddities but also features and whatnot.

But it feels like it has struck just the right balance, and it allows our feature specs to be very rich, very real. We start up the world, and we say, "Hey, external service be in this state." And then I'm going to go visit the page. I'm going to see the data. But we are almost making real HTTP requests. It's very close. It's always an interesting choice to make here. I'm very happy with the one that we've made, but it's still not perfect. There are always going to be trade-offs between the different options here. But it's always interesting revisiting this and being like, which one am I going to choose today?

STEPH: I feel like my natural progression when testing external services; I always start with WebMock, and then I progress to using adapters. And then from there, I go to actually replacing the HTTP service that is receiving and then returning a response, like you mentioned to Capybara Discoball earlier. So I can certainly see what you like about the adapter pattern. You mentioned that you're coding some of the inconsistencies. That feels very real. I'm curious if you have an example of how you've had to manage that recently.

CHRIS: A specific example would be the external API responds with certain error codes or error structures. So it's an error. It has a status of a number and then a reason, or sometimes instead of a key that is reason; it’s the message. So it's like, oh, okay, I see that in this endpoint, you respond with reason, and then this endpoint you respond with message. So now, do I encode that into my fake? I guess I do. So my adapter now implements things like that. There are cases where it's inconsistency where I'm like, well, this is the way they behave. So I would like our test suite to exist in the context of that because then our app is getting exercise in a real way.

But in some cases, it's like little bits of logic validation that an external system might do if that's an important part of the flow. The app that we're building has a lot of forms and a lot of data validation and things like that. And so, we want to make sure that we have robust handling around that robust messaging to the user so that it's very clear what they need to do and how they need to respond to things. And so putting in little bits of that like, oh, that's how you format a phone number, okay, cool. Our fake will also format phone numbers in that way, things like that.

STEPH: Every time the topic of testing external services comes up, I really, really want VCR to be the answer. I really like the idea of being able to validate that...because you'd mentioned that we're programming the expected return from this other service. And it's very easy to get out of sync with those actual responses. And then we don't really have a great way to stay up to date other than we wait for production or staging environment to fail. And then we realize something has changed, and we have to go and update either our mock or our adapter. And maybe that doesn't happen often if you're working with an external service that is very good about broadcasting when they have a breaking change.

But if you're working with a less stable endpoint, then I always want VCR to really work. But it's just one of those areas where I'm like, yes, that's the thing that I want. I want this idea where I can rerun my tests in a way that they actually hit that service and record the response. But then I have felt pain [chuckles] from working with VCR and how it's configured, and how people have used it. It's one of those where I don't blame the library. I like the library. But the way people have implemented it and test I have felt a lot of pain from that.

CHRIS: Yeah, I definitely agree with that. It feels like it's nice if you can push the mocking all the way out to that layer. Because like right now, our codebase has code in it that is subtly changing the behavior for a test, and I don't like that. It's only the swapping out of the adapter, so it's a very minimal thing. And we try and push all logic away from that such that the test adapter is as similar as possible to the real production situation. But it's enough difference that I agree I would like if VCR would just like, I catch the HTTP requests, and I respond with the same thing and sometimes we can pass through.

I do think one of the fundamental limitations, or at least very hard to get right things, would be sequential requests. So I post to this endpoint in the external service, which creates some data. And then later, when I make a GET request to their endpoint, I should get back that data that I just created. That's, I guess, doable because you can have sequential requests, have cassettes that are first this request, then that request, then that request. And it knows that, like scope them to a given spec. But that feels extra difficult. And it does, again to your comment, the maintainers of that project do a wonderful job, but it's a really hard target to hit.

STEPH: Well, and one of the other hard requirements with using a tool like VCR is then that external service really needs that sandbox staging environment that you can use. So that way you can create this data, you can rerun your test. So they're actually going to hit this real environment. They're going to create this data and that not have any harmful effects. And then you can record fetching that data. So it requires a lot of pieces to fall into place for it to work well. But then I was just thinking as you're talking about adapters, I'm like, yeah, I love the adapter pattern. I've really enjoyed that one for testing as well. But then I immediately start to think, oh, well, what happens when it gets out of sync, and how do we know that it got out of sync? And I don't have a great answer to that.

CHRIS: Production blows up, obviously.

STEPH: Production blows up, and then we go update our adapter. That's very calm. [laughs]

CHRIS: It would be great if CI could more proactively catch that or...yeah, I agree. I would love if VCR would work because that facet of it is so attractive. But [chuckles] I've never gotten to walk exclusively the happy path with VCR. So here we are. This is a classic case of here's four options as to how we can think about this hard and important thing that we do in our codebases, and they all have trade-offs much like everything else in software.

STEPH: I'm going to add this to my developer bucket list to live in a world where I can easily validate if an external API has changed or not and then also have tests that know when something has broken before production does.

CHRIS: Ooph, dare to dream. I like it.

STEPH: I'm a dreamer.

CHRIS: I want to live in that world. Well, with that wonderful dream to take us out, should we wrap up?

STEPH: Let's wrap up. The show notes for this episode can be found at bikeshed.fm.

CHRIS: This show is produced and edited by Mandy Moore.

STEPH: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or a review in iTunes as it helps other people find the show.

CHRIS: If you have any feedback for this or any of our other episodes, you can reach us at @_bikeshed on Twitter. And I'm @christoomey.

STEPH: And I'm @SViccari.

CHRIS: Or you can email us at hosts@bikeshed.fm.

STEPH: Thanks so much for listening to The Bike Shed, and we'll see you next week.

All: Byeeeeeeeee.

Announcer: This podcast was brought to you by thoughtbot. thoughtbot is your expert design and development partner. Let's make your product and team a success.

Sponsored By:

Support The Bike Shed

Episode source