Experimentation: what’s in it for backend developers?
An interview with Tech Lead, Frank Hu about the benefits of experimenting on the features your users can’t see.
Here at Optimizely, the Product and Engineering teams use experimentation in their daily workflows. It’s so tightly woven into the fabric of the organization that it can be easy to forget that experimentation isn’t top of mind for every team. I recently heard a question from a backend developer who works at another company. The inquiry was along the lines of, “Isn’t A/B testing and experimentation just for driving conversions? How would I use it on the things I work on in the background that a user can not see?”
I rattled off several things one might test given the scenario; algorithm optimizations, API improvements, architecture frameworks, and migrations. They are all opportunities to use experimentation to measure the impact of changes you are introducing, whether or not the work has a direct impact on the UI.
The goal of a backend experiment could be to measure improvement on latency or storage. Not all tests are designed to directly drive clicks, purchases, and engagement. In many cases, the goal of a test might just be to quantify before and after scenarios.
Travis Beck, an engineer at Optimizely, wrote a blog post illustrating an experiment that improved backend performance. It left me wanting to hear more about these kinds of experiments and the mindset of the engineers who create them. I sat down with Frank Hu, a Senior Software Engineer at Optimizely to dig into what goes into a back-end experiment, and what might come out.
Perri: Hi Frank, thanks for taking the time to talk with me. I was hoping you could tell me a little bit about your engineering background and role here at Optimizely?
Frank: Sure. Before coming to Optimizely, I was a founding member of a Y Combinator startup called MailTime, a mobile solution to modernize the way people interact with email. Think of it as more of an instant messaging interface for email. Following that, I came to Optimizely. I’ve been here for two-and-a-half years now.
I started out as an engineer on our monetization team, managing the backend of how features mapped to our pricing plans. Now I’m the technical lead of our data infrastructure team and I’m responsible for all the event data our customers send to Optimizely. That’s about 10 billion events every day we’re returning in real-time to be used in experiment reports for our customers.
Perri: Before coming to work at Optimizely, what was your experience and perspective on A/B testing and experimentation?
Frank: We did A/B testing at my start-up before I joined Optimizely. As a much smaller team, the experiments were mainly focused on the UI. We had built our own method for bucketing users into their respective test variations. We used our analytics tool to try and understand what happened when the test went live – but it didn’t necessarily tell us what had changed, or anything about statistical significance. Needless to say, understanding the metrics was a pain point for the engineer having to handle the traffic splitting and build-out the full UI for each variation of the experiment. It was a long process.
Perri: Tell me about one of the early and impactful experiments you ran once you got to Optimizely. What were the results?
Frank: Back when I was on the monetization team at Optimizely, we overhauled the way our functionality mapped to billing and plans. We refactored the code so that every feature could be toggled on/off and made accessible to accounts under different plans using Optimizely Full Stack.
The refactor was a big deal because before that business analysts were making the calls about which features should be a part of which plans. It was not a very data-driven process and any time we wanted to change the plan structure, the engineers had to create a new model in the database. We had over 300 features. It was a big project and a ton of work.
The results were worth the upfront investment. By breaking out every feature with feature flags our product managers were able to manage the plan structure rather than relying entirely on the engineers. And by approaching the change as a test, we were able to leverage data to understand how the plans should be structured, to begin with.
All of this translated into huge time savings for product managers, the engineers responsible for features, and the engineers responsible for the code releases. It allowed for asynchronous collaboration and majorly reduced dependencies around feature releases. Finally, it provided a new level of transparency where people were not required to dig deeply into the code every time they wanted to understand what features were being built and updated.
Perri: Does experimentation serve a different purpose for backend processes than for UI features?
Frank: Yes, they are different. With UI experiments, you really don’t know what the impact or results will be. With the backend, you often know that your proposed solution will likely be better than the status quo, but you don’t know exactly how much better it will be. Experimentation can help you determine if your solution is done or if you still have work to achieve your desired results.
With the backend you’re dealing with complex systems which inherently means your tests will be complex. For example, you may be considering introducing a microservice vs. a service mesh and you may have a defensible theory, but without testing in production with real data, how will you know if it is truly the best in practice? Sure, you can use benchmarks and samples to test distributed systems. But sampling is not representative of the true production environment. It’s very difficult to replicate a production environment.
Statistically significant experiments run in production allow you to truly know the impact with real data and you can evaluate the tradeoffs.
Perri: Do you have any advice for backend engineers who want to get started but are not yet regularly using experimentation in their workflows?
Frank: Well, experiment-driven development is not as straight-forward as using a visual editor like Optimizely Web, to drag and drop UI changes for testing. I think about experimentation as a new way to refactor code. There are some parallels with Test Driven Development for sure. And when it comes to performance tuning, it will only be solved in the production environment. Full Stack experimentation makes it easy for you to test in production.
The last thing I’d recommend considering is when getting started with testing in your product, is that as the complexity of your system increases, the complexity of your experiment setup will too. And this actually makes it even more important to do the testing.
Taking an experimentation-driven approach to software development can have huge advantages for validating your work, uncovering the best solutions and streamlining efficient communication and processes between teams. Ready to move from theory to action? Dive into the Full Stack developer docs and learn how to put experimentation and feature management into practice. Looking for a free solution to get started with today? Optimizely Rollouts offers feature flagging for free and is built using the same Full Stack SDKs and platform.