in blog | Screaming At My Screen |
---|---|
original entry | Safely rewriting complex code |
There is a lot of advice out there suggesting to never rewrite anything. Just improve it in small increments. This is sound advice for services, modules or large chunks of code. Yet sometimes rewriting a complex part of your system will be inevitable. This does not mean the above advice does not apply. You still want to minimise risk and potentially try to ship smaller increments instead of rewriting 2000 lines of code in one go. No matter how big you deside to make the change, some guardrails and a safety net will come in handy.
For basically any Python project one of the first developer dependencies I install is pytest. For Django projects I always add pytest-django. For feature flags my preferred library is Django Waffle. The code examples show how to use these three libraries to make a rewrite less scary. But there is no functionality exclusive to these three libs. You can easily replicate the approach with alternatives or in other languages and frameworks.
The complete code and Django project can be found here.
One of the features I really appreciate is parameterize
. You can add a list of values which is passed to your test function. As the values are simply arguments you can pass them to the test you want to cover and compare the output.
Let us create a slug for a string. As with all code samples, please keep in mind to not use them in production. You will certainly see why.
def simple(name: str) -> str:
name = name.replace(" ", "-")
name = name.lower()
return name
For a function to generate a slug you might want to test multiple inputs. You want spaces to be covered, removed at the beginning and end of the output, punctuation replaced and multiple dashes collapsed to one.
Instead of writing multiple tests we will use parameterize
.
@pytest.mark.parametrize(
"name",
(
"foo bar",
"baz zab",
"1 2 3 4 5"
)
)
def test_simple(name):
assert " " not in simple(name)
Pytest will run three separate tests, one for each element in the set. If a test would fail the input will be shown as well, so you can debug the individual test case instead of guessing what the problem is.
Now imagine you have a function with a few hundred lines of code doing multiple computations on some input values before returning the output. Rewriting this will be tricky. Partially rewriting it might not be possible. So let us do it the second best way and make sure we can keep the risk of deploying the code to a minium.
First of all I would suggest making sure you test the function as good as you can. More often than not I found small issues with the new implementation by simply running tests for the old code against the new one.
First rename the old function. I prefixed it with deprecated_
and the rewrite with rewrite_
. Make sure the accepted inputs are the same. A function with the original name will have exactly three lines of code.
def complex_function(x: int, y: int) -> int:
if switch_is_active("new_complex"):
return rewrite_complex_function(x, y)
return deprecated_complex_function(x, y)
We are using a waffle switch. This allows us to run either the deprecated code or the rewrite. Something goes wrong in production you could not catch in dev and QA? Turn off the switch and back to the drawing board. Realistically you might even want to make this a flag and only roll it out to a small subset of your production traffic to begin with.
All we now have to do is update our tests to call complex_function
with the switch on and off. As long as the input is the same the output should be the same.
@pytest.mark.parametrize("switch", (True, False))
@pytest.mark.django_db
def test_complex_function(switch):
with override_switch("new_complex", active=switch):
assert complex_function(2, 2) == 4
Now for input and output to be the same you have to port the code exactly as is. Including all potential bugs you might spot.
This is fine as a first step. Your rewrite should provide better test coverage and make fixing the bugs you ported easier.
Once you tested and verified your rewrite it is time to remove the deprecated code, the switch and drop the rewrite_
prefix. Now it is time to tackle bugs and start improving the code.