How to find excellent refactoring opportunities

I was doing some code review today and I realized after the fact that the refactoring I wrote for the review was a great example of how I iteratively evolved a piece of code into something really great. So here's a step-by-step approach to finding and performing a code refactoring you can be proud of.

Prerequisite: Read Refactoring

The first thing you should do is read Martin Fowler's Refactoring. You're super in luck too because the 2nd Edition is going to be in JavaScript.

Since you can't get that now, go read the Ruby Edition. The Ruby edition is better because it's easier to relate to in the front-end world since Ruby is a simple scripting language and way less verbose than the original version in Java.

Why do you need to read this book? Because it's all tactics. This book has something like 40+ techniques for finding code smells and refactoring them into something beautiful. You probably already knew the really simple ones like Extract Method and Extract Class, but the more advanced ones are where it really shines.

One client I worked with had this heavy reliance on using switch/case statements. Those aren't inherently bad but more often than not you can write something cleaner if you replace the conditional with polymorphism. Stuff like this is gold, pure gold!

Refactoring is gold

An example in refactoring

Here's the (obfuscated) code I was working with to begin our journey:

const addEvents = (events, label, segment) => events.push([label, segment]);

const main = () => {
  const [upcoming, past] = partition(
    arr, date => today.diff(date, 'day') <= 0
  );

  const events = []

  if (upcoming.length > 0) {
    addEvents(events, 'upcoming stuff', upcoming);
  }

  if (past.length > 0) {
    const byMonth = groupBy(past, date =>
      moment(date).format('YYYY-MM');
    );
    const months = Object.keys(byMonth).sort().reverse();
    months.forEach((month, idx) => addEvents(events, `${idx} months ago`, month);
  }

  return events;
}

This code is pretty simple - I've got two arrays of dates: upcoming dates, and past dates (including today). I just need to let the world know what's coming up and what is in the past, so it's not a simple push to an array because each chunk is split off by month (except for the future, which is just labeled future stuff), and each section is really a tuple of label and date.

So what immediately got me tweakin' on this code?

Single Responsibility Principle

By now the SOLID acronym is engrained in my brain. If you don't know what it is, thoughtbot wrote a great intro about what SOLID is so definitely check that out.

The most obvious one that people seem to remember is the first one, Single Responsibility Principle. It's probably the easiest because the definition really says it all. Your class/function/whatever should do one thing and do it well. In this case, I see a bunch of stuff, so my first thought is to break this function up into multiple functions.

How do I know how to break it up? Stuff like if statements are literal blocks of code. Since everything is essentially scoped to blocks in JavaScript, anything inside of a bracket is a good candidate to break out into its own function.

I'm also going to cheat and skip a step, because if statements also provide another subtle hint: the if keyword can usually be refactored into a guard clause. Why? Because it is already guarding some code against being run within your function.

If we rewrite the if in guard clause notation (meaning, exit early before the function has a chance to run), along with the SRP refactoring, we get:

const addEvents = (events, label, segment) => events.push([label, segment]);

const addUpcomingEvents = (events, upcoming) => {
  if (upcoming.length === 0) return;
  addEvents(events, 'upcoming stuff', upcoming);
};

const getPastMonths = past => {
    return groupBy(past, date => moment(date).format('YYYY-MM'));
};

const addPastEvents = (events, past) => {
  if (past.length === 0) return;
  const months = Object.keys(getPastMonths(past)).sort().reverse();
  months.forEach((month, idx) => addEvents(events, `${idx} months ago`, month);
};

const main = () => {
  const [upcoming, past] = partition(
    arr, date => today.diff(date, 'day') <= 0
  );

  const events = [];

    addUpcomingEvents(events, upcoming);
    addPastEvents(events, past);

  return events;
}

Mutation is ugly

Already this looks way better. main()'s single responsibility is populating the array of events and returning them. Each extracted function has one responsibility based on its function definition. So I guess we can call it a day, right? The next smell I see is this pattern:

const myPoorFriendArray = [];

// a world of hurt, warping our pristine, empty friend
blackBoxFunction(myPoorFriendArray);
moreTorture(myPoorFriendArray);
theGauntlet(myPoorFriendArray);

// we chew him up and spit him out, never knowing the horrors he experienced
return myPoorFriendArray;

Ya feel me on this one? Our boy myPoorFriendArray is now an honorary X-Man: mutated from his original, empty form, into some chimera of various parts of the code. Kanye said it better than I can, but this is the stuff I DON'T LIKE.

Maybe you come from an OO world like Java and mutation is just a regular, everyday occurrence. If you're woke and you come from a functional background like Elm, you can see that mutants are bad. We don't want no X-Men fighting for us on the coding battlegrounds (sorry Gambit, love you).

So what's the fix here?

  1. Send the empty array in as an argument to our first add function
  2. The function will take in the empty array and make a copy
  3. The copy is filled out and returned

The argument is not mutated, and we remove state dependencies. If you've done any functional programming, terms like stateless and immutability are everyday terms, and that's what we're striving for.

Our stateless code doesn't have side effects, a nasty code smell that makes bugs more difficult to track down because state persists across functions. Without state, what you put into it is what you get out of it each and every time. Making your functions predictable, and thus less likely to include errors you didn't test for.

Here's what this change looks like:

const addEvents = (events, label, segment) => events.push([label, segment]);

// we didn't touch the middle functions...

const main = () => {
  const [upcoming, past] = partition(
    arr, date => today.diff(date, 'day') <= 0
  );

  return addPastEvents(addUpcomingEvents([], upcoming), past);
}

Now we're getting somewhere! But for the savvy reader, something should still seem off. Can you see what's still wrong?

Negate the state

We didn't really remove the state from the add functions. Yeah, we gave it a starting point in our main() function, but the addEvents() function is still adding events to a singular events array which is getting passed around like a rag doll. How do I know this?

Array.prototype.push is a mutable function. It provides an interface to add stuff to an array and returns the number of elements in the enlarged array. How do we do this in an immutable way?

Array.prototype.concat is the immutable equivalent of push(). It takes an array, combines it with another array, and those two are combined into a newly-allocated array. So we can modify the above slightly to be:

const addEvents = (events, label, segment) => {
  if (!label) return events;
  return events.concat([[label, segment]]);
};

Now we never touch our inputs. events remains as pristine as it was the day it arrived in this humble function, and instead we use concat() to combine events with the new array we've made, which is the array of our tuple (a point of clarification: tuples don't exist in JavaScript, but I'm using that term to describe our 2-item array to make the distinction clearer here).

Now I'm in audit mode: where else do I need to follow this pattern of not touching the arguments and ensuring I return an Array type?

Provide type safety

If you aren't using TypeScript, this is a great way to practice some proactive type safety to ensure that your function returns objects of the same type regardless of the output. That means don't return an Array if you have stuff, but return undefined or null if you exit early (like in a guard clause). Oh crap, I'm doing that. Looks like another opportunity to refactor!

// unchanged but illustrating that all functions, including this one, always return an Array
const addEvents = (events, label, segment) => {
  if (!label) return events;
  return events.concat([[label, segment]]);
};

const addUpcomingEvents = (events, upcoming) => {
  if (upcoming.length === 0) return events;
    return addEvents(events, 'upcoming stuff', upcoming);
};

const getPastMonths = past => {
    return Object.keys(groupBy(past, date => moment(date).format('YYYY-MM'))).sort().reverse();
};

const addPastEvents = (events, past) => {
  if (past.length === 0) return events;
  return getPastMonths(past).reduce((acc, month, idx) => {
      return addEvents(acc, `${idx} months ago`, month), events);
  };
};

Lots to unpack here. So in addUpcomingEvents() all we did was make sure we return an Array and not undefined (ending with just return; is shorthand for returning an undefined object). We do this because concat() returns an array, so we want to make sure all return statements provide the same type of object in any given function.

Next I did some refactoring of getPastMonths() to handle the sorting an reversing, because the groupBy function technically returns an Object, and a way for it to return an Array is to grab the Keys (which is an Array) and do our necessary transformations to that array object.

Finally, addPastEvents() starts out the same as the upcoming function by ensuring our guard clause returns an Array type. The next part is a bit wilder. Originally we were taking an array and iterating over it using Array.prototype.forEach. The problem is that this iterator doesn't return an Array like we need it to. It simply gives us a platform to view every item in our array.

We also know that in the end, we want one array object that adds in all of the past events. When you think about needing to combine things into one and return that combined object, think of using Array.prototype.reduce.

In this case, I knew I needed to add the events (by month) into our events array, and return that newly combined array, using events as the starting point. The reduce function takes two arguments, a callback on how to combine stuff, and an optional initial object to begin with.

The callback is probably the most confusing part - what is this nested return trying to do? The reduce() callback argument has two arguments of its own: an accumulator object, which is initialized to our initial object (if you leave it out, it defaults to undefined), and the current item that is being iterated on. There are two additional optional arguments: the index of the item you are iterating on, and the original array object you called reduce() on. Since we need the index argument to label our months, I added that in.

So with all of that said, the reduce() function is basically saying:

For each past month in my array, add it (with its label) onto my accumulated array, which is the cumulation of every previous iteration, starting with my initial events array.

The final result

Refactored code is good code

It was at this point I called it a day and was satisfied with my refactoring for code review I mentioned in the beginning. The final product looks like this:

const addEvents = (events, label, segment) => {
  if (!label) return events;
  return events.concat([[label, segment]]);
};

const addUpcomingEvents = (events, upcoming) => {
  if (upcoming.length === 0) return events;
    return addEvents(events, 'upcoming stuff', upcoming);
};

const getPastMonths = past => {
    return Object.keys(groupBy(past, date => moment(date).format('YYYY-MM'))).sort().reverse();
};

const addPastEvents = (events, past) => {
  if (past.length === 0) return events;
  return getPastMonths(past).reduce((acc, month, idx) => {
      return addEvents(acc, `${idx} months ago`, month), events);
  };
};

const main = () => {
  const [upcoming, past] = partition(arr, date => today.diff(date, 'day') <= 0);
  return addPastEvents(addUpcomingEvents([], upcoming), past);
}

To summarize, the major things we covered in this refactoring are:

  1. Simplify functions down to a single responsibility
  2. Eliminate side effects by using immutable functions and not modifying arguments
  3. Ensure type safety by verifying all return statements provide the same type of object

This gives us code that is easier to read, less prone to bugs, and easier to test.

Is there anything else you would refactor? Did I miss something? Hit me up on Twitter and let me know what else could make this code better.


Get the FREE UI crash course

Sign up for our newsletter and receive a free UI crash course to help you build beautiful applications without needing a design background. Just enter your email below and you'll get a download link instantly.

A new version of this app is available. Click here to update.