The Law of Demeter Creates More Problems Than It Solves

January 22, 2020

Join my mailing list…

Most developers, when invoking the “Law of Demeter” or when pointing out a “Demeter Violation”, do so when a line of code has more than one dot: person.address.country.code. Like the near-pointless SOLID Principles, Demeter, too, suffers from making a vague claim that drives developers to myopically unhelpful behavior.

Writing SOLID is not Solid, I found the backstory and history of the principles really interesting. They were far flimsier than I had expected, and much more vague in their prescription. The problem was in their couching as “principles” and the overcomplex code that resulted from their oversimplification. Demeter is no different. It aims to help us manage coupling between classes, but when blindly applied to core classes and data structures, it leads to convoluted, over-de-coupled code that obscures behavior.

What is this Law of Demeter?

Update Jan 24, 2020: My former collegue Glenn Vanderburg pointed me to what he believes it he source of the “Law of Demeter”, which looks like a fax of the IEEE Software magazine in which it appears! It’s on the Universitatea Politehnica Timisoara’s website.

It does specifically mention object-oriented programming, and it states a few interesting things. First, it mentions pretty explicitly that they have no actual proof this law does what it says it does (maybe then don’t call it law? I dunno. That’s just me). Second, it presents a much more elaborate and nuanced definition than the paper linked below. The definitions of terms alone is almost a page long and somewhat dense.

Suffice it to say, I stand even more firm that this should not be called a “Law” and that the way most programmers understand by counting dots is absolutely wrong. This paper is hard to find and pretty hard to read (both due to its text, but also its presentation). I would be surprised if anyone invoking Demeter in a code review has read and understood it.

It’s hard to find a real source for the Law of Demeter, but the closest I could find is this page on Northeastern’s webstie, which says:

End of Update

This page on Northeastern’s webstie, summarizes the Law as stated in the paper above:

Each unit should have only limited knowledge about other units: only units “closely” related to the current unit.

The page then attempts to define “closely related”, which I will attempt to restate without the academic legalese:

  • A unit is some method meth of a class Clazz
  • Closely related units are classes that are:
    • other methods of Clazz.
    • passed into meth as arguments.
    • returned by other methods of Clazz.
    • any instance variables of Clazz.

Anything else should not be used by meth. So for example, if meth takes an argument arg, it’s OK to call a method other_meth on arg (arg.other_meth), but it’s not OK to call a method on that (arg.other_meth.yet_another_meth).

It’s also worth pointing out that this “Law” was not developed for the sake of object-oriented programming, but for help defining aspect-oriented programming, which you only tend to hear about in Java-land, and even then, not all that much.

That all said, this advice seems reasonable, but it does not really allow for nuance. Yes, we want to reduce coupling, but doing so has a cost (this is discussed at length in the book). In particular, it might be preferable for our code’s coupling to match that of the domain.

It also might be OK to be overly coupled to our language’s standard library or to the framework components of whatever framework we are using, since that coupling mimics the decision to be coupled to a language or framework.

Code Coupling can Mirror Domain Coupling

Consider this object model, where a person has an address, which has a country, which has a code.

Class diagram of our object model
Class diagram of the object model.

Suppose we have to write a method to figure out taxes based on country code of a person. Our method, determine_tax_method takes a Person as an argument. The basic logic is:

  • If a person is in the US and a contractor, we don’t do tax determination.
  • If they are in the US and not a contractor, we use the US-based tax determination, which requires a zipcode.
  • If they are in the UK, we use the UK based determination, which requires a postcode.
  • Otherwise, we don’t do tax determination.

Here’s what that might look like:

class TaxDetermination
  def determine_tax_method(person)
    case person.address.country.code
    when "US"
      if person.contractor?
        NoTaxDetermination.new
      else
        USTaxDetermination.new(person.address.postcode)
      end
    when "UK"
      UKTaxDetermination.new(person.address.postcode)
    else
      NoTaxDetermination.new
    end
  end
end

If address, country, and code are all methods, according to the Law of Demeter, we have created a violation, because we are depending on the class of an object returned by a method called on an argument. In this case, the return value of person.address is a Country and thus not a “closely related unit”.

But is that really true?

Person has a well-defined type. It is defined as having an address, which is an Address, another well-defined type. That has a country, well-defined in the Country class, which has a code attribute that returns a string. These aren’t objects to which we are sending messages, at least not semantically. These are data structures we are navigating to access data from our domain model. The difference is meaningful!

Even still, it’s hard to quantify the problems with a piece of code. The best way to evaluate a technique is to compare code that uses it to code that does not. So, let’s change our code so it doesn’t violate the Law of Demeter.

A common way to do this is to provide proxy methods on an allowed class to do the navigation for us:

class TaxDetermination
  def determine_tax_method(person)
    case person.country_code
    #           ^^^^^^^^^^^^           
    when "US"
      if person.contractor?
        NoTaxDetermination.new
      else
       USTaxDetermination.new(person.postcode)
       #                             ^^^^^^^^
      end
    when "UK"
     UKTaxDetermination.new(person.postcode)
     #                             ^^^^^^^^
    else
      NoTaxDetermination.new
    end
  end
end

How do we implement country_code and postcode?

class Person
  def country_code
    self.address.country.code
  end

  def postcode
    self.address.postcode
  end
end

Of course, country_code now contains a Demeter Violation, because it calls a method on the return type of a closely related unit. Remember, self.address is allowed, and calling methods on self.address is allowed, but that’s it. Calling code on country is the violation. So…another proxy method.

class Person
  def country_code
    self.address.country_code
    #            ^^^^^^^^^^^^
  end
end


class Address
  def country_code
    self.country.code
  end
end

And now we comply with the Law of Demeter, but what have we actually accomplished? All of the methods we’ve been dealing with are really just attributes returning unfettered access to public members of a data structure.

We’ve added three new public API methods to two classes, all of which require tests, which means we’ve incurred both an opportunity cost in making them and a carrying cost in their continued existence.

We also now have two was to get a person’s country code, two ways to get their post code, and two was to get the country code of an address. It’s hard to see this as a benefit.

For classes that are really just data structures, especially when they are core domain concepts that drive the reason for our app’s existence, applying the Law of Demeter does more harm than good. And when you consider that most developers who apply it don’t read the backstory and simply count dots in lines of code, you end up with myopically overcomplex code with little demonstrable benefit.

But let’s take this one step further, shall we?

Violating Demeter by Depending on the Standard Library

Suppose we want to send physical mail to a person, but our carrier is a horrible legacy US-centric one that requires being given a first name and last name. We only collected full name, so we fake it out by looking for a space in the name. Anyone with no spaces in their names is handled manually by queuing their record to a customer service person via handle_manually.

class MailSending
  def send_mailer(person)
    fake_first_last = /^(?<first>\S+)\s(?<last>.*)$/

    match_data = fake_first_last.match(person.name)

    if match_data
      legacy_carrier(match_data[:first], match_data[:last])
    else
      handle_manually(person)
    end
  end
end

This has a Demeter violation. A Regexp (created by the /../ literal) returns a MatchData if there is match. We can’t call methods on an object returned by one of our closely related units’ methods. We can call match on a Regexp, but we can’t call a method on what that returns. In this case, we’re calling [] on the returned MatchData. How do we eliminate this egregious problem?

We can’t make proxy methods for first name and last name in Person, because that method will have the same problem as this one (it also would put use-case specific methods on a core class, but that’s another problem). We really do need to both match a regexp and examine its results. But the Law does not allow for such subtly! We could create a proxy class for this parsing.

class LegacyFirstLastParser
  FAKE_FIRST_LAST = /^(?<first>\S+)\s(?<last>.*)$/
  def initialize(name)
    @match_data = name.match(FAKE_FIRST_LAST)
  end

  def can_guess_first_last?
    !@match_data.nil?
  end

  def first
    @match_data[:first]
  end

  def last
    @match_data[:last]
  end
end

Now, we can use this class:

class MailSending
  def send_mailer(person)
    parser = LegacyFirstLastParser.new(person.name)
    if parser.can_guess_first_last?
      legacy_carrier(parser.first, parser.last)
    else
      handle_manually(person)
    end
  end
end

Hmm. LegacyFirstLastParser was just plucked out of the ether. It definitely is not a closely-related unit based on our definition. We’ll need to create that via some sort of private method:

class MailSending
  def send_mailer(person)
    parser = legacy_first_last_parser(person.name)
    #        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    if parser.can_guess_first_last?
      legacy_carrier(parser.first, parser.last)
    else
      handle_manually(person)
    end
  end

private

  def legacy_first_last_parser(name)
    LegacyFirstLastParser.new(name)
  end
end

Of course, legacy_first_last_parser has the same problem as send_mailer, in that it pulls LegacyFirstLastParser out of thin air. This means that MailSending has to be given the class, so let’s invert those dependencies:

class MailSending
  def initialize(legacy_first_last_parser_class)
    @legacy_first_last_parser_class = legacy_first_last_parser_class
  end

  def send_mailer(person)
    parser = legacy_first_last_parser(person.name)
    if parser.can_guess_first_last?
      legacy_carrier(parser.first, parser.last)
    else
      handle_manually(person)
    end
  end

private

  def legacy_first_last_parser(name)
    @legacy_first_last_parser_class.new(name)
  end
end

This change now requires changing every single use of the MailSending class to pass in the LegacyFirstLastParser class. Sigh.

Is this all better code? Should we have not done this because Regexp and MatchData are in the standard library? The Law certainly doesn’t make that clear.

Just as with all the various SOLID Principles, we really should care about keeping the coupling of our classes low and the cohesion high, but no Law is going to guide is to the right decision, because it lacks subtly and nuance. It also doesn’t provide much help once we have a working understanding of coupling and cohesion. When a team aligns on what those mean, code can discussed directly—you don’t need a Law to help have that discussion and, in fact, talking about it is a distraction.

Suppose we kept our discussion of send_mailer to just coupling. It’s pretty clear that coupling to the language’s standard library is not a real problem. We’ve chosen Ruby, switching programming languages would be a total rewrite, so coupling to Ruby’s standard library is fine and good.

Consider discussing coupling around determine_tax_method. We might have decided that since people, addresses, and countries are central concepts in our app, code that’s coupled to them and their interrelationship is generally OK. If these concepts are stable, coupling to them doesn’t have a huge downside. And the domain should be stable.

Damn the Law.