As mentioned in the original post, I’m realizing that the SOLID principles are not as…solid as it would seem. The first post outlined the problems I see with the Single Responsibility Principle, and in the second, I recommended ignoring the Open/Closed Principle, since it is confusing and most reasonable interpretations give bad advice. Now, let’s talk about the Liskov Substitution Principle, which, as it turns out, is not design advice at all.
This principle states that “Objects in a program should be replaceable with instances of their subtypes without altering the correctness of the program”. To understand this, we need to know what “correctness of the program” means.
To figure that out, it’s useful to see where this principle was developed and, as it happens, it was not developed or coined by Barbara Liskov, for whom the principle is named.
Liskov and Jeannette Wing did author a paper that attempts to define subtypes in a way that relates to program correctness. In the paper, they state that if we use an object y in place of an object x, but y does not have all the same properties of x, then y is not a subtype of x.
Martin’s paper doesn’t make a very strong case about what problem the principle is trying to solve, and presents some convoluted examples that justify this principle’s existence, but it gives no direction on how to understand or apply this principle.
I’m tempted to write it off as just confusing and vague, but I’m very bothered by the insistence on the use of “correctness”.
What, exactly, is “program correctness”?
Wikipedia defines program correctness as:
[An algorithm is correct] when it is said that the algorithm is correct with respect to a specification. Functional correctness refers to the input-output behavior of the algorithm (i.e., for each input it produces the expected output).
This definition seems reasonable, however we yet again are faced with this requirement of having a specification. Not only is a specification rarely present in the development of most software, agile software development (ironically developed and championed by Martin) often eschews having one anyway, preferring to iterate on the software with user feedback.
So I’m left struggling with how I’m supposed to evaluate my design based on correctness, which requires a specification, which I don’t have.
But even a you’ll-know-it-when-you-see-it definition of correctness still leads us into a strange path.
Suppose we wish to sort the contents of a bunch of files. Say we have a directory of files, and we wish to produce a single file with all their lines sorted. We want to defer the details of how the sorting is done to a passed-in object, so our central routine might look like so:
def sort_files(files_dir, destination_file, sorter) files_in_dir = readdir(files_dir) sorter.sort_contents(files_in_dir, destination_file) end
The caller of
sort_files can provide any implementation for
sorter. And, as long as those implementations do not change
the correctness of the program, we will consider our design good, because it does not violate the Liskov Substitution
Consider two possible sorting algorithms. The first, which we’ll call
MemQuicksort, reads all the files’ lines into memory
and does a quicksort on them. It then writes the sorted results to
destination_file. This seems to satisfy the program’s requirements.
Now suppose that we have another implementation called
FileMergeSort, which uses a merge sort to basically sort the files on
disk and avoid reading every single line into memory. It requires more disk space, but not as much memory. This, too, would
seem to satisfy the program’s requirements. Both implementations, given the same input, produce the same output.
Or do they?
These two implementations fundamentally change how the software will behave, and isn’t that considered an “output”? Depending on circumstances outside the control of the source code (namely the amount of disk, the amount of memory, and the size of the files), the program might not work at all. Or it might work more slowly than we’d like. Or it might cost too much to run because of the memory required.
You see, there are more inputs to our program than just the directory where the files are, the destination file, and the sort algorithm to use. There are some implicit inputs, such as the computer on which the program will run, the memory it’s been allocated, the size of the disk, etc.
This means that our definition of correctness likely has to account for all the inputs and all the outputs, including the program’s actual behavior. Right? And if so, how could any subtype not affect some of these in some way? The whole reason we create subtypes is to change behavior.
This tells me that all subtypes violate the principle, depending on the definition of correctness that we’re using. And much like discussing the Single Responsibility Principle usually involves debating what a “responsibility” is instead of the code in question, I can’t help but think that the Liskov Substitution Principle devolves into a debate about “correctness” rather than talking about the code.
My takeaway here is that focusing on subtypes is not the right lens through which our designs should be analyzed. It doesn’t provide any clarity about how to improve our designs. It’s hard to even see this principle as design advice on any level.
My advice: This is not design guidance, ignore it, stop talking about subtypes, and focus on building software to solve the problems you have.
Next up, the Interface Segregation Principle, another prescription for making flexible code when it’s not called for.