Connascence of Algorithm

Connascence of algorithm is the next level of connascence that we’re going to be looking at. We’re still in the static category of connascence, at level 4 (or 5, there seems to be some disagreement on the order of connascence of algorithm and connascence of position).

Connascence of algorithm is similar to connascence of meaning, though it tends to be more broad than agreeing on the meaning of one variable or value. Connascence of algorithm refers to when multiple components must agree on an algorithm used to perform a particular task- the obvious candidates for this are encoding, encrypting and decrypting, and hashing. There are also more subtle cases of connascence of algorithm, such as data validation.

Consider the following:

Wow. So much connascence of algorithm! Anyone would think it was an arbitrary example designed to prove a point…

Anyway, let’s take a closer look at the classes we have here. We have a Dinosaur that is created from DNA that has to be in a certain format. The dinosaur itself probably doesn’t care that much, but it makes it interesting for the other classes that rely on Dinosaur.

The DinosaurTransmitter and DinosaurReceiver are where the bulk of the connascence of algorithm lies. Our transmitter gets the plain DNA of the dinosaur and transforms the format into FASTQ for easier data transmission. We then encode the DNA into bytes with a Base64 encoder, using the bytes of the DNA in UTF-8 format. To ensure correct delivery of the entire payload, we’re also passing along an MD5 hash as a checksum.

And here lies our problem- The DinosaurReceiver is completely separate from the DinosaurTransmitter (Maybe even a different application completely), yet it has to know about each and every one of those algorithms- both individually, and the overall “transmission” algorithm, made up of all of those steps. We have to agree on using MD5 as the hashing algorithm for the checksum, we have to agree that the DNA is being sent Base64 encoded. We have to agree that the bytes being sent are UTF-8 encoded, and we have to agree that we’re using the FASTQ format to transmit the data, and the Dinosaur has to be created with plain format. If we change any of these encodings or hashing algorithms in one place, we have to change them in the other.

Another more subtle variation on connascence of algorithm is in data validation- where multiple software components must agree on the algorithm to use to ensure valid data. Let’s see what this might look like in action.

We’ve put safeguards in each of our application layers here, to make sure the name we’re assigning our dinosaur is valid. However, this has introduced connascence of algorithm- as if the validation algorithm ever changes in one place, it will have to change in all the others in order to remain correct and consistent across all layers.

Fortunately for us, connascence of algorithm can also be refactored into connascence of name in most cases. Let’s take our first example of transmitting a dinosaur. Consider the following changes:

We’ve extracted the majority of the connascence of algorithm into a DinosaurTransformer class, that the DinosaurTransmitter and DinosaurReceiver can refer to by name. The transformer still contains our connascence of algorithm in itself- the decode method must do the opposite of the encode method, but we’ve tried to extract as much of the algorithmic similarity into named concepts- both methods refer to exactly the same Charset, MessageDigest and DnaTransformer explicitly, reducing the connascence to connascence of name. The transmitter and receiver are now connascent with the DinosaurTransformer by name, but the connection they had with each other using the same algorithms in the same order is now broken. One must encode, the other must decode, and neither cares how it happens.

Our second example is more difficult to reduce the connascence- these are algorithms written in different languages for the same purpose, with no easy way of unification. And that’s probably ok- we eliminate as much connascence as we can given the constraints that we have, and remain conscious of the connascence that is left behind.


Connascence of Meaning

Connascence of meaning is the first type of connascence that we might want to do something about, rather than just being a natural part of programming itself. Connascence of meaning refers to when multiple components must agree on the meaning of values being used. Connascence of meaning is also called connascence of convention, as it can be better described as multiple components sharing a convention for the meaning of particular values. We’re going to use both of these terms throughout.

Consider the following classes:

These classes are coupled together by connascence of meaning- they both have to agree on the convention that using the number 1 for a dinosaur’s diet means carnivore, and 2 means herbivore. Presumably when they get some omnivorous dinosaurs, they’re going to have to agree that 3 means omnivore too.

A more common and insidious example of connascence of meaning in Java is agreeing on the meaning of null values.

Consider the following classes:

We’ve got a whole bunch of connascence of meaning over here around the meaning of null.

DinosaurEgg has some connascence over the meaning of timeLaid- all callers must agree with the convention of time laid referring to seconds. We also see our first example of the infamous “meaning of null”, where in this case null refers to the fact that the egg could not be hatched, because it wasn’t laid more than 10 minutes ago.

Our  DinosaurRegistry  uses a null return value when calling get() to indicate that the specified Dinosaur could not be found. Any callers must agree on the convention that null in this case means “dinosaur not found”.

DinosaurLab is the first place we experience this connascence of meaning across different components, as it has to interact with both DinosaurEgg and DinosaurRegistry, and as such has to agree on the convention of null. The DinosaurLab agrees on the meaning of null for the DinosaurRegistry, and does a null check on get() to make sure the specified name is not already registered for a Dinosaur (We don’t want a T-Rex and a Triceratops to both be called Bessie- what if you get confused when someone asks you to go give Bessie her medicine and wander off to the T-Rex enclosure? Not good things). The lab also has to agree on the convention of an egg returning null if it can’t be hatched, and checks whether the dinosaur returned from the egg is null or not.

You’ll notice that the DinosaurLab returns null in two different places, and for two different reasons. In the registry and the egg, null had one meaning and one cause. Not brilliant, but at least the calling classes only had to agree on the convention of null meaning one thing. In the lab, the null convention is used to convey that an egg could not be hatched- both because the egg is not ready to be hatched, and because the name is already registered. When our caller gets a null back from the lab- what do they do? Null means an egg couldn’t be hatched, but we have no idea why. Do we wait before trying again because the egg isn’t ready, or do we change it’s name because there’s already a T-Rex called Princess?

Fortunately, there are things we can do to reduce connascence of meaning into connascence of type and connascence of name.

For our first example, we can easily remove the convention of particular numbers meaning particular diets by introducing our own Diet type, we we can reference the diet of the dinosaur by name.

Consider the following change:

We’ve introduced a new enum Diet to describe the diet of our Dinosaur, and no longer have to rely on knowing that 1 = carnivore and 2 = herbivore, we’ve got it defined right there in the name and type. This is much better than relying on the implicit convention we were using before, as it’s much harder to mess up.

In our second example, the easiest way to remove the different conventions and meanings of null is to just not use null any more. Instead of using null to represent a negative case, we’re going to throw exceptions. We’re also going to introduce some new methods to call- so we don’t have to use the new exceptions we’re throwing to control the flow of our program.

Consider the following changes:

Our DinosaurEgg now throws an EggNotReadyToHatchException if the egg is not ready to hatch, instead of returning null. Now instead of callers relying on the meaning of null being “not ready to hatch”, they can rely on the name of the exception that is thrown, which reduces our connascence of meaning over the convention of null down to connascence of name- of the exception that is thrown. We’ve also made that isReadyToHatch() method public, as it allows us to not have to rely on try-catching an exception to control our program flow. We got rid of the ambiguity around timeLaid, and change it to be an explicit date. We also got rid of the implicit meaning of “600” being 10 minutes, by making it a named constant so we can more easily understand the meaning of the value.

DinosaurRegistry throws a DinosaurNotFoundException when using the get- as it shouldn’t be the usual case that calling get() doesn’t return a dinosaur. Now people won’t have to rely on knowing the meaning of a returned null, they can instead rely on the name of the exception that is thrown. We’ve put another method in place to check whether a name is available, again so we don’t have to rely on try-catch to control our program flow. We probably shouldn’t have been using that method for checking names in the first place, really.

Our DinosaurLab can now use our two modified classes, safe in the knowledge that the only things it now needs to care about is the names of the new methods to call, or alternatively the names of the new exceptions being thrown from the existing methods. We’ve also removed the conflated meaning of null being returned out of the lab, by introducing new exceptions for each individual reason for failure.

Connascence of meaning is the lowest level of connascence that we can realistically do something about- and even though it is such a low level, it doesn’t mean we should just let it be. The examples we used today were small, but if left to grow unchecked could grow to tangle up the whole codebase in collective understanding of the meaning of certain values- the problem becomes much worse when it’s a DinosaurHandler, ShippingContainer, FoodFactory, Dinosaur, DinosaurLab, DinosaurEnclosure, etc. that all rely on the convention of 1 meaning herbivore and 2 meaning carnivore… or was it the other way around?