- Scala Design Patterns
- Ivan Nikolov
- 1635字
- 2021-07-16 12:57:27
Linearization
As we already saw, traits offer a form of multiple inheritance. In such cases, the hierarchy is not necessarily linear, but forms an acyclic graph that needs to be flattened upon compilation. What linearization does is this: it specifies a single linear order for all of the ancestors of a class, including both the regular superclass chain and the parent chains of all of the traits.
We will not have to deal with linearization in traits that contain no code. However, if we use mixins, we will have to consider it. The following will be affected by linearization:
- Method definitions
- Variables (both mutable—
var
, and immutable—val
)
We already saw a simple example of linearization previously. Things, however, can get much more complicated and unexpected if the rules of linearization are not clear.
Rules of inheritance hierarchies
Before looking into linearization rules, we need to be clear on some inheritance rules in Scala:
- In Java, even if a class does not explicitly extend another one, its superclass will be
java.lang.Object
. The same stands for Scala, and the equivalent base isAnyRef
. - There is a similarity between directly extending a trait and extending the trait superclass and mixing the trait in using the
with
keyword.Note
In older Scala versions, there was another type called
ScalaObject
that was implicitly added to all traits and classes.
Using those rules we can always get to a canonical form for all traits and classes, where the base class is specified using extends
and then all traits are added using the with
keyword.
Linearization rules
Linearization rules in Scala are defined and exist in order to ensure well-defined behavior. The rules state the following:
- The linearization of any class must include the unmodified linearization of any class (but not trait) that it extends.
- The linearization of any class must include all the classes and mixin traits in the linearization of any trait it extends, but the mixin traits are not bound to appear in the same order as they appear in the linearization of the traits being mixed in.
- Each class or trait in the linearization can appear only once. Duplicates are ignored.
We already saw in some of the previous examples that it is not possible to mix in traits that have different base classes or to mix in a trait into a class when their base classes differ.
How linearization works
In Scala, linearizations are listed from left to right where the right-most class is the most general, for example, AnyRef
. While doing linearization, Any
is also added to the hierarchy list. This combined with the rule that any class must include the linearization of its superclass means that the superclass linearization will appear as a suffix of the class linearization.
Let's see an example with some really simple classes:
class Animal extends AnyRef class Dog extends Animal
The linearization of these two classes will be respectively:
Animal -> AnyRef -> Any Dog -> Animal -> AnyRef -> Any
Let's now try and formalize an algorithm that describes how a linearization is calculated:
- Start with the following class declaration:
class A extends B with T1 with T2
- Reverse the order of the list except the first item and drop the keywords. This way, the superclass will come as a suffix:
A T2 T1 B
. - Each item gets replaced with its linearization:
A T2L T1L BL
. - Concatenate the list elements using the right-associative concatenation operation:
A +: T2L +: T1L +: BL
. - Append the standard
AnyRef
andAny
classes:A +: T2L +: T1L +: BL +: AnyRef +: Any
. - Evaluate the preceding expression. Due to the right-associative concatenation, we start from the right and move to the left. In each step, we remove any element that has already appeared on the right-hand side. In our case, when we get to
BL
, we will not addAnyRef
andAny
that it also contains; we will just addBL
and then we will go on. AtT1L
, we will skip the step to add anything that was added before and so on, until we reachA
.
In the end, after the linearization finishes, we will have a list of classes and traits without duplicates.
Initialization
Now that we know what happens during linearization, we will understand how instances are being created. The rule is that the constructor code is executed in a reverse order as compared to the linearization order. This means that, going from right to left, first the Any
and AnyRef
constructors will be invoked and then the actual class constructor will be called. Also, the superclass constructor will be called before the actual class or any of its mixins because, as we have already mentioned previously, it is added as a suffix.
Having in mind that we traverse the linearization from right to left also means that after the superclass constructor is called, the mixin trait constructors will be called. Here, they will be called in the order in which they appear in the original class definition (because of the right to left direction and the fact that their order is reversed when the linearization is created).
Method overriding
When overriding a method in a subclass, you may want to call the original implementation as well. This is achieved by prefixing the super
keyword to the method name. The developer also has the control to qualify the super
keyword with a trait type, thus calling the method in the specific trait. We already saw an example of this earlier in the chapter, where we called super[A].hello()
. In that example, we had mixins with the same methods; however, the methods themselves did not refer to super, but just defined their own implementations.
Let's see an example here, where we actually refer to the super class when overriding a method:
class MultiplierIdentity { def identity: Int = 1 }
Let's now define two traits that respectively double and triple the identity in our original class:
trait DoubledMultiplierIdentity extends MultiplierIdentity { override def identity: Int = 2 * super.identity } trait TripledMultiplierIdentity extends MultiplierIdentity { override def identity: Int = 3 * super.identity }
As we saw in some of the previous examples, the order in which we mix in the traits matters. We will provide three implementations, where we first mix in DoubledMultiplierIdentity
and then TripledMultiplierIdentity
. The first one will not override the identity method, which is equivalent to using the following super
notation: super.identity
. The other two would override the method and will refer to a specific parent:
// first Doubled, then Tripled class ModifiedIdentity1 extends DoubledMultiplierIdentity with TripledMultiplierIdentity class ModifiedIdentity2 extends DoubledMultiplierIdentity with TripledMultiplierIdentity { override def identity: Int = super[DoubledMultiplierIdentity].identity } class ModifiedIdentity3 extends DoubledMultiplierIdentity with TripledMultiplierIdentity { override def identity: Int = super[TripledMultiplierIdentity].identity } // first Doubled, then Tripled
Let's do the same thing as shown in the preceding code, but this time, we first mix in TripledMultiplierIdentity
and then DoubledMultiplierIdentity
. The implementations are similar to the preceding ones:
// first Tripled, then Doubled class ModifiedIdentity4 extends TripledMultiplierIdentity with DoubledMultiplierIdentity class ModifiedIdentity5 extends TripledMultiplierIdentity with DoubledMultiplierIdentity { override def identity: Int = super[DoubledMultiplierIdentity].identity } class ModifiedIdentity6 extends TripledMultiplierIdentity with DoubledMultiplierIdentity { override def identity: Int = super[TripledMultiplierIdentity].identity } // first Tripled, then Doubled
Finally, let's use our classes:
object ModifiedIdentityUser { def main(args: Array[String]): Unit = { val instance1 = new ModifiedIdentity1 val instance2 = new ModifiedIdentity2 val instance3 = new ModifiedIdentity3 val instance4 = new ModifiedIdentity4 val instance5 = new ModifiedIdentity5 val instance6 = new ModifiedIdentity6 System.out.println(s"Result 1: ${instance1.identity}") System.out.println(s"Result 2: ${instance2.identity}") System.out.println(s"Result 3: ${instance3.identity}") System.out.println(s"Result 4: ${instance4.identity}") System.out.println(s"Result 5: ${instance5.identity}") System.out.println(s"Result 6: ${instance6.identity}") } }
The example shows a multiple inheritance hierarchy, where we can see a diamond relationship exactly as in the previous figure in which we explained what it means. We have all the possibilities here in terms of the order of mixing DoubledMultiplier
and TripledMultiplier,
as well as how we call the identity
base method.
So what would the output of this program be? One would expect that in the cases where we don't override the identity
method, it would call the identity
method of the right-most trait. Since in both the cases they call the super
method of the class they extend, the results should be 2
and 3
. Let's see this here:
Result 1: 6 Result 2: 2 Result 3: 6 Result 4: 6 Result 5: 6 Result 6: 3
The preceding output is rather unexpected. This is, however, how the Scala type system works. In the case of linearization, where we have a multiple inheritance, the calls to the same method are chained from right to left according to the order of the appearance of the traits in the class declaration. Note that if we did not use the super
notation, we would have broken the chain, as can be seen in some of the preceding examples.
Note
The previous example is rather amusing and proves how important it is to know the rules of linearization and how linearization works. Not being aware of this feature could result a serious pitfall, which could lead to critical mistakes in your code.
My advice would still be to try and avoid cases of diamond inheritance, even though one can argue that this way some quite complex systems can be implemented seamlessly and without writing too much code. A case such as the preceding one could make the programs really hard to read and maintain in future.
You should be aware that linearization exists everywhere in Scala—not just when dealing with traits. This is just how the Scala type system works. This means that it is a good idea to be aware of the order in which constructors are called in order to avoid mistakes and generally, to try and keep the hierarchies relatively simple.