The sad state of extension methods in the JVM ecosystem

May 7, 2019 8 minutes read

Most of my time I write Java code. It isn’t perfect, yet I find it good enough as my primary language. But sometimes I dream about the future of Java.
Given the latest activities around the Java language design (See Project Amber), the future looks brighter JEP-by-JEP, but one feature I really miss which isn’t discussed by Amber is… Extension methods.

The problem

Java language was heavily influenced by the OOP design. We have a class, we have instances of this class. The class defines his methods, can be extended and have more methods. Simple.

Consider java.lang.String class. It has around 60+ methods defined, and some of them are purely utility methods like String#toUpperCase().

We can instantiate it:

var s = new String(bytes);

We can of course call the methods of it:

System.out.println(s.toUpperCase());

But how does this class scale?
Imagine if we have 500 different utility methods defined in java.lang.String?
Somebody mentioned “modular Java”?
Plus, there always will be some method that is missing but somebody wants it.

What we have in Java

String manipulations is a very common thing, and there are utility libraries like Apache Commons or Guava that help to perform common operations on Strings:

var myString = StringUtils.leftPad(Strings.repeat(StringUtils.swapCase(StringUtils.chomp(s, " ")), 3), 80);

Thanks to static methods, we can easily apply methods that are not defined in java.lang.String. Yay! But…

You must have noted that my blog engine rendered that example with a vertical scrollbar. It is indeed pretty long!
We can try to improve it:

var myString = StringUtils.leftPad(
    Strings.repeat(
        StringUtils.swapCase(
            StringUtils.chomp(s, " ")
        ),
        3
    ),
    80
);

While it helps a bit, it is still way too far from being perfect! Utility class’ name makes it harder to focus on the code, and methods’ arguments are hard to follow.

Can you say what (and how) this code is doing after spending 2 seconds reading it? I don’t.

One can say “use static imports”:

var myString = leftPad(repeat(swapCase(chomp(s, " ")), 3), 80);

But IMO it only makes the code harder to read.

You have to know where the method is defined now.
Is it a method from the current class?
Or maybe from his parent?
Or maybe this is an interface default method?
What if there is a method with the same name in the class hierarchy?

Luckily, Java isn’t the only language in the JVM ecosystem, and we can take a look how this problem is solved in other JVM languages.

In other languages

Groovy

One of the biggest advantages of Groovy when it was introduced to the Java community were the extension modules and default methods:

def file = new File(...)
def contents = file.getText('utf-8')

Where:

// ResourceGroovyMethods.java registered via o.c.groovy.runtime.ExtensionModule
public class ResourceGroovyMethods {

    // ...

    public static String getText(File file, String charset) throws IOException {
        return IOGroovyMethods.getText(newReader(file, charset));
    }

}

Now, once such extension methods are defined, we can call them on instances without having to modify the JDK.
Remember list.forEach { println(it) }? This is an extension method too!

Note that it uses the global registry of extension methods. Two modules define the same method? Oh well…

Kotlin

Kotlin extension functions are easier to define because they are not global, but importable and operate on this:

// listExtensions.kt
fun MutableList<Int>.swap(index1: Int, index2: Int) {
    val tmp = this[index1] // 'this' corresponds to the list
    this[index1] = this[index2]
    this[index2] = tmp
}

import listExtensions.swap

val l = mutableListOf(1, 2, 3)
l.swap(0, 2) // 'this' inside 'swap()' will hold the value of 'l'

You decide which function to import, and different Kotlin files may use different extension methods.
There is a bunch of helpful extensions in Kotlin’s stdlib.

Note that unlike Groovy, you can’t write your extension functions in Java. Only in Kotlin.

Scala

I have very little experience in Scala, so I just googled the topic and this is what I found:

implicit class ExtendedInt(val value: Int) extends AnyVal {
  def sumWith(other:Int) = value + other
}

var myInt = 1.sumWith(2)

~~Me being me: of course there is implicit! Everything is implicit in Scala 😅~~

But the truth is… I think none of them are good and Java should implement them like that 😱

Can it be better?

I’m sure there are many other JVM languages, but these are top popular ones, so I will focus on them.

What’s wrong with Groovy? Global nature!
With Kotlin & Scala? They must be defined in the same language, can’t use existing libraries.

Also, they all have the same thing in common: callsite syntax.

No matter which language it is, it looks like this:

myObj.myExtensionMethods(arg1, arg2, arg3)

It creates a bunch of problems:

if you’re not using an IDE (e.g. reviewing it on GitHub), you need to know from where this method comes from - the original object or some extension
if there is a method with the same name in the object, which one will be selected? Up to the language.
it ain’t making IDEs life easier
as a maintainer of various OSS libraries, I find it bad because some junior developers (and sometimes not even junior ones) think that these methods are defined in the library which defines the original type, and may flood your issue tracker with unrelated stuff.

But does it really have to use the same syntax as a normal method call?

When I was working with JavaScript (and RxJS), I came across the bind operator proposal which looks like this:

function toStringWithPrefix(prefix) {
    return prefix + this
}

let s = email.indexOf("@")::toStringWithPrefix("@ symbol location is: ")

And it gets replaced with:

let s = toStringWithPrefix.call(email.indexOf("@"), "@ symbol location is ")

As you can see, it uses a different syntax (double colon) to separate normal calls from “extensions”.

Due to the use of this, The JS community later have decided to focus on another proposal, The pipe operator, which also uses a different syntax:

function toStringWithPrefix(prefix) {
    // Returns lambda
    return self => prefix + self
}

let s = email.indexOf("@") |> toStringWithPrefix("@ symbol location is: ")

You may question the operator’s syntax, but the idea of not using the method call syntax is nice!
And it works with some existing libraries like Lodash too!

I guess JavaScript folks have had enough fun with the prototypes and same-syntax extensions 😅

Back to Java

⚠️ WARNING!
this section is just the result of my (sick?) imagination and nothing else.
it does not exist (at least not yet), nor planned AFAIK.

How can we apply this knowledge to Java? Let’s define the requirements:

It should not be global
The syntax should give us clear understanding that we’re calling an extension method, for the readability and IDEs
Ideally, it should support existing Java libraries like Apache Commons or Guava, so that they can be used “as is”
Extra points if it can be done without changing the JVM spec / JVM implementation
It should works with nulls too!
Other JVM languages should be able to call these methods

Extension method definition

For the definition, I like how Groovy/Lombok does it.

Here is an example from Project Lombok:

 import lombok.experimental.ExtensionMethod;

@ExtensionMethod({java.util.Arrays.class, Extensions.class})
public class ExtensionMethodExample {
  public void test() {
    int[] intArray = { 5, 3, 8, 2 };
    intArray.sort(); // nice!

    String iAmNull = null;
    iAmNull.toTitleCase(); // null
    "hELlO, WORlD!".toTitleCase(); // Hello, world!
  }
}

class Extensions {  
  public static String toTitleCase(String self) {
    if (self == null || self.isEmpty()) return in;
    return ""
        + Character.toTitleCase(self.charAt(0))
        + self.substring(1).toLowerCase();
  }
}

What we have here:

The toTitleCase extension method is defined as a static method
It works with nulls
We can reuse existing utility methods like the java.util.Arrays class

What’s left:

it still looks like a regular method call
We cannot use individual methods due to @ExtensionMethod annotation

UX

While I really liked the :: syntax from JS, it is already used in Java for the method references.
Since this part is an imaginary one, I will stick to :: due to a lack of any other good option for now:

@ExtensionMethod({java.util.Arrays.class, Extensions.class})
public class ExtensionMethodExample {
  public void test() {
    int[] intArray = { 5, 3, 8, 2 };
    intArray::sort(); // nice!

    String iAmNull = null;
    iAmNull::toTitleCase(); // null
    "hELlO, WORlD!"::toTitleCase(); // Hello, world!
  }
}

Now, the only missing part is the imports. We could borrow C#-style using keyword:

package com.example;

using java.util.Arrays.*; // import-like star imports
using com.example.Extensions.toTitleCase; // use concrete method

public class ExtensionMethodExample {
  public void test() {
    int[] intArray = { 5, 3, 8, 2 };
    intArray::sort(); // nice!

    String iAmNull = null;
    iAmNull::toTitleCase(); // null
    "hELlO, WORlD!"::toTitleCase(); // Hello, world!
  }
}

Neat, isn’t it?

What it gives us

If we rewrite our original String example, we get:

// Before:
var myString = StringUtils.leftPad(
    Strings.repeat(
        StringUtils.swapCase(
            StringUtils.chomp(s, " ")
        ),
        3
    ),
    80
);

// After
var myString = s::chomp(" ")::swapCase()::repeat(3)::leftPad(80);

Maybe the string examples are not that convincing. But consider the following examples:

myList.stream()
    ::ofType(String.class)
    ::collectList();

or:

myList::map(Object::toString);

// where map is:
public static <I, O> List<O> map(List<I> self, Function<I, O> mapper) {
    // TODO optimize later ;)
    return self.stream(mapper).collect(Collectors.toList());
}

or:

User user = null;

String name = myObject::transform(it -> it.getName())::or("Anonymous");

// Where:
public static <I, O> O transform(I self, Function<I, O> mapper) {
    return mapper.apply(self);
}

public static <T> T or(@Nullable T self, T fallback) {
    return self != null ? self : fallback;
}

There is also Flow.Publisher<T> type in Java 9+, which does not define any operators and is not user friendly without the operators:

Flow.Publisher<User> publisher = ...;

return publisher
    ::filter(it -> it.getAge() >= 18) // generic operator for any Flow.Publisher
    ::asFlux() // convert to Project Reactor's Flux
    .map(it -> it.getName()::or("Anonymous"))
    ::onErrorLogAndReturnEmpty();

Conclusion

With the emerging functional programming in Java, having extension methods would help to keep the original types small, but also remove a lot of verbosity from users’ code.

Library maintainers would also benefit from them.
Just look at the amount of methods defined in Reactor’s Flux! 😅