/ DEDUPLICATION, API

Deduplication trick in legacy code

It might happen that you need to deduplicate a list of items…​ coming from legacy code. The class - let’s call it LegacyObject has already implementations for equals() and hashCode(). It’s not possible to change the implementation, for fear of breaking the running code. And unfortunately, the Java API doesn’t offer a distinctBy() feature.

In that case, a cheap trick is to create a wrapper class around LegacyObject, with the desired implementation:

public class LegacyObject {

  private final UUID id;
  private final String foo;
  private final int bar;

  public LegacyObject(UUID id, String foo, int bar) {
    this.id = id;
    this.foo = foo;
    this.bar = bar;
  }

  @Override
  public int hashCode() {
    return Objects.hash(id);
  }

  // Implementation of equals() using only the id field
  // Getters
}

public class DeduplicateWrapper {

  private final LegacyObject object;

  public DeduplicateWrapper(LegacyObject object) {
    this.object = object;
  }

  public LegacyObject getObject() {
    return object;
  }

  @Override
  public int hashCode() {
    return Objects.hash(object.getFoo());
  }

  // Implementation of equals() using only the foo field of the wrapped object
}

With that, it’s a no-brainer to deduplicate a collection using the stream API:

List<LegacyObject> duplicates = ...;

duplicates.stream()
    .map(DeduplicateWrapper::new)
    .distinct()
    .map(DeduplicateWrapper::getObject);

It works also using pre-Java 8 code, but with much more ceremony involved:

List<LegacyObject> deduplicated = new ArrayList<>();
Set<DeduplicateWrapper> wrappers = new HashSet<>();
for (LegacyObject duplicate: duplicates) {
  wrappers.add(new DeduplicateWrapper(duplicate));
}
for (DeduplicateWrapper wrapper: wrappers) {
  deduplicated.add(wrapper.getObject());
}

Of course, if you’re lucky enough being able to use Kotlin, this becomes a no-brainer:

val duplicates: List<LegacyObject> = ...
duplicates.distinctBy { it.foo }

Alternatively, third-party libraries exist that can get the job done.

Nicolas Fränkel

Nicolas Fränkel

Nicolas Fränkel is a technologist focusing on cloud-native technologies, DevOps, CI/CD pipelines, and system observability. His focus revolves around creating technical content, delivering talks, and engaging with developer communities to promote the adoption of modern software practices. With a strong background in software, he has worked extensively with the JVM, applying his expertise across various industries. In addition to his technical work, he is the author of several books and regularly shares insights through his blog and open-source contributions.

Read More
Deduplication trick in legacy code
Share this