Guava caches have weird sizes

Recently i solved a bug in an open source project relating to google guava caches and kotlin collections


Recently i found a stack trace in an open source project i sometimes contribute to. Very rarely some features throw a NoSuchElementException:

Caught an NoSuchElementException in ChumBucketHider at CheckRenderEntityEvent: null

Caused by java.util.NoSuchElementException: null
    at com.google.common.cache.LocalCache$HashIterator.nextEntry(LocalCache.java:4343)
    at com.google.common.cache.LocalCache$KeyIterator.next(LocalCache.java:4362)
    at kotlin.collections.CollectionsKt___CollectionsKt.toSet(_Collections.kt:1347)
    at SH.utils.TimeLimitedSet.toSet(TimeLimitedSet.kt:31)
    at SH.features.fishing.ChumBucketHider.onCheckRender(ChumBucketHider.kt:57)
    ...

This exception seemed to happen only fairly rarely and up until then nobody could figure out what the problem was. There were some suspicions regarding threading, but nothing concrete. Since it was fairly rare, the fix that was eventually implemented was just to surround the access with a try-catch and try again at some later point. Let’s look at what this function call was actually supposed to do.

TimeLimitedSet is a class based on guava’s Cache. It is used as a list of entities that get automatically removed after 5 seconds. While the Cache offers some basic methods for checking if a key is present, if you want to iterate over all entries you need to use the asMap() method. This returns a live view of the cache, meaning changes in the cache get reflected in the returned map as well.

The interesting part about that returned Map is not actually the live-ness of it, but rather that it is does not have the correct size. While this is documented for the size() method of the Cache object, it is not part of the interface contract of a regular Map or Collection. This can cause some confusion (as it did here), when other pieces of code are not aware that the size() might be incorrect.

Kotlin offers a bunch of functions for creating collections, such as iterable.toSet(), iterable.toList() and more. The resulting collections are read only variants of those collections, and have optimizations for small collection sizes, such as 0 or 1:

public fun <T> Iterable<T>.toSet(): Set<T> {
    if (this is Collection) {
        return when (size) {
            0 -> emptySet()
            1 -> setOf(if (this is List) this[0] else iterator().next())
            else -> toCollection(LinkedHashSet<T>(mapCapacity(size)))
        }
    }
    return toCollection(LinkedHashSet<T>()).optimizeReadOnlySet()
}

If you don’t know Kotlin this might be a but confusing, but the gist is this: the this refers to the iterable that is being converted. when (size) is used as a switch over the .size() function of the iterable collection (if it is a Collection). If there is only 1 element, it will use iterator().next() to extract that element and turn it into an unmodifiable singleton set.

Here is where things go wrong for the guava cache. Since the cache map can return higher numbers than what is actually present in the map if an entry hasn’t been evicted yet, Kotlin is being mislead. The size can be 1 but the actual iteration will be empty, causing a crash.

The fix for this is to just use a regular new SomeSet() constructor, since those do not use the small collection optimization.

This was an interesting find for me. I haven’t worked much with guava caches and they definitely have some quirks i wasn’t ware of, but they are well documented if you know what to look for.