Skip to content

Avoid allocating Collections.nCopies in ScopeStack #16258

@lukaseder

Description

@lukaseder

ScopeStack is quite on the hot path for SQL rendering. Inside its internals, we're using nCopies to conveniently allocate a list of null values:

    private void set0(List<V> list, V value) {
        int l = scopeLevel + 1;
        int size = list.size();
        if (size < l)
            list.addAll(nCopies(l - size, null));
        list.set(scopeLevel, value);
    }

This could be avoided easily with a loop:

    private void set0(List<V> list, V value) {
        int l = scopeLevel + 1;
        int size = list.size();
        if (size < l) {
            int nulls = l - size;
            for (int i = 0; i < nulls; i++)
                list.add(null);
        }
        list.set(scopeLevel, value);
    }

A profiling session shows that we're allocating 6 of these lists per rendering:

image

That seems to be a low amount of overhead, but it's not negligible as shown in benchmarks:

Before

BenchmarkExecution.testExecuteOnH2            thrpt    7  89742.402 ┬▒ 2245.476  ops/s
BenchmarkExecution.testExecuteOnMock          thrpt    7  97799.574 ┬▒ 1438.548  ops/s
BenchmarkImplicitJoin.testRenderExplicitJoin  thrpt    7  97711.156 ┬▒ 1667.901  ops/s
BenchmarkImplicitJoin.testRenderImplicitJoin  thrpt    7  80856.308 ┬▒ 1379.261  ops/s

After

BenchmarkExecution.testExecuteOnH2    thrpt            7   91395.750 ┬▒ 3148.779  ops/s
BenchmarkExecution.testExecuteOnMock  thrpt            7   99775.405 ┬▒ 1039.559  ops/s
BenchmarkImplicitJoin.testRenderExplicitJoin  thrpt    7  101599.864 ┬▒ 2519.052  ops/s
BenchmarkImplicitJoin.testRenderImplicitJoin  thrpt    7   81835.718 ┬▒ 2357.342  ops/s

Alternative

An alternative would be to keep addAll(), which tends to be faster for larger argument lists because it can better resize the ArrayList internal array:

    private final V create0(K key, List<V> list) {
        V result = constructor.apply(key, scopeLevel);
        set0(list, result);
        return result;
    }

    private void set0(List<V> list, V value) {
        int l = scopeLevel + 1;
        int size = list.size();
        if (size < l)
            list.addAll(nCopiesOfNulls(l - size));
        list.set(scopeLevel, value);
    }

    static final List<?> NULLS_1 = nCopies(1, null);
    static final List<?> NULLS_2 = nCopies(2, null);
    static final List<?> NULLS_3 = nCopies(3, null);
    static final List<?> NULLS_4 = nCopies(4, null);

    private static <V> List<V> nCopiesOfNulls(int size) {
        switch (size) {
            case 1: return (List) NULLS_1;
            case 2: return (List) NULLS_2;
            case 3: return (List) NULLS_3;
            case 4: return (List) NULLS_4;
            default: return nCopies(size, null);
        }
    }

Given that we're usually not going too deeply into new scopes (e.g. for nested selects), this looks like an unnecessary optimisation. In another benchmark, things don't get faster this way.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions