-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
ScopeStack is quite on the hot path for SQL rendering. Inside its internals, we're using nCopies to conveniently allocate a list of null values:
private void set0(List<V> list, V value) {
int l = scopeLevel + 1;
int size = list.size();
if (size < l)
list.addAll(nCopies(l - size, null));
list.set(scopeLevel, value);
}This could be avoided easily with a loop:
private void set0(List<V> list, V value) {
int l = scopeLevel + 1;
int size = list.size();
if (size < l) {
int nulls = l - size;
for (int i = 0; i < nulls; i++)
list.add(null);
}
list.set(scopeLevel, value);
}A profiling session shows that we're allocating 6 of these lists per rendering:
That seems to be a low amount of overhead, but it's not negligible as shown in benchmarks:
Before
BenchmarkExecution.testExecuteOnH2 thrpt 7 89742.402 ┬▒ 2245.476 ops/s
BenchmarkExecution.testExecuteOnMock thrpt 7 97799.574 ┬▒ 1438.548 ops/s
BenchmarkImplicitJoin.testRenderExplicitJoin thrpt 7 97711.156 ┬▒ 1667.901 ops/s
BenchmarkImplicitJoin.testRenderImplicitJoin thrpt 7 80856.308 ┬▒ 1379.261 ops/s
After
BenchmarkExecution.testExecuteOnH2 thrpt 7 91395.750 ┬▒ 3148.779 ops/s
BenchmarkExecution.testExecuteOnMock thrpt 7 99775.405 ┬▒ 1039.559 ops/s
BenchmarkImplicitJoin.testRenderExplicitJoin thrpt 7 101599.864 ┬▒ 2519.052 ops/s
BenchmarkImplicitJoin.testRenderImplicitJoin thrpt 7 81835.718 ┬▒ 2357.342 ops/s
Alternative
An alternative would be to keep addAll(), which tends to be faster for larger argument lists because it can better resize the ArrayList internal array:
private final V create0(K key, List<V> list) {
V result = constructor.apply(key, scopeLevel);
set0(list, result);
return result;
}
private void set0(List<V> list, V value) {
int l = scopeLevel + 1;
int size = list.size();
if (size < l)
list.addAll(nCopiesOfNulls(l - size));
list.set(scopeLevel, value);
}
static final List<?> NULLS_1 = nCopies(1, null);
static final List<?> NULLS_2 = nCopies(2, null);
static final List<?> NULLS_3 = nCopies(3, null);
static final List<?> NULLS_4 = nCopies(4, null);
private static <V> List<V> nCopiesOfNulls(int size) {
switch (size) {
case 1: return (List) NULLS_1;
case 2: return (List) NULLS_2;
case 3: return (List) NULLS_3;
case 4: return (List) NULLS_4;
default: return nCopies(size, null);
}
}Given that we're usually not going too deeply into new scopes (e.g. for nested selects), this looks like an unnecessary optimisation. In another benchmark, things don't get faster this way.