Optimisations and cleanups#173
Conversation
b29b665 to
4041d2b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
🚀 New features to boost your workflow:
|
90a3ec7 to
32492a9
Compare
onno-vos-dev
left a comment
There was a problem hiding this comment.
Haven't had the time to look through all commits yet but sharing these two review comments as a start
| -define(ISUM_POS, 2). | ||
| -define(FSUM_POS, 3). | ||
| -define(WIDTH, 16). | ||
| -define(WIDTH, erlang:system_info(schedulers_online)). |
There was a problem hiding this comment.
Maybe instead of fixing this to be schedulers_online we can set this up as an application:get_env(<application>, <some_name>, 16) so the original remains preserved but we open up the loophole for setting a higher value.
There was a problem hiding this comment.
Another issue is that schedulers_online can change value dynamically at runtime.
There was a problem hiding this comment.
Right. Should use just schedulers, as that can't change at runtime.
On the other hand I'm not sure it'd help to make it configurable, currently the algorithm that chooses the key would do X band (?WIDTH - 1), so most values would mean skewing the distribution of schedulers to each key and some keys would see more contention than others. Sensibly, if we're keying on the scheduler id, the only values that make sense are multiples (or halves) of the number of schedulers.
| -define(FSUM_POS, 4). | ||
| -define(BUCKETS_START, 5). | ||
| -define(WIDTH, 16). | ||
| -define(WIDTH, erlang:system_info(schedulers_online)). |
There was a problem hiding this comment.
Maybe instead of fixing this to be schedulers_online we can set this up as an application:get_env(<application>, <some_name>, 16) so the original remains preserved but we open up the loophole for setting a higher value.
There was a problem hiding this comment.
This solution has the same issue as @mikpe mentioned above, we don't want to let users change that value dynamically. another issue with defaults - they are rarely changed :)
32492a9 to
ebd885f
Compare
ebd885f to
eb19b61
Compare
eb19b61 to
e2c2b50
Compare
|
About the change to use number of schedulers for the width, what if I have a very large number of schedulers, will it still work? Or is there a limit in ets? |
| -define(ISUM_POS, 2). | ||
| -define(FSUM_POS, 3). | ||
| -define(WIDTH, 16). | ||
| -define(WIDTH, erlang:system_info(schedulers)). |
There was a problem hiding this comment.
X band (?WIDTH - 1) - it seems that WIDTH is expected to be a power of 2
| -define(FSUM_POS, 4). | ||
| -define(BUCKETS_START, 5). | ||
| -define(WIDTH, 16). | ||
| -define(WIDTH, erlang:system_info(schedulers)). |
There was a problem hiding this comment.
X band (?WIDTH - 1) - it seems that WIDTH is expected to be a power of 2
Note that the histogram table actually requires read concurrency, as on every insert first parameters need to be looked up. Otherwise this makes it more flexible to configure any table independently and precisely, may the need arise.
Examine the condition inlined in the helper instead of passing a function object. Also ensure that the entry point has checked the types, when looping on the inner function, the JIT compiler will know already that the inputs are lists and numbers, skipping dynamic type checking on every iteration. The `lists` module implements such optimisation extensively.
|
On @lhoguin comment, I'd say bigger width should just make reads slower as there'd be more rows to read, but inserts should be faster as there'd be less contention. On the other hand, these are educated guesses, not measured points. On @DenysGonchar comment, actually that's a very good point. In general, what we want is to ensure the distribution of values is as uniform as possible across schedulers, I just ran some checks and it seems like indeed my proposal wouldn't be good enough either. For that matter, I'm just taking this change out of this PR, will later on when I find time for it run some actual statistics about distributions, and benchmark different solutions. The rest of the changes still apply. Thank you for your comments 🙇🏽 |
e2c2b50 to
72ed85c
Compare
This PR proposes a miscellaneous of updates and improvements to the repo. Commits are well organised by topics and with good descriptions, so that'd be the best way to review the whole thing, as some changes are not related to each other – but are some times too small to be moved to their own PR, which I could easily do if any of you would prefer me to do anyways.
EDIT: #173 (comment)