Skip to content

Conversation

@nyckyta
Copy link
Contributor

@nyckyta nyckyta commented May 19, 2025

There are multiple issues that appear in the current version of the implementation:

  1. Unnecessary sorting is done before each selection. SUS does not require sorting, copy and sort look like wasting of resources, can be critical on algorithms requiring highly intensive computation.
  2. Distance between points in SUS must be equally sized intervals. The previous version had random size interval between zero's and first points.
  3. Actual selection of population from points and cumulative probability had a bug. Selection of i-th element has been done through analyzing (i-1)-th interval. Previously, the 0-th element has always been skipped. As a result first interval actually always corresponded to the second element creating a skew in distribution.

For example, if we had probabilities == [0.7, 0.2, 0.1], and points == (0.12, 0.45, 0.78). It can be seen that these parameters will produce [1, 1, 2] population (values are indexes of parents from original populations). Obviously, it is wrong.

It ia also worth to mention, that this fix shows better results on performance tests done for SUS. Previous versions did fail some of the tests. This version actually passes all performance tests.

@jenetics
Copy link
Owner

Hi @nyckyta, thank you for the fix. Do you mind to add also a unit-test, which will verify the fix? :)

@jenetics jenetics added the bug label May 20, 2025
@jenetics jenetics added this to the v8.3.0 milestone May 20, 2025
@nyckyta
Copy link
Contributor Author

nyckyta commented May 21, 2025

Sure. Give me few days, I will add some tests.

There are multiple issues that appear in the current version of the implementation:
1. Unnecessary sorting is done before each selection. SUS does not require sorting,
copy and sort look like wasting of resources, can be critical on algorithms
requiring highly intensive computation.
2. Distance between points in SUS must be equally sized intervals. The previous
version had random size interval between zero's and first points.
3. Actual selection of population from points and cumulative probability had a bug.
Selection of i-th element has been done through analyzing (i-1)-th interval.
Previously, the 0-th element has always been skipped. As a result first interval
actually always corresponded to the second element creating a skew in distribution.

For example, if we had probabilities == [0.7, 0.2, 0.1], and points ==
(0.33, 0.66, 0.99). It can be seen that these parameters will produce [1, 1, 1]
population (values are indexes of parents from original populations). Obviously,
it is wrong.

It ia also worth to mention, that this fix shows better results on performance
tests done for SUS. Previous versions did fail some of the tests. This version
actually passes all performance tests.
@nyckyta
Copy link
Contributor Author

nyckyta commented May 22, 2025

@jenetics , done

@jenetics jenetics changed the base branch from master to releases/r8.3.0 May 22, 2025 20:30
@jenetics jenetics merged commit 31ee7ac into jenetics:releases/r8.3.0 May 22, 2025
4 checks passed
jenetics added a commit that referenced this pull request May 22, 2025
Signed-off-by: Franz Wilhelmstötter <franz.wilhelmstoetter@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants