Skip to content

new fast isValidCell#968

Merged
ajfriend merged 48 commits into
uber:masterfrom
ajfriend:new_isValidCell
Mar 10, 2025
Merged

new fast isValidCell#968
ajfriend merged 48 commits into
uber:masterfrom
ajfriend:new_isValidCell

Conversation

@ajfriend

@ajfriend ajfriend commented Feb 17, 2025

Copy link
Copy Markdown
Collaborator

Taking over from #496.

New isValidCell should be faster in all cases. Relies on bit manipulation instead of loops.

In the last correctness check, we use compiler intrinsics, so we'll want to verify that that approach works on all platforms. We can always fall back to a loop, and that would still be faster (see the benchmarks plot below). However, we need to ensure that we have the right test to evaluate which approaches are available on each platform. I'm not currently sure how to do that in an exhaustive way.

@coveralls

coveralls commented Feb 17, 2025

Copy link
Copy Markdown

Coverage Status

coverage: 98.786% (+0.002%) from 98.784%
when pulling 52c7146 on ajfriend:new_isValidCell
into 42450ce on uber:master.

@ajfriend ajfriend marked this pull request as draft February 18, 2025 04:06
@ajfriend

Copy link
Copy Markdown
Collaborator Author

OK, current approach breaks some rules:

/home/runner/work/h3/h3/src/h3lib/lib/h3Index.c: In function ‘_isValidCell_const’:
/home/runner/work/h3/h3/src/h3lib/lib/h3Index.c:297:14: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]

@ajfriend

Copy link
Copy Markdown
Collaborator Author

Speedup results on my M3 mac are promising. This is for 8_14_null_100.

  • old is the original algorithm
  • final is the final algorithm using intrinsics
  • fallback is the final algorithm, but will the fallback for when intrinsics are not availab.e
image

Speedups, comparing the fastest run for each name and algo:

┌───────────────┬─────────┬─────────┬─────────┐
│     name      │  algo1  │  algo2  │ speedup │
│    varchar    │ varchar │ varchar │ double  │
├───────────────┼─────────┼─────────┼─────────┤
│ 2_8           │ old     │ final   │     6.4 │
│ 8_14          │ old     │ final   │     9.1 │
│ 8_14_null_10  │ old     │ final   │     7.8 │
│ 8_14_null_100 │ old     │ final   │     8.8 │
│ 8_14_null_2   │ old     │ final   │     5.0 │
└───────────────┴─────────┴─────────┴─────────┘

@ajfriend

Copy link
Copy Markdown
Collaborator Author

Latest benchmarks of final algo:

┌───────────────┬──────────────┬───────┐
│     name      │ microseconds │ runs  │
│    varchar    │    double    │ int64 │
├───────────────┼──────────────┼───────┤
│ 2_8           │      145.976 │   500 │
│ 8_14          │      145.646 │   500 │
│ 8_14_null_10  │      157.995 │   500 │
│ 8_14_null_100 │      149.786 │   500 │
│ 8_14_null_2   │      145.231 │   500 │
└───────────────┴──────────────┴───────┘

@ajfriend ajfriend marked this pull request as ready for review February 22, 2025 20:47
@ajfriend ajfriend mentioned this pull request Feb 22, 2025
@ajfriend ajfriend changed the title [wip] new fast isValidCell new fast isValidCell Feb 22, 2025
@ajfriend

ajfriend commented Feb 22, 2025

Copy link
Copy Markdown
Collaborator Author

TODO: We test 32 bit windows in CI. Should we do the same for mac and linux?

@isaacbrodsky

Copy link
Copy Markdown
Collaborator

TODO: We test 32 bit windows in CI. Should we do the same for mac and linux?

I don't think there is an equivalent Mac 32-bit.
Linux we can test a x86 or ARM 32-bit configuration

Comment thread src/h3lib/lib/h3Index.c
[4] = 1, [14] = 1, [24] = 1, [38] = 1, [49] = 1, [58] = 1,
[63] = 1, [72] = 1, [83] = 1, [97] = 1, [107] = 1, [117] = 1};

if (isBaseCellPentagonArr[base_cell]) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use (perhaps an inlined version of) _isBaseCellPentagon?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll run some benchmarks and see if there's any difference.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely slower with the naive use of _isBaseCellPentagon. Test setup in the latest commit:

image

I'll figure out the right extern and inline incantation and try that next.

Another alternative might be to just expose the BaseCellData directly.

More generally for the H3 lib as a whole, I'm not sure if there's a benefit to using a struct of arrays vs an array of structs.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doing this will be a little involved, so I'm proposing doing it later :)

#984

@ajfriend

ajfriend commented Feb 25, 2025

Copy link
Copy Markdown
Collaborator Author

TODO: We test 32 bit windows in CI. Should we do the same for mac and linux?

I don't think there is an equivalent Mac 32-bit. Linux we can test a x86 or ARM 32-bit configuration

Think that's a blocker for this PR, or can I create a separate task/PR to add that to CI? @isaacbrodsky

@isaacbrodsky

isaacbrodsky commented Feb 25, 2025

Copy link
Copy Markdown
Collaborator

TODO: We test 32 bit windows in CI. Should we do the same for mac and linux?

I don't think there is an equivalent Mac 32-bit. Linux we can test a x86 or ARM 32-bit configuration

Think that's a blocker for this PR, or can I create a separate task/PR to add that to CI? @isaacbrodsky

I don't think it's a blocker, but we could certainly merge it in first.

My sense is Linux x86/i*86 is much less common today and probably not an architecture we need to be too concerned with testing.

I'll open a PR to include Linux arm64. I don't think Github provides Linux arm32 runners, at least as standard: https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

edit: #974

@ajfriend

ajfriend commented Mar 6, 2025

Copy link
Copy Markdown
Collaborator Author

Issues trying to inline:

image image

@dfellis

dfellis commented Mar 6, 2025

Copy link
Copy Markdown
Collaborator

I'm not super familiar with inline functions in C, but going by this StackOverflow answer you need inline int ... in the header, and extern int ... in the implementation.

@ajfriend

ajfriend commented Mar 7, 2025

Copy link
Copy Markdown
Collaborator Author

I'm not super familiar with inline functions in C, but going by this StackOverflow answer you need inline int ... in the header, and extern int ... in the implementation.

Nice. That works for me locally, but with some errors. Let's see now it does in CI.

@ajfriend

ajfriend commented Mar 9, 2025

Copy link
Copy Markdown
Collaborator Author

Note the follow-up: #984

@ajfriend ajfriend merged commit 306a7cb into uber:master Mar 10, 2025
@ajfriend ajfriend deleted the new_isValidCell branch March 10, 2025 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants