Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fast HPACK comparisons when not checking sensitive headers #9259

Merged
merged 1 commit into from
Oct 24, 2019

Conversation

carl-mastrangelo
Copy link
Member

Motivation:
Constant time comparison functions are used to compare HTTP/2 header
values, even if they are not sensitive.

Modification:
After checking for sensitivity, use fast comparison.

Result: Faster HPACK table reads/writes

@netty-bot
Copy link

Can one of the admins verify this patch?

@@ -59,21 +60,16 @@ final int size() {

@Override
public final int hashCode() {
// TODO(nmittler): Netty's build rules require this. Probably need a better implementation.
return super.hashCode();
throw new UnsupportedOperationException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn’t it possible that the user calls this somehow ? Why not just leave it as it is ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure, but I made it throw because it would be unsafe to use this without knowing it's insensitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm still this seems wrong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove hashCode and equals. WDYT?

HpackHeaderField other = (HpackHeaderField) obj;
// To avoid short circuit behavior a bitwise operator is used instead of a boolean operator.
return (HpackUtil.equalsConstantTime(name, other.name) & HpackUtil.equalsConstantTime(value, other.value)) != 0;
throw new UnsupportedOperationException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same... equals should never throw

@normanmaurer
Copy link
Member

@carl-mastrangelo do you have a benchmark by any chance?

@normanmaurer
Copy link
Member

@netty-bot test this please

Copy link
Member

@normanmaurer normanmaurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two questions

@carl-mastrangelo
Copy link
Member Author

The effect is not so pronounced, but it helps in the larger cases. My guess why insensitive is slower for small headers is that there are multiple lookups involved in the dynamic and static table. It only gets faster when the headers are already in the table.

BEFORE:

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt       Score       Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt   10    2572.595 ±    16.184  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt   10   19580.815 ±   397.780  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt   10  379456.381 ±  2059.919  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt   10     730.579 ±     8.116  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt   10    2087.590 ±    84.644  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt   10   11725.228 ±    89.298  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt   10     555.971 ±     5.120  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt   10    2831.874 ±    41.801  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt   10   36054.025 ±   179.504  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt   10     340.337 ±     3.313  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt   10    1006.817 ±     8.942  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt   10    8784.168 ±   164.014  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt   10    2561.934 ±    27.056  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt   10   22061.105 ±   154.533  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt   10  435744.897 ±  8853.388  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt   10    2737.683 ±    47.142  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt   10   22385.146 ±    98.430  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt   10  408159.698 ± 12044.931  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt   10     544.213 ±     3.279  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt   10    2908.978 ±    31.026  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt   10   36471.262 ±  1044.010  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt   10     609.305 ±     4.371  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt   10    3223.946 ±    23.505  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt   10   39975.152 ±   655.196  ns/op

AFTER:

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt       Score      Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt   10    2580.732 ±   39.935  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt   10   19963.569 ±  417.324  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt   10  361053.507 ± 2141.450  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt   10     774.761 ±    3.446  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt   10    1601.366 ±   11.409  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt   10    4852.643 ±   43.546  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt   10     551.301 ±    3.420  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt   10    2821.285 ±   11.290  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt   10   35519.282 ±  194.778  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt   10     251.308 ±    2.342  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt   10     509.551 ±    3.531  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt   10    1648.384 ±   17.924  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt   10    2782.013 ±  151.077  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt   10   21969.109 ±   63.417  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt   10  415850.888 ± 1507.624  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt   10    2699.697 ±   34.675  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt   10   23074.582 ±  105.401  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt   10  405572.113 ± 2954.830  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt   10     543.590 ±    2.325  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt   10    2943.186 ±   12.283  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt   10   36131.091 ±  119.945  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt   10     601.432 ±    5.279  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt   10    3225.028 ±   46.043  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt   10   39672.871 ±  585.702  ns/op

@carl-mastrangelo
Copy link
Member Author

@normanmaurer I am going to send a followup PR. Disabling the huffman coder makes this ridiculously faster, to the point where this PR makes only a minor difference.

@normanmaurer
Copy link
Member

@carl-mastrangelo so should we close this then ?

@carl-mastrangelo
Copy link
Member Author

@normanmaurer I think it will still have a minor benefit for large header values, just small headers won't see it. The two changes are independent, but the other is more important.

@normanmaurer
Copy link
Member

normanmaurer commented Jun 20, 2019 via email

@carl-mastrangelo
Copy link
Member Author

I removed the two methods. I also applied this change on top of the huffman change. There is still benefit for larger header values I think. Results are a little noisy, but still give insight:

DIFFBASE

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt     Score     Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt    5   379.473 ± 133.815  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt    5  1118.772 ±  89.258  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt    5  5366.828 ±  89.746  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt    5   284.401 ±   2.088  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt    5   922.805 ±  10.796  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt    5  8727.831 ± 462.138  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt    5   337.093 ±  22.585  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt    5   693.689 ±  16.351  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt    5  5616.786 ±  98.647  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt    5   286.708 ±  13.765  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt    5   906.279 ±  32.338  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt    5  8304.736 ± 128.584  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt    5   351.381 ±  15.547  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt    5  1188.166 ±   7.023  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt    5  6876.009 ±  48.117  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt    5   434.759 ±   8.619  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt    5   954.588 ±  58.514  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt    5  8534.017 ± 552.597  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt    5   223.713 ±   4.823  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt    5  1181.538 ±  11.851  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt    5  6670.830 ± 267.927  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt    5   424.609 ±  27.477  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt    5  1003.578 ±  53.991  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt    5  8428.932 ± 102.838  ns/op

AFTER

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt     Score     Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt    5   359.901 ±  12.893  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt    5  1070.761 ±  11.828  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt    5  5340.499 ±  62.160  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt    5   214.511 ±  11.789  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt    5   424.725 ±  39.835  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt    5  1391.647 ±  11.890  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt    5   344.233 ±  32.231  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt    5  1065.387 ±  22.257  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt    5  5294.608 ± 197.011  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt    5   211.384 ±  12.504  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt    5   415.328 ±   3.515  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt    5  1403.609 ±   3.463  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt    5   332.837 ±   5.381  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt    5  1160.590 ±  23.529  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt    5  6855.142 ± 676.847  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt    5   409.124 ±  38.120  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt    5  1479.240 ±  24.127  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt    5  7261.508 ± 106.516  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt    5   344.280 ±   6.342  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt    5  1174.367 ±  41.517  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt    5  6193.716 ± 452.406  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt    5   294.076 ±  13.072  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt    5  1508.438 ±  21.544  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt    5  8798.796 ± 126.874  ns/op

@normanmaurer
Copy link
Member

@netty-bot test this please

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can name lookups always use the constant equals and value lookups always use variable equals? Value lookups are where we'd expect the most gain from the variable equals.

@@ -145,7 +146,7 @@ static int getIndex(CharSequence name) {
* Returns the index value for the given header field in the static table. Returns -1 if the
* header field is not in the static table.
*/
static int getIndex(CharSequence name, CharSequence value) {
static int getIndexSensitive(CharSequence name, CharSequence value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any usage of this. And I wouldn't expect any usage since sensitive header values won't use the HPACK table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, no use, removed.

* Returns the lowest index value for the header field name in the dynamic table. Returns -1 if
* the header field name is not in the dynamic table.
*/
private int getIndexInsensitive(CharSequence name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this buy us? I'd expect not much. Since we only pay the equals check if there is a hash collision, and these are header names which are short, it seems like there's not room for gain here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but it's about code clarity. This method says it doesn't care about security, as opposed to getIndexSensitive.

Also in the case the header names change frequently, I'd expect there would be frequent collisions and (evictions).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ejona86 WDYT ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the sensitive/nonsensitive split just is a bit more confusing. I'd have preferred dumping a comment on the previous getIndex: // use constant-time comparison since used with sensitive headers. And for the value look ups // not used for sensitive headers, so no need to use constant-time comparison.

If we really like the split names, then I'd just have getIndexInsensitive() call getIndexSensitive() in the implementation.

Header name changes doesn't have anything to do with frequency of collisions. Only the number of entries plays a part in collision rate. I still maintain that for header names the difference between constant- vs variable-time comparison is well below what we should care about. I'd much rather optimize for code clarity and size here by not duplicating the getIndex implementation.

@normanmaurer
Copy link
Member

@netty-bot test this please

@carl-mastrangelo
Copy link
Member Author

@normanmaurer I would like to get this in, if there aren't any other concerns. There are always going to be differences of opinion, so we need to decide on what is a blocker and what isn't. I feel that since this code is faster, correct, and doesn't pose a significant maintainability burden, it should be committed.

@normanmaurer
Copy link
Member

@carl-mastrangelo I am not strong on this... If @ejona86 is not "strongly" against it I think we can merge... @ejona86 WDYT ?

@ejona86
Copy link
Member

ejona86 commented Jul 1, 2019

Why are those both getNameIndexSensitive and getNameIndexInsensitive? It is in the name of performance, right? But there has not been any performance benefit shown from that change. Benefits are for the value change, not the name change.

So the name change should be dropped. The name change adds code for no benefit.

I don't care if it is renamed or the like. Avoiding the second implementation is all I'm concerned about.

@normanmaurer
Copy link
Member

@carl-mastrangelo ^^

@carl-mastrangelo
Copy link
Member Author

@normanmaurer @ejona86 PTAL

@carl-mastrangelo
Copy link
Member Author

@normanmaurer should be ready again

@CodingFabian
Copy link
Contributor

I can confirm a positive impact. While on a micro level it is very little, over time it does sum up, and seems to me as a sensible optimization.

@normanmaurer
Copy link
Member

@bryce-anderson can you take a look as well ?

@normanmaurer normanmaurer added this to the 4.1.43.Final milestone Oct 22, 2019
@normanmaurer
Copy link
Member

@CodingFabian thanks for confirming :)

@normanmaurer
Copy link
Member

@netty-bot test this please

Copy link
Contributor

@bryce-anderson bryce-anderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a comment to justify why we are now using *Insensitive methods would be helpful for future security-oriented readers.

Motivation:
Constant time comparison functions are used to compare HTTP/2 header
values, even if they are not sensitive.

Modification:
After checking for sensitivity, use fast comparison.

Result: Faster HPACK table reads/writes
@carl-mastrangelo
Copy link
Member Author

@bryce-anderson I added a class comment about it.

@normanmaurer any chance this can make it into 4.1.43 ?

@normanmaurer
Copy link
Member

@netty-bot test this please

@normanmaurer normanmaurer merged commit e8e7a20 into netty:4.1 Oct 24, 2019
@normanmaurer
Copy link
Member

@carl-mastrangelo merged... thanks :)

normanmaurer pushed a commit that referenced this pull request Oct 24, 2019
Motivation:
Constant time comparison functions are used to compare HTTP/2 header
values, even if they are not sensitive.

Modification:
After checking for sensitivity, use fast comparison.

Result: Faster HPACK table reads/writes
@carl-mastrangelo carl-mastrangelo deleted the insensitive branch October 24, 2019 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants