First off, your work here is very cool. Thank you!
The context for all of this is a version control system that is trying to determine the delta between two nodes (client and server initially, P2P nodes later).
Question 1
I want to make sure I understand the security consideration here. The README states -
Although this library is the artifact of a research project,
it is of relatively high quality and should be suitable for
deployment in production systems where workload given
to the library is trusted, i.e., not injected by malicious actors.
If we randomly select a key for each SipHash interaction, does this mitigate issues around accepting payloads from malicious users? There is still the issue of denial of service by forcing computation (re-hashing), but we can rate limit the endpoint to avoid this DoS. I am not sure if there are other issues I am not aware of.
Question 2
What actually needs to be sent over the wire? I see we have a coded symbol which contains a hashed symbol which contains a symbol.
I am curious about two scenarios -
- If we want the remote party to know the values of the symbols that are missing. (i.e. Server: I am missing values A, B, C, D and you are missing values X, Y, Z.)
- If we just want some index or identifier for each symbol that the remote party is missing (from our set) and their symbols. (i.e. Server: I am missing indices 1, 2, 3, 4 and you are missing values X, Y, Z.) I am not sure this is possible but it seems lighter to send just the uint64s rather than the full symbols.
Should I just be sending the following from the coded symbol and reconstructing it on the other side (assuming T is []byte)?
type CodedTransport struct {
Count int64
Hash uint64
Symbol []byte
}
Question 3
I plan to use this over HTTP initially which, unfortunately, does not allow for streaming symbols. Do you have any advice for this "one-shot" scenario? I was thinking to send 100 symbols and if that failed, retry with 10x until we get a successful decode.
First off, your work here is very cool. Thank you!
The context for all of this is a version control system that is trying to determine the delta between two nodes (client and server initially, P2P nodes later).
Question 1
I want to make sure I understand the security consideration here. The README states -
If we randomly select a key for each SipHash interaction, does this mitigate issues around accepting payloads from malicious users? There is still the issue of denial of service by forcing computation (re-hashing), but we can rate limit the endpoint to avoid this DoS. I am not sure if there are other issues I am not aware of.
Question 2
What actually needs to be sent over the wire? I see we have a coded symbol which contains a hashed symbol which contains a symbol.
I am curious about two scenarios -
Should I just be sending the following from the coded symbol and reconstructing it on the other side (assuming
Tis[]byte)?Question 3
I plan to use this over HTTP initially which, unfortunately, does not allow for streaming symbols. Do you have any advice for this "one-shot" scenario? I was thinking to send 100 symbols and if that failed, retry with 10x until we get a successful decode.