0% found this document useful (0 votes)
14 views4 pages

Checksum

A checksum is a data block used to detect errors in digital data during transmission or storage, generated by a checksum function or algorithm. While checksums verify data integrity, they do not ensure data authenticity and can be implemented in various algorithms such as parity checks, sum complements, and fuzzy checksums. Effective checksum algorithms aim to minimize the likelihood of undetected errors by spreading valid data representations across a larger set of possible messages.

Uploaded by

SUM Dev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views4 pages

Checksum

A checksum is a data block used to detect errors in digital data during transmission or storage, generated by a checksum function or algorithm. While checksums verify data integrity, they do not ensure data authenticity and can be implemented in various algorithms such as parity checks, sum complements, and fuzzy checksums. Effective checksum algorithms aim to minimize the likelihood of undetected errors by spreading valid data representations across a larger set of possible messages.

Uploaded by

SUM Dev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Checksum

A checksum is a small-sized block of data


derived from another block of digital data for the
purpose of detecting errors that may have been
introduced during its transmission or storage. By
themselves, checksums are often used to verify
data integrity but are not relied upon to verify
data authenticity.[1]

The procedure which generates this checksum is


called a checksum function or checksum
algorithm. Depending on its design goals, a
good checksum algorithm usually outputs a
significantly different value, even for small
changes made to the input.[2] This is especially
true of cryptographic hash functions, which may
be used to detect many data corruption errors and
verify overall data integrity; if the computed Effect of a typical checksum function (the Unixcksum
checksum for the current data input matches the utility)
stored value of a previously computed checksum,
there is a very high probability the data has not
been accidentally altered or corrupted.

Checksum functions are related to hash functions, fingerprints, randomization functions, and
cryptographic hash functions. However, each of those concepts has different applications and therefore
different design goals. For instance, a function returning the start of a string can provide a hash
appropriate for some applications but will never be a suitable checksum. Checksums are used as
cryptographic primitives in larger authentication algorithms. For cryptographic systems with these two
specific design goals, see HMAC.

Check digits and parity bits are special cases of checksums, appropriate for small blocks of data (such as
Social Security numbers, bank account numbers, computer words, single bytes, etc.). Some error-
correcting codes are based on special checksums which not only detect common errors but also allow the
original data to be recovered in certain cases.

Algorithms

Parity byte or parity word


The simplest checksum algorithm is the so-called longitudinal parity check, which breaks the data into
"words" with a fixed number n of bits, and then computes the bitwise exclusive or (XOR) of all those
words. The result is appended to the message as an extra word. In simpler terms, for n=1 this means
adding a bit to the end of the data bits to guarantee that there is an even number of '1's. To check the
integrity of a message, the receiver computes the bitwise exclusive or of all its words, including the
checksum; if the result is not a word consisting of n zeros, the receiver knows a transmission error
occurred.[3]

With this checksum, any transmission error which flips a single bit of the message, or an odd number of
bits, will be detected as an incorrect checksum. However, an error that affects two bits will not be
detected if those bits lie at the same position in two distinct words. Also swapping of two or more words
will not be detected. If the affected bits are independently chosen at random, the probability of a two-bit
error being undetected is 1/n.

Sum complement
A variant of the previous algorithm is to add all the "words" as unsigned binary numbers, discarding any
overflow bits, and append the two's complement of the total as the checksum. To validate a message, the
receiver adds all the words in the same manner, including the checksum; if the result is not a word full of
zeros, an error must have occurred. This variant, too, detects any single-bit error, but the pro modular sum
is used in SAE J1708.[4]

Position-dependent
The simple checksums described above fail to detect some common errors which affect many bits at
once, such as changing the order of data words, or inserting or deleting words with all bits set to zero.
The checksum algorithms most used in practice, such as Fletcher's checksum, Adler-32, and cyclic
redundancy checks (CRCs), address these weaknesses by considering not only the value of each word but
also its position in the sequence. This feature generally increases the cost of computing the checksum.

Fuzzy checksum
The idea of fuzzy checksum was developed for detection of email spam by building up cooperative
databases from multiple ISPs of email suspected to be spam. The content of such spam may often vary in
its details, which would render normal checksumming ineffective. By contrast, a "fuzzy checksum"
reduces the body text to its characteristic minimum, then generates a checksum in the usual manner. This
greatly increases the chances of slightly different spam emails producing the same checksum. The ISP
spam detection software, such as SpamAssassin, of co-operating ISPs, submits checksums of all emails to
the centralised service such as DCC. If the count of a submitted fuzzy checksum exceeds a certain
threshold, the database notes that this probably indicates spam. ISP service users similarly generate a
fuzzy checksum on each of their emails and request the service for a spam likelihood.[5]

General considerations
A message that is m bits long can be viewed as a corner of the m-dimensional hypercube. The effect of a
checksum algorithm that yields an n-bit checksum is to map each m-bit message to a corner of a larger
hypercube, with dimension m + n. The 2m + n corners of this hypercube represent all possible received
messages. The valid received messages (those that have the correct checksum) comprise a smaller set,
with only 2m corners.
A single-bit transmission error then corresponds to a displacement from a valid corner (the correct
message and checksum) to one of the m adjacent corners. An error which affects k bits moves the
message to a corner which is k steps removed from its correct corner. The goal of a good checksum
algorithm is to spread the valid corners as far from each other as possible, to increase the likelihood
"typical" transmission errors will end up in an invalid corner.

See also
General topic Hamming code
Reed–Solomon error correction
Algorithm IPv4 header checksum
Check digit Hash functions
Damm algorithm
Data rot List of hash functions
File verification Luhn algorithm
Fletcher's checksum Parity bit
Frame check sequence Rolling checksum
cksum Verhoeff algorithm
md5sum File systems
sha1sum
Parchive Bcachefs, Btrfs, ReFS and ZFS – file
Sum (Unix) systems that perform automatic file
integrity checking using checksums
SYSV checksum
BSD checksum Related concepts
xxHash
Isopsephy
Error correction Gematria
File fixity

References
1. "Definition of CHECKSUM" (https://www.merriam-webster.com/dictionary/checksum).
Merriam-Webster. Archived (https://web.archive.org/web/20220310132715/https://www.merr
iam-webster.com/dictionary/checksum) from the original on 2022-03-10. Retrieved
2022-03-10.
2. Hoffman, Chris (30 September 2019). "What Is a Checksum (and Why Should You Care)?"
(https://www.howtogeek.com/363735/what-is-a-checksum-and-why-should-you-care/). How-
To Geek. Archived (https://web.archive.org/web/20220309114058/https://www.howtogeek.co
m/363735/what-is-a-checksum-and-why-should-you-care/) from the original on 2022-03-09.
Retrieved 2022-03-10.
3. Fairhurst, Gorry (2014). "Checksums & Integrity Checks" (https://erg.abdn.ac.uk/users/gorry/
eg3576/checksums.html). Archived (https://web.archive.org/web/20220408011213/https://er
g.abdn.ac.uk/users/gorry/eg3576/checksums.html) from the original on April 8, 2022.
Retrieved March 11, 2022.
4. "SAE J1708" (https://web.archive.org/web/20131211152639/http://www.kvaser.com/zh/abou
t-can/related-protocols-and-standards/50.html). Kvaser.com. Archived from the original (htt
p://www.kvaser.com/zh/about-can/related-protocols-and-standards/50.html) on 11
December 2013.
5. "IXhash" (https://cwiki.apache.org/confluence/display/spamassassin/iXhash). Apache.
Archived (https://web.archive.org/web/20200831125801/https://cwiki.apache.org/confluenc
e/display/spamassassin/iXhash) from the original on 31 August 2020. Retrieved 7 January
2020.

Further reading
Koopman, Philip; Driscoll, Kevin; Hall, Brendan (March 2015). "Cyclic Redundancy Code
and Checksum Algorithms to Ensure Critical Data Integrity" (http://www.tc.faa.gov/its/worldp
ac/techrpt/tc14-49.pdf) (PDF). Federal Aviation Administration. DOT/FAA/TC-14/49.
Archived (https://web.archive.org/web/20150518073214/http://www.tc.faa.gov/its/worldpac/t
echrpt/tc14-49.pdf) (PDF) from the original on 2015-05-18.
Koopman, Philip (2023). "Large-Block Modular Addition Checksum Algorithms".
arXiv:2302.13432 (https://arxiv.org/abs/2302.13432) [cs.DS (https://arxiv.org/archive/cs.D
S)].

External links
Additive Checksums (C) (http://www.netrino.com/Embedded-Systems/How-To/Additive-Che
cksums) theory from Barr Group
Practical Application of Cryptographic Checksums (http://www.peterjockisch.de/texte/comput
erartikel/Kryptographische-Pruefsummen/Kryptographische-Pruefsummen_EN.html) *A4 (ht
tps://www.peterjockisch.de/Checksums_A4.pdf) *US-Letter (https://www.peterjockisch.de/Ch
ecksums_US-Letter.pdf) *US-Letter two-column (https://www.peterjockisch.de/Checksums_
US-Letter_2.pdf)
Checksum Calculator (https://codebeautify.org/checksum-calculator)
Open source python based application with GUI used to verify downloads. (https://github.co
m/CRPrinzler/HASH-verify)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Checksum&oldid=1290925990"

You might also like