... | @@ -6,7 +6,7 @@ In several places we use a truncated URL-safe base 64 encoding of a SHA-512 chec |
... | @@ -6,7 +6,7 @@ In several places we use a truncated URL-safe base 64 encoding of a SHA-512 chec |
|
|
|
|
|
SHA-512 is a [cryptographic hash function](https://en.wikipedia.org/wiki/Cryptographic_hash_function) described in the [NIST FIBS 180-4](http://dx.doi.org/10.6028/NIST.FIPS.180-4).
|
|
SHA-512 is a [cryptographic hash function](https://en.wikipedia.org/wiki/Cryptographic_hash_function) described in the [NIST FIBS 180-4](http://dx.doi.org/10.6028/NIST.FIPS.180-4).
|
|
|
|
|
|
A cryptographic hash function is a function that can be calculated quickly, and produces a relatively short result for which it is not only unlikely to find collisions (i.e. two different arguments that give the same value), but (being cryptographic) it is difficult even to build these collisions artificially.
|
|
A cryptographic hash function is a function that can be calculated quickly, and produces a relatively short result for which it is not only unlikely to find collisions (i.e., two different arguments that give the same value), but (being cryptographic) it is difficult even to build these collisions artificially.
|
|
|
|
|
|
For this reasons the SHA 512 checksum can be safely used to identify a long value (for example the content of a file) using just the short generated checksum.
|
|
For this reasons the SHA 512 checksum can be safely used to identify a long value (for example the content of a file) using just the short generated checksum.
|
|
|
|
|
... | @@ -16,14 +16,14 @@ As the name implies SHA 512 generates 512 bit checksums, which are relatively lo |
... | @@ -16,14 +16,14 @@ As the name implies SHA 512 generates 512 bit checksums, which are relatively lo |
|
|
|
|
|
Luckily the construction of SHA 512 allows for easy truncation while maintaining the good cryprographic properties, as discussed in detail in section 5.1 of [SP 800-107](http://csrc.nist.gov/publications/nistpubs/800-107-rev1/sp800-107-rev1.pdf) "Truncated Message Digest".
|
|
Luckily the construction of SHA 512 allows for easy truncation while maintaining the good cryprographic properties, as discussed in detail in section 5.1 of [SP 800-107](http://csrc.nist.gov/publications/nistpubs/800-107-rev1/sp800-107-rev1.pdf) "Truncated Message Digest".
|
|
|
|
|
|
## Base 64
|
|
## URL-safe Base 64 encoding
|
|
|
|
|
|
the SHA 512 checksum is a binary sequence, which can contain invalid charaters, to make it representable as a string one should use an encoding. [Hex encoding](https://en.wikipedia.org/wiki/Hexadecimal) is often used but it needs 2 characters (16 bits) to represent 8 bits, making the encoded value twice as long.
|
|
the SHA 512 checksum is a binary sequence, which can contain invalid charaters, to make it representable as a string one should use an encoding. [Hex encoding](https://en.wikipedia.org/wiki/Hexadecimal) is often used but it needs 2 characters (16 bits) to represent 8 bits, making the encoded value twice as long.
|
|
As we would like to truncate as much as possible we use the [Base 64](https://en.wikipedia.org/wiki/Base64) encoding in the url safe version that uses alphanumeric characters and '-', '\_' to encode the values.
|
|
As we would like to truncate as much as possible we use the [Base 64](https://en.wikipedia.org/wiki/Base64) encoding in the url-safe version that uses alphanumeric characters and '-', '\_' to encode the values.
|
|
This encodes 6 bits in a 8-bit character making for shorter values.
|
|
This encodes 6 bits in a 8-bit character making for shorter values.
|
|
|
|
|
|
For internal Gids we use 28 characters that correspond to 168 bit of the checksum.
|
|
For internal Gids (unique identifiers of the metadata) we use 28 characters that correspond to 168 bit of the checksum.
|
|
|
|
|
|
# Conclusion
|
|
# Conclusion
|
|
|
|
|
|
Truncated Base64 encoding of SHA 512 checksum can be an effective way to create short unique values that depend only on longer values. They are reproducible and generation can be distributed as there is no need of a central authority. Depending on the truncation lenght the collision probability can vary from unlikely, to effectively impossible. |
|
Truncated Base64 encoding of SHA 512 checksum can be an effective way to create short unique values that depend only on longer values. They are reproducible and generation can be distributed as there is no need of a central authority. Depending on the truncation length the collision probability can vary from unlikely, to effectively impossible. |
|
\ No newline at end of file |
|
\ No newline at end of file |