Security Researchers Engineered a SHA-1 Collision, Which I Will Try to Explain


This morning, security researchers working with Google announced that they have successfully created a technique for engineering a collision using the SHA-1 hash function, potentially endangering systems and web services that still use SHA-1 for security purposes. Nobody’s in any immediate danger, but the breakthrough demonstrates how even the measures we thought were once impenetrable can be cracked with relative ease.

Hash functions like SHA-1 analyze files, and then generate a string of characters based on that analysis, a one-of-a-kind fingerprint of sorts. That string of characters, known as a hash, helps determine file integrity. An unaltered file should always generate the same hash every time it is run through the program. (You can try it here.) If you were sending delicate information, you might use a hash to verify that what was transmitted had not been tampered with. Web log-in systems, for instance, use hashes to analyze passwords, in order to evaluate them without exposing the password itself.

Most importantly, different inputs should generate different hashes. When different inputs have the same hash, that’s known as a collision.

Collisions are very, very rare, and nearly impossible to create intentionally. The SHA-1 hash function has 9,223,372,036,854,775,808 possible outputs (the maximum value of a 64-bit signed integer, which I’m sure needs no explanation), a number so large that a hash collision was considered impossible to brute force (going through possibilities one by one) because it would take so long, even using a computer. But times change, and the SHA-1 algorithm is outdated now.

Marc Stevens and Pierre Karpman of CWI Amsterdam, and Elie Bursztein, Ange Albertini, Yarik Markov of Google were able to generate a collision using Google’s resources, concluding that “a well-funded attacker can craft a collision. The attacker could then use this collision to deceive systems that rely on hashes into accepting a malicious file in place of its benign counterpart.”

“Well-funded” is a key word here in describing the attack effort. The attack required “6,500 years of CPU computation to complete the attack first phase” and “110 years of GPU computation to complete the second phase.” That requires a lot of hardware resources, though it’s hardly unobtainable.

Also worth noting is that SHA-1 is not the most en vogue cryptographic hash function. SHA-256 and SHA-3 are more current and secure algorithms that haven’t been broken (yet), and most leading companies that take security seriously are already using them. All of the mainstream web browsers already (or soon will) warn users visiting websites that utilize SHA-1. Chrome and the rest of Google’s products have already been upgraded, and Firefox will also soon flag SHA-1 protection as insecure. The team’s method for generating the collision won’t be publicized for 90 days, and in the meantime, that gap should encourage anyone still using SHA-1 to upgrade to a more secure option.

So is there anything to worry about? Not really. But the breakthrough is a reminder that systems that we once considered impenetrable are quickly falling victim to faster and faster processing power. And, of course, this team is only the first to publicize their findings. You can pretty much assume that government agencies like the NSA have been hacking away at this, and the more secure hash functions previously mentioned, for years. Full cybersecurity is a moving target that we can’t ever expect to hit.

Security Researchers Engineered a SHA-1 Collision