View on GitHub

PolyPasswordHasher

A Password hashing scheme that prevents an attacker from cracking passwords individually and efficiently.

Download this project as a .zip file Download this project as a tar.gz file

Password database disclosures have caused companies billions of dollars in damages. Typically a password server securely obtains a password from the user, performs a salted hash on the password, and then checks if that entry matches what is in the password database under that user name. Hackers have proven adept at stealing password databases from dozens of companies. An attacker with the password database can try different passwords for a user and check if there is a match.

PolyPasswordHasher stores password data in a different way (technical details). To understand the concept, imagine a flat piece of paper. Instead of storing the salted password hash, PolyPasswordHasher combines this value with a point on the piece of paper. All of the correct passwords when salted will produce points that lie on the same line. (For the cryptographically knowledgeable reader, the actual construct used is a Shamir Secret Share instead of points in a multi-dimensional space.) The information about this line is a secret that is not stored on disk. When an attacker tries to crack passwords, they will simultaneously guess multiple passwords (the line), which makes it amazingly hard to crack passwords.

When the server restarts, it also needs to recompute the line. However, through the normal login process, users provide many correct passwords and so the server finds the line without problems. This technique is very efficient for the server to compute and requires very little additional storage or memory; this is done only once by the server. For an attacker, however, recomputing a line must be done after each guess, making guessing really expensive. Better yet, client computers do not need to be changed in any way to use this. There are free, open source implementations available for PolyPasswordHasher. So hopefully this technique will be coming to your favorite service to protect your password!

News

21/04/17 We showcased PPH in the NYU tandon research expo. You can read more about it here.
20/04/16 A proof of concept of PPH with error-correcting secret sharing schemes is available now. These schemes allow for faster secret recovery for the server, while requiring the attackers to perform more computation. You can take a look at this implementation here.
10/05/15 We are now hosting the Java version of PolyPasswordHasher in our organziation. You can take a look at it by clicking here.
09/16/15 PolyPasswordHasher was featured on Hacker News! There were some good comments and discussion and people submitted some improvements to our documentation.
09/02/15 Our node node.js implementation is done. We are currently exploring ways to integrate it into passport as an authentication backend.
08/25/15 We are integrating yubikeys as an alternative authentication mechanism for PolyPassswordHasher. When protector accounts opt for yubikey hardware, shares are protected with a cryptographic key, increasing the security of all accounts in the database.
06/27/15 We are exploring the usage of error-correcting secret sharing schemes. We are in the process of developing a reference implementation using McEliece's error-correcting secret sharing scheme.

FAQ

"How hard is it to crack passwords stored using this technique?"

Suppose that three people have passwords that are each randomly chosen and 6 characters long. A typical laptop can crack those passwords in about 1 hour.

If you take the same passwords and protect them with PolyPasswordHasher, every computer on the planet working together cannot crack the password in 1 hour. In fact, to search the key space, it would take every computer on the planet longer than the universe is estimated to have existed.

"What about a service like Facebook or Gmail where anyone can register an account?"

It is possible to use accounts that contribute to the line (protector accounts) as a key to encrypt other account credentials (shielded accounts). So an attacker can know any number of those shielded accounts and cannot crack other shielded account or the protector accounts.

"What if an attacker figures out the sensitive information (line / Shamir Secret Share)?"

This would require an attacker to be able to read memory from a running server. While the communication between the client and server are typically encrypted, this only prevents an attacker on the network from eavesdropping on the password. The server typically obtains the client's password in plain-text and then performs the salted hash. So if an attacker can read memory on a running server, they can steal passwords regardless of how they are protected.

However, even if an attacker can recover the line / Shamir Secret Share, the account passwords are still salted and hashed. (Salting and hashing is the industry standard best practices for password protection today.) So even if an attacker can read memory, stored accounts still have protection equal to the best case today.

Fortunately, most prior password database breaches did not result in the attacker gaining access to memory on a running server. The most commonly disclosed cause of password database compromise is SQL injection, which does not imply the attacker can read arbitrary memory. So in many cases, an attacker that can steal the database does not have this access.

"If the attacker can't check individual accounts, how does the server check the first account after rebooting?"

The basic technique described here would require some number of users to provide correct login information before authorizing any of them. However, our paper discusses an extension that gets around this issue. You can leak a small amount of information about the password hashes to allow checking. Using the example above, this is similar to leaking the last few digits of the points. An attacker still has a huge number of things to guess, but the server can check and eliminate most wrong passwords right after reboot.

"Does this mean I can choose very weak passwords on sites that use PolyPasswordHasher?"

Extremely weak passwords are a bad idea. An attacker can guess poor passwords and just try them even if they do not have the database. So do not use them regardless of the storage technology.

Another thing PolyPasswordHasher does not protect against is password reuse. Do not use the same password on multiple sites no matter how strong it is! We recommend using a password manager that generates a separate, strong password per site.

"What secure hash algorithm does PolyPasswordHasher use?"

Any secure hashing algorithm can be used with PolyPasswordHasher. The Python reference implementation uses SHA256.

"Where can I get an implementation of PolyPasswordHasher?"

There is a basic reference implementation for Python that is written for readability. There are also implementations in C and Ruby available.

We are currently testing a Django implementation that can be easily deployed.

We would love to see a PHP implementation for common frameworks because many password hash breaches occur on these systems.

Please contact us if you have an implementation of PolyPasswordHasher that you would like us to list.

"How do I get more information about PolyPasswordHasher?"

Our paper should be a good starting point. If you have a question that is not covered there, feel free to email our mailing list.