How to choose the right password

Passwords are vital, but how should we select them?

TODO alt text

Passwords seem to be the modern version of the medieval hairshirt.

They seem to exist as an irritant to today's online life. You want access to your PC? Password, please. You want to add a Facebook status? Password! You want to check your bank account online? Password needed!

So, how do you create good ones? In fact, what are good ones? How do you remember them? How can you reduce the irritation?

In order to authenticate yourself to the systems you use every day – to prove to them that you are who you say you are – you use a password. This password, in theory anyway, is known only to yourself and the system you are trying to access – be it Facebook, Twitter, your bank, your email, your blog or anything else. It is a secret not to be revealed to third parties.

There is another essential piece to the authentication puzzle – your username – but this is generally your email address or your name in some concatenated form, and is easily discoverable. Your password is therefore the 'open sesame' that reveals everything about you. How can you make sure that your privacy remains intact and that the secret persists?

Let's approach the question from the viewpoint of a black hat hacker who wants to impersonate you for some system. To raise the stakes, let's assume that the system is your bank and the hacker wants to test your credit limit. How can he get your password?

Watch and learn

The first way is the simplest: he watches you as you type in your password. That way it doesn't matter how strong or weak your password is; the hacker just watches you enter it. I'm going to assume that you'd be aware of someone watching over your shoulder, so the question becomes how else could a hacker 'watch' you?

Back in March, RSA (producer of the SecurID systems used by corporations and the US Department of Defense) was hacked. Someone managed to gain access to internal systems and networks and steal secrets pertaining to the SecurID two-factor authentication key.

A couple of months later, they attempted to hack into Lockheed Martin, the defence contractor using them. How was this done? Simple – it was a phishing attack.

An email purporting to be about 2011 recruitment plans and containing an Excel spreadsheet was sent to several low-profile staff members at RSA, seemingly from a recruitment agency. The spreadsheet contained an embedded Adobe Flash object that in turn contained a zero-day vulnerability. Once the spreadsheet was opened, this malware installed a backdoor onto the machine, which gave the attackers access to the PC and the network.

At that point all bets are off. The attacker could install a keylogger and track exactly what you type at login screens – there goes a password. Even worse, they could download your system password files (those used by System Account Manager) and then crack them with a program like Ophcrack, which uses techniques like rainbow tables to reverse the hashed login data. There go all your passwords.

In fact, that last scenario brings up the whole subject of cracking passwords. There are two stages: guessing the password using some algorithm – usually brute-force by trying every permutation – and then validating the password against the system being hacked.

The issue with validating passwords is that many systems have built-in safeguards. Generally you only get so many attempts at trying a password before the system locks out the account being tried. Sometimes the system will also deliberately delay resetting the login screen by a few seconds to make trying many passwords extremely slow.

Note that a standalone Windows 7 machine has account lockout disabled by default, whereas a PC on a corporate network might have it enabled. If the system is embodied in a file – say the victim is using a password manager and the hacker has managed to capture the password file – the hacker's job is made much easier.

In essence, the online safeguards (limited number of password attempts, delay between attempts) are no longer in play and the hacker has free rein to try as many passwords as they like as quickly as possible. This is where the strength of the password comes into play.

Strength in numbers

When we access a new resource for which we have to create a password, we're generally given some guidelines for creating a strong password and discouraged from using weak ones. The guidelines usually include making passwords longer than some defined minimum (say, eight characters), not using normal words, using upper and lower case letters, and using numbers and punctuation symbols.

With luck, the screen where you enter your new password will have some kind of visual cue to show how good it is, like a progress bar coloured from red (bad) to green (good). The worst systems are those that limit your password to a low character count, restrict the characters used to just lowercase letters and digits, and so on. Such guidelines will automatically produce weak passwords.

The strength of a password is measured by its entropy, as a number of bits. The greater the number of bits the larger the entropy, and the harder it will be to crack the password.

Entropy is a concept from information theory, and is a measure of a message's predictability. For example, a series of tosses from a fair coin is unpredictable (we can't say what's coming next) and so has maximum entropy. Text in English – this article, for example – is fairly predictable in that we can make judgments about what's going to come next. The letter E appears far more often than Q, if there is a Q, it's likely that the next character will be U, and so on.

It's estimated that English text has an entropy of between one and 1.5 bits per (8-bit) character. In another sense, entropy is a measurement of how compressible a message is – how much fluff we can discard in compressing a message and still be able to reconstitute the original message at a moment's notice. If you like, the compressed message contains just the information content of the message.

We've all compressed a text file in a zip file to get 70-80 per cent compression or more; that is just an expression of the entropy of the text.