In response to the news that researchers have discovered a trove of 1.4 billion credentials on the dark web, Terry Ray, CTO of Imperva, made the following comments:
“I’ve been speculating for some time now about the aggregation of data from breaches. This is a fine, though basic example of what I’ve been proposing. I’ve suggested that it would be possible to take stolen identity data, such as names, addresses, employer, spouse’s name, children’s names, etc.
"Anything identifiable and combine that with various other breaches to find common data points linking people to people, people to companies, companies to data, etc… which would possibly be useful in targeted phishing or extortion attacks. There certainly have been enough breaches to expose personally identifiable information in quantities useful in such analytics. This particular dataset is quite large overall, but what I find most interesting is how similar the common passwords were to a smaller older dataset we at Imperva Inc. analyzed in 2010.
"For example, in our 2010 research of 32 million credentials, we found that ‘123456’ was the top most common password by a factor of 4 from the next most common password. In this latest batch of 1.4 billion credentials, I suppose people are starting to learn better passwords, wink, since “123456” is only the most common by a factor of 3, slowly, but surely.
"Further, 45 percent or 9 out of the top 20 passwords in this latest dataset were in our top 20 passwords in 2010. This isn’t overly surprising, since this large batch is a combination of 252 prior breaches, but it does give a clear indication that there are still some very poor password practices being used, and it further validates that users tend to use the same poor passwords across many different sites.
"I don’t think it will be long before aggregated data sets on the dark web are sold containing much more than passwords, given the breadth of data we know has been stolen over the years. When you consider how quickly one can change their password, datasets like this one are only valid as long as users continue to make poor choices in password usage. Stolen names, addresses, family member names, etc, don’t change nearly as often, if ever for some, so the long-term value and longevity of a more extensive analytic dataset would likely be very popular in some hands.”