Security Breaches, Identity Fraud, and Unknowns
February 14th, 2009 § Leave a Comment
There’s a minor hubbub at Wired and CyberCrime & Doing Time over the most recent Javelin Identity Theft Survey (consumer version here—the full report will set you back a cool $3,000). The report surveyed 4,784 people to find out if they had experienced identity fraud, and, if so, if they knew how the perpetrator accessed their data. Javelin’s big claim is this:
Despite the hefty blame . . . placed on the Internet and cyber-crime, online identity theft methods (phishing, hacking and malware) only accounted for 11% of fraud cases in 2008. The truth is, most known cases of fraud occur through traditional methods, when a criminal has direct, physical access to the victim’s information.
The report has a chart purporting to show the sources of data used to commit identity fraud. For example, here’s a partial list of categories:
| Stolen while making an online purchase: | 1% | |
| “Hackers, viruses, or spyware” on a home or work computer: | 9% | |
| Phishing: | 1% | |
| Stolen from a company in a data breach: | 11% | |
| “Primarily business controlled” data stolen while making a purchase: | 19% | |
| From a lost wallet, purse, etc.: | 43% |
Critics level two main charges against the report. First, they note that it was sponsored by a bank and an online identity protection company, creating a potential source of bias. But their main complaint is that the report—or at least the summary chart and Javelin’s 11% number—ignored the cases where identity fraud had an unknown source.
They have a point. Of the roughly ten percent of phone survey respondents who said they had experienced identity fraud (482 out of 4,784 people called), 65% had no idea how their data was obtained. Javelin threw away the unknowns, and calculated its percentages based on the 169 people who said they knew how their information was obtained. That 9% who said their data was accessed by hackers, viruses, or spyware? That’s 15 out of the 169. It’s also 15 out of the 482 who experienced identity fraud.
A more accurate survey result would look like this:
| Stolen while making an online purchase: | <1% | |
| “Hackers, viruses, or spyware” on a home or work computer: | 3% | |
| Phishing: | <1% | |
| Stolen from a company in a data breach: | 4% | |
| “Primarily business controlled” data stolen while making a purchase: | 7% | |
| From a lost wallet, purse, etc.: | 15% | |
| Unknown: | 65% |
The problem boils down to this: did the 313 people who said they didn’t know how their information was obtained have their data stolen in proportionately the same ways as the 169 people who could identify a source? Javelin, by tossing out those 313 unknowns, seems to think so. But there are good reasons why the knowns may not adequately represent the unknowns. For example, not all methods of data theft have the same visibility to the victim—most people know when their wallets have been stolen; the same is not always true of data stolen from a business. And do phishing and social engineering victims usually know they’ve been had?
Unfortunately, the critics take this point and then leap too far, claiming that all of the unknown cases must have come from data breaches, malware, phishing, and other online sources. That’s a reasonable conjecture, but that’s all it is. It replaces a poor assumption—that the sub-sample accurately represents the full sample—with unabashed speculation. Both are interesting, but neither are reliable data.
The most honest approach would be to put that 65% in the margin of error for each category. But “online identity theft methods only accounted for somewhere between 5% and 70% of all identity thefts” doesn’t make nearly as catchy a headline.