Too Many Law Schools in Minnesota?
March 9th, 2009 § 4 Comments
This had nothing to do with data security, but I couldn’t resist the chance to obsess over some numbers. Last week, Mark Cohen at the Minnesota Lawyer blog posted a question that came up in a previous comment thread: are there too many law schools in Minnesota?
Whether we have too many law schools (or, more to the point, law students) is a slippery question. It’s like asking whether we have too many lawyers—it depends on whether you want to be one, or hire one.
I thought it might be interesting to compare the Twin Cities law school situation with other metro areas. Specifically, I wanted to look at two measures:
- The number of new law graduates produced in each area per year, as a proportion of the total population of the area, and
- The overall matriculation rate of the schools in each area.
The first is a measure of supply, from the legal market perspective—the higher an area’s per capita production of new grads compared to other areas, the more likely it is that the market may be oversaturated. The second measures demand—if far more students apply to schools than attend, there may be demand for more law school seats (from students, if not employers).
The results are listed below, based on data from the 2007 US Census data for primary statistical areas and the LSAC’s Guide to ABA-Approved Law Schools. I combined some of the census’s primary areas where it made sense because of school locations. I included roughly the fifty largest areas (reduced a bit because of combining).
The results confirmed what a lot of people already think: the Twin Cities produces a relatively high number of law school grads compared to its population. The 936 graduates are 264.5 graduates per million in population. Only San Diego, Boston, and Washington, DC put out more law grads per capita. Boston and D.C. are probably net exporters of new lawyers, and D.C. may have more lawyers per-capita than anywhere else. That leaves San Diego as the only metro area producing significantly more law grads per capita than MSP—but note that Los Angeles, just a bit to the north of San Diego, has a particularly low rate of law grad production.
Number of law grads per million population:
| Area | 2007 Pop | Grads/Yr | Grads/M |
| San Diego, CA | 2,974,859 | 1,170 | 393.30 |
| Boston, (MA/RI/NH) | 7,476,689 | 2,483 | 332.10 |
| Washington, DC, Baltimore, MD, Northern Virginia | 8,241,912 | 2,735 | 331.84 |
| Minneapolis/St. Paul & St. Cloud Area, MN/WI | 3,538,781 | 936 | 264.50 |
| Oklahoma City, Tulsa, OK | 2,217,670 | 585 | 263.79 |
| Indianapolis, Bloomington, Lafayette, IN | 2,423,956 | 638 | 263.21 |
| Detroit, Flint, Lansing, & Grand Rapids, MI | 7,257,206 | 1,811 | 249.55 |
| Columbus , OH | 1,982,252 | 462 | 233.07 |
| Birmingham, Montgomery, Tuscaloosa, AL | 1,811,555 | 418 | 230.74 |
| Little Rock, AR | 1,277,040 | 288 | 225.52 |
| San Francisco-San Jose, CA | 7,264,887 | 1,617 | 222.58 |
| St. Louis, MO/IL | 2,866,517 | 634 | 221.17 |
| Albany, NY CSA | 1,148,416 | 251 | 218.56 |
| New York, NY/NJ/CT/PA | 21,961,994 | 4,777 | 217.51 |
| Philadelphia (PA/NJ/DE/MD) | 6,385,461 | 1,384 | 216.74 |
| Milwaukee & Madison, WI | 2,353,600 | 501 | 212.87 |
| San Antonio, & Austin, TX | 3,588,836 | 750 | 208.98 |
| Cleveland-Akron-Elyria, OH CSA | 2,896,968 | 599 | 206.77 |
| Sacramento, CA | 2,397,691 | 487 | 203.11 |
| Kansas City (KS/MO), Lawrence, & Topeka, KS | 2,396,108 | 472 | 196.99 |
| Chicago, IL | 9,745,165 | 1,908 | 195.79 |
| Portland, Eugene, Salem, Corvallis, OR | 3,100,110 | 571 | 184.19 |
| Denver-Aurora-Boulder, CO | 2,998,878 | 547 | 182.40 |
| Hartford, CT | 1,306,151 | 238 | 182.21 |
| Miami, FL | 5,413,212 | 936 | 172.91 |
| Buffalo, Rochester, Syracuse, NY | 3,056,474 | 519 | 169.80 |
| Orlando, Jacksonville, St. Petersburg, Sarasota, Gainesville, Tallahassee FL | 8,167,737 | 1,385 | 169.57 |
| Greenville, Columbia, Charleston, SC | 2,615,644 | 437 | 167.07 |
| Charlotte-Greensboro-Raleigh, NC | 5,448,974 | 842 | 154.52 |
| Pittsburgh, PA | 2,446,703 | 375 | 153.27 |
| Houston, TX | 5,729,027 | 874 | 152.56 |
| Albequerque, NM | 835,120 | 114 | 136.51 |
| Salt Lake City & Provo, UT | 2,180,009 | 288 | 132.11 |
| Atlanta, Athens, Macon GA | 6,200,339 | 809 | 130.48 |
| Seattle-Tacoma, WA | 4,038,741 | 518 | 128.26 |
| Richmond, VA | 1,212,977 | 149 | 122.84 |
| Cincinnati (OH/KY/IN) | 2,176,749 | 267 | 122.66 |
| Nashville, Memphis, Knoxville, TN | 3,911,091 | 475 | 121.45 |
| Los Angeles, CA | 17,755,322 | 1,868 | 105.21 |
| Phoenix, AZ | 4,179,427 | 410 | 98.10 |
| Dallas-Fort Worth & Waco, TX | 6,726,533 | 626 | 93.06 |
| Virginia Beach, Norfolk, Newport News, VA-NC | 1,658,754 | 143 | 86.21 |
| Louisville (KY/IN) | 1,369,024 | 112 | 81.81 |
| Las Vegas, NV | 1,880,449 | 142 | 75.51 |
That’s the supply side. On the demand side, Minneapolis-St. Paul ends up with the tenth highest matriculation rate per application—suggesting that law school applicants are more likely to be able to attend a school in the area than most. Also note that all the areas with higher Grads/Million rates have lower matriculation rates: 8.14% in Boston, 6.32% in DC, and 10.95% in San Diego.
Matriculation as a percentage of applications (the number of applications and matriculations are the total for all schools in the area):
| Area | Apps | Matric. | Matric/Apps |
| Virginia Beach, Norfolk, Newport News, VA-NC | 575 | 153 | 26.61% |
| Kansas City (KS/MO), Lawrence, & Topeka, KS | 2,960 | 490 | 16.55% |
| Birmingham, Montgomery, Tuscaloosa, AL | 2,979 | 479 | 16.08% |
| Oklahoma City, Tulsa, OK | 3,524 | 551 | 15.64% |
| Louisville (KY/IN) | 1,099 | 168 | 15.29% |
| Pittsburgh | 3,190 | 485 | 15.20% |
| Salt Lake City & Provo, UT | 1,796 | 268 | 14.92% |
| Greenville, Columbia, Charleston, SC | 2,912 | 421 | 14.46% |
| Cincinnati (OH/KY/IN) | 2,284 | 324 | 14.19% |
| Minneapolis/St. Paul & St. Cloud Area, MN/WI | 7,401 | 984 | 13.30% |
| Buffalo, Rochester, Syracuse, NY | 3,584 | 469 | 13.09% |
| Columbus , OH | 3,618 | 439 | 12.13% |
| Houston, TX | 7,750 | 938 | 12.10% |
| Albany, NY CSA | 2,065 | 246 | 11.91% |
| Orlando, Jacksonville, St. Petersburg, Sarasota, Gainesville, Tallahassee FL | 17,687 | 2,055 | 11.62% |
| Little Rock, AR | 2,597 | 299 | 11.51% |
| Cleveland-Akron-Elyria, OH CSA | 5,665 | 638 | 11.26% |
| Milwaukee & Madison, WI | 4,378 | 488 | 11.15% |
| Detroit, Flint, Lansing, & Grand Rapids, MI | 10,633 | 1,184 | 11.14% |
| San Diego, CA | 11,786 | 1,291 | 10.95% |
| San Antonio, & Austin, TX | 6,771 | 729 | 10.77% |
| Miami, FL | 12,110 | 1,294 | 10.69% |
| St. Louis, MO/IL | 6,773 | 698 | 10.31% |
| Indianapolis, Bloomington, Lafayette, IN | 7,164 | 728 | 10.16% |
| Portland, Eugene, Salem, Corvallis, OR | 5,766 | 556 | 9.64% |
| Albequerque, NM | 1,175 | 111 | 9.45% |
| Denver-Aurora-Boulder, CO | 5,920 | 555 | 9.38% |
| Dallas-Fort Worth & Waco, TX | 6,983 | 640 | 9.17% |
| Las Vegas, NV | 1,713 | 153 | 8.93% |
| Seattle-Tacoma, WA | 5,769 | 505 | 8.75% |
| Atlanta, Athens, Macon GA | 11,799 | 1,029 | 8.72% |
| Richmond, VA | 1,886 | 160 | 8.48% |
| Sacramento, CA | 6,569 | 541 | 8.24% |
| Boston, (MA/RI/NH) | 31,362 | 2,552 | 8.14% |
| Nashville, Memphis, Knoxville, TN | 6,334 | 512 | 8.08% |
| New York, NY/NJ/CT/PA | 66,008 | 5,053 | 7.66% |
| Hartford, CT | 2,824 | 216 | 7.65% |
| Chicago, IL | 26,707 | 1,925 | 7.21% |
| Phoenix, AZ | 5,827 | 408 | 7.00% |
| Philadelphia (PA/NJ/DE/MD) | 18,200 | 1,247 | 6.85% |
| Los Angeles, CA | 29,623 | 1,997 | 6.74% |
| Charlotte-Greensboro-Raleigh, NC | 14,718 | 970 | 6.59% |
| San Francisco-San Jose, CA | 26,219 | 1,687 | 6.43% |
| Washington, DC, Baltimore, MD, Northern Virginia | 48,501 | 3,066 | 6.32% |
This doesn’t prove Minnesota has too many law schools. But it does show that we put out a large number of law graduates for an area of our size, and it’s easier for students to get into a school here than in most cities. Whether that’s “too many” is left as an exercise for the economy.
Security is Not a Checklist
February 16th, 2009 § 1 Comment
In the security profession, we have a maxim that security is not a product. It’s a reminder that security doesn’t result from plugging in devices, but through continuous integration of security into design, development, management, and operations. I’d add another maxim: security is not a checklist.
When I was in QSA training a few years back, our trainer claimed that no one who was PCI DSS compliant had ever suffered a data breach. He hedged this bold statement by suggesting that anyone who had been certified as PCI DSS compliant and later suffered a breach must have fallen out of compliance by the time the breach happened. It was an entertaining exercise in circular logic: PCI DSS prevents security breaches, so obviously anyone who suffered a security breach couldn’t have been PCI DSS compliant.
Well, Heartland Payment Systems, who may have suffered the largest breach in history (giving executives at TJX something to celebrate), was certified as PCI DSS compliant. That suggests at least three possibilities:
- Heartland was PCI DSS compliant when they were audited, but fell out of compliance by the time of they were breached;
- Heartland wasn’t PCI DSS compliant, but their QSA said they were; or
- PCI DSS doesn’t actually prevent compliant organizations from suffering a breach.
Each of the first two conclusions would be reasonable. A PCI DSS assessment is a snapshot in time, and business are constantly changing. And because they are paid by the companies they assess, it’s fair to wonder whether QSAs are truly independent. The third conclusion is more than reasonable, it’s certain: PCI DSS compliance doesn’t guarantee security. That should be obvious. But maybe it’s not.
PCI’s strength and weakness is that it’s a checklist of detailed requirements. Its specificity is an improvement over laws like HIPAA, which calls for protecting against “reasonably anticipated threats” while considering the size of the organization and the costs of the security measures. It’s a flexible approach, but it doesn’t provide many answers. Are firewalls required? Does internal traffic have to be encrypted? It depends.
As a checklist, PCI DSS is more to the point. Companies know exactly what’s expected. They have to have firewalls between untrusted networks and any cardholder data environment (PCI DSS Requirement 1.2), install personal firewall software on laptops (Requirement 1.4), use anti-virus software (Requirement 5.1), and so on. There’s very little “it depends” in the PCI DSS requirements.
But companies sometimes think the checklist is all they need—that once they’ve checked “compliant” next to all the requirements, they’re done (until the next audit rolls around). They fall into the trap of thinking that a checklist item intended to mandate a minimum level of adequate security is also the most they need to do. They forget that being able to answer “yes, we have a process” to a checklist item is not as important as whether that process works. Then, when data is lost, they point to the checklist and ask what more they were supposed to do. That’s when a reasonableness standard starts looking awfully good.
The checklist is necessary, because there’s too much wiggle room and too much ambiguity without one. But just as security is not a product, it is also not a checklist. It is, as always, a process—one that a checklist can inform, and sometimes measure, but never complete.
Security Breaches, Identity Fraud, and Unknowns
February 14th, 2009 § Leave a Comment
There’s a minor hubbub at Wired and CyberCrime & Doing Time over the most recent Javelin Identity Theft Survey (consumer version here—the full report will set you back a cool $3,000). The report surveyed 4,784 people to find out if they had experienced identity fraud, and, if so, if they knew how the perpetrator accessed their data. Javelin’s big claim is this:
Despite the hefty blame . . . placed on the Internet and cyber-crime, online identity theft methods (phishing, hacking and malware) only accounted for 11% of fraud cases in 2008. The truth is, most known cases of fraud occur through traditional methods, when a criminal has direct, physical access to the victim’s information.
The report has a chart purporting to show the sources of data used to commit identity fraud. For example, here’s a partial list of categories:
| Stolen while making an online purchase: | 1% | |
| “Hackers, viruses, or spyware” on a home or work computer: | 9% | |
| Phishing: | 1% | |
| Stolen from a company in a data breach: | 11% | |
| “Primarily business controlled” data stolen while making a purchase: | 19% | |
| From a lost wallet, purse, etc.: | 43% |
Critics level two main charges against the report. First, they note that it was sponsored by a bank and an online identity protection company, creating a potential source of bias. But their main complaint is that the report—or at least the summary chart and Javelin’s 11% number—ignored the cases where identity fraud had an unknown source.
They have a point. Of the roughly ten percent of phone survey respondents who said they had experienced identity fraud (482 out of 4,784 people called), 65% had no idea how their data was obtained. Javelin threw away the unknowns, and calculated its percentages based on the 169 people who said they knew how their information was obtained. That 9% who said their data was accessed by hackers, viruses, or spyware? That’s 15 out of the 169. It’s also 15 out of the 482 who experienced identity fraud.
A more accurate survey result would look like this:
| Stolen while making an online purchase: | <1% | |
| “Hackers, viruses, or spyware” on a home or work computer: | 3% | |
| Phishing: | <1% | |
| Stolen from a company in a data breach: | 4% | |
| “Primarily business controlled” data stolen while making a purchase: | 7% | |
| From a lost wallet, purse, etc.: | 15% | |
| Unknown: | 65% |
The problem boils down to this: did the 313 people who said they didn’t know how their information was obtained have their data stolen in proportionately the same ways as the 169 people who could identify a source? Javelin, by tossing out those 313 unknowns, seems to think so. But there are good reasons why the knowns may not adequately represent the unknowns. For example, not all methods of data theft have the same visibility to the victim—most people know when their wallets have been stolen; the same is not always true of data stolen from a business. And do phishing and social engineering victims usually know they’ve been had?
Unfortunately, the critics take this point and then leap too far, claiming that all of the unknown cases must have come from data breaches, malware, phishing, and other online sources. That’s a reasonable conjecture, but that’s all it is. It replaces a poor assumption—that the sub-sample accurately represents the full sample—with unabashed speculation. Both are interesting, but neither are reliable data.
The most honest approach would be to put that 65% in the margin of error for each category. But “online identity theft methods only accounted for somewhere between 5% and 70% of all identity thefts” doesn’t make nearly as catchy a headline.
Sen. Feinstein Reintroduces Federal Data Breach Notification Bill
January 10th, 2009 § Leave a Comment
Senator Dianne Feinstein re-introduced her federal data breach notification bill this week. This is the Senator’s fourth attempt to pass a data breach bill, having introduced similar bills in 2003, 2005, and 2007.
The bill looks at lot like Sen. Feinstein’s previous bills. Most importantly, this year’s bill, like previous bills, would preempt state data breach laws. That would be good for businesses, who currently have to track forty-seven data breach notification laws enacted by states, the District of Columbia, and two territories. A federal data breach notification law would replace all these different laws with a single standard for notification. Consumers, however, might be better off without a national breach notification law, because companies usually comply with whichever state law demands the most of them, rather than adjusting their notifications by state. A federal data breach notification law therefore has to be carefully written so that it doesn’t end up reducing consumer protection.
Preemption is almost certain to be part of any national data breach notification law. Only 6.7% of Americans live in states without data breach notification laws, and they probably get breach notifications as a side effect of other states’ laws. So a national law is no longer necessary to ensure notification, but it could still be used to create uniform requirements—which means preemption.
Sen. Feinstein’s previous bills didn’t get far, even when TJX and Choicepoint were in the news. This year, we have no data breach poster child, and lots of other priorities. It’s a new Congress and a new administration so anything could happen, but unless there’s another big public breach I’d be surprised if this bill gets much attention this year.
Imperfect but Still Useful: Data Destruction and MD5
January 8th, 2009 § Leave a Comment
We techies sometimes have an unfortunate tendency to be absolutists.
For example, consider secure data destruction. Ask a group of techies how to securely dispose of a disk full of sensitive data, and you’ll get a discussion about Gutmann, magnetic force microscopes, massive electromagnets, 35-pass overwrites, shredding, drilling, crushing, melting—pretty much everything up to and including throwing it into the fires of Mount Doom. We get caught up in the extreme cases—how to protect data from shadowy figures with infinite time and infinite resources. But unless you’re a government and your hard drive has vital state secrets on it (in which case, go ahead and use the Mount Doom method), it just doesn’t matter that much. Almost any method of data destruction is so much better than nothing that any differences between methods are usually insignificant.
Plenty of data breach announcements have come from companies that improperly disposed of media. In each of these cases, the problem was not that the media was only overwritten once instead of thirty-five times, but that the media hadn’t been erased or encrypted at all. It’s similarly hard to imagine a court holding someone negligent for “merely” using a three-pass wipe to erase data. We shouldn’t get so caught up in edge cases that we ignore the center.
The MD5 certificate hack is another example. MD5 has been “broken” for a while, but the term “broken” gets tossed around so much for crypto algorithms that it’s meaningless. There’s a difference between the way MD5 is “broken” and, say, the way a Caesar cipher is “broken.” MD5 is “broken” in that a cluster of 200 PS3s can create a fake CA certificate in a few days. A Caesar cipher is “broken” in that a kid with a pencil can solve it in a few minutes. Treating both as equally “broken” is silly. Cryptographic strength is not a binary question of whether something is “valid” or “broken,” but a matter of the computational power needed to find an original plaintext without the key. “Broken” implies that an algorithm is either perfect or useless, when most flaws merely lower the computational cost of working around the algorithm.
An interesting question is what affect, if any, the certificates hack has on other uses of MD5 as a hashing algorithm. The exploit focuses on web certificates, and some (but not all) of the cleverness is in how they craft signing requests that get a CA to sign a “real” certificate with a signature that also fits a counterfeit certificate. A web authentication certificate has a certain structure that makes it hard to create a meaningful collision, and the researchers figured this out. But they also developed a “sophisticated and highly optimized method for computing MD5 collisions,” which might have broader implications than just certificates.
For example, it’s not clear what this means for use of MD5 in forensics, where one-way hashes are used to show that a hard drive hasn’t been modified. There are obvious differences between certificates and hard drives. Certificates are small (about 1KB), and hard drives are big. Certificates have a carefully defined structure. Hard drives also have a structure, but that structure has more free space that might be used to create collision blocks. If one has to look at all the data to see any signs of hash trickery, it will be easier to do that with a small certificate than with a large hard drive. Someone with better crypto knowledge than me could opine on these factors, but they illustrate that getting a CA to sign a certificate that collides with a fake certificate is different than modifying a hard disk and keeping the same hash.
Assume, however, that the exploit is equally useful for hard disks—that with 200 PS3s, one could create an entire disk with the same MD5 hash as another disk, then use that fake disk as evidence against someone in court. Would a drive hashed with MD5 be thrown out of evidence because of that weakness?
I don’t think so. When used to authenticate a copy of a hard drive, the purpose of the MD5 hash is twofold: (1) to show that the data was not accidentally modified from the original, and (2) to prove that the data was not maliciously modified. MD5 is still good enough for the first purpose—if it takes 200 PS3s to create a collision, that’s enough to show that a hashed hard drive wasn’t accidentally modified. It’s a little weaker as proof against intentional modification, but 200 PS3s would still involve a lot of work to forge evidence. An opposing party would probably have to do more than merely allege the possibility of forged evidence; the burden would probably still be on that party to show inauthenticity. The best choice, of course, is to use something other than MD5 in forensics when possible. But MD5 is still a whole lot better than nothing.
Not everything falls into an easy distinction between “perfect” and “broken.” Technical measures can be less than perfect, but still useful, and sometimes even the best option in some circumstances.