Commentaries On The 9/11 Commission Report
The Accuracy of Data Matching
By Christopher Effgen
The Disaster Center
August 8, 2004
Congress is now considering the 911 Commission recommendation that a plan of the Markle Foundation be adopted that would result in the integration of databases across Federal, State and local lines. Government databases would be data matched with commercial databases to help fight the war on terror. However, as the Markle Foundation has pointed out, such a system created at great cost would be a wasted resource if it were not used for other purposes.
In, An Example of a Data Matching, we investigated the operation of a data matching program run by the US Census Bureau, wherein applicants for employment were subject to a name-based criminal history check prior to being considered for employment. In this article we shall explore the accuracy of the database that was used in order to explore some of the consequences involved with the use of data-matching programs.
In, An Example of a Data Matching, a FBI and Search Inc. (a Justice Department funded corporation) finding was referred to that indicated that approximately 11.7% of people with a criminal history are not identified when a name-check is performed. The effect of the use of the name-based criminal history database led to the Census Bureau’s failure to identifying approximately 60,000 applicants who may have presented a risk to the public. This is only one of the issues involved with using data-matching schemes. The reasons the data matching scheme failed to identify everyone with a criminal history was that, while these records exist, it was not in a form that could be automatically accessed, or the individual used a form of identification that did not connect them to the record.
The Census Bureau conducted 3,519,831 background checks on applicants for employment. The FBI automated name match program returned results that indicated 2,682,111 (76.2%) of the applicants did not have a criminal history. These individuals were immediately placed in a pool from which applicants were made available for consideration for employment. After they cleared the background check, whether an applicant would be further considered depended on their relative test and skill scores compared with other applicants. Whether an applicant was then hired would depend upon an interview. Because the Census Bureau relied on the name-check no references would be contacted.
This roughly corresponds with the instant proceed to sale rate (77%) of Brady Gun background checks during 2001-2002. In the case of a Brady Check these individuals would be allowed to immediately purchase a gun. By statute a final resolution of a Brady Background Check must be reached within three days. The percentage of people who were identified as having records that would prohibit them from purchasing a gun in 2000 was 2.0%.
The consequences of not instantly clearing a name-based background check, was a delay in being placed in the pool from which applicants would be selected, additional costs in processing applications, and in some instances arrest. That an individual who was subject to a background check may be arrested should go without saying, considering the nature of the database. It took the Census Bureau an average of 14.5 days to manually review the criminal history records returned by the FBI.
The Census Bureau reduced the percentage of people who were identified as having a criminal history from 24.3% (837,720) to 15.6% (550,002) by manually comparing the information submitted with applications and the criminal histories returned by the FBI. That number was again reduced from 15.6% (550,002) to 9.4% (331,429) by determining that while some applicants had a criminal history it was not significant enough to prevent them from being considered for employment. The Census Bureau caused 837,720 applications to be withheld from consideration an average of 14.5 days, cleared 506,175 of applicants for consideration, to identify the 331,545 (9.1%) individuals that would be denied consideration for employment.
Both the Brady and the Census Bureau’s background check processes provide the opportunity for an appeal. In the year 2000, of the people denied the right to purchase a firearm 14% determined to file an appeal, of which 21% won their appeal. Of people denied consideration for employment by the Census Bureau 8.3% determined to file an appeal. Of these 37% won their appeal. The right of an appeal in both cases means that the individual who has been identified as having an arrest record is prohibited the from enjoying a right or exercising a privilege, and must prove that they are entitled to it, in order to enjoy it. For Census Bureau applicants this process could take as long as 105 days.
Two studies have been undertaken to determine the accuracy of name-based background checks versus fingerprint checks:
On May 18, 2000 David R. Loesch, Assistant Director in Charge of the Criminal Justice Information Services Division (or CJIS Division) of the Federal Bureau of Investigation stated before the House Judiciary Committee:
"Of these 6 million noncriminal justice inquiries, 977,426, or some 16.2 percent of the names checked, resulted in "hits". In light of the 8.7 percent of civil applicants which are historically identified by fingerprints with criminal records, we can conclude that at least 7.5 percent (the 16.2 percent less the 8.7 percent), are "false positives." To translate such percentage into real terms, had name checks been made of the 6.9 million civil applicants, some 517,500 "false positives" would have resulted, and these individuals would have been, at least temporarily, falsely identified with criminal records. Stigmatization is the least of the consequences of "false positives": because of the confidentiality typically associated with personnel decisions and criminal history record information, an applicant incorrectly identified may well not have an opportunity to challenge the incorrect determination and denial of employment or volunteer opportunities."
The percentage of people who were identified as having a criminal history by the Census Bureau was 15.6%. If the percentage (8.7%) of applicants who historically are identified as having a criminal records as a result of civil fingerprint checks, was the same as was determined by the FBI’s study, the Census Bureau should have identified 306,225 people as having a criminal history. Yet, the Census Bureau’s process identified the 550,002 people as having a criminal history. The difference of 243,777 should be people who were falsely identified as having a criminal history.
The other study that investigated the accuracy of name-checks was the Interstate Identification Index Name Check Efficacy: Report of the National Task Force to the U.S. Attorney General July 1999. This was a study involved employment related background checks performed in the State of Florida.
That report indicated that 5.5 % of applicants were falsely identified as having a criminal history as a result of name checks. (It also established that 11.7% of people with a criminal history would have been cleared for employment if name-checks alone were performed.) To estimate the number of people who were falsely identified as having a criminal history by this method we would need to multiply the total number of applicants for employment 3,519,831 times 5.5%. When we do so we find that 193,590 applicants would be falsely associated with a criminal history.
Another factor in this process is that about 40% of criminal history records do not have a disposition of the arrest associated with the criminal history record. This was probably the reason why the Census Bureau determined to use arrest rather than conviction records as the basis for determining whom to consider for employment. What this means is that even when a correct association is made with a criminal history there is a fair probability that match has been made with an arrest record only.
The final factor we should consider is that every database has an error rate. The primary search criteria used for name-based searches of criminal history records are name, date of birth, race and sex. Male names compromise about approximately 70% of arrest records. Certain races have a greater percentage of individuals with criminal history records. Errors in data-matching general follow the characteristics which enables the system to correctly generate a result, thus if you are a male, of certain races, with a common name you are more likely to be falsely associated with the criminal history of another.
The justification for engaging in this process was that it was the most cost-effective means for processing and clearing a large number of people for consideration of employment.
The consequence of this practice was that approximately 60,000 people with criminal histories were not detected and approximately 200,000 people were falsely identified as having a criminal history.
Both the use of government collected criminal history databases and Federal hiring procedures are highly regulated activities. The Census Bureau did not comply with these regulations, when it engaged in this process. Similar processes using commercially collected, and therefore unregulated database uses, are now being used to determine if people should have the right to vote, to rent apartments, to be eligible to be considered for credit. The lack of regulation of these activities is creating consequences both for individuals and the culture at large.
This document is located at
http://www.disastercenter.com/911_8.htm
Commentaries On The 9/11 Commission Report
For Those Who Loved
Them
Risk/Threat Management
The Terrorism
Center
Deep
Institutional Failings
WMD -- Weapons of Mass
Destruction
The
911 Commission Report and the Markle Foundation's Recommendations
An Example of Data Matching
The Accuracy of Data Matching
What the United States Stands For
The 911
ReportThe complete Commission Report in PDF format (7.4
MB)
Christopher Effgen [send him an mail] is the owner of the Disaster Center web site, and has been active in reporting about disasters by digital means since the site was established in 1996. He has authored articles dealing with wide variety of disaster related topics including risk/threat management, neural networks, the science of disaster communication, and compiled numerous disaster related statistics (many of which are hosted on this site). He is active as a participant in national and international forums promoting disaster mitigation towards the goal of sustainable development.