By Brian Calvert, Infoglide Senior Software Architect
The recent “Christmas Bomber” incident incited many posts about applying technology to address the gaps that allowed it to happen. For example, David Loshin wrote about a piece for BeyeNETWORK about a “master terrorist system” while Lawrence Dubov suggested improving the watch list process using entity resolution. While technology is a critical component of any solution, some specific issues about the technology are important to understand.
In an address this week, President Obama outlined the shortcomings in people, processes, and technologies that gave the now infamous Christmas Bomber the opportunity to take down a Detroit-bound flight.
President Obama identified three major problem areas:
It’s now clear that shortcomings occurred in three broad and compounding ways. First, although our intelligence community had learned a great deal about the al Qaeda affiliate in Yemen called al Qaeda in the Arabian Peninsula — that we knew that they sought to strike the United States, and that they were recruiting operatives to do so — the intelligence community did not aggressively follow up on and prioritize particular streams of intelligence related to a possible attack against the homeland.
Second, this contributed to a larger failure of analysis — a failure to connect the dots of intelligence that existed across our intelligence community, and which together could have revealed that Abdulmutallab was planning an attack.
Third, this in turn fed into shortcomings in the watch-listing system which resulted in this person not being placed on the no-fly list; thereby allowing him to board that plane in Amsterdam for Detroit.
CNN highlighted one additional failing that’s relevant to the topic of Identity Resolution (my emphasis):
A timeline provided by the State Department officials, who spoke on condition of anonymity, showed that an initial check of the suspect based on his father’s information failed to disclose he had a multiple-entry U.S. visa. The reason was that AbdulMutallab’s name was misspelled. “That search did not come back positive,” said one official, who called it a quick search without using multiple variants of spelling.
What are the specific technology issues?
While the details of the technologies used by the State Department are not identified, the story is typically the same for government and industry. Simple equivalency lookups are not enough. “John Kennedy” will not match “Jhon Kennedy” with standard database lookups. Furthermore, some technologies rely on strategies that actually destroy the forensic integrity of the data. They force it into pre-existing molds in a variety of ways to perform similarity matching. We’ve addressed the many challenges to matching names in this blog in the past, especially in “Playing the Name Game with Terrorist Watch Lists and Shoplifter Databases”.
Indexing is one approach that can fail. It tries to turn common names and known variations and nicknames into identical easily matched tokens. So John, Jack, and Johnny might all translate to “F12391″, facilitating a quick match. But what happens when John’s name — like AbdulMutallab’s — is misspelled? “Jhon” will fail to be matched to the common code and, thus, the match will quickly fail. Encoding is another common example that we addressed. Algorithms like “soundex” attempt to translate words into a fuzzy phonetic equivalent. But the promise of these algorithms falls short, especially when they encounter misspellings, nicknames, and cultural variations.
So while merging all information into a common view or improving watchlist management might be part of the solution, they will still fail if the technology used to merge or search is not up to the task.
Not all identity resolution technologies are the same. Ours can be configured using a number of strategies to fit particular customer performance requirements, sensitivity to false positives or false negatives, and Similarity Search behaviors, including specialized name algorithms that catch misspellings, nicknames, and ordering variations.
Although the consequences are grimmer in homeland security situations, the challenges are the same for financial, healthcare, gaming, state and local government, and marketing applications. While it remains to be seen what improvements the US government will apply to the people, processes, and technology used to secure the country, it’s easy to see that simple misspellings need not break the system or, for that matter, any other system.