Unobtrusive Measures and Identity Resolution
Thursday, April 1st, 2010By Mike Betron, Infoglide Director of Marketing
For decades, researchers in the social sciences have used “unobtrusive measures” as defined originally in a 1966 book by Webb, Campbell, Schwartz, and Sechrest. The idea is to collect and analyze data without disturbing the subjects of the study. For example, instead of surveying subjects to find out how many candy bars they eat each day, the subjects’ garbage is searched and the number of candy wrappers is tallied.
Social science researchers are driven to unobtrusive measures when they encounter or anticipate either intentional or unintentional bias in their subjects’ responses. For example, in the study above, one may be inclined to understate the number of candy bars consumed (either intentionally or unintentionally) to improve personal perception. In the case of fraud, bad actors have an even stronger motivation to purposely bias any information because they don’t want to get caught.
Data analysis using unobtrusive measures can be extremely effective for discovering fraud and risk because bad actors often provide different versions of identifying data to avoid detection. For example, suppose you’re responsible for the placement of foster children in safe homes. A key requirement is to avoid placing a child in a home where a registered sex offender lives. However, what if no sex offender has the same official address as the foster home candidate but one does in fact live in the foster home or has a relationship with the foster home owner. How can identity resolution be used to alert you to that fact?
By using sophisticated algorithms to measure and score similarity between data fields, non-obvious relationship analytics (NORAn) helps users discover relationships between people that would otherwise go undetected. In our foster home example, NORAn could be applied to uncover the fact that a resident of the candidate foster home shares a sequential phone number with another person who shares the same address as a registered sex offender. The information highlighted in red shows the at least partial matches (all information is fictitious).
When we see these results, it’s not hard to speculate that Sally may be Jane’s mother or daughter and that John is a boyfriend who is living with her. If Sally and John visit Jane, there could be significant risk to any foster child living with Jane. Although we can’t determine that there is a relationship here with 100% certainty, the statistical probability of a potential link to a known sex offender is high enough to warrant further investigation.
Another area of social services where this might be relevant would be when a social worker is making a home visit and needs to check the home to make sure that no one with a record of violent crime lives at the house. What if a violent offender has listed his official address as his mother’s house but spends about half his time at his girlfriend’s house? A social worker who is conducting a home visit at the girlfriend’s house would want to know that.
Fortunately a major pizza delivery company sells their data, and that it turns out to be pretty accurate because if people want their pizza delivered they have to give the right address. By tapping into this data as well as other internally or externally available data, identity resolution technology will uncover the fact that the violent offender uses that address periodically as his residence.
Many health and human services organizations utilize data matching technology with varied levels of sophistication but, as evidenced by these examples, data matching alone is not enough. Social workers have to have the ability to understand and be made aware of non-obvious relationships as well.

