Comments for Identity Resolution Daily http://identityresolutiondaily.com All About Identity and Entity Resolution Mon, 26 Oct 2009 19:08:03 +0000 http://wordpress.org/?v=2.2 Comment on Avoiding False Positives: Analytics or Humans? by Ken O'Connor http://identityresolutiondaily.com/639/avoiding-false-positives-analytics-or-humans/#comment-701 Ken O'Connor Thu, 15 Oct 2009 14:47:36 +0000 http://identityresolutiondaily.com/639/avoiding-false-positives-analytics-or-humans/#comment-701 Robert, Excellent post on a highly sensitive issue. I would like to use an anology to highlight the intrinsic advantage analytics have over humans. Yahoo used to be the leading Search Engine. Yahoo employed people to decide which websites should get the highest ranking. In contrast, from day one Google used the "Algorithm". Yahoo's approach was never sustainable, given the exponential growth in the number of websites. I have worked on the development of Anti Money Laundering (AML) systems. AML systems perform Financial Transaction Monitoring. They could not function without analytics. They monitor Transaction Activity on millions of accounts. The purpose of the analytics is to identify "Transaction Activity that is unusual when compared to an account holder's peers". The AML system alerts a human to study the unusual activity. The human then seeks to "explain away" the unusual activity as 'normal', e.g. Once off sale of an asset. If the human cannot find a good reason for the unusual transaction activity, they report it to the authorities as "Suspicious". In my opinion, AML systems provide a good example of the pragmatic combining of analytics and humans - for the good of society. I completely agree with your quote "analytics are ethically neutral and the risk of something going “to the dark side” is the risk that comes from the people involved, with or without analytics." Rgds Ken Robert,

Excellent post on a highly sensitive issue.

I would like to use an anology to highlight the intrinsic advantage analytics have over humans.

Yahoo used to be the leading Search Engine. Yahoo employed people to decide which websites should get the highest ranking. In contrast, from day one Google used the “Algorithm”. Yahoo’s approach was never sustainable, given the exponential growth in the number of websites.

I have worked on the development of Anti Money Laundering (AML) systems. AML systems perform Financial Transaction Monitoring. They could not function without analytics. They monitor Transaction Activity on millions of accounts. The purpose of the analytics is to identify “Transaction Activity that is unusual when compared to an account holder’s peers”. The AML system alerts a human to study the unusual activity. The human then seeks to “explain away” the unusual activity as ‘normal’, e.g. Once off sale of an asset. If the human cannot find a good reason for the unusual transaction activity, they report it to the authorities as “Suspicious”.

In my opinion, AML systems provide a good example of the pragmatic combining of analytics and humans - for the good of society.

I completely agree with your quote “analytics are ethically neutral and the risk of something going “to the dark side” is the risk that comes from the people involved, with or without analytics.”

Rgds Ken

]]>
Comment on Applying Identity Resolution to Patient Identification Integrity by Henrik Liliendahl Sørensen http://identityresolutiondaily.com/605/applying-identity-resolution-to-patient-identification-integrity/#comment-685 Henrik Liliendahl Sørensen Mon, 10 Aug 2009 05:56:13 +0000 http://identityresolutiondaily.com/605/applying-identity-resolution-to-patient-identification-integrity/#comment-685 Avoiding duplicate patients may be a very different task depending on from which country you are. In Scandinavia every citizen is assigned a unique citizen ID used all around in healthcare as well as other areas as election, driving license, welfare and so on. Newest improvements are that the ID is assigned to newborns by health care staff – as close to the root as possible as one may put it. More <a href="http://liliendahl.wordpress.com/2009/08/05/sweden-meets-united-states/" rel="nofollow">here</a>. Avoiding duplicate patients may be a very different task depending on from which country you are.

In Scandinavia every citizen is assigned a unique citizen ID used all around in healthcare as well as other areas as election, driving license, welfare and so on.

Newest improvements are that the ID is assigned to newborns by health care staff – as close to the root as possible as one may put it.

More here.

]]>
Comment on The Myth of Matching: Why We Need Entity Resolution by Steve Sieloff http://identityresolutiondaily.com/493/the-myth-of-matching-why-we-need-entity-resolution/#comment-679 Steve Sieloff Mon, 13 Jul 2009 22:57:17 +0000 http://identityresolutiondaily.com/493/the-myth-of-matching-why-we-need-entity-resolution/#comment-679 John -- Another great post and on point! I find it very interesting linking "point in time" occupancies to the current state location of an entity. Public records, while fruitful, are spotty in availability and lack many standard data quality measures. Name distributions per a given geography (zip or zip+4) are helping in making links between names with materially different addresses -- Zawarek Timonsky 123 Main St and Zawarek Timonsky 456 Elm Dr in same zip code where only one Zawarek first name is known and 3 Timonsky surnames known ... the unique combination creates a high degree of confidence we are talking same person -- even with differing addresses. As for the example of St. in the street not always meaning Street, it is clear that the software causing the incorrect classification and standardization is not looking at both the keyword AND the pattern or semantics in which the keyword or phrase is referenced. This type of semantic parsing and standardization is gaining traction in document classification and phrase searching (aka Google). Keep up the thought provoking articles! Steve John –

Another great post and on point! I find it very interesting linking “point in time” occupancies to the current state location of an entity. Public records, while fruitful, are spotty in availability and lack many standard data quality measures. Name distributions per a given geography (zip or zip+4) are helping in making links between names with materially different addresses — Zawarek Timonsky 123 Main St and Zawarek Timonsky 456 Elm Dr in same zip code where only one Zawarek first name is known and 3 Timonsky surnames known … the unique combination creates a high degree of confidence we are talking same person — even with differing addresses.

As for the example of St. in the street not always meaning Street, it is clear that the software causing the incorrect classification and standardization is not looking at both the keyword AND the pattern or semantics in which the keyword or phrase is referenced. This type of semantic parsing and standardization is gaining traction in document classification and phrase searching (aka Google).

Keep up the thought provoking articles!

Steve

]]>
Comment on What’s the Data Quality Business Message? by Dylan Jones http://identityresolutiondaily.com/594/what%e2%80%99s-the-data-quality-business-message/#comment-678 Dylan Jones Thu, 09 Jul 2009 12:39:59 +0000 http://identityresolutiondaily.com/594/what%e2%80%99s-the-data-quality-business-message/#comment-678 Hi Robert I would have to agree with Ted but I think "reasonably poor" is a little gracious! I think that part of the problem is that historically, the data quality technology companies employed a lot of very technically minded people who lacked the specific vertical experience that so many businesses need. What I am definitely seeing in recent months is a much sharper focus on business products which don't really have a great deal of focus on DQ but underneath the covers is the same old DQ engine. A couple of vendors are doing this well but the vast majority are just pushing the tired old messages. There has been tons of online research for example which demonstrates that if your message is focused on you (the company) and not you (the customer) the engagement can be lost, literally in seconds. Looking forward to your new messaging, why not use your community here to help you shape it? Hi Robert

I would have to agree with Ted but I think “reasonably poor” is a little gracious!

I think that part of the problem is that historically, the data quality technology companies employed a lot of very technically minded people who lacked the specific vertical experience that so many businesses need.

What I am definitely seeing in recent months is a much sharper focus on business products which don’t really have a great deal of focus on DQ but underneath the covers is the same old DQ engine. A couple of vendors are doing this well but the vast majority are just pushing the tired old messages.

There has been tons of online research for example which demonstrates that if your message is focused on you (the company) and not you (the customer) the engagement can be lost, literally in seconds.

Looking forward to your new messaging, why not use your community here to help you shape it?

]]>
Comment on The Growing Role of Identity Resolution in MDM by Dan Power http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-674 Dan Power Tue, 16 Jun 2009 20:07:45 +0000 http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-674 <p>Recognizing that identity resolution is a core element of MDM is not the same thing as having the best technology in that area.</p> <p>I think the IBM and Initiate identity resolution offerings are pretty strong, but for those who have chosen to license Informatica's solution, that could be increasingly problematic, as I expect Informatica to gradually transition to being a full-fledged MDM vendor itself.</p> <p>I think you'll see the Informatica licensees eventually develop their own identity resolution solutions, or perhaps acquire companies like Infoglide. </p> Recognizing that identity resolution is a core element of MDM is not the same thing as having the best technology in that area.

I think the IBM and Initiate identity resolution offerings are pretty strong, but for those who have chosen to license Informatica’s solution, that could be increasingly problematic, as I expect Informatica to gradually transition to being a full-fledged MDM vendor itself.

I think you’ll see the Informatica licensees eventually develop their own identity resolution solutions, or perhaps acquire companies like Infoglide.

]]>
Comment on The Growing Role of Identity Resolution in MDM by Clayton Forrester http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-673 Clayton Forrester Tue, 16 Jun 2009 11:33:06 +0000 http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-673 Of course Identity Resolution is very important for MDM - but most of the serious MDM vendors have already recognized this for years! Of the vendors you list in your post - IBM owns Entity Analytics, Initiate has their own sophisticated identity resolution, and Oracle, Siperian and Purisma/D&B have been licensing Informatica's Identity resolution for at least 4 years. The only one who has not got the "message" is SAP. Of course Identity Resolution is very important for MDM - but most of the serious MDM vendors have already recognized this for years!

Of the vendors you list in your post - IBM owns Entity Analytics, Initiate has their own sophisticated identity resolution, and Oracle, Siperian and Purisma/D&B have been licensing Informatica’s Identity resolution for at least 4 years.

The only one who has not got the “message” is SAP.

]]>
Comment on The Growing Role of Identity Resolution in MDM by Dan Power http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-665 Dan Power Thu, 21 May 2009 18:12:15 +0000 http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-665 Jairaj, you're right that robust Identity Resolution is critical, particularly in the financial services industry. False positives and false negatives can carry big consequences from a risk and exposure management perspective. IBM's Entity Analytic Solution and Infoglide's Identity Resolution Engine are the two leading products, and D&B's corporate hierarchy information can be helpful as well, when you're trying to build business-to-business relationships. Wikipedia has a good write-up on the subject at http://en.wikipedia.org/wiki/Identity_resolution ... Jairaj, you’re right that robust Identity Resolution is critical, particularly in the financial services industry. False positives and false negatives can carry big consequences from a risk and exposure management perspective.

IBM’s Entity Analytic Solution and Infoglide’s Identity Resolution Engine are the two leading products, and D&B’s corporate hierarchy information can be helpful as well, when you’re trying to build business-to-business relationships.

Wikipedia has a good write-up on the subject at http://en.wikipedia.org/wiki/Identity_resolution

]]>
Comment on The Growing Role of Identity Resolution in MDM by Jairaj http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-664 Jairaj Thu, 21 May 2009 16:15:23 +0000 http://identityresolutiondaily.com/527/the-growing-role-of-identity-resolution-in-mdm/#comment-664 Agreed, Identity resolution is very critical, in certain industry. Risk management and reduction of exposure to defaults by clientele has made Banks aggressively focus on Identity resolution. Considering the fact every person is related to each other at least within six degree to separation, it means whom you know has a huge bearing on how your identity gets resolved. Recently while attending a Information management technical conference hosted by IBM, there was a huge interest around the offering Entity Analytic Solutions which bears lots of features where Identity is resolved based on anonymous factors and relationship factors. Agreed, Identity resolution is very critical, in certain industry. Risk management and reduction of exposure to defaults by clientele has made Banks aggressively focus on Identity resolution. Considering the fact every person is related to each other at least within six degree to separation, it means whom you know has a huge bearing on how your identity gets resolved. Recently while attending a Information management technical conference hosted by IBM, there was a huge interest around the offering Entity Analytic Solutions which bears lots of features where Identity is resolved based on anonymous factors and relationship factors.

]]>
Comment on Stuck in the Middle by Francois Wolf http://identityresolutiondaily.com/513/stuck-in-the-middle/#comment-656 Francois Wolf Sat, 02 May 2009 19:10:57 +0000 http://identityresolutiondaily.com/513/stuck-in-the-middle/#comment-656 The debate between freedom and security is as old as the democratic ideal. As one of the pillars of individual freedom, privacy is a matter of capital concern for citizens in all free societies. Modern terrorism, with its willingness to break the most basic rules of human behavior, poses a special challenge to leaders who can apply advanced knowledge that may be effective but may also undermine some of our basic democratic values. Technology like identity resolution, developed and deployed by responsible and moral leaders can safeguard privacy while substantially "hardening" our societies as targets for evildoers. The same way that a missile launch sequence can be designed in a way that makes an unauthorized firing virtually impossible, the same way protocols can be created to make privacy a top concern while scanning massive amounts of personal data for people for whom freedom is abhorrent. Technology experts that are on the front lines of the defense of our societies need to cultivate strength and good judgment as they handle the double-edged swords developed to guarantee the survival of our ideals. It's a mission that is full of pitfalls but one that must be carried out. No one ever said that Democracy is a suicide pact. The debate between freedom and security is as old as the democratic ideal. As one of the pillars of individual freedom, privacy is a matter of capital concern for citizens in all free societies. Modern terrorism, with its willingness to break the most basic rules of human behavior, poses a special challenge to leaders who can apply advanced knowledge that may be effective but may also undermine some of our basic democratic values. Technology like identity resolution, developed and deployed by responsible and moral leaders can safeguard privacy while substantially “hardening” our societies as targets for evildoers. The same way that a missile launch sequence can be designed in a way that makes an unauthorized firing virtually impossible, the same way protocols can be created to make privacy a top concern while scanning massive amounts of personal data for people for whom freedom is abhorrent. Technology experts that are on the front lines of the defense of our societies need to cultivate strength and good judgment as they handle the double-edged swords developed to guarantee the survival of our ideals. It’s a mission that is full of pitfalls but one that must be carried out. No one ever said that Democracy is a suicide pact.

]]>
Comment on The Myth of Matching: Why We Need Entity Resolution by Daragh O Brien http://identityresolutiondaily.com/493/the-myth-of-matching-why-we-need-entity-resolution/#comment-638 Daragh O Brien Sun, 29 Mar 2009 09:47:49 +0000 http://identityresolutiondaily.com/493/the-myth-of-matching-why-we-need-entity-resolution/#comment-638 John, Great post. You may recall the slides I've used at IAIDQ conferences about my name and how it got me into information quality at an early age. 13+ spelling variants, can be male/female, can be miskeyed as Tara, or mangled to be Darren, Daryn, Daryl (also a male/female name), Dora (hence my love of exploring). And let's not get started on my home address as a kid which seems to still confuse data quality tools (here's a hint... St. in an address is not always an abbreviation of "street"). I have other examples... I think one of the mental gear-shifts that needs to be made when looking at these issues is to remember that data is a representation of a real world thing (in this case a person). It is not the thing itself. When we are elbows deep in the data it can be all too easy to loose sight of that. Looking forward to the follow ups to this. John,

Great post. You may recall the slides I’ve used at IAIDQ conferences about my name and how it got me into information quality at an early age.

13+ spelling variants, can be male/female, can be miskeyed as Tara, or mangled to be Darren, Daryn, Daryl (also a male/female name), Dora (hence my love of exploring). And let’s not get started on my home address as a kid which seems to still confuse data quality tools (here’s a hint… St. in an address is not always an abbreviation of “street”). I have other examples…

I think one of the mental gear-shifts that needs to be made when looking at these issues is to remember that data is a representation of a real world thing (in this case a person). It is not the thing itself. When we are elbows deep in the data it can be all too easy to loose sight of that.

Looking forward to the follow ups to this.

]]>