Archive for the ‘Customer Data Integration’ Category

Architectures for Entity Resolution-Part 2

Wednesday, March 10th, 2010

By John Talburt, PhD, CDMP, Director, UALR Laboratory for Advanced Research in Entity Resolution and Information Quality (ERIQ)

In the last post we examined how entity resolution (ER) systems are actually implemented, starting with the most basic merge/purge process and heterogeneous join systems. Both of these approaches focus on collecting equivalent references from among the sources provided, either as a large batch of references in a single file, or through queries against a federation of databases.  The entity identities found by these ER systems are transient in the sense that they depend upon the sources input into the process.  When different sources are provided, different identities will emerge.

On the other hand, there are ER systems that retain and manage identity information.  By doing this they are able to “recognize” the same identity over time and assign that identity the same entity identifier (sometimes called “persistent identifiers” or “persistent links”).  In Customer Data Integration (CDI) applications, these kinds of systems are sometimes called Customer Recognition Systems.

Two major types of ER systems perform identity management.  The first type is the “identity resolution” system.  It is most effective in situations where a fairly stable set of known identities of interest exists, such as the set of vendors or customers of a company, a set of products, or the students enrolled in a school.  The attributes of these identities are pre-loaded into the system and assigned identifiers.  When a reference is given to the system, it then decides whether the reference is to one of the known identities, and if so, returns the identifier of that identity.

Identity resolution systems can operate in either batch or transactional mode.  In cases where there are a large number of pre-stored identities, the performance of batch operations can be improved through distributed processing where the identities are partitioned over multiple processors and resolved in parallel.

However, there are many situations where the identities are not necessarily known in advance, or in some cases  the entities are known but simply not organized in such a way that they can be easily pre-loaded.  For example, suppose two companies merge and each company has its own customer database. The customers are identified in different ways in each database, and furthermore, for the customers of one company, poor systems and practices prevent having any confidence that the master records are unduplicated across business lines or company locations.

The type of system often applied in these situations is an “identity capture” system.  The identity capture architecture can be seen as a hybrid of  merge/purge and identity resolution systems.  It supports identity management and persistent identifiers, but without starting with a preloaded set of identities.  In my next post, we’ll delve deeper into the identity capture process.

Is MDM Dead?

Wednesday, March 3rd, 2010

By Mike Shultz, Infoglide Software CEO

Andrew White of Gartner recently posed a question about whether master data management (MDM) is dead. He didn’t actually suggest that the demise of master data management is imminent. He was challenging whether our current terminology adequately clarifies the current reality about MDM and associated product areas.

Certainly the terms describing many markets and types of products are being associated with MDM. Jackie Roberts of DATAForge pointed out that the definition of MDM now seems to include “data integrity, data quality, entity resolution, matching, data integration, governance, metrics and analysis.”

While entity resolution was mentioned in her list, our obsessive focus on entity resolution (aka identity resolution) leads to the conclusion that, rather than being subsumed, its role is growing. Wayne Eckerson at TDWI seems to agree that identity resolution is a critical component of the recent MDM acquisitions. In his post about the acquisitions by Informatica and IBM of Siperian and Initiate Systems, respectively, he described the two transactions this way:

“You could say that Siperian is mostly MDM, but with identity resolution and other capabilities, whereas Initiate is mostly about identity resolution, but with MDM and other capabilities.”

Identity resolution is becoming an integral part of many product areas. Within MDM itself, creating a single-entity view is best done with an identity resolution engine. Data mining is greatly enhanced by the addition of entity resolution. Dan Power of Hub Solution Designs wrote about how key identity resolution is to data matching. We’ve talked about how social CRM can resolve identities of individuals across multiple disparate data sources using identity resolution, as well as “rationalize multiple variations and errors and anomalies that block finding existing customers within their systems”.

Although identity resolution technology has been years in the making, it has only recently risen into the consciousness of most analysts and customers. Because of its ability to bring enhanced clarity to ambiguous data, advanced identity resolution is now beginning to have a significant impact across many data-centered disciplines.

Identity Resolution Daily Links 2010-03-01

Monday, March 1st, 2010

By the Infoglide Team

IT-Director.com: The Informatica Event

[Philip Howard] “To begin with, the company talked about its acquisition of Siperian. I have already commented on this but one point that emerged at the conference was the way that Informatica describes Siperian as infrastructure MDM as opposed to application MDM. This is a hitherto unrecognised distinction (with respect to terminology) in the MDM market. Informatica distinguishes the former from the latter by saying that infrastructure MDM is domain and data model independent.”

Workforce Management: Medical Clinic Owners Plead No Contest to $60 Million Workers’ Compensation Fraud

“Investigators alleged that the pair purchased thousands of workers’ compensation client referrals from an attorney television advertising service. Clients were then sent to doctors who had a relationship with Premier, which would handle billing and collection work in return for a 50 percent fee for money they collected. Clients were then sent to attorneys who had a business relationship with Fish and Bacino, investigators allege. ‘Getting kickbacks for referring medical payments is illegal and drives up the costs in the system,’ California Insurance Commissioner Steve Poizner said in a statement.”

SignalScape: DC Police Chief Cathy Lanier Describes How Technology Is Changing Police Work in the Capitol

“The MPD also established a fusion center, which is responsible for the national capitol region. From a homeland security perspective, Chief Lanier said that the center collects and stores crime and terror alerts into a data warehouse.”

Injured Workers’ Law Firm Blog: Insurance Fraud Is a Huge Crime

“The fraudulent claims that can be made through insurance companies are categorized as being soft or hard. Soft fraud is the most common type of fraud and usually takes place when someone exaggerates a claim being made. Hard fraud takes place when someone deliberately plans a deceptive act such as a collision or the theft of their vehicle.”

Identity Resolution Daily Links 2010-02-09

Tuesday, February 9th, 2010

By the Infoglide Team

ovum: Informatica finally plugs MDM gap

MDM now creates another competitive front for Informatica against rivals and complicates some partial relationships - notably Oracle, which includes Informatica’s identity resolution software as part of its Siebel Universal Customer Master (UCM) MDM engine, as well as some parts of its data quality software. Informatica also has OEM relationships with IBM and DataFlux for address cleansing that might need revisiting.”

ovum: IBM acquires Initiate Systems to strengthen healthcare solutions

“Being acquired by a large player such as IBM also raises the question of whether Initiate will be able to unfold its potential under the large IBM umbrella, or whether it will wither and sink into oblivion alongside the multitude of applications in IBM’s broad portfolio. This will be a test of how well IBM integrates small but high-performing companies.”

TMCnet Healthcare Technology: ECRI Guides Hospitals on Electronic Health Record Implementation

“Electronic health records, or ‘EHRs,’ are the future of medical record keeping. The American Recovery and Reinvestment Act, or “ARRA,” includes incentive payments for hospitals that adopt an EHR, but the timetable for implementation is tight. To qualify for the full payment, hospitals will require proving ‘meaningful use’ by October 2012.”

2010 TDI Fraud Conference: Texas Workers’ Compensation Fraud

Workers’ comp fraud indicators… Frequent additions and cancellations of coverage, especially if several business entities appear to be owned or controlled by the same person or group”

Identity Resolution Daily Links 2010-02-05

Friday, February 5th, 2010

[Post from Infoglide] And Then There Were Two

“IBM announced today that it plans to buy MDM vendor Initiate Systems.  As hypothesized here in this blog last week, the move was not entirely unexpected, but on the heels of last week’s announcement by Informatica to purchase Siperian, it certainly creates yet another wave in the marketplace.  More moves are certain to take place as competing companies align – and realign – their Single Entity View (SEV) strategies.  The key to this realignment will be for current industry players to maximize their functionality beyond ‘playing with matches’.  That dated view of fuzzy matching is no longer enough.  Not for the large data quality vendors.  Certainly not for the customer.”

Information Week: Global CIO: IBM Data Strategy Is Flawed, Say Kalido And Informatica

“Noting that Initiate’s product is spefically designed to handle only certain types of data—customer data and product data—Kalido CEO Hewitt says, ‘Where they have struggled is in mastering multiple domains, even though they advertise their products as such. The problem is that as you add domains, the complexity of the data relationships expands exponentially. So one domain might have 100 relationships, two domains 300 relationships, 10 domains 3,000 relationships. So when one master data element changes, hundreds of relationships could change, which requires a governance process to manage it.’”

Columbia Daily Tribune: Networks advance child-trafficking investigation

“Watson called up a contact at the El Paso Intelligence Center (EPIC), a fusion center that combines intelligence from federal law enforcement and state and military sources. Watson also called a friend at U.S. Immigration and Customs Enforcement and asked him to prepare a ’serious incident report.’ ICE mobilized an officer specializing in human trafficking within minutes, Watson said.”

ITBusinessEdge: How Big Deals Affect MDM Competitors, Customers

“But the general upheaval in MDM aside, the IBM deal is interesting in another way. IBM has downplayed this as an MDM acquisition, positioning it more as buying into two verticals, health care and a government. Gartner’s Andrew White writes that at one point during the briefing, IBM was asked what the Initiate acquisition meant for MDM. IBM responded it reflects a ‘verticalization of MDM.’ White writes that’s good news for health care customers, but ‘troubling for IBM MDM product strategy.’”

Identity Resolution Daily Links 2009-09-28

Monday, September 28th, 2009

[Post from Infoglide] Social CRM, CDI, and Identity Resolution

“In her well-read book on CDI, Jill Dyché offers a definition of CDI that also seems to describe social CRM. Try reading her definition of CDI, replacing ‘CDI’ with ’social CRM’: CDI is a set of procedures, controls, skills and automation that standardize and integrate customer data originating from multiple sources.”

Concord Monitor: Don’t play games when giving your name

“What do they want? Your date of birth, your gender and your middle initial. This information will be relayed to the TSA, and the TSA will match the information against information maintained by the Terrorist Screening Center (an arm of the FBI that gathers and consolidates watch lists). The theory is that a 12-year-old boy named John X. Doe can more easily be separated from John Z. Doe, who happens to be a 37-year-old man with a history of making bombs, if additional information is collected during the booking process. Once TSA has cleared you, you’ll be issued a boarding pass.”

pressdemocrat.com: Achieving paperless health care

“Medical record-keeping, until recently, relied on rooms full of paper files that were easily misplaced and filled with hurried, handwritten entries that could be hard to read. Electronic records hold orderly, keyboard-entered data that never leaves a hard drive and have the potential to move seamlessly from a primary care provider’s office to an emergency room or specialist’s suite.”

ebizQ: MDM Becoming More Critical in Light of Cloud Computing

[David Linthicum] “We’re moving from complex federated on-premise systems, to complex federated on-premise and cloud-delivered systems.   Typically, we’re moving in these new directions without regard for an underlying strategy around MDM, or other data management issues for that matter.”

Homeland Security: I&A Reconceived: Defining a Homeland Security Intelligence Role

“There are currently 72 fusion centers up and running around the country (a substantial increase from 38 centers in 2006).  I&A has deployed 39 intelligence officers to fusion centers nationwide, with another five in pre-deployment training and nearly 20 in various stages of administrative processing.  I&A will deploy a total of 70 officers by the end of FY 2010, and will complete installation of the Homeland Secure Data Network (HSDN), which allows the federal government to share Secret-level intelligence and information with state and local partners, at all 72 fusion centers.”

Identity Resolution Daily Links 2009-9-25

Friday, September 25th, 2009

By the Infoglide Team

[Post from Infoglide] Social CRM, CDI, and Identity Resolution

“In her well-read book on CDI, Jill Dyché offers a definition of CDI that also seems to describe social CRM. Try reading her definition of CDI, replacing ‘CDI’ with ’social CRM’:  CDI is a set of procedures, controls, skills and automation that standardize and integrate customer data originating from multiple sources(1).”

Charleston Daily Mail: Former owner of WVa trucking company sentenced

“Leonard Cline formerly owned H & H Trucking. The insurance commissioner says he defrauded the old state workers’ compensation system of more than $500,000 in unpaid premiums, penalties and claims for benefits over about 10 years.”

WTVQ: Eight People Indicted for Insurance Fraud

“The US attorney’s office says the suspects intentionally damaged insured automobiles owned by other conspirators then filed claims.”

KansasCity.com: Push for electronic medical records picks up steam

“With or without health care reform this year, electronic medical records are picking up steam. Recent technological advances are easing the transition for doctors and hospitals, and there’s the little matter of the Health Information Technology for Economic and Clinical Health Act. The act, part of last spring’s stimulus package, included billions of dollars to ‘advance the use of health information technology.’ There’s plenty of advancing to do, with one group estimating that less than half the hospitals and only one in five physicians are equipped to fully use electronic records. ‘The United States is far more advanced in grocery store technology than in medical records technology,’ said Steve Lieber, president and chief executive officer of the Healthcare Information and Management Systems Society in Chicago.”

pnj.com: Man charged with workers’ comp fraud

“Florida Chief Financial Officer Alex Sink announced the arrest today in a news release. In the release, Sink said her Division of Insurance Fraud said Soto is charged with falsifying employment numbers with the intent of avoiding higher workers’ compensation premium payments.”

Federal News Radio: Update: Identity management in the Obama administration

“The alphabet soup of identity management programs from the Bush administration — HSPD-12, TWIC, Real ID, and many more — have gotten little attention publicly during the first nine months of the Obama presidency. But that doesn’t mean identity management has been ignored totally, says one senior administration official.”

London Evening Standard: Lloyd’s chief warns of more insurance fraud

“Lloyd’s of London’s chief executive Richard Ward today warned the deep recession would increase the number of fraudulent claims being made against the insurance market.”

Computerworld: Laptop searches at airports infrequent, DHS privacy report says

“The U.S. Department of Homeland Security’s annual privacy report card revealed more details on the agency’s  controversial policy involving searches of electronic devices at U.S. borders. . . . For instance, numbers released in the report indicate that warrantless searches of electronic devices at U.S. borders are occurring less frequently than some privacy and civil rights advocates might have feared. Of the more than 144 million travelers that arrived at U.S. ports of entry between Oct. 1, 2008 and May 5, 2009, searches of electronic media were conducted on 1,947 of them, the DHS said.Of this number, 696 searches were performed on laptop computers, the DHS said. Even here, not all of the laptops received an ‘in-depth’ search of the device, the report states. A search sometimes may have been as simple as turning on a device to ensure that it was what it purported to be. U.S. Customs and Border Protection agents conducted ‘in-depth’ searches on 40 laptops, but the report did not describe what an in-depth search entailed. . . . The report chronicled similar efforts to monitor the privacy implications of a range of projects that privacy groups are also watching. Examples include  Einstein 2.0 network monitoring technology that improves the ability of federal agencies to detect and respond to threats, and the  Real ID identity credentialing program. The DHS’s terror watch list program, its numerous  data mining projects  and the secure flight initiative were also mentioned in the report.”

Social CRM, CDI, and Identity Resolution

Wednesday, September 23rd, 2009

By Robert Barker, Infoglide Senior VP & Chief Marketing Officer

In her well-read book on CDI, Jill Dyché offers a definition of CDI that also seems to describe social CRM. Try reading her definition of CDI, replacing “CDI” with “social CRM”:

CDI is a set of procedures, controls, skills and automation that standardize and integrate customer data originating from multiple sources(1).

In fact, Ray Wang of A Software Insider’s Point of View suggests that social CRM initiatives could be more effective by leveraging MDM technology. In a recent post he listed key questions that social CRM and other relationship management initiatives like CDI have to answer:

1.    Do we know the identity of the individual?
2.    Can we tell if there are any apparent and potential relationships?
3.    Are they advocates or detractors?
4.    How do we know whether or not we have a false positive?
5.    What products and services have been purchased in the past?
6.    Have we assessed how much credit risk we can be exposed to?
7.    What pricing and entitlements are customers eligible for?

So how exactly can social CRM systems resolve identities of individuals across multiple disparate data sources? How can they rationalize multiple variations and errors and anomalies that block finding existing customers within their systems?

The obvious answer is identity resolution. We highlighted in an earlier post that Dyché declared that identity resolution supports and enhances five of the eight core MDM functions enumerated in her book with Evan Levy. Similarly, identity resolution is critical in accurately answering key questions about identity in social CRM.

Ray’s list of questions can be divided into two sets. Accurately answering the first set related to identity and relationships (questions 1, 2, and 4) is critical to answering the rest of the questions. If we blow it on identity, it is impossible to make sense of social CRM data.

Social media marketing and social CRM are becoming more and more mainstream. If you want to get more familiar with social media marketing and social CRM, Paul Gillin’s recent book is a great way to get started.

If you’re already familiar and want to comment or take issue with this post, let us hear from you.

(1)Dyché, Jill and Levy, Evan. Customer Data Integration: Reaching a Single Version of the Truth. John Wiley & Sons, Inc. 2006. Page 274.

Identity Resolution Daily Links 2009-08-14

Friday, August 14th, 2009

[Post from Infoglide] Vetting Sharks and Whales

“If you’re not in the casino industry, the title of this post may be meaningless, but for casino managers, “sharks” are the bad guys and “whales” are the good guys. Sharks are people who try to defraud the casino through illegal activities, while whales are the high rollers who are apt to win $20,000 one trip and lost $25,000 the next. If there’s any environment where you’d be motivated as a businessperson to know as much as you can about who you’re dealing with, it’s a casino.”

DATAWARE HOUSING: Business Intelligence and Identity Recognition—IBM’s Entity Analytics

“This article will define master data management (MDM) and explain how customer data integration (CDI) fits within MDM’s framework. Additionally, this article will provide an understanding of how MDM and CDI differ from entity analytics, outline their practical uses, and discuss how organizations can leverage their benefits.”

Workers’Comp Kit Blog: Failure to Pay Workers Compensation Premiums

“A New York asbestos  contractor failed to pay $1.6 Million in workers’ compensation premiums and will serve four years in prison. Upon his release he will be deported to his home country as he is an illegal immigrant… He repeatedly changed the name of his company.”

The TSA Blog: Secure Flight Q&A II

“Each one of these layers alone is capable of stopping a terrorist attack. In combination their security value is multiplied, creating a much stronger, formidable system. A terrorist who has to overcome multiple security layers in order to carry out an attack is more likely to be pre-empted, deterred, or to fail during the attempt.”

Identity Resolution Daily Links 2009-07-17

Friday, July 17th, 2009

[Post from Infoglide] iPhones, Identity Resolution, and Cloud Computing

“A personal favorite saying for years has been “invention is the mother of necessity” (a twist on the original saying, of course). It aptly conveys what has driven the high tech industry for the last several decades. Principles like Moore’s Law and its equivalent for the internet have created unanticipated waves of computing and networking power. All that available power has released the combined creativity of tens of thousands of engineers and marketers who dreamed up ways of interacting and managing our lives and businesses that were inconceivable 30 years ago…”

Liliendahl on Data Quality: Match Destinations

“When matching party data – names and addresses – very often it is not just only about hitting similar records, but also about performing some form of transformation with the data before, during and after the hitting.”

Tech Law Notes: Health IT & Open Source

“Repeatedly, I hear the refrain that this stimulus money is going to go to systems that can be put to a “meaningful use,” and that is going to exclude rogue open source Health IT developers from being funded, squelching innovation in the market place.  I imagine that complying with the security regulations under HIPAA probably hinder innovation, too, but they increase the reliability of the system vendors that remain in the market place and reduce the risk to the data of patients that might be in their computer systems.”

The Data Doghouse: People, Process & Politics: Integration Portfolio

“Existing IT projects may be under the label of: Corporate Performance Management (CPM), Master Data Management (MDM), Customer Data Integration (CDI), Product Information Management (PIM), Enterprise Information Management (EIM), Data Warehousing (DW) and Business Intelligence (BI).”

Bad Behavior has blocked 1306 access attempts in the last 7 days.

E-mail It
Portfolio Strategy News The Direct Marketing Voice