Identity Resolution Daily Links 2008-12-12

December 12th, 2008

[Post from Infoglide] Part Deux: If Only Data Quality Were That Simple

“Applying generic algorithms to data attributes with wildly varying characteristics simply can’t match the accuracy of applying a family of deterministic analytics, each built around specific characteristics of a particular attribute type.”

Data Value Talk: The added value of an integrated customer view

“So it appears that the data itself plays a crucial role in the lack of an integrated customer view. Or more accurately, the better the data - the better the customer view.  And the better the matching of customer records across separate systems the better the integrated customer view. So Data Quality and Matching (Identity Resolution) determine in large parts the quality of the integrated customer view and the added value that it delivers.”

Marion Star: Muzzle loading and compensation

“Investigators from the Ohio Bureau of Workers’ Compensation, posing as gun enthusiasts, twice visited SMS. Those visits consisted primarily of small talk about guns and ammo. McGraw discussed some pistols that he had recently sold and invited one of the investigators to bring in an allegedly defective gun, telling them he would ‘take a look at it.’”

Intelligent Enterprise: ‘Surround Strategy:’ A Prediction for 2009

” Rather than trying to remodel the data warehouse to accommodate fresher and more detailed operational data (near real-time activity in operational systems, process logs, etc.), these data sources will operate in parallel (or horizontally, whichever word you like) as complementary feeds to analytics. It takes too long and is too expensive to expand the data warehouse concept to do this.”

New York State Insurance Department: Cortland Woman Accused of Workers’ Comp Fraud

“Horton is charged with making false statements and submitting false testimony to the Workers’ Compensation Board to receive benefits. She claimed that an April 2006 back injury she suffered while she was a health aide prevented her from working or attending school. Investigators learned that she was attending school full-time.”

Gartner: When is SOA, DOA? When it’s without MDM!

[Andrew White] “Clearly, if every SOA-based application interaction had to incur the costs of data reconciliation, mapping, clean up etc, then the cost of building and maintaining that SOA-based application would exceed what it costs today without SOA.  The bottom line: SOA needs MDM to help with the evolution of the information infrastructure.”

The State Journal: Insurance Fraud Unit Wins 45 Convictions This Year

“Since January 2007, the fraud unit has received 1,703 case referrals for review from those in the insurance industry and private citizens. After reviewing the referrals, field investigators have been assigned 397 cases to pursue. During that time, [West Va. Insurance Commissioner Jane] Cline said, 292 criminal cases have been referred to various prosecuting authorities, as well as in-house prosecutors who have been assigned to the unit on a full-time basis. Further, the fraud unit has secured indictments on 84 individuals for 294 felony counts and successfully obtained 73 convictions, including 45 in 2008.”

Part Deux: If Only Data Quality Were That Simple

December 10th, 2008

By Robert Barker, Infoglide Senior Vice President & Chief Marketing Officer

During the past two weeks, Phillip Howard at Bloor Research has raised interesting questions about the nature and efficiency of data quality solutions in a series of posts entitled “The problem with data quality solutions.” Last week I responded on his blog and posted an expanded discussion of the same points here.

His fourth installment opens some interesting new topics. Perhaps the best approach is to lift some quotes and then respond below.

“Where I will comment is on the importance of understanding relationships not just between data elements but also between data and applications and even between data and the business. Understanding data relationships is arguably the most important factor whenever you are moving and transforming data, especially in data migration and data archiving environments but also for moving data into a warehouse and similar applications.” We agree that finding non-obvious connections is crucial to building effective data quality solutions. Many technologies fall short in this regard. They are unable to evaluate relationships based on similarity when data is inconsistent. Philip’s simple example baffles many technologies:

“A typical case might be where one application required a five digit numeric field and another application requires the same five numbers plus an additional two alphabetic characters. So, here’s a question for data quality vendors: can your software tell the difference?”  Applying generic algorithms to data attributes with wildly varying characteristics simply can’t match the accuracy of applying a family of deterministic analytics, each built around specific characteristics of a particular attribute type.

He goes on: “Unfortunately, discovering relationships is not just about profiling your database. There may be relationships that exist across data sources (and types of data source) that you need to understand; and then there is the application factor. While it may not be theoretically correct from a purist data management perspective the fact is that many data relationships are defined within applications so, in one way or another, you really need to discover these.”  We couldn’t have articulated it any better. Many data quality solutions assume a higher degree of order than actually exists in the real world. Being able to deal with ambiguity (e.g., data sometimes missing, data entered in wrong fields) distinguishes the best technologies from their more simplistic brethren.

This post is getting a little long, so we’ll continue this discussion next week. In the meantime, we’d like to hear your reaction.

Identity Resolution Daily Links 2008-12-08

December 8th, 2008

By the Infoglide Team

Supply & Demand Chain Executive: Avoiding the Big Bang Backlash of MDM Implementations

“A strong MDM strategy touches so many parts of the enterprise that it may take years to define, evangelize and implement, but that doesn’t mean that MDM needs to sit quietly on the sidelines until that time. Rather, an evolutionary approach to moving forward with MDM could in fact unlock the door to broader acceptance of an enterprise-wide MDM strategy.”

Government Computer News: Better privacy for better security

“Although the two complement each other, it is not easy to provide both security and privacy because using data for security can expose it. This means that the two concerns have to be balanced. To pursue one end at the expense of the other is self-defeating, the experts said.”

Poughkeepsie Journal: Saugerties man charged with fraud

“A Saugerties man faces insurance fraud and other charges after he allegedly accepted more than $77,800 in workers’ compensation benefits despite appearing well. Joseph A. Gambino, 58, Simmons Street, was arrested Wednesday after investigators produced videotapes showing him moving furniture and riding a motorcycle. During medical exams, he had been leaning heavily on a cane, according to the Ulster County District Attorney’s Office.”

FRAUDWAR: How to Legally Buy Hot Merchandise

“The Internet has opened new avenues for criminals to fence stolen merchandise. This has made it easier to sell stolen merchandise and there are many who believe that it contributes to the problem. The most recent survey by the National Retail Federation estimates that Organized Retail Crime is a $30 billion a year issue. Their most most recent Organized Crime Survey showed that e-fencing on traditional auction sites has grown by six percent. In response to this, they are even pushing bills in Congress to force the auction sites to allow more access to law enforcement and retailers, who are attempting to shut down this activity.”

Identity Resolution Daily Links 2008-12-05

December 5th, 2008

By the Infoglide Team

[Post from Infoglide] Identity Resolution Daily: If Only Data Quality Were That Simple

“The most effective approach blends several best-of-class techniques, and it scales without compromising performance. A multifaceted solution combines an extensive rules base for nicknames and abbreviations, heuristics, semantics, and a large array of public and proprietary algorithms and other types of analytics.”

MarketWatch: Fraudulent Doctor Surrenders License, Repays $144K to Texas Mutual

“Between January 2003 and March 2006, Shanti and his clinic over-billed workers’ compensation carriers for pain management services in excess of hours actually attended by patients, according to the indictments.”

IT-Director: The problem with data quality solutions part 4

“A typical case might be where one application required a five digit numeric field and another application requires the same five numbers plus an additional two alphabetic characters. So, here’s a question for data quality vendors: can your software tell the difference?”

SecurityInfoWatch: Police, private security given access to ORCIN database

“‘Right now information sharing between loss prevention security and police officers is very limited to who you know,’ said ORCIN Founder Rudy Bravo ‘This way, if you go onto the Web site and post information (about a retail crime) and send it, it will be sent out to all our members.’”

KARE11.com: Retailers report rise in ‘organized’ shoplifting

“What isn’t clear is whether the apparent rise in organized shoplifting is due mostly to the Internet, where auction sites offer sellers an easy way to make money on never-before-used products, or if the shoplifting is rising as the economy gets worse.”

TravelAgentCentral: ASTA Alerts Agents to 2009 Secure Flight Changes

“ASTA said it is essential that travel agents take steps now to prepare for this new set of procedures and offered specific guidelines agents can use to comply with the new data collection rules.”

If Only Data Quality Were That Simple

December 3rd, 2008

By Robert Barker, Infoglide Senior Vice President & Chief Marketing Officer

In a set of three recent posts, industry analyst Philip Howard of Bloor Research compares different types of data quality technologies. Setting aside the fact that the focus was on two specific companies, let’s examine the key conclusions.

First, “next generation” data quality solutions must employ “improved matching with less human involvement.” While no one will argue that “better results, less cost” should be (and is!) a goal of all matching technologies, implying that mathematical modeling and semantic analytics alone can solve every problem ignores the breadth of requirements and attribute types across multiple industries. For example, a solution that’s great at matching product data may fail miserably at identifying insider trading on Wall Street.

Another key point made in the posts: most products require the user to “tinker around with your guesses and see if your match percentage improves” and that “means a lot of manual work, not just to begin with but on an on-going basis.” In reality, users often list configurability as a top criterion in choosing a solution for complex problems, and our experience is that in most instances the amount of ongoing adjustment after the initial learning phase is minimal. “One size fits all” works OK with t-shirts but not so well with data.

A final conclusion is that “all the leading products have been built using out-of-date technology that has now been superseded.” In point of fact, both companies cited have been around for years, and all leading companies (including mine) continually evolve their techology. Perhaps more importantly, a key requirement for all but the simplest problems is a solution that can incorporate newer, better analytics as they emerge, rather than locking the customer into a single “my way or the highway” approach that works well for some classes of data attributes but not so well on others.

The most effective approach blends several best-of-class techniques, and it scales without compromising performance. A multifaceted solution combines an extensive rules base for nicknames and abbreviations, heuristics, semantics, and a large array of public and proprietary algorithms and other types of analytics. As important as matching is, a strong solution will enable easy integration with existing systems and can evolve as requirements grow and new analytics emerge.

Stimulating conversation about the range of solutions available to address data quality problems is a highly desirable activity. However, considering only one or two vendors (including Infoglide!) for any solution can limit your thinking about how best to address your unique requirements.

Identity Resolution Daily Links 2008-12-01

December 1st, 2008

By the Infoglide Team

Intelligent Enterprise: Master Data Management Adoption Going Strong

“Business units and IT departments collaborate, cleanse, publish and protect common information assets that are shared across the enterprise. Gartner, however, cautioned that there is no single technology that meets all MDM user requirements, and products vary by technology, industry, data domain and use case, and many span multiple domains.”

Insurance & Financial Advisor: New York man arrested for workers’ compensation fraud

“Queen is accused of accepting $3,000 in workers’ compensation benefits from the New York State Insurance Fund and stating that he was not working as the result of a job-related knee injury. He later collected $7,000 in workers’ compensation benefits from AIG insurance, claiming an injury to the same knee that prevented him from working.”

CIO: The Ugly Truth About “One Version of the Truth”

“‘Many organizations spend months and endure significant costs to obtain the reporting and analysis capabilities that BI promises,’ Hatch writes, ‘only to find that different ‘versions of the truth’ still exist without any definite way of determining which one is real or accurate.’”

Chronicle Herald: Flight rules raise privacy worries

“The name, gender and birth date of Canadians flying from Toronto to destinations such as Cuba, Mexico or even Europe will be transmitted by airlines to the TSA under its Secure Flight program, to take effect next year. The agency will then vet the names against security watch lists aimed at keeping dangerous people on the ground.”

DN/Online: Ask me about my panties

“Retail theft is on the rise and the National Retail Federation said in its 2008 report that 68 percent of retailers have been able to identify or recover stolen merchandise and gift cards on online auction sites, 61 percent more than last year. The report also indicated that 63 percent have seen an increase in e fencing selling stolen items on online auction sites - activity in the last year.”

Seacoastonline.com: Beware organized crime online

Organized retail crime involves the organized theft of retail merchandise that is resold to consumers through online auction sites and through other outlets, like local flea markets. These organized crime rings target over-the-counter pharmaceuticals, baby formula, tobacco cessation products, pregnancy strips, diabetic test strips, cosmetics and similar types of personal care items.”

bobsguide: Recession gives rise to online fraud fears

“Business Journal said the study by fraud detection specialist 41st Parameter and the Merchant Risk Council found that 84 per cent of respondents are concerned that internet retailers will face a ’slight or substantial’ increase in fraudulent activity.”

Identity Resolution Daily Links 2008-11-21

November 21st, 2008

By the Infoglide Team

NOTE: We will start posting again after Thanksgiving. Happy Turkey Day!

datanomic: Are we nearly there yet?

“In the 1990s, Customer Relationship Management promised, amongst other things, to provide us with a single view of customers, but the ideal fragmented into a number of different disciplines, largely dictated by technology vendors.  Instead of a single customer view, most organisations have multiple, often inconsistent views of their customers and prospects delivered through an assortment of Sales Force Automation, Analytical CRM and Campaign Management systems each propagating their own database.”

Scamtypes: 5 Types Of Social Networking Scam - #1 The Fake Identity

“Setting up a new profile on the major social networking sites is an incredibly simple thing to do. For criminals this presents a tremendous opportunity as it allows them to affiliate themselves with just about any identity, whether that is a real person or not. For some, a fake identity may just be a means of having fun online, however warped that intention may be. For others, far more sinister motives guide them, from arranging risky meetings to making abusable connections and many other shady reasons.”

Conde Nast Daily Traveler: Bush Officials Claim a Kinder, Gentler Airport Security

“And sometime in January, you will start giving your birth date, home address, and full legal name when you make an airline reservation–all part of a ’secure flight’ initiative that will reduce the number of innocent people who are falsely flagged as potential terrorists because their names resemble those of actual bad guys.”

The Bunker Blog: Macy’s Loss Prevention Agent Arrested For Assisting Shoplifters

“One of the alleged shoplifters was the sister of the loss prevention agent. The 24 year old LP agent had been working for Macy’s since February, and his manager suspected something was going on, so a surveillance was conducted on the LP agent by the manager.”

Evolution of Security: Why?

“More than 23 million passengers were screened at our checkpoints last year during the holiday season, and many of those passengers travel infrequently. Those are the travelers we’d most like to reach. Passenger feedback has shown us that people are more willing to comply with security procedures if they understand the ‘why’ behind the measure.”

Leveraging Identity Resolution Data Sources

November 19th, 2008

By Robert Barker, Infoglide Senior Vice President & Chief Marketing Officer

Ever have this experience? You’re searching Google for specific examples of a topic when you come across an already compiled and complete list of related examples – what a find! I had this experience recently when looking for contact information for people we wanted to invite to a marketing event, and voila! – I found a list of them that was 90% complete and accurate. Without that find, the project could have taken a couple days longer.

Aggregations of this type abound. Besides lists that individuals put together and post on the web, public and private databases offer all sorts of information on people that are useful in addressing multiple types of business problems and opportunities. Here are a few links to “people data” that you may not be aware of, and there are many more:




NETR Online



Who’s Who

Identities contained in multiple databases with varying schemas as well as ambiguous and sometimes missing attributes can be resolved to deliver a clear picture of a person and activities they are involved in. Here’s an example that illustrates how you can draw on multiple data sources to solve a complex problem.

Everyone knows about insider trading, especially with the recent allegations about Mark Cuban. Essentially, someone uses confidential knowledge about a financial transaction to buy or sell stock to their personal advantage.

Many illegal insider stock trades can be readily identified. How? By “similarity searching” across records of stock trades, associated timelines (who knew what and when about the event) and public company financial institution data (e.g., CapitalIQ) then finding hidden relationships using biographical information (e.g., Who’s Who), background screening and residential information (e.g., ChoicePoint), and other public and private sources.

There are many more cases where identity resolution can exploit available data sources to address complex problems. Making sense out of these massive amounts of data by aggregating and sifting through them requires an ability to score the results accurately. Just as importantly, you need to be able to configure the scoring to fit the specific problem, i.e. the solution must be tuned to meet unique requirements.

Solving complex business problems often requires knowing more about who you’re dealing with and their relationships. Vast amounts of data are accessible online via APIs and web services, and they can be incorporated into new kinds of online applications that once were impossible.

Identity Resolution Daily Links 2008-11-17

November 17th, 2008

[Post from Infoglide] Identity Resolution Daily: Proud of Our Heritage

“When we examine our company’s roots, we see that our heritage is finding bad guys. That’s what David Wheeler set out to do when he saw that detectives had a critical need for better tools for criminal investigations. That is what we are beginning to do in the great State of Washington to identify businesses trying to cheat on their workers’ compensation premiums. From desktops to mainframes and everything in between, our roots have spread and have helped keep us stable as the winds of change have buffeted us about.”

Miami Herald: Workers’ compensation investigator accused of fraud

“In September, according to an arrest warrant, Vega visited Pipe Designs Inc., 7710 NW 72nd Ave., in Miami-Dade. The company did not have any workers’ compensation coverage, Vega found. Vega told owner Ronald Triana that he would lower the hefty penalty — between $27,000 and $30,000 — if Triana gave him a $2,500 money order with the payee information blank, according to the warrant.”

onestopclick: MDM ‘driving software development’

“Studies carried out by IT industry analyst Gartner indicate the necessity for firms to increase the effectiveness of their database development, while reducing costs and meeting compliance requirements, is driving the take-up of MDM technologies.”

Computing SA: IT downturn: every cloud has a silver lining

“Open source data integration, data quality, and extraction, transformation and loading (ETL) applications will flourish in these conditions because they are less costly to obtain, widely supported and constantly updated.”

opodo: Travellers reminded of Esta regulations

“Jim Forster, British Airways’ government and industry affairs manager, said: ‘The US is our biggest overseas market and we have been working hard to advise our visa waiver customers that they must apply to the Department of Homeland Security well in advance of travel.’”

Identity Resolution Daily Links 2008-11-14

November 14th, 2008

By the Infoglide Team

Windsor Star: OLG coughs up cash

“The insider policy is to guard against fraud, such as a retailer who tells a customer a ticket is not a winner and then tries to claim the prize. In the spring of 2007, a provincial ombudsman issued a scathing report on the OLG and said Ontario store owners and their families had collected tens of millions of dollars in fraudulent claims.”

MarketWatch: Worldwide thefts cost retailers US $104 billion annually - Survey

“This year’s survey, the most complete analysis of global shrink ever conducted, reports key findings on retail shrinkage and crime in 36 countries and on five continents, based on data from a confidential survey of 920 large retailers with combined sales of U.S. $814 billion and 115,612 operating retail outlets…’This sum represents a tax imposed on honest people by retail criminals of $229.73 per household or $71.12 for every single person in the 36 countries surveyed,’ said Professor Bamfield, Director of the Centre for Retail Research.”

Jackson Citizen Patriot: Kids learn online dangers

Internet predators lurk on networking Web sites, such as Facebook and MySpace, or in chat rooms, looking for young victims. ‘Everybody you meet online is an Internet stranger,’ Malik Williams, an Internet-safety presenter from the Michigan Cyber Safety Initiative, told more than 50 fifth-graders Tuesday at Concord Elementary School. ‘That’s why it’s important to keep yourself safe.’”

Bucyrus TelegraphForum: Bucyrus experiences rash of break-ins

“‘There is a new trend in what we call e-fencing. Thieves are selling their stolen items on the Internet versus just selling them outright. They can get up to 70 percent or more of the value if they sell on the Internet versus selling them on the streets, where they only get about 30 percent of the value,’ Teets said.”

SFGate: Ex-S.F. firefighter’s workers’ comp problem

“Indeed, if Hijjawi were trying to hide her fitness quest, she wasn’t doing a very good job. Our own Google search turned up records showing her running in marathons in Lake Tahoe, Los Angeles, Honolulu and elsewhere. From 2001 to 2006, according to records on the Web site Athlinks, Hijjawi ran in no fewer than a dozen marathons. And her biography on another site shows she was taking on even bigger challenges, including the Canada 2005 Ultraman super triathlon competition - in which competitors swim 6.2 miles, ride a bike for 170 miles and run 52 miles, twice the distance of a marathon. Completing it took her more than 33 hours.”

Bunker Blog: Update On Cops Involved In Major Shoplifting Ring

“Kevin Burchell and Clifford Barber, both police officers, worked with two others, one of them an employee at the Walmart the items were taken from; to get up to $200,000.00 worth of merchandise out of the store and onto an eBay site. According to the latest report, Barber was the mastermind behind the scheme, and sold the items on eBay and to friends and acquaintances.”

Bad Behavior has blocked 512 access attempts in the last 7 days.

E-mail It
Portfolio Strategy News The Direct Marketing Voice