Welcome to Understanding Link Analysis. The purpose of my site is to discuss the methods behind leveraging visual analytics to discover answers and patterns buried within data sets.

Visual analytics provides a proactive response to threats and risks by holistically examining information. As opposed to traditional data mining, by visualizing information, patterns of activity that run contrary to normal activity surface within very few occurances.

We can dive into thousands of insurance fraud claims to discover clusters of interrelated parties involved in a staged accident ring.

We can examine months of burglary reports to find a pattern leading back to a suspect.

With the new generation of visualization software our team is developing, we can dive into massive data sets and visually find new trends, patterns and threats that would take hours or days using conventional data mining.

The eye processes information much more rapidly when information is presented as images, this has been true since children started learning to read. As our instinct develops over time so does our ability to process complex concepts through visual identification. This is the power of visual analysis that I focus on in my site.

All information and data used in articles on this site is randomly generated with no relation to actual individuals or companies.

Visualizing Organized Retail Fraud

Organized fraud poses the highest risk to any commerce platform from banking, eCommerce and insurance companies to brick and mortar retail chains and individual stores. In a retail environment, opportunistic shoplifting, the single shoplifter who steals for themselves is the lowest level threat followed by internal fraud which represents a moderate risk.

Organized shoplifting rings can have hundreds of participants both internal and external to the retail organization and can steal over $1 million dollars in merchandise and cash in a single month.

These rings do not limit themselves to just the shoplifting of merchandise from stores, they are highly complex criminal organizations with resources to attack merchants from a number of fronts simultaneous, committing:

  • Credit card fraud
  • Refund and rebate fraud
  • Price switching and price alerting through the production of bar codes
  • Burglary by "sleep in" thefts from stores after hours
These rings have market intelligence on which merchandise brings the most value and the fronts in place to sell the merchandise they sell within hours. They operate as cells that come together as a complete organization that can span the entire country or in the case of hybrid retail theft rings who target designers, they can be international.

Combating organized fraud requires the same type of intelligence gathering, case management and analysis that is present in law enforcement organizations and financial institutions. It requires the centralization of incident information across the entire franchise.

Records of shoplifters can no longer be housed store to store, they have to be centralized and databased across an entire franchise to be able to leverage visual analysis to find the relationships between the shoplifter caught in Atlanta Georgia and the one caught in California who were both stealing massive amounts of drill bits.

Databasing Retail Theft Information

Like most organized fraud rings, participants come equipped with a wide variety of identities and cover stories, however like most organized rings this can end up being a vulnerability when employing data basing and visual analysis. Because organize ring suspects get caught individual in mass, those identities end up getting re-used from time to time, thus establish a link between the participants.

Because rings focus on certain merchandise, what they steal can be as important of a link between suspects as their identities in visual analysis. Visually we can group together shoplifting suspects by the type of merchandise they specialize in stealing and moving. This is the same way police analyze how a bank robber commits his crime to link that individual to multiple bank robberies.

Refund information is essential to centralize and integrate with shoplifting data. If an organized ring can refund stolen merchandise for full price as opposed to selling it to a fence for a fraction of the retail price they will. They are experts in reproducing receipts and often partner with internal employees to assist them in refunding merchandise.

While we are talking about internal employees, ever wondered what the employee you caught last week stealing high end small electronics is doing with it. The average employee steals twenty to thirty times before being caught.

If you caught an employee handing out 20 USB flash drives to an unknown customer without charging them for it, that means the employee actually handed out 400 of them before you find out. Unless you are a major league computer pack rat, I doubt any one person needs 400 USB flash drives, those are being refunded to your sister store five states away.

When centralizing data for analysis in a retail investigation environment consider integrating the following information to maximize the potential for visual analysis:

  • External theft suspect information including detailed information regarding the items being stolen and the methods or tools used.
  • Internal theft suspects including what employment information such as what department they worked in, who they worked for and with and their responsibilities
  • Refund and rebate information, particularly refunds without receipt
  • Integrating gift card information database with your theft database
  • Integrating any store specific credit card database with your theft database
Visualizing Retail Theft Data

For this scenario I am an analyst for a large retail chain tasked with identifying organized fraud. I have the ability to query off my central database of store incidents and export the data from my database for import into my visual analysis program.

As organized fraud rings are nomadic and multi state, I need the widest sample of data possible to establish links between identifying information which has an average accuracy rate of less then 30%.

I am going to review my data and determine what fields I need to extract in order to create unique identifiers for my subjects and the merchandise involved.

By reviewing the data I have determined that I am going to create person entities by utilizing the name and DOB of my incident suspects. I will be linking my person entities to addresses, phones, incidents and merchandise.

Merchandise is going to be an entity that is specific to retail and very tricky when using it in visual analysis. For the type of merchandise to be a relevant entity in analysis, we have to make the type of merchandise stolen unique, but not too unique that at best we end up with a one on one relationship.

Ideally utilizing the SCU or product number would best, however if that data isn't captured using a combination of quantity and merchandise description could create a unique identity.

If I was dealing with a large amount of data, I might specifically query incidents with a merchandise total over $500 or a quantity over 5 so that I am not trying to link together every teenage kid who boosted the latest Dave Mathews CD across the country.

Lets take a look at my import specification:

I have linked persons to address, phone and SSN for identifiable information. For my person entities I have created a unique identifier by combining the name and the DOB, however since that is not going to make any since as a link label on my chart I am only utilizing the name as the label.

I can also utilize the DOB field in the entity attributes so that I can compare multiple individuals with the same name to find and sequence to the DOB's that were used. DOB patterning is easy if you remember that when people are making up a DOB they will use either a day, month or year which is their actual DOB but make up the other two.

If a person makes up a year, unless they are a true professional, they don't have the ability to do math when questioned. They will usually add or subtract years in systematic method from their real DOB.

I am going to execute my import specification and see if any clusters of interrelated incidents appear.

With the visualization software I am using, the clusters are going to be organized with the largest to the left, decreasing to the right. I am going to focus on the largest cluster of interrelated entities and see if I can identify any organized rings.

What we see in this cross section from my chart is a group of individuals who are associated by address, phone or incident. What is interesting is that this small subset from my chart are all related but all committed different retail theft schemes from internal theft to credit card fraud to refund fraud. This is exactly how organized shoplifting rings operate, by attacking a retail institution using varying methods and resources.

From the information I have developed through my visual analysis, I have located an organized fraud ring impacting my retail organization and through my visualization I can uncover the methods they are utilizing and the types of merchandise they are targeting enabling me to alert the stores in the States being impacting by the ring.

I can brief my in store security regarding the ring's method of operation, the times they hit, the involvement of potential internal employees and the places in the store they need to be focusing on to catch the group in action.

Once my in store resources have been alerted and have apprehended additional suspects following the same pattern uncovered from the analysis, I can integrate the new information into my link analysis chart to compile a complete case visualization for my companies inside counsel and law enforcement.


While visual analysis can be effectively deployed to locate organized theft it can also be integrated into numerous platforms to combat threats such as gift card fraud, credit card fraud, register or cash shortages and personnel social networking environments.

By proactively identifying organized rings and other fraud threats, you strengthen your organizations deterrent to future events. Fraud always follows the path of least resistance, by leveraging centralized data basing of theft intelligence and leveraging visual analysis you can drive organized rings to easier targets of opportunity.