Welcome to Understanding Link Analysis. The purpose of my site is to discuss the methods behind leveraging visual analytics to discover answers and patterns buried within data sets.

Visual analytics provides a proactive response to threats and risks by holistically examining information. As opposed to traditional data mining, by visualizing information, patterns of activity that run contrary to normal activity surface within very few occurances.

We can dive into thousands of insurance fraud claims to discover clusters of interrelated parties involved in a staged accident ring.

We can examine months of burglary reports to find a pattern leading back to a suspect.

With the new generation of visualization software our team is developing, we can dive into massive data sets and visually find new trends, patterns and threats that would take hours or days using conventional data mining.

The eye processes information much more rapidly when information is presented as images, this has been true since children started learning to read. As our instinct develops over time so does our ability to process complex concepts through visual identification. This is the power of visual analysis that I focus on in my site.

All information and data used in articles on this site is randomly generated with no relation to actual individuals or companies.

Using Visual Analysis to Combat Call Center Fraud

Your company's call center processes thousands of transactions each week. They are the face of your company and in most cases, are empowered to access customer data, financial data and grant concessions to dissatisfied customers.

Preventing and detecting internal fraud and data leaks occurring in your call center can be a daunting task based on the sheer number of customers and agents that are involved. To add complexity to proper oversight, is that a great number of these call centers are outsourced, often overseas, making access to complete audit trails of activity more difficult.

From the perspective of executives and managers of call center companies, protecting your clients data and property is one of your top operational priorities. Leaks of client data or fraud involving your clients merchandise could damage your reputation, put in risk your current contracts and leave you open to potential litigation.

Well thought out and enforced privacy and operational policies go a long way to protecting your call center or your clients data, however it is unrealistic to think that all the employees in your call center are going to play by the rules. Just as in every business, there is going to be a few who are looking to profit off their position.

This is where a solid fraud analysis and investigation program will provide a further layer of protection that can proactively identify potential fraud and data loss at the earliest stages. Because of volume of transactions, leveraging visual analysis is the best way to look for associations between call center agents and the customers they interacting with.

The Threat Levels That Exist In Call Centers

There are three distinct threat levels which exist in most call centers. The level of investigative analysis and investigation should be based on the potential threat level which exists in the center.

  • Low - Agents who mistakenly leak information or allow concessions to customers by failing to follow proper procedures
  • Medium - Outsourced or temporary agents who have little or no company loyalty and have no incentive in the company success, but have access to customer information or the ability to send concessions (free merchandise, repair replacements ect)
  • HIGH - Fraudulent agent groups within a call center - call center agents representatives by criminal organizations or friends of corrupt agents for the sole purpose of stealing customer information or converting concessions for personal use.
Visual analytics can effectively address medium and high threat risks within the call center organization by identifying clusters of interrelated activity, customers and customer attributes and activity logs between the call center agents themselves and the individuals they are having contact with or sending merchandise too. Always remember, fraud follows the path of least resistance. By shoring up your fraud prevention defenses through visual analysis, those organizations wanting to penetrate and corrupt your call center's organization will search elsewhere.

Identifying Call Center Concession Fraud

Probably one of the most difficult analytical tasks will be the identification of call center agents who are converting merchandise or concessions for personal use. This type of activity costs companies thousands of dollars every week in misappropriated goods.

The difficulty in discovering this activity through data mining is because the activity itself is three dimensional involving the call center agents themselves and the customers. To accurately analyze the activity we have to look at the not only the activity of the agent but also the relationships of the customer to their attributes and even the relationship between the call center agents and the customers.

This often involves the extraction and mining of data from multiple data sources, and the layering of that data in an visual analysis. This is the scenario we are going to employ in this example because if you are able to leverage visual analysis to proactively identify this activity, any internal issues such as data or financial theft, which is one source visualization, will be much easier to accomplish.

So lets start with a scenario for this analysis. I am an fraud investigator for a call center and conduct a monthly analysis of concessions sent out by my agents to detect any fraud or theft which may be occurring.

Like all analysis, the first step is the planning, extraction and cleaning of the data for import into our visual analysis tool. Since I am analyzing concession fraud, I am going to need to extract data out of my agent activity database which will give me the service requests, type of concession and date of concession. Next, I will need to extract data from my customer and shipping database to find relationships between the customers and where the concessions where shipped.

In all fraud analysis, we are looking to leverage the weakest link of the scheme. In concession fraud, the weakest link in the scheme is the shipping address. If you are a call center and either converting your companies merchandise for personal use or sending concessions to friends, the one piece of information that will have to be accurate is the shipping address. This is the main entity in my visualization that I am going to focus on.

In my first step, I download all transactions from the call center which are coded as concession transactions including the service request number, the service request date, the customer ID, the agent name or number, the concession which was sent out and if possible, the tracking number of the concession package.

In my next step, I download the customer and shipping information from my database. I want to ensure that I capture all fields in the shipping database which will accurately identify and make unique, the location the concessions where sent to. If the agent is involved in fraud, they are going to alter the names and phones, however the address will have to be correct for the scheme to work. Agents who are very good at committing this type of fraud will alter the addresses enough to avoid detection but to still ensure delivery. We can counteract this tactic in visual analysis by conducting semantic matches between entities which will detect patterns of inverted numbers, names, slight misspellings or the addition of small pieces of information in the shipping address.

Now that I have downloaded the data I need for my visual analysis, since this data came from two different sources, I am going to need to join the two tables or files of data to make a flat file or view for import.

Once I have joined and cleaned my data of nulls and bad values I am ready to import my data into my visualize it. The schema that I will use for this analysis will be the call center agent linked to the service request to the customer to the customer's shipping address, phone and email.

After completion of the initial import of all my data, plan for the visualization to appear as the example below. The reason for such large clusters of data is because of the call center agent entity. Through normal transactions, multiple agents are going to link together by joining customers, this is not an indicator of fraud.

The good thing is that for my first cluster analysis of this data, I am not even going to use the call center agent at all in my visualization. By filtering out the call center agent and temporarily hiding that entity I can focus on customer clusters linked by address and service request. By looking for groups of customers which are linked together, I can identify possible destinations for fraudulent concessions, friends of agents which are being shipped merchandise or in the case of external fraud, individuals who are taking advantage of my call center reps to gain free merchandise.

I am moving on to my next step and hiding my call center agents to look for customer clusters, don't worry we will bring the agents back shortly. I am going to focus on the largest clusters of interrelated customers which my visualization tool is going to sort for me left to right in my chart.

My largest cluster of interrelated customers is involved in ten different service requests in which the customer was shipped a concession by the call center agent. There are seven different names associated with this cluster, however they are all linked to the same address which is extremely suspicious.

My next step is to leverage visual semantic searching across all my customer entities and attributes to detect attempts to create the appearance of two separate addresses by changing small details in the data such as 123 main street and 123 main st. A strong visual analysis program, including the tool used in the example, i2, will incorporate smart matching to detect these entities such as in the example below.

Once smart matching is completed and ensure that my clusters contain all linked entities and data, I am going to break out my largest cluster and incorporate in the call center agent or agents which created the service requests linked to the customer cluster.

Now before we go further, there are two possible scenarios which may be occurring with large clusters of interrelated customers, both of which will be detectable when we bring back our call center agents. First, there is a group of customers who are taking advantage of my company and call center by acquiring concessions or merchandise through false pretenses. If this is the case, then the service requests will be linked to different to different agents. The second will be an agent sending out concessions or merchandise to individuals fraudulently in which case there will two indicators, all of the service requests will linked to the same agent and the customer profile will have been created by the call center agent because no service call ever existed.

Lets bring in the call center agent entity and see which scenario exists. From the visualization we see below, all of the interrelated customers and service requests are linked to two agents in which multiple concessions were sent to the same address.

The same call center agent linked to multiple service requests

Group of customers all linked to the same address

Call center agent linked to multiple service requests to customers linked to same address

To complete my analysis I am going to examine who created the customer profiles for each of the customers in my visualization and also incorporate the call logs for the agents during the dates and time these service requests were created to determine if an actual call was inbound to the agent when the service requests were created.


From the visualization examples shown, we have identified two call center agents who are actively engaged in fraudulently shipping concessions (merchandise) to individuals for the purpose of converting it for their own use.

All of the service requests in this cluster occurred over a two week period and all were linked to different individuals living at the same address.

For a strong proactive deterrence to this type of fraud, a regular schedule for visualizing concessions should be performed based on the velocity of calls, numbers of agents and locations so that the analyst doesn't end up in information overload. For example if in my organization, I have ten call centers with 100 agents in five different countries handling 1000 transactions a month, I might want to schedule my analysis on a weekly basis to best identify the activity without being overwhelmed by the data.

By leveraging visual analysis in my fraud investigation and deterrence program, I can add another level of security for my call center, company and client, allowing for timely identification of fraudulent internal and external schemes.