Welcome to Understanding Link Analysis. The purpose of my site is to discuss the methods behind leveraging visual analytics to discover answers and patterns buried within data sets.

Visual analytics provides a proactive response to threats and risks by holistically examining information. As opposed to traditional data mining, by visualizing information, patterns of activity that run contrary to normal activity surface within very few occurances.

We can dive into thousands of insurance fraud claims to discover clusters of interrelated parties involved in a staged accident ring.

We can examine months of burglary reports to find a pattern leading back to a suspect.

With the new generation of visualization software our team is developing, we can dive into massive data sets and visually find new trends, patterns and threats that would take hours or days using conventional data mining.

The eye processes information much more rapidly when information is presented as images, this has been true since children started learning to read. As our instinct develops over time so does our ability to process complex concepts through visual identification. This is the power of visual analysis that I focus on in my site.

All information and data used in articles on this site is randomly generated with no relation to actual individuals or companies.

Leveraging Visual Analytics for eCommerce Fraud Detection

In 2010, online fraud cost merchants $2.7 billion dollars in losses just in the U.S. and Canada alone. Losses stemming from unauthorized credit card transactions accounted for $100 million of the total loss and the trends is expected to increase 23% in 2011.

Online fraud is new wild west of crime where individuals to organized rings can operate anywhere in the world while being practically invisible. At the same time, the percentage of online sales is expected to double by 2014, with more households purchasing goods and services online then from brick and mortar locations.

Based on these trends, shying away from online commerce is not an alternative, so retailers and online merchants have little choice but to reach for effective fraud solutions to level the playing ground against those robbing them armed with nothing more then an Internet connection.
Compound the issue with the fact that your site is open 24 hours a day, 7 days a week, and your checkout aisle is limited only by your bandwidth, minutes can mean thousands of dollars in potential fraud losses.

Using Data Visualization to Combat Online Fraud

While online merchants loose the protection of face to face "know your customer" transactions, there is a wide variety of data which is captured during online purchase sessions which can be effectively utilized to combat fraud.

Data points such as IP address and associated IP geo-location data, machine and device fingerprinting ID's, shipping and billing address attributes and item purchasing velocities can all be leveraged within visual analytics to detect abnormal patterns in online buying behavior indicative of fraud.

As opposed to data mining, visual analytics gives us the capability to distinguish patterns that differentiate themselves from normal transaction flow. The more transactions which occur on your system, the more fraud instances which have to occur to become statistically relevant in data mining, thus the more losses which have to be realized by the merchant. This is just not an acceptable fraud prevention strategy.

Within visual analytics in SynerScope, data is examined holistically, through multiple perspectives of hierarchy. Because we are visualizing the entire transaction flow, abnormal activity within a visualization presents itself much earlier, sometimes in as early as two transactions.

Visualizing eCommerce Driven Transaction Flow

For this example I am going to import several months of online transactions into the SynerScope visualization. SynerScope's ability to visualize large amounts of data resides with the tools ability to leverage the natural hierarchies which exist within data sources. By utilizing this hierarchy, relationships and events are bundled and positioned within the visualization to give the user a complete holistic view of activity, relationship and time.

Within our current example I have utilized a hierarchy from our customers based on specific identifiers captured within the transaction flow including IP State, IP City and the specific IP address utilized at the time of transaction. I am relating that activity to the product item numbers available on my site, the quantity ordered and the price of the transaction.

The reason for this particular hierarchy is examine the relationship and velocity between the locations being utilized online to conduct the transactions to specific items that are being purchased. If we equate this to shoplifting activity at traditional stores, we know that professional shoplifters tend to target merchandise which has a higher retail value or an increased resell value on the black market. Since the transactions occur online, the shoplifter no longer has to worry about size and concealment.

From the highest level within the SynerScope relationship view, we can already identify several IP addresses with large velocities of online transaction activities or orders. SynerScope provides a visual representation of velocity and interactivity (or betweeness) through weighted size of attributes and linearization or width of connecting bundles.

From our visualization I see a large velocity of orders being placed by a series of IP addresses in Dallas Texas.

I am going to highlight by hovering over the bundles connected to a specific IP address to gain a better understanding of the activity. You will notice from the picture, as I hover over the bundle the relationship view indicates which products this IP is associated with and that the sequential event viewer displays the time ranges these orders were placed.

From the relationship view, this IP is associated with a large number of product orders and through the interactive sequential view, the orders are all being placed in bursts of online activity during very short time spans. This is highly suspect activity that runs contrary to normal online ordering patterns.

To gain a complete understanding of the activity, I can surface the underlying data by right clicking on the highlighted bundle and displaying the data within SynerScope's data stream view.

By surfacing the underlying, I gain granular detail into the order information and notice a trend where multiple separate orders are being placed, in some cases, within minutes of each other.

By experience, the user understands that this is very high risk activity that would not normally occur within traditional online shopping patterns. The user at this point can intervene in the processing and shipping of these orders to mitigate the risk.

Visually Mining Into Data

Now that I have detected and mitigated the "low hanging fruit" displayed on my initial visual import, I am going to begin drilling down into my transactional data by utilizing SynerScope's sequential event viewer to discover what other fraud patterns may be hiding in my data.

You will notice in the associated picture, new patterns and entity velocities have surfaced. This time I will focus on a specific product type, quantity and price that is showing a very high velocity based on the bundle width and the entity weight.

As I hover over and highlight the connecting bundle in the relationship view associated with this product, you can immediate observe that this specific product group is associated with a group of IP addresses originating in New York City.

Through the sequential event view, we also notice that the orders are being placed in very short time spans with unusual velocity which raises a red flag.

I surface the underlying data within the data stream view and notice online transactions with very high but same dollar amounts being placed in groups minutes apart from each other. Additionally all of the orders are for the same product, same quantity and same price.

Based on the visual analysis of this activity provided by SynerScope, the user has identified a high risk trend and may take steps to immediately mitigate the risk to the merchant.

In Conclusion:

As you can see form this demonstration, by leveraging visual analytics, a user is able to quickly and intuitively identify abnormal trends and patterns in online transaction activity. The data when rendered visually, makes spotting potential threats within the online transaction system much easier as they surface against the stream of normal transactions essentially helping the user find "the needle in the haystack".

For a complete full motion demonstration of leveraging visual analytics for eCommerce Fraud Detection please view the video below.