Welcome


Welcome to Understanding Link Analysis. The purpose of my site is to discuss the methods behind leveraging visual analytics to discover answers and patterns buried within data sets.

Visual analytics provides a proactive response to threats and risks by holistically examining information. As opposed to traditional data mining, by visualizing information, patterns of activity that run contrary to normal activity surface within very few occurances.

We can dive into thousands of insurance fraud claims to discover clusters of interrelated parties involved in a staged accident ring.

We can examine months of burglary reports to find a pattern leading back to a suspect.

With the new generation of visualization software our team is developing, we can dive into massive data sets and visually find new trends, patterns and threats that would take hours or days using conventional data mining.

The eye processes information much more rapidly when information is presented as images, this has been true since children started learning to read. As our instinct develops over time so does our ability to process complex concepts through visual identification. This is the power of visual analysis that I focus on in my site.

All information and data used in articles on this site is randomly generated with no relation to actual individuals or companies.

Pushing Fraud Upstream is the Goal

The ultimate goal of any fraud program is push detection of suspect activity as far upstream as possible.  But in reality what often happens is companies become entrenched in reactive analytics at the transaction or loss level without figuring out how the threat made it through the door in the first place.  A large percentage of suspect or fraudulent activity can be detected at time of entry or account creation before a single transaction is made and often much easier then at the transaction level.



Transaction level fraud detection is an essential component of any fraud prevention platform and there is much to be leveraged at the transaction level that lends itself to robust protection, but in order for this layer to start working a fraudulent transaction has to take place.  If in your company transaction level detection is your first line of defense, then it’s the same as leaving your door unlocked for the burglar because you have a camera inside.  For the strongest protection from fraud, a layered approach to fortifying your platform starting with the behavior of users the minute they enter the door gives a first chance at profiling risk before risk can occur. 

Most of fraud prevention is pattern detection, interrelationships between activities that should be random increase the likelihood of fraud.  The more interrelationships the higher the risk of fraud and the best way to target these interrelationships is to aim detection on the activities that generate the most velocity, know your enemy as early as possible. 

Locking the front door



With any luck the vast majority of users utilizing your site are legitimate.  They browse your site, set up an account and transact in very distinct patterns based on the user funnel you have established for them.  Since most company’s goal is to establish the site with a specific user behavior in mind, it is likely that your usability engineers already have metrics on how users interact with the site in the way it was intended and the data can be as granular as page click and movement logging.

This is exactly where we want our fraud platforms first layer of security to start, I want to be able to detect when an account is created by someone with high probably for committing fraud.  Organized fraud has several weaknesses that can be exploited, they need a network that hides their identity and location, they need a financial network to execute transactions and pay for goods or extract funds (depending on your business) and to move that money to a safe harbor and third, because they are a business they need to create many accounts in a short period of time to get a return on investment.

The reason why building a strong fraud prevention system at the account creation layer is to take advantage of that weakness in the organized fraud scheme.  Fraudsters may spread attributes and velocities across transactions more effectively but in order to commit the fraud in the first place they have create multiple accounts in order to execute on your site and they must do it in a way that runs contrary to the behavior of legitimate users on your site out of sheer necessity.

Start with examining velocity signatures on your site by attributes captured at the account creation stage.  You should be able to establish a baseline of legitimate account creation based on attributes such as user agent, IP address, tracking cookie and device fingerprint. In my case I was examining a series of organized fraud activities  and started looking at user behavior on entry for indicators where I could tie the activity to a set or signature of actions and velocities from attributes.  I found that under the most extreme circumstances, normal user would never create more than X accounts from the same user agent and IP in any given session and anytime that I found a browser agent with an account creation velocity of more than X and tied to other behavior red flags there was a 95% chance that the account would engage in fraud.  In most cases the velocity was much higher than X and those were “low hanging fruit” detections and working my way down to 3 provided a reliable indication when coupled with several signature indicators and behavior (unfortunately I can tell everything, never know who is reading this).

In looking at the signature of the entry and account establishment behavior I could link multiple organized fraud instances together to gain an understanding of the scope of activity and the specific methods being used that differed from the way normal users would interact with the site.

By looking at this activity through visual analytics you can the multiple layers of interrelated attributes that are created and this represents the activity of a single organized fraud rings creating accounts in a one hour period.




Next look at the sequence of activities when creating the account.  Normal users take a certain mean minimum time when establishing a new account.  Normal users also don’t create an account and disappear, if they are going to take the time to create the account in the first place they are going to browse and interact with your site for a certain amount of time.  When your intention is organized fraud however you have to create a hundred accounts in a short amount of time.  There are two things to look for, do you have users who are creating an account in a fraction of the time your legitimate users are taking and two, do you have a number of account created within a short time frame where the page click logs show the exact (key word) pattern of account creation over and over.  Even for a simple form, ten people will not fill out the form in the same sequence, in the same method without making a mistake. 

When committing fraud at the transaction level, organized fraud can spread these transactions and attributes over a much wider set of data points but in account creation these indicators are much more condensed both from need and by design of the account creation process on most sites.

Another key indicator to look for at account creation is network attributes and the relationship between multiple accounts on certain domains and IP’s.  Again because of the nature of the account creation process, these network indicators become more condensed and dynamic when examining them through visual analysis.  Do you have abnormal user agent attributes jumping across multiple high risk domains and IP’s within a short time span, for example user agent creates 10 accounts on one IP, jumps to another IP and creates another 10.  In examining my own data, I never found one legitimate case where a user would create legitimate accounts or transactions while jumping across network attributes.  Key is understanding what the clear majority of legitimate users do in order to detect the much smaller percent that isn’t.



If you find a case where the network attributes are creating a pattern like the one below, that is an high risk cluster of interrelationships which shouldn’t exist (left)




It is an impulse to always through fraud detection at the threat “you can see”, at the transactional level.  Fraud detection and risk scoring at the transaction level has a number of challenges, the activity is much more disbursed so creating a pattern is harder, this is not true at account creation or site interaction.  Transaction level fraud behavior is much more close to normal user behavior, there is a difference but detecting and measuring it is much more complex because likely by the time a fraudster makes it to the transaction flow they are doing the same thing that normal users do and their pattern of activity is much closer, this is not true at account creation.

By coupling together behavior, risk attributes, network forensics and pattern analysis to create a high risk signature and running it against account creation or even earlier in the entry phase, your fraud system can begin risk scoring and mitigation before the first transaction is ever attempted.  I have worked at fraud prevention with many different types of companies but regardless if its social media, ecommerce or FinTech fraud the rule for account velocity has been true in targeting organized fraud across all of these business types.


In watching the news while writing this article, I heard that Facebook is hiring 3,000 people to moderate video in an effort to remove offensive or dangerous content from the site faster and I am thinking, could the people who create this type of content be cohorted and risk assessed based on the way they create and interact with their account?