06/01 2020

AWS re:Invent 2019 – How to detect fraud with AWS’s new ML service?

nextlink-AWS re: Invent 2019

Nextlink technical editors are very interested in the new feature of fraud detecting: Amazon Fraud Detector announced by AWS CEO during the Executive Keynotes session on the second day of re: Invent 2019. Therefore, they found the session that introduced this technique, which is the topic we would like to share: How to use Machine Learning for fraud detection?

At the beginning of the agenda, the fraud and financial crime vice president of the US financial institution: Charles Schwab & Co shared how they catch online financial frauds in their daily operations. He first listed the most common frauds including new accounts (with suspicious domains or accounts registered in specific countries), online account takeovers, payment frauds, purchases as visitors (unregistered accounts), and social engineering frauds, etc. As the pattern of fraud is constantly changing, Charles Schwab uses algorithms and a self-developed monitoring system to find out similar factors from fraud cases and improve the accuracy of fraud detection internally. In the end, he proposed an overall fraud prevention strategy, from the prevention at the first, daily detection and repair to the final deterrent.

“Machine Learning can contribute to the first two of the above four steps: prevention and detection,” said AWS fraud Prevention General Manager. The ML model collects a large amount of data to learn more general patterns and discover the risks of trading events. When scammers make minor scam adjustments, the ML model still considers them suspicious because they behave differently from legitimate customers.

ML models require three elements to train:

Data

To train a model requires a very “large amount” of sample data, at least 10,000. Of course, the more the better, including predicted values, independent variables, and types including categorical variables and numerical variables.

Mark

Each sample needs to be tagged and classified according to the result you want. Numerical variables are used for regression, and categorical variables are used for classification.

Algorithm

Includes from simple linear regression to complex deep neural networks.

Of course, the above methods are just the required preparation to build the ML model in the initial stage. In the follow-up, it is necessary to continuously record the events marked as fraud, retrain the model, and optimize the ML rules.

It is very difficult to detect fraud before using the ML model, as it requires a lot of manpower. Scammers often change their methods and the past algorithms used embedded detection logic to make the overall detection efficiency cannot be effectively raised. However, using ML for fraud detection is not that simple as well, because a lot of effort must be invested beforehand, including ML experts that are currently hard to find and the high personnel costs, general models cannot be used for specific purposes, early-stage needs a lot of time and finding errors in the model is like finding a needle in a haystack.

That’s why AWS introduced the Amazon Fraud Detector fraud detection service, which makes it easy for companies to use machine learning to detect online fraud incidents in a timely and large-scale manner. It can build high-quality fraud detection ML models faster, block scammers from the start, build-in online fraud expertise, and give fraud control teams more control. You can save and upload past historical data CSV files to S3 and link to Amazon Fraud Detector to create detection templates. The templates can check and expand data, perform special functions, select algorithms, train and optimize models, verify performance, host models, and finally, the API of detection logic will be derived. This API can be applied to online transaction detection according to the set rules. Amazon Fraud Detector’s actual approach is to give the score of each customer transaction based on the judgment of the ML model. If the score is too high, it means that this transaction may be a fraud case.

Amazon Fraud Detector’s main features include its pre-built fraud detection templates and automated generation of fraud detection models based on demand. You can see the past evaluations, logical data, and results through a graphical interface. Users can integrate Amazon SageMaker for different use cases, such as Amazon Fraud Detector for customer transaction fraud detection, and then use SageMaker to check the risk of leakage of account data.

Although Amazon Fraud Detector is still in the preview stage when it is available in the future, it can be applied to various industries to help capture online fraud cases and make the overall online transaction environment more secure.