Financial organisations allocate a lot of human resources to manual reviews and monitoring of transactions. Often an entire team is focused on daily examination of transactions and detection of commonly occurring counterparties, destinations and patterns. This review should detect cases of potential fraud and accounts misuse for money laundering or tax evasion. We see a perfect opportunity here for integration of automated data analytics solutions to increase the efficiency of transaction analysis and automatic pattern discovery.
In the previous two parts of our blog series, we described how data analytics solutions can support the detection of “static” KYC incoherencies, such as missing risk flags, and “dynamic” ones, like a mismatch between transactional geography and the “Countries of business” KYC section. In this last post of the series, we consider the most dynamic and the hardest to detect violation: mismatch between transactional behaviour and purpose of account stated in the KYC.
If we asked a client review officer to share their “gut feeling” for how accounts with a different stated purpose behave, they would probably say the following. A typical inheritance account experiences one or two huge inflows, after which it is slowly (or quickly) drained with relatively infrequent outflows. Savings accounts usually have regular, but infrequent inflows of about the same volume. The most active type is commercial accounts, which perform many transactions, have a huge turnover and wide geography. But is there a way to translate this expertise into mathematical objects and apply cluster detection on them?
Step One – Aggregate Transactions into a Vector
The first step of the scientific approach to understanding of transactional behaviour is feature engineering. Which characteristics could we design that would describe an account’s transactional behaviour? The illustration below offers some examples.
Figure 1 Aggregated characteristics of transactional behaviour help gain a high-level understanding
From the above characteristics we can already gain a higher-level understanding of transactional behaviour and even try guessing the corresponding account’s purpose. Account #111111 is likely a savings or a salary account, with regular, mostly domestic, inflows. Account #222222 shows a more active behaviour: frequent transactions, large turnover, wide geography; it is likely to be a commercial account. Finally, account #333333 looks like an inheritance account, with one large inflow and regular outflows.
As we see, aggregated transactional behaviour already offers strong evidence for detecting the real account purpose. However, such manual allocation based on gut feeling can only get us so far. A more efficient approach is usage of clustering algorithms, which automatically detect cluster structure in the space of accounts’ aggregated transactional behaviour.
Step Two – Discover Cluster Structure
When each account is described by a multi-dimensional vector, it is straightforward to use a clustering algorithm on these vectors. Choices include K-means, hierarchical clustering, density-based clustering, etc. Regardless of specific algorithm choice, a likely outcome would look as follows.
Figure 2. Application of a clustering algorithm helps efficiently detect clusters in the space of aggregated transactional behaviour
For the purpose of illustration, we reduce the number of account descriptors, or dimensions, to 2: average frequency of transactions and average volume. Each dot represents one account, with its transactional behaviour aggregated over the last year. Two outliers pop up instantly. Those are a savings and an inheritance account that were allocated to the cluster of commercial accounts by the algorithm. Chances are that these accounts were either mislabeled from the beginning, or are misused. They could even be employed as money mules for money laundering or other fraudulent activities. These outliers require further investigation and would not be found without the data analytics method of clustering.
This analysis could be embedded into support of periodic reviews. Executed regularly, the cluster detection algorithm monitors transactional behaviour and immediately notices deviations and anomalies. The accounts with detected inconsistencies are brought higher up in the periodic review priority and submitted to a compliance officer. The picture below offers a schematic overview of the process.
Figure 3. Regular monitoring of transactional behaviour offers support of periodic reviews and improves KYC compliance
In this blog series we focused on the usage of customized data analytics to improve and maintain high KYC quality. We have seen how KYC compliance is an essential pre-requisite for successful implementation of AML, fraud and corruption detection, and how challenging the KYC monitoring can be. Data analytics solutions support the detection of static KYC incoherencies, such as missing risk flags, and dynamic violations, such as inconsistent transactional behaviour or risky transaction geography. Finally, we considered different options for integration of the data analytics engine into the business processes. At the on-boarding stage, the engine detects static inconsistencies. When used for periodic reviews, the data analytics solutions detect accounts with deviating transactional behaviour and prioritize them.
In combination with a cross-channel multi-lingual client screening tool smartKYC, data analytics solutions provide continuous fluid risk monitoring and help ensure uninterrupted compliance. If you are interested to know more about this topic, take a look at our recent webinar series.
Part 1 replay can be found here
Part 2 replay can be found here