Debunking Myths: Why Machine Learning for Finance is Applicable, Accessible (Part 1 of 2)

Image: metamorworks/iStock

Editor’s Note: The misperceptions about machine learning lead many finance teams to avoid investing in related technologies. Let’s focus on the two of the four misperceptions today and discuss the rest later this week.


A recent PwC study found that over the next two to three years “basic and intermediate AI,” or machine learning, will be the single most important technology impacting the finance function.

It’s easy to see where the respondents are coming from: The finance function, by nature, is forward looking and machine learning’s ability to make accurate predictions will lead to faster, more informed decisions.

Despite this, a recent Workday survey found that only 35% of corporate finance teams are making extensive use of advanced analytics, including machine learning, in key finance areas such as planning, budgeting, and forecasting.

This begs the question, why is there such a disconnect between the perceived benefits of machine learning and the application of the technology?

We believe there are four misperceptions leading to approximately two thirds of corporate finance teams not investing in machine learning:

  • Machine learning requires big data
  • Machine learning requires staffing data scientists
  • The cost is high with limited or unknown benefit
  • It is more science fiction than true science

We will break down these misperceptions to show how applicable and accessible machine learning can be in the finance function.

Misperception: Big data is needed

Currently, the perception of many finance professionals is that machine learning is a solution that requires big data.

Most FP&A departments don’t have big data. Machine learning can produce extremely effective results with small or medium data sets as well.

To illustrate this, we devised an experiment where 30 inputs were manipulated through a variety of mathematical techniques and then randomly weighted to generate an output.

To confuse any prediction algorithm further, we gave half of the inputs no weighting at all.

Only the inputs and the single output were fed into our platform, all manipulations and weightings were hidden. By any conventional statistical methods, our outputs resembled chaos.

We then tested selected prediction algorithms on a scale from 10 to 100,000 observations.

The results showing the accuracy of the algorithms by number of observations are illustrated in the graph below.

While certain algorithms struggled with smaller data sets, others were able to learn very quickly showing decent results with as few as 1,000 observations, and most algorithms were over 90 percent predictive once the number of observations increased to 5,000.

This is a clear example that the machine learning tools don’t need a lot of data to predict better than humans.

We also see that certain algorithms are a better fit for certain data than others. Analysts tend to use the same tools for all problems and datasets, but with the power of processing we can use the right tool to solve different problems.

Measuring the impact of data size on the accuracy of selected algorithms using Alvarez & Marsal’s Machine Leaning Platform 

To use a real-world example, we were able to utilize machine learning to produce a model to estimate S&P Credit Ratings using only publicly available data with approximately 40,000 quarterly observations of public company filings.

Our model SCRE (Sample Credit Rating Estimator) started with those observations and 400 features (data components) per observation to produce a final model that required fewer than 10 features and outperformed any of the commonly used models.

Misperception: Data scientists are required

By removing the need for big data from the picture, the job description of people who leverage machine learning to make more accurate predictions stops resembling the qualifications of a data scientist and starts resembling that of a traditional analyst. Someone who can not only work with numbers but also understand the meaning behind those numbers.

A finance function will never get completely sanitary or clean data, it will always be heavily contextual.

No matter what you are doing with the data, you will need people who can understand its context and limits. Those are your analysts and managers, not necessarily a data scientist.

It will be important to find the right partner or staffer who understands how to apply and interpret the machine learning results, but that person isn’t necessarily a data scientist.

While having resources with knowledge of the underlying models and concepts is undeniably important, it would be useless without team members who understand the underlying data and business.

This understanding will guide how you tune your models and feed it the right data to train the model.

To read part 2 of this series, please click here.


About the Author

Chandu Chilakapati, Managing Director, and Devin Rochford, Director, are with Alvarez & Marsal, Valuation Services.

Copyright © 2018 Association for Financial Professionals, Inc. All rights reserved.