Building AI into IO Products

As part of the program, I took part in an expert panel on Machine Learning and Artificial Intelligence in IO products. Thank you to Vivian Wei from AON Global Products who led the panel, and to my fellow panelists: Joseph Abraham, Vice President Assessment Solutions at PSI; Richard Justenhoven, Director Global Products at AON; and Richard Landers, Associate Professor of IO Psychology at Minnesota.

In this blog post, I’ve pulled out some reflections based on our panel discussion. I have focused on what’s useful for IOs who are just starting to think about ML, or who are considering using ML-based products.

Of course, the views in this post are my own, and I’ve left all the good stuff in the panel discussion where it belongs. Make sure you check it out at.

Welcome, stranger

As an IO entering the world of ML, I certainly felt like a stranger. The statistical methods I was used to were developed back when computer time was expensive and sample sizes were small. ML is a younger discipline, born into a world with abundant data and the computing power to match. This has freed ML from (some of) the simplifying assumptions that have constrained traditional psychological statistics and allowed it to pursue a different set of goals.

IOs will find a lot in ML to be surprising or counterintuitive. To an IO, having more predictors in your equation than people in your dataset doesn’t seem right. But this kind of situation is not uncommon in ML, and the field has developed methods for handling problems like this. To an IO, adding bias to a prediction model sounds a little odd – especially when you’re trying to reduce adverse impact. In the ML world, adding bias (a better term might be anchoring) means making sure your model is not too responsive to noise in the data, including noise associated with protected groups.

I remember you

Despite the differences, IOs will find plenty of old friends in ML. Many ML methods are adapted regression equations. IOs who have used regression, or who remember it from their undergraduate training, will find these regression-based ML methods very familiar.

For example, imagine you have a client who wants to predict success in a leadership training program. Your predictors include about 30 personality traits, job history data, and supervisory evaluations of readiness. A method like elastic net regression will help you to produce a regression equation that narrows down on the most important variables and that is optimized for subsequent prediction on as-yet-unseen data. Helpfully, the regression equation that it produces can be used in the same way as regression equations derived from more traditional methods, making it easy to apply in practice.  A method like the Generalized Additive Model (GAM) will produce a more complex model, but one that handles nonlinear relationships more effectively.

Of course, these methods lie at the surface of a very deep pool, and many IOs will find themselves quickly out of their depth. That’s where the experts from data science, computer science, and statistics will come in handy.

Forming a team

Putting an ML-based tool into production is going to require a multidisciplinary team that goes far beyond IOs. In my own work, the team has included people from data science, software engineering, UI/UX, and graphic design.

When I think about the value of an IO in a team like this, I focus on issues such as:

  • What is the human behavior that generated the data, and would we theoretically expect that behavior to be relevant for what we’re trying to do with the data?
  • How is the data being pre-processed or cleaned before it’s entered into the algorithm, and is this appropriate?
  • What are the legal and ethical implications of what we’re doing with the data?
  • How would our algorithms and processes be explained and presented to users, in a way that facilitates effective decision making?

Stand on the shoulders of giants.

Once you start moving into specialized areas, such as speech data, social networks, or image data, you’ll find specialized services and tools that have been built specifically to handle this kind of information. For example, Google’s BERT and BigBird services are available for natural language processing applications and are used by many AI and ML providers.

Positively, using third-party tools helps AI and ML products gain access to technology that performs much better than anything that could be developed in-house. They allow providers to focus on their specific application, with the publicly available tools doing what they do best.

Negatively, there is a cottage industry of providers with little technical expertise whose products are just thin wrappers around these tools, and whose understanding of how to use the tools appropriately may be very limited.

Where to from here?

I believe that AI and ML provide IOs with an extended toolbox of methods that can be used effectively to solve workplace problems. Sometimes, the traditional tools we learned as a standard part of our training will be perfectly fine. In situations with strong theory, with only a small number of variables, and with broadly linear relationships, methods like OLS regression will perform well and are easy to apply. Once the situations become complex enough, however, using ML becomes almost obligatory.

AI and ML are well represented in SIOP this year, and I think it is almost certain that this interest will only increase over time. I hope you take the time to view our panel discussion and to take advantage of the many other AI and ML sessions at SIOP.