Posted by Josh Millet on Tue, Apr 06, 2010 @ 07:00 PM
A blog post on Huffington Post caught our eye last week (Employment Testing for the Priesthood Can Prevent Child Abuse). The title concisely states the thesis, but aside from a few sensible sounding comments about pre-employment testing from a headhunter, the rest of the article unfortunately does not offer any logical support for the over-reaching title. It also perpetuates some misconceptions about pre-employment testing, which is what I want to address here.
It should come as no surprise to regular readers of this blog that we're big believers in the utility of pre-employment testing, but it's also important to recognize the limits on the power of such tests. They are not a panacea that can solve all the HR-related problems of any organization, even when used properly.
This blogger's argument unfortunately rests on an inaccurate idea of how pre-employment testing works. To begin with, let's set aside the fact that joining the priesthood is a lengthy process of study and commitment, making comparisons with other "you're hired, you start tomorrow" work settings inappropriate. And let's also set aside the obvious fact that the church's current difficulties would seem, from an organizational perspective, to be much more about review and internal management than about hiring per se.
All that said, if any organization could in fact push the envelope on pre-employment testing, the Church would be one of the few. After all, questions about religious faith, sexual orientation, marital status--which would be illegal to ask at, say, the post office--are perfectly relevant to ask prospective priests. We would imagine that the latitude to pursue mental health testing for prospective priests would be wide (much like for police officers), in contrast to more traditional work settings where such inquiries are usually not legally permitted. We sometimes have people call us up and say that they just had to fire a crazy person, and ask us if we have tests that can help screen out such employees in the future. Tests for psychopathology, however, are generally not permitted in the US in the context of pre-employment screening, because of the Americans with Disabilities Act, which makes it illegal to administer anything that can be construed as a medical exam as a condition for offering employment in most settings.
But even if the church had the latitude to pursue intense clinical testing, what exactly would these tests be looking for? The author of the blog post doesn't offer any help on this question. And even if there were a test that would indicate predilections for certain kinds of deviant behavior, the fact is that no employee assessment tool is a perfect predictor, guaranteed to screen out everyone who is a "bad apple". In fact, employee assessment tools are about increasing your hiring accuracy rate, or decreasing your likelihood of hiring a "bad apple" of whatever kind, not about ensuring it will never happen.
In a future blog we'll discuss in greater detail how pre-employment tests offer information and utility, but not certainty. Many prospective clients we speak with who are just beginning to investigate pre-employment testing expect that tests should somehow provide a perfect rank-ordering of how a group of new hires will perform on the job. (Sadly, the marketing departments at some of our competitors do not try very hard to clear up this misperception.) Such a faith in the power of testing is unrealistic. In fact, the pursuit of perfection in testing can be highly counter-productive: if the threshold for selection is set so high that it absolutely minimizes (but never eliminates) the chance of failure, it will also screen out many employees who would have been excellent performers. These cases are called false negatives, and they represent a costly yet unobserved error in lost opportunity.
It is the horrible nature of the documented and alleged cases of abuse that makes us all want to consider ways to ensure it never happens again. Our point is that faith in some unspecified pre-employment tests over and above what the Church is already doing at the selection level is probably misplaced. Such tests don't exist, would never be failsafe, and would likely exclude large numbers of potentially valuable employees.
Posted by Josh Millet on Wed, Mar 03, 2010 @ 11:53 AM
As the focus of stock market watchers, economists, and politicians turns to Friday's release of the February non-farms payroll and unemployment numbers, we thought we'd weigh in again on hiring trends. In the past few days a number of payroll companies have released reports based on their own February numbers. They contain some hopeful signs. On Monday the Intuit Small Business Employment Index, which tracks hiring at companies of fewer than 20 employees, reported another uptick in hiring. The SurePayroll Small Business Scorecard, another measure of small business hiring, reported that in February hiring year-to-date increased 1.9%. And then today the granddaddy of all these private sector reports, the ADP payroll report, noted that non-farm payrolls declined by the least since February 2008. (It's a sign of how bleak the jobs picture is when small job losses are being celebrated as good news).
It should be noted that the ADP number, unlike the other two, is not focused exclusively on small businesses, as ADP's customers include large corporations as well as smaller ones. It stands to reason that for small and medium-sized businesses the employment trends may be better than in the economy as a whole, as small and medium-sized businesses often recover more quickly and begin hiring sooner coming out of downturns than do larger corporations.
For what it's worth, here are our two cents on the latest hiring trends. Our Hiring Activity Index is based mostly on small and medium-sized businesses, though unlike Intuit's is not confined only to very small businesses: 95% of our customers have between 10 and 1000 employees. And what do our numbers show? The HAI numbers for February look pretty good: the 66.8 reading on the index (meaning 66.8% of our customers were hiring) is up a point from 65.7 in January, and up significantly from the 58.8 level of February 2009.
Here's hoping that Friday's goverment numbers look ok too.
Posted by Josh Millet on Thu, Jan 14, 2010 @ 04:40 PM
As some of you may be aware, Criteria's
pre-employment testing software includes some assessments that were originally created by a research partnership between NASA and Harvard. There's an article in NASA's annual
Spinoff publication about the collaboration that produced the MRAB, a test that was originally created by Dr Stephen Kosslyn, a Harvard University psychology professor and a member of Criteria's
Scientific Advisory Board. If you are interested you can read the article
here.
Posted by Eric Loken on Tue, Nov 24, 2009 @ 06:50 PM
I'll admit I'm in a curmudgeonly mood because I feel like I'm wasting time writing about something so obvious. But we've been implicated in a strange argument that erupted in the blogosphere last week, so I'm compelled to write a few words to clear our name. As we mentioned in our last post, a few days ago Steven Pinker reviewed Malcolm Gladwell's latest book and criticized him rather harshly for several shortcomings. Gladwell appears to have made things worse for himself in a letter to the editor of the NYT by defending a manifestly weak claim from one of his essays – the claim that NFL quarterback performance is unrelated to the order they were drafted out of college. The reason we're implicated is that Pinker identified an earlier blog post of ours as one of three sources he used to challenge Gladwell (yay us!). But Gladwell either misrepresented or misunderstood our post in his response, and admonishes Pinker by saying "we should agree that our differences owe less to what can be found in the scientific literature than they do to what can be found on Google."
Well, here's what you can find on Google. Follow this link to request the data for NFL quarterbacks drafted between 1980 and 2006. Paste the data into a spreadsheet and make a simple graph of touchdowns thrown (as of 2008) versus order of selection in the draft to create the picture below.

The graph includes 373 QBs with a correlation of -.40. If you take the log of TDs the correlation increases to -.57. But correlation can be misleading here because the data are heavily skewed and stacked at zero. Instead, just focus on the perfectly transparent visual display. What is the probability that a quarterback throws 50 or more touchdowns if picked early in the draft? Is the probability lower for QBs picked later in the draft? If you were going to predict performance, would you want to know the draft position of the QB before you made your prediction? The answer to this last question is an unequivocal yes.
So how do you make this plain-as-day-association disappear? You can eliminate some of the data by declaring it off limits. For
example, an economist named David Berri has recently published an article claiming that the correct way to look at the above data is by filtering some observations and making some transformations. (I am working from his blog post here as the journal article is not yet available at my library.) On his blog, Berri says he restricts the analysis to QBs who have played more than 500 downs, or for 5 years. He also looks at per-play statistics, like touchdowns per game, to counter what he considers an opportunity bias. Because early draft picks are given more opportunity to play, there is a natural correlation between draft order and playing time which might inflate the career statistics like total touchdowns.
Fair enough, but you have to be careful about writing off one source of covariance as a bias in need of correction. Longevity in the NFL is a function of opportunity and success. To attribute all the covariance between playing time and draft order as some sort of opportunity bias is to dramatically redefine the logic of the question. Does anyone believe that NFL owners and coaches are just "socially promoting" their early draft picks to run up these gaudy production stats, while equally able QBs with the misfortune of being selected later in the draft sit idly by and watch? Yes,there are Tom Bradys sitting on the bench... but very very few quarterbacks picked 199th in the draft are remotely as good as Brady proved to be, whereas several QBs picked in the early rounds are as good. You can't look at the above graph and not agree that there is some association between draft order and probability of being a high producer. It doesn't make sense to say that graph is an illusion due to uncorrected factors.
Even when I do take a few chops at the above data, I can't eliminate the strong correlation. The correlation is still there when I do TDs per game. It's there when I restrict the data for at least 100 pass attempts. The correlation is even bigger when I do TD per game for QBs picked in the first 100 positions of the draft. I can't get the association to go away, and I'm going to let these graphs stand as a challenge to Gladwell's statement that no prediction is possible regarding the future success of NFL quarterbacks. The consensus of the predictive information reflected in draft order out of college unambiguously does predict future performance.
This Thanksgiving kids everywhere will choose sides for pick-up games of football. Oh how silly are these kids who make alternating choices to fill up two teams! Just let Sally pick the first 10 players and let Johnny pick the next 10 and let the games begin. After all, where no prediction is possible, everything else is just prejudice, right?
Posted by Josh Millet on Thu, Nov 19, 2009 @ 02:50 PM
We're going to file this one in the "we told you so" file! The other day the famous Harvard psychologist Stephen Pinker reviewed Malcolm Galdwell's newest essay collection in the New York Times, and it wasn't pretty. Pinker savages Gladwell, concluding:
"Unfortunately he wildly overstates his empirical case. It is simply
not true that a quarterback’s rank in the draft is uncorrelated with
his success in the pros, that cognitive skills don’t predict a
teacher’s effectiveness, that intelligence scores are poorly related to
job performance or (the major claim in “Outliers”) that above a minimum
I.Q. of 120, higher intelligence does not bring greater intellectual
achievements. The reasoning in “Outliers,” which consists of
cherry-picked anecdotes, post-hoc sophistry and false dichotomies, had
me gnawing on my Kindle."
Pinker's conclusions echo the arguments we made in this humble blog about a year ago. Read them again here and here. Gladwell is a great story-teller, and a gifted writer--but he should not be considered an authoritative voice on how we conduct social science or public policy, or for that matter employee testing.
Posted by Eric Loken on Tue, Oct 13, 2009 @ 05:37 PM

Last Thursday morning there was a bump in the futures markets as the latest weekly numbers for initial unemployment insurance claims were released. The numbers were apparently better than expected, and the futures markets reacted positively. Jobs data are, now more than ever, an important economic indicator driving investor sentiment.
In previous blogs here and here we've discussed whether the utilization rates of our software tools for employers to test prospective employees could serve as an advance indicator of the national jobs picture. A usage metric we track called the Hiring Activity Index represents the percentage of our clients who are actively testing prospective employees in a given month. Earlier this year we focused on the upticks we saw in the HAI in February and March, which we thought were harbingers of better (or at least "less bad") jobs data to come.
Turns out we were right about that, and we thought we would share what the trends look like with 21 months worth of data. We've plotted the initial unemployment claims data (weekly numbers, smoothed over a month) with the monthly Criteria Corp Hiring Activity Index. The trends look similar, and indeed they correlate very well. The correlation is -.79, showing excellent correspondence between the rise and fall of the jobs data and the HIA. Furthermore, when predicting the jobless claims on the basis of the concurrent HAI and the HAI from the two previous months – using a lagged regression model - the multiple R is .93 (Adjusted R2 = .85).
The point is that real time utilization data for an employee assessment service with a modest client base of small and medium sized businesses can provide very good prediction of national trends. We see this as similar to reports earlier this year that Google searches for flu related topics mapped on closely to CDC data for the spread of influenza. That was also an example where a real-time indirect indicator predicted definitive data that would be available later.
We mostly see this finding as validation of our earlier interpretation of the HAI. There are a number of caveats, including that the time series are short, and that we have used the non-seasonally adjusted UI numbers. Presumably our client base operates on a similar seasonality as the initial claims data, and that inflates the correlation. Furthermore, the time series for the HAI also represents the growth and development of our company (our client base grows by a factor of 5 across the time span), so the data change meaning somewhat across time.
We don't expect to move the financial markets with these data. But we do take them as an indication that our services and our clients are moving with the times.
Posted by Josh Millet on Thu, Sep 17, 2009 @ 07:50 PM
Back in March we noted in a post that garnered some attention from other bloggers that our metric for measuring the level of hiring activity among our customer base (called, unimaginatively, the Hiring Activity index) had edged upwards in the spring, after reaching its nadir in December and January. Although the national employment picture remains ugly, in June we saw the HAI recover to its highest level since before the stock market crash of the fall. In June the HAI was 65.9, and July it was 66.3, levels not seen since the summer of 08. August saw it dip a little to 62.9. The uptick in hiring activity among our customers, we hope, is another sign of stabilization in the employment picture, at least as far as small and medium-sized businesses are concerned.
Among professional economists there is a virtual consensus that unemployment will continue to climb well into 2010, and peak at a rate well past 10% some time next year. My opinion, for what its worth, is that the unemployment picture, while still bleak, will not get much worse before it begins to stabilize and, eventually recover. In fact, if the HAI is any indication, small and medium-sized businesses are already beginning to pick up the pace of hiring--let's hope we see more companies following this example soon.
Posted by Josh Millet on Thu, Jul 30, 2009 @ 12:30 PM
Yesterday's NY Times reports on a dispute concerning the
posting of the Rorschach Inkblots on Wikipedia. With all 10 inkblots posted, along with common answers, many psychologists argue the test has been compromised. Free information advocates argue that the test no longer had copyright protection, and therefore posting it is perfectly acceptable. It's fitting that a debate about the grand-daddy of all projective tests — tests in which ambiguous stimuli provoke reactions that reveal aspects of a person's mental state — should elicit this kind of polarized reaction.
I think most practicing professional psychologists would feel some degree of concern, knowing full well that they would not want their measures to be gamed, or to become so culturally exposed as to become irrelevant and invalid. However the advocates for posting the inkblots sound indifferent — one is quoted as laughing at the idea that a German publisher might be considering legal action, while the person who posted the Rorschach says he doesn't care what experts think, he wants to be shown the actual damage caused by his actions.
At one point the article says that those opposed to the posting of the test feel that it is akin to posting a future version of the SAT. This is an iteresting point. Suppose we were discussing someone attempting to post next October's SAT on Wikipedia. There would be no more laughing at the prospect of Educational Testing Services pressing legal action — Wikipedia would be doing everything in their power to absolve themselves. Presumably it would be an absolutely clear cut situation. However, there would be very little empirical evidence of the degree to which the SAT was compromised — the "no brainer" status would all be about the legal copyright. It seems like the free information advocates are failing to recognize that the hypothetical case of the compromised SAT and the actual case of the Rorschach are both worrisome to testing professionals because of the potential to undermine the integrity of the tests.
We at Criteria are also concerned about test security, as is the testing industry as a whole (Click here for a Wall Street Journal article related to test security). For some of our tests, like our neurocognitive tests, exposure isn't really an issue. These are performance tests that show real time processing and responses. Other tests, such as our employment personality tests, are more like the Rorschach in that they do not have absolute correct and incorrect answers, although it is true that certain response profiles are considered more or less appropriate for certain jobs.
For special cases there is an interesting compromise between free information advocates and testing professionals. In some cases, it might be possible for the full range of testing materials to be freely available. Consider what ETS has done with the writing prompts for the GRE taken annually by half a million applicants to graduate schools. The prompts for which the students must write a short essay are posted online here. There are hundreds of prompts, although on test day an applicant will only encounter one. If they have prepared answers to each and every one — well more power to them; they have shown a remarkable degree of perseverance which ought to count for something. If they have memorized answers from a website, their work will easily be recognized as generic (think about that the next time you play the lottery numbers from your fortune cookie, and you have to share the Powerball prize with 110 other people).
We at Criteria think that open source testing might also have a place in undergraduate education. As of now, multiple choice tests in college classes are difficult to keep secure. Too many sororities and fraternities have files with pirated versions of tests. One solution to this is to build out a universe of potential test items. Students could have access to the open source version with all the possible testing materials, and come the time for the final exam, the instructor could use a subset of the items for the test. We're working on ways to make this possible.....we'll keep you posted. We're just saying that if there were 2000 Rorschach inkblots, having them up on Wikipedia wouldn't be such a compromising event. And that guy from Saskatoon probably wouldn't have bothered posting them in the first place, making it even less of an issue.
Posted by Josh Millet on Fri, Jun 12, 2009 @ 05:37 PM
Well, we've now put the Microsoft skills tests live on HireSelect (read about it in the press release.) The integration of these tests took longer than we expected, and right now they are in beta release, because we are still gathering customer feedback and making sure they are bug-free.
One of the great things about delivering software over the web is that when we make changes we get immediate feedback from our customers. If a new feature isn't intuitive, we hear about it right away and can immediately improve it; if there's a bug that we didn't catch in QA, we learn about it right away once a customer encounters it. While constantly upgrading our software does introduce the possibility of minor technical glitches, this is far outweighed, we believe, by the fact that our customers don't have to wait six months for the next release to see new features they want, as is the case with companies that are still distributing their software in shrink-wrapped CDs.
Posted by Josh Millet on Tue, Apr 14, 2009 @ 07:02 PM
In my last post I described our customer service test and the kinds of personality traits that it measures. People who have high levels of cooperativeness, patience, and personal diplomacy tend to be well suited for customer service roles. The use of personality tests is even more widespread, however, in helping select salespeople, because there's a lot of research that shows that people with certain personality traits tend to be successful in sales roles across a wide range of industries. Most personality tests that are designed to help select salespeople look for outgoing, fairly aggressive people that tend to be competitve and highly motivated. This general profile of a stereotypical sales professional is probably not all that surprising. But what kinds of research underlies this type of "sales profiling?"
The sales aptitude test featured in HireSelect is called the Sales Achievement Predictor. The professors who created the test validated it in part by comparing the 15 personality traits it measures to job performance data for various samples of salespeople. The highest performing salespeople tended to be competitive, outgoing, highly motivated, assertive individuals. For example, in a sample of 156 real estate sales professionals whose test scores were compared to their job performance, the highest correlations were observed in the following traits: Achievement, Motivation, Initiative, Assertiveness, Competitveness, Goal-Orientation, and Extraversion (the correlations were .53, .43, .42, .38, .38, and .36, respectively.) Interestingly, low or even negative correlations were observed for Cooperativeness and Patience, suggesting that when it comes to sales being too patient or too cooperative can sometimes be a liability rather than an asset. We've conducted numerous case studies with our customers that essentially confirm these findings: the most successful salespeople tend to be competitive, assertive, and relatively impatient individuals--in short, nearly the opposite of the type of people who are best suited to customer service. To be sure, the type of personality that is best suited for a particular sales role can vary from one organziation to another, and from one industry to another, depending on the nature of the sales process and the sales culture in a given environment. But the basic building blocks of what personality traits you should look for in selecting sales people are remarkably consistent across all industries.
Click here to read more about our sales aptitude test.