Scorecard ratings overview – Greenhouse Support

Permissions: —

Product tier: Available for all subscription tiers

If you've ever seen or filled out a scorecard in Greenhouse Recruiting, you likely noticed our unique measurement scale for assessing candidates using a combination of colors and emojis.

We created our scorecard rating setup using science-based survey methods that account for the various ways an interviewer could interpret a scorecard. Our carefully crafted scorecard rating system provides your organization with the best possible platform to reduce bias and individual differences.

In this article, we will introduce our justification for excluding numbers, text/worded scales, and our ultimate choice to use color and symbolic scales in Greenhouse Recruiting scorecards. Additionally, this article will share some interesting facts about how people tend to respond to surveys with only the smallest tweaks in measurement scales.

Psychometrics

Our choice of measurement scale for scorecards was far from random. In fact, our choice was heavily informed by research in the field of psychometrics. Psychometrics is the study of quantitative measurement practices in the social sciences.

A psychometrician generally researches best practices in evaluating the quality of metrics (i.e., survey items), measurement scales (i.e., Strongly Disagree to Strongly Disagree), and other related factors that contribute to the accuracy of the behavior or process you’re trying to capture.¹

Our numerical scale's impact on question responses

Have you noticed the wide variety of numerical scales used in survey questions to represent the same set of response choices? The following 5-point Likert scale is a common set of choices available as a range of answers to a survey question:

Strongly Disagree	Disagree	Neither Agree nor Disagree	Agree	Strongly Agree
1	2	3	4	5

And so is this variation of the same scale:

Strongly Disagree	Disagree	Neither Agree nor Disagree	Agree	Strongly Agree
-2	-1	0	1	2

Several studies on the use of Likert scales have demonstrated that participants' responses will vary based on the numbers shown (or not shown).² Survey participants tend to evaluate and assign different weights to numbers, which may introduce bias in a survey's results, especially in small sample sizes.

Traditional academic research tends to collect data from larger sample sizes (For example, 200–300 respondents). This helps balance out the variation in response rates from people’s interpretation of a scale. However, these sample sizes are generally much higher than you will see in the hiring process.

Example: As a Hiring Manager, you'll likely receive candidate ratings from 5 to 10 people and even fewer responses on each of the individual attributes to be evaluated.

This means it's important for your organization to be aware of when and where biases may emerge. Any step Greenhouse Recruiting can take to reduce error and bias improves the quality of your candidate evaluations.

Cultural and regional response differences to worded measurement scales

Did you know that residents of the United States tend to respond overwhelmingly positively to most survey questions they receive versus residents in other parts of the world?³ These differences in survey responses do not reflect any cultural differences in optimism or agreement. In fact, people in the United States tend to be less positive and trusting of institutions as a whole.⁴

In the U.S., respondents will likely respond to a question as Strongly Agree unless they have a clear reason to disagree with the statement. In other parts of the world, such as in mainland China,³ respondents will respond more neutrally unless they have a clear justification for strongly agreeing with the statement.

Excluding a worded scale ensures that your Hiring Managers can be confident that your candidate is being evaluated consistently across interviewers, and is not subject to bias associated with different individuals’ interpretations of Strongly Agree/Strongly Disagree.

Colors vs. words

People tend to respond more consistently and powerfully to colors than they do to words. Humans process colors and symbols in a different region of the brain than words or numbers.⁵ We also more readily associate colors with a negative or positive valuation based on opacity, etc.⁶

Greenhouse Recruiting uses shades of red, yellow and green for our scorecard rating icons to help provide you with more consistent scorecard results across individuals and teams.

Sources

Psychometric Society: What is psychometrics? https://www.psychometricsociety.org/content/what-psychometrics
Weijters, B., Cabooter, E., & Schillewaert, N. (2010). The effect of rating scale format on response styles: The number of response categories and response category labels. https://doi.org/10.1016/j.ijresmar.2010.02.004
Lee, J. W., Jones, P. S., Mineyama, Y., & Zhang, X. E. (2002). Cultural differences in responses to a Likert scale. Research in nursing & health, 25(4), 295–306. https://doi.org/10.1002/nur.10041
Twenge, J. M., Campbell, W. K., & Carter, N. T. (2014). Declines in trust in others and confidence in institutions among American adults and late adolescents, 1972–2012. Psychological Science, 25(10), 1914–1923. https://doi.org/10.1177/0956797614545133
Peterson, Bradley S., et al. “An fMRI study of Stroop word-color interference: evidence for cingulate subregions subserving multiple distributed attentional systems.” Biological psychiatry 45.10 (1999): 1237–1258. https://doi.org/10.1016/S0006-3223(99)00056-6
Piotrowski, C., & Armstrong, T. (2012). Color Red: Implications for applied psychology and marketing research. Psychology and education-An Interdisciplinary Journal, 49, 55–57.