Data-Driven Personas Persona Research Personas

Can Your Personas be Biased?

bias image
Detecting Demographic Bias in Automatically Generated Personas

In research led by Joni Salminen, we investigate the existence of demographic bias in automatically generated personas by producing personas from YouTube Analytics data.

Despite the intended objectivity of the methodology, we find elements of bias in the data-driven personas. The bias is highest when doing an exact match comparison, and the bias decreases when comparing at age or gender level. The bias also decreases when increasing the number of generated personas.

For example, the smaller number of personas resulted in under representation of female personas. This suggests that a higher number of personas gives a more balanced representation of the user population and a smaller number increases biases.

Researchers and practitioners developing data-driven personas should consider the possibility of algorithmic bias, even unintentional, in their personas by comparing the personas against the underlying raw data.

Read full research

Salminen, J., Jung, S.G., and Jansen, B. J. (2019) Detecting Demographic Bias in Automatically Generated Persona. ACM CHI Conference on Human Factors in Computing Systems (CHI2019) (Extended Abstract), Glasgow, United Kingdom, 4-9 May. Paper No. LBW0122.

By Jim Jansen

Dr. Jansen is a Principal Scientist in the social computing group of the Qatar Computing Research Institute, and a professor with the College of Science and Engineering, Hamad bin Khalifa University, and an adjunct professor with the College of Information Sciences and Technology at The Pennsylvania State University. He is a graduate of West Point and has a Ph.D. in computer science from Texas A&M University, along with master degrees from Texas A&M (computer science) and Troy State (international relations). Dr. Jim Jansen served in the U.S. Army as an Infantry enlisted soldier and communication commissioned officer.