Imaginary People Representing Real Numbers: Generating Data-Driven Personas Using the APG System

Using the APG system for automatic persona creation from real data is both a working system and an on-going research project. With APG, we have developed a methodology to automate the creating of imaginary people. These imaginary people are personas. We do this by processing complex behavioral and demographic data of social media audiences.

To show the value of APG, we use data from a popular social media account. This account contains more than 30 million interactions by viewers from 198 countries engaging. It has more than 4,200 online videos produced by a global media corporation.

We demonstrate that the APG methodology has several novel accomplishments, including

  • (a) identifying distinct user behavioral segments based on the user content consumption patterns;
  • (b) identifying impactful demographics groupings; and
  • (c) creating rich persona descriptions by automatically adding pertinent attributes, such as names, photos, and personal characteristics.

We have validated the approach for automatically creating personas, which we call data-driven personas, by implementing the methodology in an actual working persona system (APG).

We then evaluate the data-driven persona creation methodology via quantitative techniques. We examine the accuracy of predicting content preference of personas, the stability of the personas over time, and the generalizability of the method via applying to two other datasets.

Research findings show the data-driven persona approach can develop rich personas representing the behavior and demographics of real audiences, and it does this using privacy-preserving aggregated online social media data from major online platforms.

Results of the data-driven persona creation approach, and its instantiation in the APG system, have implications for media companies and other organizations distributing content via online platforms. There are at least nine benefits of data-driven personas.

By Jim Jansen

Dr. Jansen is a Principal Scientist in the social computing group of the Qatar Computing Research Institute, and a professor with the College of Science and Engineering, Hamad bin Khalifa University, and an adjunct professor with the College of Information Sciences and Technology at The Pennsylvania State University. He is a graduate of West Point and has a Ph.D. in computer science from Texas A&M University, along with master degrees from Texas A&M (computer science) and Troy State (international relations). Dr. Jim Jansen served in the U.S. Army as an Infantry enlisted soldier and communication commissioned officer.