Farrukh, Muhammed Umar, Wainwright, Richard, Crockett, Keeley ORCID: https://orcid.org/0000-0003-1941-6201, McLean, David ORCID: https://orcid.org/0000-0001-7894-5176 and Dagnall, Neil ORCID: https://orcid.org/0000-0003-0657-7604 (2023) Building Actionable Personas Using Machine Learning Techniques. In: 2022 IEEE Symposium Series on Computational Intelligence (IEEE SSCI), 04 December 2022 - 08 December 2022, Singapore.
|
Accepted Version
Available under License In Copyright. Download (307kB) | Preview |
Abstract
Personas are quantifiable and describable ways of grouping people based on their behaviours. They are valuable to businesses as it enables them to better understand their customer base. The creation of personas from survey data requires establishing the client requirements, building a quantifiable personality scale, developing personality questions for a survey, and human subjective analysis. In this work, we have utilised clustering to automate the persona development process. We have developed a real-world survey for children (from 17 countries) which included 25 personality-based questions (based on the OCEAN model), 22 questions that captured purchase behaviour, and other general features from the children’s landscape. There were 63,969 completed questionnaires with a high proportion of categorical features, which were preprocessed to allow different segmentation methods to be tested. Preliminary results with simple K-means and a Euclidean distance function demonstrated that this was inappropriate for the survey data set. A novel distance function for K-means clustering has been developed, which can handle a mixture of feature types and to allow the importance of each feature to be varied, using a linearly weighted distance method. The function also incorporates the haversine distance function to provide a distance between two locations, enabling potential cultural differences to be examined. We have also implemented Gaussian Mixture Model on the same feature set to compare the results and see the limitations of Gaussian Models Our novel approach generated clusters based on a combination of features including personality, consumer behaviour and location which has demonstrated key cultural differences across the globe. Results from our novel approach show that location distance is one of the key features when constructing personas as culture (location) has a significant effect on the way children answer survey questions.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.