Annelies Verhaeghe, Managing Partner and Chief Platform Officer, Human8 How AI is transforming the insights industry
There’s no way around synthetic data
With the rise of generative AI also came the rise of synthetic data. This data is artificially generated through machine learning techniques, rather than being observed and collected from real-world sources. To create marketing persona, for example, researchers train AI models using extensive internet data or existing research data. Advancements in generative AI tools make generating synthetic data more accessible to the extent that it will become an integral part of each research project. We’re not far from a future where every quantitative research piece will include a percentage of synthetic data. This allows researchers to compare analysis results between human and synthetic generated data, learn from this and improve their methodologies.
Working with synthetic data comes with many advantages. Firstly, you can generate results at a low cost, and at a very high speed. You have no panel costs and with the right instructions you can get out of the system what you want, often within a matter of seconds. Secondly, when conducting research around a niche profile, the arduous task of recruiting hard-to-find people becomes a thing of the past. Instead, you can easily incorporate synthetic profiles to supplement your sample, simplifying the process significantly. And thirdly, unlike research involving real consumers, you don’t have to worry about privacy issues or Intellectual Property Protection (IPP).
Of course the quality of synthetic data depends on the quality of the underlying model and datasets and additional verification steps are necessary to ensure reliability and validity. But there are also other risks when it comes to relying on synthetic data. A first and very essential one is: can you trust the data? AI's sources for generating output are unclear, raising questions about data ownership and whose opinions are represented. Second, if data ownership is unclear, who is the end responsible for decisions made via AI generated input? Think about making the decision to launch a product based on AI input, that fails in the market. Who is the real responsible for this? And third, the lack of historical data with models that miss out on the here and now, or on consumer trends as their input is limited. Think for example about ChatGPT that has long been restricted to data before September 2021. It all comes down to having access to high-quality training data that represents real people.
We need guidance on how to deal with synthetic data within our industry. I expect it’s a matter of time before clear guidelines and norms will be developed on, for example, the percentage of synthetic data we can use in our sample. Cause relying for 100% on artificially generated data doesn’t give you the full picture. We did the test with qualitative data from an online insight community where we asked participants about their perception of the brand Range Rover. We did the same with synthetic data and used an AI system to visualize the output. In both cases the system represented the brand by a lion, but with a very different brand positioning. One could question whether the synthetic data shows Land Rover’s positioning as perceived by consumers, or the brand’s desired positioning?
This example shows that primary research remains essential to complement synthetic data and guide business decisions. AI models are as smart as the quality of the input they receive. Since many will use algorithms trained on the same data, high-quality primary data will become the true differentiator.
Boosting insight activation
Today, many insight professionals get bombarded with an array of free AI tools and are convinced that it will increase the level of DIY and commoditization in our industry. I firmly believe that the true value of AI models lies in their role as activation tools, bringing large groups of internal stakeholders closer to the people that matter to your brand. Just think about how the models can be trained on your own primary data to craft personas that are tailored to your brand and category. Instead of reading lengthy reports, your stakeholders can quickly chat with one of the personas to immerse in their world and get answers to their most burning questions.
Training AI models on owned data
Training AI models on brands’ proprietary, primary data will become the secret weapon of the future. These models can be fine-tuned to address the specific challenges and opportunities of an organization, leading to much more accurate and tailored insights. At Human8, we are also experimenting with training our very own AI research assistant on insight community data for different use cases. Think about identifying main themes in conversations, unearthing insights, but also outlier detection. Evidently, we are highly committed to data privacy and confidentiality, so the content remains within the confines of a client project.
However, with great power comes great responsibility. Training these models entails a significant responsibility to ethical decision-making. Without vigilant management, these systems may inadvertently perpetuate societal biases. So having diverse and representative datasets, along with continuous monitoring, is key to mitigate potential biases. It’s important to be able to explain all the decisions you make and to be transparent about them. For instance, you could intentionally opt to give more weight to the voices of minority groups when training your AI model, as long as you have valid reasons and disclose this approach. This transparency is pivotal, safeguarding the integrity of your business decisions.
AI is rightfully generating a lot of excitement in the research industry, but the question you should ask yourself is not “How can I use this technology?”, but “What need does it solve?”? Using tech for the sake of tech is not a business model, solving your client needs with a relevant proposition in a smart way is.
Über die Person
Annelies ist Chief Platform Officer bei Human8. In ihrer Rolle beaufsichtigt sie alle digitalen Plattformen, die verwendet werden, um tiefe und sinnvolle Verbindungen mit Menschen zu schaffen. Außerdem treibt sie Projekte voran, die darauf abzielen, Menschen durch den Einsatz von Technologie zu unterstützen. Als Vorreiterin bei neuen Technologien hat Annelies die Innovation in der Forschung vorangetrieben. In der Vergangenheit war sie maßgeblich an der Entwicklung neuer Lösungen für die Hör-... mehr