I attended the Publisher Research Council (PRC) presentation of their latest research endeavour: they are fervent proselytisers of data fusion.
Providing some context, the PRC Research Consultant, Peter Langschmidt recalled the 2013 future proofing study conducted by Jos Kuper on behalf of SAARF. This pointed to international best practice being a “hub and spoke” model, which entailed using an establishment survey as the hub to fuse currency and brand surveys together. This model was followed in 2016 when the PRC and BRC (Broadcast Research Council) collaborated to launch the now discontinued Establishment Survey.
Fortunately, the science of fusion has evolved: Langschmidt cheerily explained “now any survey can be fused with any other survey.” One should now think of a DNA helix or chain as the model, which suggests the possibilities are endless.
He enthusiastically extolled data fusion as the methodology for the NEW Millennium. Fusion is the process of integrating multiple data sources by using common linking variables (fusion hooks) to match two or more datasets at the respondent level and create one unified database. Statistical analytics and modelling are used to create this single data set. It is cost-effective and convenient, ultimately delivering a single-source survey, without any of the drawbacks inherent in the old AMPS approach e.g. respondent fatigue.
An elegant solution
One might wonder if fusion is such an elegant solution, why it was not used sooner; Langschmidt explained it has been made possible by the exponential rise in computing power. (To illustrate this point, he showed how the transistor count on computer chips has grown a staggering 4 347 826 times between 1971 and 2018.)
In 2018, the PRC pioneered the process locally with Nielsen, combining PAMS (Publisher Audience Measure Survey) and the Nielsen CPS (Consumer Panel). This meant that media decision-makers had access to the publishers readership currency automotive, banking, cellular and retail data, as well as data on over 3 000 FMCG brands.
Langschmidt spent some time outlining the advantages of the Nielsen CPS data: it has a solid sample of 4 000 households, which are demographically and geographically representative of all South African households and tracks actual, audited household purchases rather than claimed ones, eradicating the vagaries of consumer memory.
He also went through the scrupulously painstaking process Nielsen employs to gather the data. The National Average of FMCG brands bought in the past 12 months was 136. As one might anticipate, SEM Supergroup 1 bought 14% fewer brands in that time while the comfortable SEM Supergroup 5 were able to acquire 23% more.
He also used some rather pretty charts to illustrate that the SEM Supergroup 1 was likely to buy a larger range of brands in staple categories such as laundry soap bars, rice and hand and body cream than SEM Supergroup 5.
By contrast, the affluent SEM Supergroup 5 purchased a greater range of brand in the more indulgent categories such as luxury biscuits coffee and boxed assortments. This makes perfect sense: the lower end consumer will be deal shopping, whilst the upper end consumer has the ability to indulge in variety.
Langschmidt also demonstrated the SEM profile by CPS: reassuringly low-priced soups and maas were skewed to the lower end, while aftershave, pet snacks and treats and shaving preparations skewed to the upper end.
He then turned to the PRC’s latest endeavour, the third fusion in the chain, which adds in the Nielsen Global People Products and Platforms study, to provide a granular view of people’s media consumption across various platforms. It covers TV, print and digital consumption by devices/media across dayparts. This digital study was carried out by Nielsen in 15 markets, with local fieldwork conducted from September to November 2018. The sample comprised 1 104 past month online users, aged 15 years and older. Naturally, the methodology was an online survey which could be via any connected device. Key demographic quotas were set to ensure that the sample was representative of the past month online population. The data was weighted to IHS population estimates to deliver a complete representative sample.
Of course, it is crucial to remember that the study focuses only on the online population, and that the sample was largely metropolitan/ large urban. To put this into perspective, it is worth referring to the Jan-Jun 2019 Establishment Survey, which shows that 66% of our population accessed the internet in the past month; this rises to 77 % in metropolitan areas over the same period. Media decision-makers, who tend to be early adopters of tech and committed digerati, need to remember that a significant section of the local population is excluded from this study.
Digital Consumer Survey 2019 Report underscored this: “the media landscape in South Africa may have changed inexorably, but to sketch the digital landscape, to the exclusion of other traditional media platforms, is premature.”
Langschmidt outlined the largesse of data available in the fused study. In the finance category spanning 11 financial institutions, the questions cover where accounts or cards are held, the consumer’s main bank and personal usage of credit cards. In the automotive category, 56 brands are covered, and questions address the number of vehicles in household, personal ownership or usage of vehicle and whether the vehicle was obtained new or second-hand.
The cellular information covers the type of mobile phone and network provider, whilst the retail section covers12 food and grocery outlets, 25 furniture and appliances stores and 22 fashion stores as well as the use of store cards. Over 200 FMCG categories with more than 3 000 brands are covered by heavy, medium and light consumption. Media consumption covers standard live TV viewing, time-shifted TV viewing as well as OTT/Streaming (Netflix/YouTube/Showmax). Newspaper and magazine readership are covered across paper and digital formats. For radio, both live and online streaming listening is covered, while the internet usage section spans both social media, devices and apps as well as tech in the home.
Some examples of the data were provided: standard live broadcast on a TV set still remains the most favoured way to view TV. Even amongst Makro customers who average 2.5 ways to watch TV this remains true. They are however particularly partial to watching TV or movie clips or entire shows streamed on sites like YouTube, Vimeo etc and over 40% use internet subscriptions of TV e.g. Netflix, Showmax, Amazon Prime, etc. YouTube is the dominant online source for tv/movies with Netflix and Showmax falling into second and third places respectively.
A day in the life graph was used to show the activity by platform by daypart of online adults; unsurprisingly this painted a compelling picture for reading. This pitch was supported by graph showing that most people watch tv and use the internet at same time, mostly using their smartphones. A slide from Kantar TNS’s 2017 Media Engagement Study, showed by contrast, that the quality of readers attention is focussed.
The PRC has more developments on the pipeline: in April the new 15,000 PAMS 2019 survey will be released. Pilot fusions with Brand Mapp and Narratiive digital data have been tested. The intention is to add the Narratiive data every 3 months via an adaption of fusion called stacking to provide information on the titles’ constantly growing digital audiences. It is also possible for agencies and marketers to fuse their own bespoke surveys with the PRC studies. CEO, Josephine Buys indicated that a ballpark figure for such an exercise would be between R250 000 and R360 000.
Certainly, the PRC has been eminently forward thinking and there is a remarkable amount of data which can be put to good use by the industry. However, there is one area that does give me cause for concern, that of weighting. When analysing data from the categories/brands in PAMS any weight can be used, but when working with categories/brands in PAMS BRANDS household weights must be used, and when using some of the questions in PPP you must use a matched universe i.e. code past month internet access. The PRC needs to ensure that all potential data users, not all of whom will have attended the presentations are educated about these parameters.