When AI goes rogue

AI drives modern marketing, optimising content, targeting audiences and shaping creative decisions. But what happens when flawed inputs lead it astray?

In early 2025, Paramount faced public ridicule after releasing a film promo voiced entirely by AI. Viewers labelled the narration “shockingly bad AI slop”, and critics slammed the studio for prioritising cost savings over quality and authenticity. Ouch.

Around the same time, Activision tested AI-generated imagery for its Guitar Hero mobile ads, only to delete the posts after fans derided them as lifeless and unconvincing. These missteps show how low-effort automation can degrade brand perception, and underline the need for human oversight in every stage of content creation.

When promotional content feels more bot than brand, it becomes clear: without data integrity, even the most powerful AI can go off-course.

Metrics matter: Don’t let bad KPIs hijack your ads

Every time a metric is defined, it shapes an AI’s objectives. Optimising for the wrong key performance indicators (KPIs) – like clicks over conversions, or view count over brand impact – often results in algorithms chasing surface-level wins while missing long-term value.

A short-sighted KPI can skew campaigns and erode customer trust.

Mapping each KPI to a real business goal can help. For example, aligning click rates with purchase intent or tying engagement scores to brand sentiment. When a major retailer optimised only for add-to-cart rates, they saw cart abandonment soar.

A refined KPI set – blending add-to-cart, checkout completion and post-purchase NPS – cut wastage by 23% in the next quarter.

Another helpful practice is standardising metric definitions. Recording each formula, data source and purpose in a shared repository under version control prevents confusion when teams scale campaigns across channels. If “engagement rate” suddenly spikes, you trace whether it means raw clicks, time on page or social shares. Speedy root-cause analysis keeps campaigns on track.

Since business goals and consumer behaviours evolve, regular reviews of KPIs remain important. TikTok’s early ad clients, for instance, initially focused on view counts and ended up optimising for loopable two-second clips that hurt brand recall. A shift to dwell-time metrics led to an 18% improvement in recall within three months.

Trust AI, verify with humans

Automation brings speed and scale, but it doesn’t replace context. While AI might catch schema mismatches (issues in the expected structure or format of data) or statistical outliers, it often misses tonal missteps or cultural nuance. That’s where human oversight remains essential.

Combining automated tools with scheduled human review – like assigning data stewards to sign off on outputs – is one way to build accountability into the process. Maintain change logs for every edit, noting who approved what and why.

And roll out updates in phases, testing new models on small, representative audiences first. That painful response to the Paramount Pictures trailer could have been avoided with a pilot review round.

Cross-functional collabs can also strengthen oversight. Marketers and data scientists each offer valuable perspectives, and regular forums or feedback loops between teams can help spot gaps or potential issues early. When a global CPG brand added cross-team forums to discuss botched AI captions, approval time dropped by 40 percent while user complaints declined by half.

Stick to the benchmark

Without a stable reference point, it’s difficult to tell whether a model is improving or deteriorating. Benchmarks provide that anchor. They catch drift, bias and degradation before flawed outputs reach your audience.

Establishing gold-standard datasets that reflect a mix of use cases and edge cases – and then locking them down – can serve as a baseline. Pairing internal benchmarks with third-party fairness scans introduces a layer of independent oversight.

When xAI’s Grok chatbot began spouting conspiracy theories about “white genocide” in South Africa, they traced the cause back to a noisy training subset. A robust bias scan could have caught the red flags before public launch.

Running old and new models in parallel on small traffic samples is another useful technique. If your fresh AI delivers five percent more engagement but also doubles hallucination rates (how often the model makes things up or returns false results), you’ve found an imbalance that you’ll want to correct before exposing your full audience to an error-prone model.

Publishing benchmark results internally promotes transparency and continuous improvement. In one financial services firm, sharing monthly benchmark scores across data, marketing and compliance teams cut unplanned drift incidents (unexpected changes in the model’s behaviour or performance over time) from 12 to two in six months.

Three moves for reliable AI

Standardise metrics

Define every KPI clearly. Record formulas, data sources and intended use.
Enforce naming conventions and version control.

Build hybrid validation

Pair automated checks (outlier detection (spotting data points that don’t fit the usual pattern), schema validation) with human oversight.
Log every change and its rationale. Use canary testing (rolling out to a small group first to catch problems early) for new models.

Lock down benchmarks

Curate untouchable datasets. Test every release against them.
Share benchmark results across teams to align strategy and performance.

Getting the basics right still matters. When we use clear metrics, human checks, and solid benchmarks, we help AI work as it should. That means fewer nasty surprises, more accurate results, and campaigns we can actually trust.

Data professionals play a bigger role than many realise. It’s not just about models and dashboards – it’s about shaping how AI shows up in the world. The way we define success, the data we feed in, and the standards we set all influence how these systems behave.

At the end of the day, it comes down to choices: will your AI promote clarity or confusion, progress or noise? With the right guardrails in place, it becomes a powerful ally – and your most powerful tool for building smarter campaigns that actually deliver.

When we make our metrics matter, vet inputs rigorously, and return to the basics of credible analysis, we uphold the true promise of commercial AI: delivering meaningful, trustworthy insights that drive genuine progress.

Ryan Campher, is VML South Africa data lead.