Why Removing Ethnicity from Consumer Data Sets Could Do More Harm Than Good (Guest Blog)

Map with diverse Americans
(Image credit: Getty Images)

We know that consumer data sets are a valuable tool for businesses to understand, serve and target consumers. 

When accurate, consumer data can reveal information about behaviors, messaging preferences and purchasing habits, and increase precision targeting and ROI. But recent U.S. state privacy laws — while well-meaning — are starting to prohibit the collection of “sensitive” information, specifically around details like race and ethnicity, that can actually drive a more positive and relevant experience for every consumer.

Scott McKinley, CEO, TruthSet

Scott McKinley, CEO, TruthSet

It’s argued that removing ethnicity data is necessary to protect individual privacy and prevent discrimination. That’s a valid (and warranted) critique, but there’s also another side of the coin: Race and ethnicity data is essential to accurately represent diverse audiences.

For brands, agencies and publishers, race and ethnicity data is often used to target specific groups with marketing messages and products tailored to their interests and needs. Losing this information, certain communities may be inadvertently excluded or overlooked. That could mean both lost revenues for businesses and an overall lack of representation for various marginalized audience groups.

For instance, a beauty brand may want to target a variety of consumers — men and women, darker and lighter skin tones, etc. — with specific messaging. Losing ethnicity data means you’re potentially delivering the same message to a Hispanic man as you are to an Asian-American woman. Or perhaps a hair product is designed specifically for Black consumers’ needs. 

There’s a concentrated audience to speak to there, and without ethnicity data, it’s turning intentionally helpful targeting and customization into an unnecessary array of guesswork.

This is without even digging into the existing issues around race and ethnicity data that ultimately gets worse by simply cutting it out of data sets. At Truthset, we’ve found that 74% of ethnicity data for African-Americans consumers was accurate in Q4 2022. Just 78% was accurate for Asian-Americans and 89% was accurate for Hispanic Americans. Those numbers get almost twice as bad as source data passes through onboarding and modeling processes common in programmatic advertising. 

This informs us that there’s already an issue serving these audiences. So why are we willing to exacerbate this issue further?

Having access to race and ethnicity data can actually allow advertisers, agencies and publishers to address systemic marketing discrimination, identifying where blind spots and inequality have existed in the past to try and actively avoid it going forward. 

While we can put safeguards in place to prevent data from being used improperly, you could contend there’s a greater chance of addressing societal concerns around race (and other sensitive information) by quantifying proper representation of these populations.

This is growing increasingly crucial across many demographic and psychographic lines today as the monoculture largely breaks down. Knowing what is culturally relevant on a large scale could now represent just a quarter of the U.S. population — and perhaps a small subsection of a brand or publisher’s intended audience. Access to accurate ethnicity data helps fight tired ideas, opening doors for brands and consumers that may not have had a chance to thrive in the past.

There is more work to be done to ensure that ethnicity data is collected and utilized accurately and ethically. But the idea of that hard work shouldn’t preclude the advertising industry from doing so.

Working with state and federal regulators, the entire industry can plot a way forward that benefits all parties involved and makes sure that future privacy initiatives are both consumer-safe and act in the interest of serving consumers better experiences as well.

Scott McKinley

Scott McKinley is founder and CEO of Truthset.