Centering racial equity in data collection

This section will help you understand how to center racial equity when collecting data and describes best practices and resources that can be used.

Intro summary

An approach that centers racial equity is essential in all parts of the data cycle, starting with data collection. While collecting and disaggregating race and ethnicity data is crucial for identifying and understanding health inequities, it is important to understand how race and ethnicity data have been used to reinforce systemic racism and take steps to ensure that racial equity is the focus throughout the data collection process. This section allows you to familiarize yourself with the definitions of race and ethnicity and to understand how the categories can be used to better understand risk and protective factors across communities.  

Table of Contents

Defining race and ethnicity 

The Massachusetts Department of Public Health’s Race, Ethnicity, and Language Standards1 include the following definitions of race and ethnicity: 

Race is defined as any one of the groups that humans are often divided into based on physical traits regarded as common among people of shared ancestry; or as a group of people sharing a common cultural, geographical, linguistic, or religious origin or background2. Race is a socially constructed concept. Race has no genetic basis. Race has been proposed as “only a rough proxy for socioeconomic status, culture, and genes, but it precisely captures the social classification of people in a race-conscious society such as the United States. … That is, the variable ‘race’ is not a biological construct that reflects innate differences, but a social construct that precisely captures the impacts of racism3.” 

Ethnicity refers to a person’s heritage, culture, ancestry, or sometimes the country where their family were born. Ethnicity also may be a measure for shared cultural practices and beliefs and may suggest the need for improved sensitivity to linguistic needs.” 

Overall, race and ethnicity are not, in and of themselves, risk factors that explain disparities in health outcomes — they are only markers used to better understand risk factors related to inequities1.

Exploring your data 

It’s important to look at how your race and ethnicity data are collected, be aware of how terminology may have shifted over time and identify opportunities for improving the completeness and accuracy of the data. 

Here are some important considerations:

  • What are your data collection methods (i.e., survey, in-person, over the phone)?
  • Does the terminology or definitions of race and ethnicity vary between data sources or over time?
  • Are participants able to identify as more than one race or ethnicity? To respect self-identification, participants should be able to select all of the race and ethnicity groups with which they identify. 

It is important to explicitly document these data considerations as they provide context when it’s time to analyze your data. For example, some programs may have initially collected race data as a single select option and later allowed respondents to select all races with which they self-identify. How might this impact your analysis and the results?

When exploring your race and ethnicity data, you may realize there is a large amount of missing data. It's important to consider the reason why data are missing. Frequently, data are missing because systems do not support the collection of race and ethnicity data, even when there is a regulatory requirement. This may be due to the lack of understanding of the importance of collecting this information, a lack of data collection standards, discomfort with talking about or acknowledging race and ethnicity, and assumptions that asking about race and ethnicity makes people of color uncomfortable. Consider how these missing data could potentially distort your analysis and/or interpretation. If more than 20% of your data are missing, then you can use Plan-Do-Study-Act cycles or other quality improvement methods to address missing data. Visit Data to Action for additional quality improvement ideas.   

Best practices for collecting data about race and ethnicity   

Data are vulnerable to inaccuracies, incompleteness, and biases like selection bias (i.e., the individuals included in the data are not random or do not represent the intended population) and confirmation bias (i.e., data are used to confirm pre-existing beliefs). Decontextualized data showing differences by race and ethnicity have been used to perpetuate harmful narratives about communities of color and reinforce systemic racism and oppression. To center racial equity in data collection, it is necessary to consider the risks associated with the process and take steps to ensure your program’s data collection reduces bias and contextualizes data appropriately4.

Below are some best practices when centering racial equity in data collection (adapted from the AISP Centering Racial Equity Throughout Data Integration Toolkit). 

  • Include agency staff and community partners in defining which data to collect.
  • Collaborate to develop a shared data collection agenda that is connected to practice, policy, and research.
  • Work with staff to support equity-oriented data collection practices. 
  • Include qualitative stories to contextualize quantitative data. 
  • Explore reasons why people “opt out” of providing data for surveys and other data collection efforts and use their feedback to minimize harm in future data collection processes.

While these best practices provide recommendations for centering racial equity in data collection generally, collecting data specifically about race and ethnicity often requires additional considerations. The following framework (Table 3.1), adapted from Chapter 3: Collect Diversity Data of the DPH Making CLAS Happen Manual5, provides guidelines for developing a standard process when collecting race and ethnicity data.  

Table 3.1. Process for collecting race and ethnicity data

When should you ask for data? Ask for information on race and ethnicity early in the encounter (e.g., during enrollment, admission or registration)
Who will collect the information?While this depends on the size and needs of your program, it is helpful to have the staff that are first to interact with clients (e.g., enrollment, admissions or reception staff) collect the data. 
How will you communicate about the data collection process?While some individuals or communities might have hesitations about providing information about race and ethnicity, you should always be sensitive to concerns and explain why you are collecting data and how it will be used. Develop a script to address concerns and explain confidentiality. 
How will you collect information?Self-reporting race and ethnicity data is the most consistent and valid source of information because it reflects how individuals describe themselves. Individuals should be able to include more than one category and have the option not to answer if they chose. 
What information will you collect?Be consistent in the types of data you collect to make it easier to compare and analyze data in the future. For specific recommendations about what information to collect, reference the resources below. 
What tools and systems will you use to collect and store data?Use standard collection instruments and store data in a standard electronic format. To ensure the accuracy of data collection, it can help to use instruments (forms, surveys, etc.) that conform to updated guidelines. 
How will you train staff? Because collecting information about race and ethnicity can be sensitive, it is important to provide ongoing data training and evaluation to staff. A standard process and practice can help staff more comfortably and accurately collect information.

Data sovereignty

Some communities, like Indigenous and tribal populations, have been particularly harmed by inequitable data collection practices. Many data collection practices omit and misidentify Native people or collect information that is used solely to present negative narratives. In order to be equitable, data practices should center indigenous data sovereignty, the right of Indigenous Peoples and Nations to govern data about their peoples, land, and resources6. It is important to know that each Tribe is their own sovereign entity. Individual Tribes should be consulted prior to any data collection or analysis that focuses on their Tribe. 

The following resources provide recommendations for data collection and analysis that is informed by Indigenous values and practices.  

Reflection 

Now that you understand how to center racial equity in your data collection processes and take into consideration ways in which data can be used to reinforce structural racism, reflect with your team on the following: 

  • Do you have a good understanding of the definitions of race and ethnicity? 
  • Have you taken steps to improve the completeness and accuracy of your data, and made efforts to reduce bias? 
  • Have you engaged community partners in planning for and collecting data that are meaningful and clearly connected to practice, policy or research? 
  • Have you developed a standard process for collecting race and ethnicity data and identified any trainings needed to support staff who collect this information? 

Check in with your team to determine if you are ready to disaggregate your data to understand what they say about differences in health outcomes by race and ethnicity.  

Resources 

The following resources can provide additional information about data collection, including specific recommendations for collecting information on race, ethnicity, language, and other standards:  

Contact

1 Massachusetts Department of Public Health. MDPH Race, Ethnicity, and Language Standards. Available on request by emailing DPHDataStandards@mass.gov

Race, Merriam Webster Dictionary

3 Jones CP. Levels of Racism: A Theoretic Framework and a Gardener’s Tale Am J of Public Health. August 2000, Vol. 90, No. 8 

4 Hawn Nelson, A., Jenkins, D., Zanti, S., Katz, M., Berkowitz, E., et al. (2020). A Toolkit for Centering Racial Equity Throughout Data Integration. Actionable Intelligence for Social Policy, University of Pennsylvania.

5 Massachusetts Department of Public Health. Office of Health Equity. Making CLAS Happen: Six Areas for Action

6 Carroll SR, Rodriguez-Lonebear D, Akee R, Lucchesi A, Richards JR. Indigenous Data in the Covid-19 Pandemic: Straddling Erasure, Terrorism, and Sovereignty. Published June 11, 2020. 

7 Urban Indian Health Institute. Best Practices in American Indian and Alaska Native Data Collection

8Urban Indian Health Institute. Decolonizing Data Guidebook

Help Us Improve Mass.gov  with your feedback

Please do not include personal or contact information.
Feedback