Get started conducting thematic analysis

This guide helps you get started with thematic analysis. Use this method to help you learn about constituents' experiences from their feedback, social media posts, email, and other unstructured data.

Table of Contents

What is thematic analysis?

Thematic analysis helps you analyze unstructured data, such as text from feedback surveys, social media comments, or emails.  Themes help capture common needs that come up in this kind of data.

For example, think of all the ways somebody might ask to contact you in a web survey. They might write, “Why can’t I find your phone number?” They might also say, "I need to email you" or "Phone?!" or "Where's your phone num?" To group all of these, you might assign them a theme such as "contact info." This makes the core idea easier to track. You can even reuse this theme to help you understand different data, such as a call center call or a constituent interview. 

Tracking themes helps us prioritize what to fix, and to persuade stakeholders that they should invest in fixes. It does this by:

  • Helping us see which issues are most common
  • Avoid paying too much attention to "squeaky wheels." (That is, particularly angry comments that might not represent the most critical issues.)
  • Tracking themes over time to confirm that our improvements are working

How is “thematic analysis” different from reading feedback or social listening? 

You may already have a process where you review feedback or social media posts. Maybe you even identify action items and distribute them to stakeholders. This can be valuable and result in quick improvements. If you don’t have experience reviewing constituent feedback, this is the way to get started.

Thematic analysis takes more time and effort but can deliver more value. Developing themes helps us identify and group constituents’ experiences. The “identifying” process helps us understand people’s challenges. The “grouping” helps us focus on the most common issues rather than overreacting to “squeaky wheel” comments. They both contribute to powerful tracking and reporting capabilities. 

How to use this guide

Conducting thematic analysis is valuable but time-consuming activity. It requires planning, setup, collaboration, and routines. This step-by-step guide helps you preview what this work is like. It also helps you understand what you can learn from doing it. You'll need:

  • Some unstructured text data, like web feedback, social media posts, or constituent emails
  • A teammate who can collaborate with you. If you don't have a teammate available, Mass Digital can support you for this exercise. To get help, schedule a consult.

You can finish all the steps in a few hours total.

Step 1. Gather some data

Gather a small dataset. Mass.gov feedback data, which is available in the CMS, is an easy thing to start with. 

You can try this out with as few as 20 messages, though it's better to aim for 50 or more, especially if they're short. 

In addition to the messages, collect the message’s source. If this is Mass.gov feedback, that source is the page the message was submitted on. If it is a social media post, it may be the social media site, or the post that the constituent was replying to. 

For this exercise, limit yourself to data from related sources. For example, take feedback from one or more pages about eligibility. Don’t use social media posts about a live event and also from a page on a grant application that’s unrelated to the live event. 

Step 2: Put the data in a spreadsheet

We recommend using Excel or a similar tool for your first round of analysis. 

  1. Add a column for where the message came from. You can call this "Source" or "Page"—whatever makes sense for the type of data you have.
  2. Add the messages to a column titled, “Message"
  3. Add a third column titled, “Theme.” 

When you’re done, you’ll have a dataset that looks like this one, which uses feedback from a page on reporting cybersecurity incidents: 

SourceMessageTheme
Report a cybersecurity incidentplease add how to report a phishing scam by text (not always getting these by email)  
Report a cybersecurity incidentsome of the referenced crimes that Inhad been violated by and assaulted are much more complex and had been elaborated as a result of compromised officials with anti-american ways of proceedures.  
Report a cybersecurity incidentI’m getting a text which I think is a scam  
Report a cybersecurity incidentI want to report a cyber incident and you don’t seem to care. We got sucked into the smishing deal because the claim was we had not paid Mass. Pike tolls ON THE DAY WE WERE ON THE PIKE. How in the world did that happen?  
Report a cybersecurity incidentYour chatbot is useless and does not get me to a point where I can chat with someone when I can't find the limited answer options that it presents. These types of chatbots are a waste of money and only aggravate the tax paying consumer.  
Report a cybersecurity incidentlet me connect with a person!  

Step 3: Draft themes

Read each message. For each, try to understand what the constituent’s experience is. Remember that we’re not diagnosing the constituent’s problem. We're identifying what they think it is, or what they are experiencing.  

For example, one row of our sample data reads, “I want to report a cyber incident and you don’t seem to care. We got sucked into the smishing deal because the claim was we had not paid Mass. Pike tolls.” What’s important here could be: 

  • The constituent needs to report a "smishing incident.” (Smishing refers to a scam text message.)
  • The constituent feels that we—EOTSS—doesn't care
  • The smishing incident is about Mass Pike tolls 

We could draft themes for each of these in separate columns. When we're starting out, it can be helpful to use people's words as draft themes.

MessageTheme 1Theme 2Theme 3
I want to report a cyber incident and you don’t seem to care. We got sucked into the smishing deal because the claim was we had not paid Mass. Pike tolls ON THE DAY WE WERE ON THE PIKE. How in the world did that happen? Report smishing EOTSS doesn’t care Mass Pike tolls 

Remember that we are focused on what the constituent "knows" to be true. We do this even if we believe the constituent is wrong or confused. For example, it doesn't help us to think, "EOTSS actually does care," or "Mass Pike tolls are managed by MassDOT, not EOTSS." We stick with the person's reported experience. We need to know if lots of people believe that "EOTSS doesn't care." This helps us learn if we have a trust problem to address. 

Look for chances to reuse themes 

Be on the lookout for opportunities to re-use the themes you draft. For example, this dataset includes other messages that are about text message scams. We can reuse our “report smishing” theme for these: 

MessageTheme 1Theme 2Theme 3
I want to report a cyber incident and you don’t seem to care. We got sucked into the smishing deal because the claim was we had not paid Mass. Pike tolls ON THE DAY WE WERE ON THE PIKE. How in the world did that happen?Report smishingEOTSS doesn’t careMass Pike tolls 
please add how to report a phishing scam by text (not always getting these by email) Report smishing   
I’m getting a text which I think is a scam Report smishing   

Step 4: Create a theme dictionary

This is the most important step in this guide.

Once you have drafted themes for your data, you should try to define them.  This is difficult, critical work. Your definitions help your team members use your themes the same way you do. This means that when 2 researchers can analyze the data and come to the same conclusions. 

So what is a "theme definition"? A definition tells us what a theme means and distinguishes it from other themes. For example, let’s try defining “Report smishing."

Theme nameTheme definitions
Report smishingLooking to report a text message scam, or what they think is a text message scam. Includes messages that don’t mention reporting (e.g. that just express concerned about scammy texts). Does not cover email scams.

This example definition does 3 things: 

  • Says what types of messages you should apply the theme to
  • Reinforces that the theme applies to a possible gray area (messages about smishing that don’t mention reporting, even though the theme has the word “reporting” in it)
  • Says what’s excluded (email scams) 

The definition is more important than the theme’s name. The definition, not the name, tells us which themes it should apply to. 

Step 5: Have a teammate try using your theme dictionary on your data

Test your dictionary by having a teammate try to use it on the same data. (You may have to help them learn how to apply themes, first.) 

After they’ve done it, discuss any differences in how you both applied themes. This will help you hone your dictionary by revising definitions and updating theme names.  

If you don't have a teammate, Mass Digital can support you for this exercise. To get help, schedule a consult.

Step 6 (optional): Continue with more data and revise your theme dictionary

Add another 25-50 messages to your dataset. You may notice the need for new themes. You might also notice that some themes “squish together” issues that should be separated. Here is an example of a theme and that covers too much ground. The first table shows a theme and its definition:

Theme nameTheme Definition
Can’t find infoQuestions/complaints about needing to know something, including where documents are or missing info from webpages. Does not cover questions about how to fill out applications. 

The next table shows this theme being applied:

MessageTheme 
could not find a notice to quit form Can’t find info
Give a simple rental agreement to use. Can’t find info
How can I get assistance for bills to get paid Can’t find info

The researcher who created this theme decided to split up “Can’t find info” because it covered too much ground. In this example, the first 2 messages are about looking for forms or documents. The third is a question about if a service exists or if it covers a particular situation. The researcher created new themes to replace “Can’t find info,” 2 of which were: 

Theme nameTheme definition
Need form/document Looking for document downloads, a form to fill out, sample forms to use.  
Service coverage Questions or comments about what a service can be used for. Might be asking for more info, if the service covers a specific use, or expressing confusion about what the service is for.  

Step 7: Assess what you've learned

The last step is to do a review your data. What themes did you assign most? Does this surprise you?  Do the themes in your dictionary represent issues you were expecting?  

You can also make a bar chart to see which themes appear most. (See below if you need help doing this in Excel.) 

You can also use these themes as the basis for a scenario walkthrough. Try to investigate why people might be having the experiences they are having. You may also be able to learn more by speaking with stakeholders or people who interact with constituents frequently. 

Adding up themes across columns in Excel

This guide encourages you to assign multiple themes to each message. However, this makes it more difficult to count theme totals, since they might be spread across multiple columns. We've created a model Excel file to demonstrate counting themes across columns.

Contact

Help Us Improve Mass.gov  with your feedback

Please do not include personal or contact information.
Feedback