
Introduction to qualitative content analysis
Qualitative content analysis helps researchers systematically explore large volumes of text—finding patterns, mapping meaning, and generating insights without reducing everything to numbers. It’s a flexible method that brings structure to qualitative data without locking researchers into rigid frameworks.
In this guide, we break down what qualitative content analysis is, how it differs from thematic analysis, and what each step looks like in practice—from prepping your data to presenting your results. We also cover the roles of inductive and deductive coding, show how qualitative data analysis software such as NVivo and ATLAS.ti can streamline the workflow, and walk through examples that show what qualitative content analysis looks like in real-world application.
If you are new to the method, you may want to start with Lumivero’s overview of qualitative data analysis before proceeding.
What is qualitative content analysis?
Content analysis for qualitative research is a systematic way to organize and interpret textual or visual data. Unlike quantitative content analysis, which counts words or codes, the qualitative form attends to meaning and context.
Researchers work with interview transcripts, survey comments, policy documents, or media posts, isolating segments relevant to the study question. These segments are grouped into data-driven or theory-guided categories that describe the phenomenon. The procedure balances structure with flexibility, letting a study stay grounded in the source material. The technique grew from early studies of mass communication and now supports research in health, education, and sociology.
Several styles appear in the literature on qualitative content analysis. Hsieh and Shannon describe conventional, directed, and summative approaches, each matching a different analytic aim.
Conventional analysis lets categories grow from the data, directed analysis starts from prior theory, and summative analysis examines frequency or intensity of chosen terms while also interpreting latent meaning. All three require transparent documentation so readers can see how findings were reached.
Trustworthiness in qualitative content analysis hinges on clear coding rules, reflexive memo writing, and, when useful, more than one coder discussing discrepancies. Especially when a data set is being analyzed by multiple researchers, establishing reliability is an important task to presenting an analysis that is trustworthy. Reliability statistics such as Krippendorff’s α can sit alongside qualitative criteria like credibility and transferability (guidance on trustworthiness). Guidance emphasizes that even when counts appear, interpretation—not prediction—remains the goal.
Peer debriefing, participant checks, and reflexive journals give additional evidence that interpretations along with the source context. These practices signal analytic integrity to journal reviewers.
Qualitative data analysis software such as NVivo and ATLAS.ti can speed up qualitative content analysis through segmenting, coding, and memo management without changing the logic of the method. The same workflow applies to focus groups, field notes, or social media screenshots. Selecting the right platform for your research depends on team size, file types, and reporting needs, which this guide will address later. Next, we consider where content analysis sits on the qualitative–quantitative spectrum.
Is content analysis qualitative or quantitative?
Content analysis is occasionally labelled a mixed method because it borrows tools from both qualitative and quantitative traditions. The distinction depends on how researchers handle the material.
Qualitative orientation. When the goal is to explain how meanings form, researchers focus on context. Segments of text are coded for latent or manifest meaning, and categories remain open to refinement. Interpretive notes, memos, and iterative comparisons guide decisions. Frequencies may be noted but serve only to highlight emphasis, not to test hypotheses. Credibility checks—peer review, member feedback, reflexive journals—support the trustworthiness standards common in qualitative work. When the goal is to explain how meanings form, researchers focus on context, which is why a qualitative study focuses on exploring in-depth insights into specific phenomena.
Quantitative orientation. Quantitative studies are essential for establishing reliability and validity in research by measuring and categorizing content, enabling researchers to draw conclusions based on statistical evidence. A study shifts toward the quantitative side when codes are fixed before data review and counts drive the analysis. In media research, for example, a team might tally the number of times a policy frame appears across newspapers to make generalizable claims. Reliability statistics such as Cohen’s Kappa Coeffientthen become central to judging rigor. This approach is useful for large datasets where volume makes deep interpretive work impractical.
Hybrid practice. Many studies combine both logics. Categories may stem from theory (deductive), yet analysts still read for nuance. Counts can illustrate prominence, while narrative summaries address how and why patterns matter. Software like NVivo and ATLAS.ti facilitate both counting- and meaning-focused coding, letting teams move back and forth across the spectrum. Choosing one balance over another rests on the research question: Is the aim to map frequencies, interpret meanings, or accomplish both? Recognizing this continuum clarifies why content analysis can appear in either a qualitative or quantitative research section of a journal’s methods guidelines.
Types of qualitative content analysis
Researchers choose between deductive and inductive content analysis based on how firmly theory shapes the study question. Both approaches follow the same broad workflow—preparing data, coding, categorizing, and interpreting—but they differ in when categories are defined and how flexible coding remains during analysis.
Deductive qualitative content analysis
Deductive analysis starts with an existing framework—often a set of concepts drawn from prior literature or policy documents. The analyst converts those concepts into a category matrix before coding begins. Each segment of text is assessed for its fit within this predetermined structure; segments that do not fit are noted separately or marked as “other” for later scrutiny.
Because the coding frame is fixed early, intercoder agreement statistics can be calculated at an early stage. Any disagreements prompt targeted revisions to category definitions rather than wholesale restructuring. This strategy works well when the aim is to confirm a model, monitor policy uptake, or map theory across new contexts. A study of patient safety reports, for example, might code reports against predefined error types to assess frequency and severity across hospital units. The deductive route keeps the focus on testing or extending an established explanation rather than generating new constructs.
Inductive qualitative content analysis
Inductive analysis reverses that order by letting categories evolve during close reading of the material. Analysts begin with open coding—marking any meaningful segment without forcing it into a preset scheme. Codes that capture similar meanings are then merged into higher order categories, which can be refined through constant comparison until no new categories emerge.
Memo writing tracks analytic questions and tentative links between categories. This flexibility allows unanticipated themes to surface, making inductive analysis suitable for exploratory studies or contexts with limited prior research. A project examining first generation college students’ social media posts, for instance, might reveal coping strategies or support networks not noted in earlier literature.
For a practical guide to this workflow, read “Navigating Inductive Content Analysis in Qualitative Research,” which outlines techniques for code merging, abstraction, and audit trails that bolster transparency.
Content analysis vs. thematic analysis in qualitative research
Both content analysis and thematic analysis convert raw language data into organized findings, yet they serve different analytic aims. Each begins with familiar steps—data familiarization, coding, categorizing, and pattern seeking—so researchers moving from one to the other will recognize much of the workflow. Both methods also accept either deductive or inductive logic; a study can start with theory-driven codes or allow categories to surface during reading. Reflexive memo writing, peer checks, and research software tools such as NVivo or ATLAS.ti support rigor in both approaches.
In the context of content analysis, it is important to distinguish between conceptual and relational analysis. Conceptual analysis focuses on the existence and frequency of specific concepts within data, while relationship analysis seeks to understand the relationships between these concepts. By outlining the processes and considerations involved in both types of analysis, researchers can ensure methodological rigor and understand the implications for research outcomes.
The key divergence lies in how each method treats meaning units and what outcome it seeks. Content analysis emphasizes the frequency and distribution of clearly bounded units—words, sentences, or images—to answer “what is present and how often.” Categories stay relatively close to the text, and counting introduces a quasi-quantitative lens without changing the interpretive nature of the work.
Thematic analysis focuses on broader patterns that cut across a dataset, asking “what does this pattern tell us about the underlying phenomenon.” Themes can integrate context, latent meaning, and participants’ own framing even when specific words change. As a result, thematic analysis often produces narrative accounts that link several codes into an overarching story, whereas content analysis typically yields a structured list of categories and subcategories with illustrative extracts.
The two qualitative research methods also differ in output granularity. Managers of survey comment data may prefer content analysis for its succinct category counts that feed directly into dashboards, while ethnographers may lean on thematic analysis to craft rich descriptions of social processes. Choosing between them depends on the research question, dataset size, and desired level of abstraction.
For practical considerations about software features that can accommodate either method, see Lumivero’s comparison of QDA software options.
Aspect | Content analysis | Thematic analysis |
Primary focus | Frequency and distribution of meaning units | Patterned meaning across the dataset |
Typical unit of analysis | Word, phrase, sentence, image | Segment, episode, or full account |
Outcome | Hierarchical categories with counts (optional) | Themes that explain a phenomenon |
Role of counting | Common, supports comparisons | Optional, usually secondary |
Level of abstraction | Closer to surface meaning | Ranges from surface to latent meaning |
Ideal use cases | Large comment sets, media studies, policy tracking | Experience-based studies, identity work, process tracing |
Common software tasks | Code-and-count, matrix queries | Code comparison, theme mapping |
Reporting style | Tables or charts of category frequencies plus examples | Narrative synthesis with illustrative quotes |
Qualitative content analysis steps – from data collection to the analysis process
Qualitative content analysis moves through a repeatable workflow that preserves the link between raw material and final findings. The eight steps below outline a typical path from data collection to writeup. Teams often cycle back to earlier steps as insights develop, so treat the sequence as an organising scaffold rather than a rigid checklist.
Prepare and familiarize with the data
Gather all relevant sources—interview transcripts, focus group recordings, policy documents, or social media captures—in a single project space. Transcribe recordings verbatim, correct obvious errors, and anonymize identifying details to protect participants. Export the cleaned files to a text searchable format such as .docx or .txt, then read through each file at least once without coding. Use margin notes or memos to capture first impressions, interesting turns of phrase, and possible analytic angles. Early familiarization sharpens subsequent coding by clarifying how participants express ideas, which terminology they use naturally, and where contextual cues appear.
Define the unit of analysis
A unit of analysis is the smallest chunk of data that will receive a code. It can range from a single word to a full paragraph, depending on the research question. For a study on stigma language, a word or phrase may suffice; for work on workplace culture, a sentence or short paragraph provides more context. Choose the unit deliberately, state the choice in the method section, and keep it stable across the dataset so counts and comparisons remain meaningful. When multiple units are plausible, pilot each option on a small subset to see which one captures the intended concept with the least ambiguity.
Develop or refine the coding frame
Create a coding frame that lists category names, brief definitions, and inclusion–exclusion examples. In deductive projects, categories derive from theory or policy guidelines; in inductive projects, they grow from open coding. Either way, aim for mutual exclusivity at the lowest level so a single unit fits one category unless a justified overlap exists. Where hierarchical relationships matter, arrange categories into parent child branches to reflect increasing specificity. A well-defined frame reduces coder drift, supports transparency, and speeds later retrieval tasks. Store the frame in software notes or a shared document that updates automatically as revisions occur.
Pilot test the coding frame
Apply the draft frame to 5 to 10 percent of the dataset. If working in a team, code the same files independently, then compare segment to code matches. Note disagreements, vague definitions, or missing categories. Revise wording, merge overlapping codes, or split large codes that obscure important distinctions. Retest until coders reach an acceptable level of agreement or, in single-coder studies, until the frame feels stable. Piloting prevents large scale re-coding later and produces an audit trail that shows how analytic decisions evolved.
Code the full dataset
Work through each file systematically, assigning codes to every qualifying unit. Keep reflexive memos close at hand to note uncertainties, emerging ideas, or contextual factors that influence meaning. Segment boundaries should remain consistent with the chosen unit of analysis; avoid expanding or shrinking units to force a fit. In software, use color coding or labels to flag units needing a second look. If double coding for reliability, set checkpoints (e.g., every fifth file) to discuss discrepancies and adjust the frame as needed. Coding is complete when all data are handled and no unresolved segments remain.
Summarize categories and identify patterns
Export code reports or run matrix queries to view all segments linked to each category. Write concise summaries that capture the gist of each category in plain language. Note relative prevalence, typical context, and any notable exceptions. Look for cooccurrences between categories, temporal sequences, or contrasts across participant groups. Basic counts can highlight emphasis, while qualitative comparison clarifies how meanings differ. Where visualization helps, generate bar charts or cluster maps within the software; these outputs aid interpretation without supplanting narrative insight.
Check trustworthiness and reliability
Establish credibility by tracing every claim back to supporting segments and memos. Invite a peer to review the coding frame and a sample of coded text, asking whether the interpretation appears reasonable. If applicable, share preliminary findings with participants for comment. Calculate intercoder reliability statistics such as Krippendorff’s α when using a fixed frame or rely on negotiated agreement when the frame evolves inductively. Maintain an audit trail that records coding decisions, memo entries, and frame revisions so external readers can follow the analytic logic.
Report the findings
Describe the dataset, unit of analysis, coding frame construction, and reliability checks in the method section. Present each major category (or theme, if categories were later aggregated) with a succinct definition and one or two illustrative quotations. When counts are informative, include a simple table that lists category frequencies or crosstabulations by group. Discuss how patterns answer the research question, relate to previous studies, or point to practical implications. Provide enough excerpted data for readers to judge the fit between evidence and interpretation, but avoid overwhelming them with lengthy blocks of text.
Benefits and challenges of qualitative content analysis
Qualitative content analysis offers a range of benefits that make it a powerful method for exploring complex research questions. One of its key strengths is the ability to uncover deep, nuanced insights from textual and visual data. Unlike quantitative approaches, qualitative analysis allows researchers to explore context, meaning, and subjective experiences—making it especially valuable in social and cultural studies.
Another major advantage is flexibility. Qualitative content analysis can be applied across diverse data sources such as interviews, focus groups, open-ended survey responses, policy documents, and social media content. This adaptability allows researchers to tailor their analysis to fit the specific goals and design of their study.
However, the method does come with challenges. Coding and categorizing data can be highly subjective, increasing the risk of inconsistency or bias. Tools like NVivo and ATLAS.ti address this by offering structured coding environments, visualizations, and audit trails that help maintain consistency and transparency throughout the research process. Both platforms support team-based coding, allowing multiple researchers to work collaboratively while tracking inter-coder agreement.
Researcher bias is another concern, particularly because qualitative content analysis relies heavily on interpretation. To mitigate this, reflexive practices like memoing and annotation are crucial. NVivo and ATLAS.ti support these practices through built-in features for memo writing, linking insights, and documenting analytical decisions, helping researchers stay aware of their assumptions and maintain rigor in their interpretations.
The time-intensive nature of qualitative content analysis—especially with large datasets—is also a common hurdle. Manually sorting, coding, and analyzing can be demanding. Here, NVivo and ATLAS.ti provide automation tools such as word frequency counts, text searches, sentiment analysis and autocoding by theme (identifying noun phrases) that speed up initial coding and reduce manual workload without sacrificing depth.
Despite the challenges, qualitative content analysis remains a valuable approach for generating rich, detailed findings. When paired with tools like Lumivero’s research software, researchers can streamline their workflows, enhance reliability, and deepen their insights.
Qualitative content analysis examples
The following fictional cases show how researchers could apply qualitative content analysis across different data sources and analytic logics. While not drawn from actual studies, each example walks through key decisions on units of analysis, coding frames, and trustworthiness checks, illustrating how the method adapts to varying research aims.
Example 1: Inductive analysis of patient experience social posts
Research aim and dataset. A qualitative health research team wanted to understand how patients talk about outpatient chemotherapy on social media. They scraped 12,000 public social posts during a six-month window, then sampled 2,000 English language posts that mentioned infusion clinics. Each post became the unit of analysis.
Coding workflow. Analysts imported the sample into their CAQDAS project and began open coding. Short phrases such as “long wait,” “nurse support,” and “billing shock” captured post content without forcing it into a predefined structure. After two rounds, the team grouped codes into broader categories—logistics, interpersonal care, physical side effects, and financial strain—checking that posts fitted one category at the lowest level.
Trustworthiness. Two coders double coded 10 % of the sample, negotiated disagreements, and refined definitions where overlap caused confusion. Memos recorded decisions to split “side effects” into “acute” and “delayed” subcategories when new patterns emerged. A peer reviewer outside the project read random excerpts linked to each category and agreed that the labels matched the content.
Key insight. The inductive approach surfaced an unanticipated category: “crowdsourcing tips,” where patients exchanged practical advice about managing nausea and scheduling. This category pointed the clinical team toward possible patient generated resources worth formal evaluation.
Example 2: Deductive analysis of university policy documents
Research aim and dataset. An education policy study examined how U.S. universities embed equity language in academic integrity policies. Researchers collected 150 publicly available policy documents from institutional websites. The unit of analysis was a sentence, chosen for its balance between context and specificity.
Coding frame construction. Drawing on existing equity frameworks, the team built a category matrix before coding: (1) inclusive language, (2) restorative practices, (3) punitive focus, and (4) student support resources. Each category carried inclusion and exclusion rules plus example sentences.
Analysis and reporting. A cross-tabquery tallied category counts by university type (public flagship, regional public, private not-for-profit). Private institutions showed higher proportions of “inclusive language” and “student support,” while punitive language dominated regional publics. The report presented these proportions in bar charts and quoted illustrative sentences to show how equity references varied by institutional mission.
Practical outcome. Findings informed a policy briefing that highlighted gap areas and suggested revision templates grounded in the higher frequency restorative statements from peer institutions.
These examples demonstrate how inductive and deductive logics guide frame development, how unit choices shape comparability, and how trustworthiness checks align with the study design rather than a one-size-fits-all protocol.
Qualitative data analysis software
While qualitative content analysis can be done with Microsoft Word files and Microsoft Excel spreadsheets, dedicated research software manages repetitive tasks, tracks decisions, and organizes evidence in ways that manual systems struggle to match.
Lumivero's qualitative data analysis software below meet most researchers’ needs for importing data, applying codes, querying patterns, and exporting results – allowing you to streamline your qualitative content analysis. Both run on Windows and macOS, support mixed media projects, and allow teams to share the same project file through a server or cloud addon. Choosing between them comes down to interface preferences, collaboration workflow, and the depth of analytic tools required.
NVivo
NVivo groups everything—sources, codes, memos, and classifications—inside a single project file. Text, audio, video, images, PDFs, and web captures import with a few clicks, preserving original formatting.
Coding is handled through a familiar highlight-and-assign action that works much like comment functions in common word processors. Analysts can tag the same segment to multiple codes when overlap is theoretically justified and drag-and-drop lets them merge or reorganize nodes on the fly.
Query functions in NVivo provide strong support for content analysis. A coding query retrieves all segments across sources linked to selected codes, useful for checking how consistently a concept appears in interviews and policy texts. A matrix coding query crosses two sets of codes—say, error type by hospital department—to create a contingency table that shows count and textual data extracts for each cell. Users can export these matrices to Excel for further statistics or copy them directly into reports.
NVivo’s automated insights include word frequency lists, word clouds, and topic summaries powered by large language models. These tools give a rapid sense of dominant language without replacing manual interpretation. Classification sheets store demographic or organizational metadata that feed directly into crosstab queries, making it easier to compare patterns by participant group or document type.
Teamwork is supported through a merge function for independent projects, but larger teams often choose the Collaboration Cloud service. It manages checkout and check-in so users never overwrite each other’s work. An audit log records project changes, which is valuable during trustworthiness checks. NVivo’s capacity limit is high—thousands of sources and codes—though very large video datasets will require adequate hardware for peak performance.
ATLAS.ti
ATLAS.ti treats a project as a network of linked objects rather than a single container. Documents, codes, memos, quotations, and groups appear in side panels that can be tiled or floated across multiple monitors. This layout suits analysts who prefer to see relationships while working, as links are created with a drag from one object to another. The software supports all common data types and embeds a transcription editor for synchronising audio or video with timestamps.
The coding process uses a markup model similar to XML: each quotation receives one or more code labels that display in the margin. The code list can be filtered by groups, color, or creation date. Code-by-code comments sit directly under each label, making it easy to store operational definitions alongside examples. ATLAS.ti’s network view allows users to draw link lines among codes, memos, and documents, producing a visual map that doubles as an analytic memo.
For content analysis, the Code Cooccurrence Table and Code–Document Table offer quick counts of how often codes overlap or where they appear. Analysts can set proximity rules—same quotation, same sentence, or within “x” words—to refine these counts. The table exports to Excel or SPSS for additional analysis. A sentiment analysis tool scores positive and negative tone across the dataset, which can guide further category refinement when studying affective language.
Teams benefit from multiuser projects stored on ATLAS.ti Cloud or a network drive, with user-based access control to prevent accidental edits. The comment log lists every change, showing who moved a code or edited a memo. Intercoder agreement is checked through a built-in function that calculates Krippendorff’s α and visualizes agreement by code, helping teams resolve disagreements early.
ATLAS.ti integrates with external reference managers and survey platforms. A .QDpx export ensures interoperability with other CAQDAS programs, so researchers can move projects if institutional licenses change. The pricing model is subscription-based for individuals and perpetual for institutions, offering flexibility for short-term projects. Users often cite its network diagrams and flexible linking as reasons to choose it when relational analysis is central to the study design.
Ready to streamline your qualitative analysis?
Whether you’re coding social media posts, policy docs, or interview transcripts, the right tools make all the difference. Lumivero’s qualitative research software helps you manage, code, and visualize your data without losing the nuance that makes qualitative work so powerful.
Want to see it in action?
Request a free demo with our sales team and find the right solution for your research goals.