Pros and cons of using a spreadsheet as a literature tracker
The literature review for your thesis or dissertation requires keeping track of sources, their important points and their links to each other – for hundreds of articles, books, and other references. So, it’s no surprise grad students frequently post questions like the following on Twitter:
Many university guides on the subject recommend creating a synthesis matrix for keeping track of sources, ideas, and quotations. Traditionally, this matrix was often created as a Word document, and you’ll still find many templates available online. However, more and more academics now seem to be using spreadsheets instead. The spreadsheet method is popular, too: one recent Twitter post on an Excel setup garnered over 1,500 likes and 272 retweets and comments from other academics.
Why all this enthusiasm for a decidedly old-school piece of software that most people associate with accounting?
This blog post will help answer that question by taking a look at the advantages of using Excel and some specific variations (including the popular one mentioned above) that you can use to create your own system. It will then address some of the disadvantages and consider how the reference management program Citavi could be a better alternative in some cases.
Advantages of using Excel
Proponents of the Excel approach are quick to tout the many benefits. First, there’s no need to pay for a new piece of software, since if you already have Microsoft Office installed on your computer, you also already have Excel. Otherwise, you can also use Google Sheets which has all the options you might need.
Then, there’s the simplicity and flexibility of using a spreadsheet. Set up time is pretty low. You simply create a few columns and can get started using your literature tracking system in a matter of minutes. Especially if you’ve learned how to use Excel in the past, it will also feel more comfortable than opening up a new program.
Another benefit is how easily customizable the solution is – you can make the categories be exactly what you want rather than having to fit them into the structures imposed by a software program. Need a column to track the location of a study or a specific intervention? You just need to add it. Even though Excel can get complicated if you set up formulas or other customizations, for a literature review spreadsheet you usually can just use it as a simple table.
So far, the advantages listed apply to Word as well, but Excel has one crucial advantage: it lets you search, sort, and filter. Have a vague recollection of a note you wrote but only remember one term you used in it? Use Excel’s “Find” feature. Want to sort all your notes by year of publication of your source? Nothing could be easier than sorting your “year” column in ascending order. Want to find clinical trials with female participants with a statistically significant intervention? If you set up your Excel sheet as described below under “Version 2” such combinations of queries are possible.
So, with all these advantages, how does the Excel method work in practice?
Version 1: Simple but effective
When you search for “Excel literature review”, Dr. Elaine Gregersen’s 2016 blog post “How I use Excel to manage my Literature Review” about her personal literature tracking system is one of the first results to pop up. It’s an approach that’s still often praised in discussion threads about Excel literature tracking methods. In her own words, it’s a simple approach, but that’s what makes it work. She uses one sheet only and includes columns for basic citation information, keywords, objectives, methods, and conclusions. In addition, she adds in four personalized categories: happy thoughts, unhappy thoughts, her own ethical concerns, and the author’s ethical concerns. These last two columns perfectly align with her field of Autoethnography. The happy thoughts column is for notes, such as how findings relate to her own work, while the unhappy thoughts column is for times when she disagrees with an author, among other uses.
Dr. Raul Pacheco uses a similar one-sheet method, which he calls the Conceptual Synthesis Excel Dump (CSED) technique since he tosses in any literature he might be using for analysis. His setup overlaps in some ways with Gregersen’s but has a few differences; he has columns for the concept (i.e. theme), citation, main idea, three columns for notes (which function similarly to Gregersen’s happy and unhappy thoughts), cross-references, quotes, and page numbers. An example of how this looks in practice can be found here.
Personally, I like the dedicated column for quotations to help separate out the authors’ exact words from one’s analysis of them or the article as a whole. This can help you inadvertently misrepresent an author’s ideas as your own when you’re later writing your literature review.
Taking the models laid out by Gregersen and Pacheco as a jumping off point, it’s easy to make some tweaks for even better usability for your own projects. Obviously, you’ll want to create columns that fit your needs. Instead of a column “main theme” you might have several “key takeaways” columns. Or a highly-personal column for how each article relates to your own work. For example, you might include only the author names and year of publication for an article rather than the full citation (in which case we’d highly recommend saving the full details in a reference management program!). Some people might want to copy the abstract the authors provide, while some will choose to write their own summaries. You can add “notes” columns or distinguish between paraphrases, comments, and direct quotations. Beyond that there are a lot of other small things you can do to make your spreadsheet work better for you, such as linking from a citation to the actual PDF, adding comments to cells, or adding drop-down lists to make data entry easier.
Version 2: Advanced setup options
If you want to take your basic Excel spreadsheet up a notch, you can do so in several ways. For one, you can make use of multiple sheets in the same workbook. Dr. Kathleen Clarke describes her method which involves a major spreadsheet for tracking all the high-level information about a source along with minor spreadsheets which are more granular. She describes her method as a mix between Gregersen’s and Pacheco’s, but she also includes additional sheets on different but related topics and for studies she wants to read later on. One other notable addition is the use of a numbering system for her sources which corresponds to the article file names on her computer.
The other way you can improve upon the basic models is to increase your Excel workbook’s usability. If you want to turn your spreadsheet into a visually appealing filtering machine, I suggest watching this recorded webinar by PhD student Stephen McQuilliam. The example below is based on his method and was created in about 15 minutes using information from the Citavi example project:
Basically, you use the Excel slicer feature to set up a row of filter buttons above your entries that is color coded to match the headers in your table below. Including the year as a banner has the added advantage of letting you see the range of years of publication you currently have. When you want to filter, you just click a button – no complicated Excel formulas needed. To combine filters, for example “year” and “keyword” you can simply hold down the “Ctrl” key to select an additional option
Some general tips
While there’s a lot of freedom in how you set up your Excel files, there are still some best practices you’ll likely want to follow. First, you should set up your table so that headers are marked as such. This way they won’t be sorted along with the other cells if you sort the column by A-Z, for example. Also, you’ll want to apply word wrap formatting to cells to keep content from spilling over into neighboring empty cells. This just keeps everything looking a lot tidier and makes it easier to skim through. Another handy option recommended by McQuilliam is to set up endless scrolling which keeps your column headers visible, even when you’re adding entries at the bottom of your list.
The columns you include are more or less up to you, but you’ll need a column for source information for sure to avoid inadvertent plagiarism or having to hunt down sources later on. In addition, a year column is invaluable for sorting your literature chronologically in preparation for writing your lit review. To keep track of how authors build upon and discuss each other’s work, a cross-references column can also be helpful. I would also suggest making it very clear which analysis and thoughts are your own and which are those of your author. For this reason, I like Pacheco’s “quotes” column being separated from his three “notes” columns.
If you’re planning on using filter features later on to search by study type, keyword, or some other criteria you’ll need to use controlled vocabulary, i.e. each concept should be referred to by a single term rather than using a bunch of different synonyms. You can define this at the start in a key on a separate sheet of your Excel workbook so that you can easily refer to it as needed. Each time you decide to add new terms, just add them to your key.
Disadvantages of the Excel approach
It’s hard to argue with the advantages of ease, simplicity, and flexibility that the Excel method gives you. But, there are still some big downsides to consider.
First, you have to set everything up yourself – it’s not already set up for you in a way that should fit most workflows. If you try something and later decide to take a different approach, you may need to go back and add in additional information for many sources you already examined.
Although search, filtering, and sorting options in Excel are much better than they would be in a Word table, the program is still a spreadsheet at heart which means that it’s “flatter” than a database. In other words, it’s less relational which makes it difficult to create complex search strings to get a subset of items that fit multiple criteria or that use more complicated search techniques such as Boolean logic or wildcards.
Another drawback is that the Excel approach involves a lot of manual entry. While some amount of manual work will always be necessary, for example, when you type up your comments or key takeaways, you won’t be able to directly extract information from PDFs (such as direct quotes or images) without using an additional PDF reader. Moreover, there are no time-saving automation options for adding source information that you might be accustomed to from your reference manager.
Speaking of reference managers, in many of the Twitter discussions around the Excel note-taking approach, there will always be a few comments asking why the person didn’t consider using their referencing software for their notes. Many proponents of the Excel approach stress that they do indeed use a reference management program to keep track of their source information but that they prefer to keep their notes and analysis in a separate Excel file. One of the reasons is that even though many reference management programs let you group references into folders and tag them with specific terms, they don’t let you easily keep track of and categorize notes on a particular source. You basically get a single notes field and that’s it. No way to categorize, group, or tag the note itself, just the source as a whole.
Of course, there is one reference management program that has been designed with this type of knowledge organization in mind and that may, in many cases, be a better alternative to the Excel method: Citavi.
Citavi lets you automatically add source information for most journal articles. Then, you can read PDFs and save notes directly in the program. The notes – which can be summaries, indirect quotations, direct quotations and comments – are always linked directly to their location in the PDF, so if you ever have to look up the context for one of your own comments or a direct quotation again, one click takes you directly to where you need to go. Page numbers are saved automatically, as long as the PDF metadata includes that information. Otherwise, you just need to enter a page number for an article with the first “knowledge item” you save for it. Citavi will then add all the rest automatically.
Although the knowledge item types are pre-defined, the many options will fit most needs, and you can also always use either the keywords, categories, or the core statement field to designate the type of note you are adding if you want more customization. Any terms you use can later be searched or used as filters (more on that below). In addition, for the reference as a whole you also have pre-defined fields for keywords, groups, evaluations, abstracts, notes, and cross-references. This lets you classify at both the reference and note level, so, if you want, you can assign different categories or keywords for a source as a whole and for a statement you find in it. If you need additional source fields, there are 9 custom fields which you can rename and format with drop-down options.
Where Citavi really shines against Excel are in its search features and integration with Word. You can create and save complex searches that combine or exclude certain terms, keywords, categories, note type, year, etc. You can make use of advanced search syntax, such as boolean operators, wildcards, and regular expressions. You can rate sources and filter by rating. And, you have full-text search across all of your PDFs. You can also view project statistics at a glance or use an add-on to do an analysis by author or another criteria.
But the best part is that all of this information can be taken directly over to Word. You have all the analysis and quotes you’ve saved in a panel at the left and can just click to insert what you need. Citavi will insert the correct citation formatting and add an entry to your bibliography at the end. If you added your notes to an outline in Citavi, you can use the “Chapter” view to focus on what you need for a particular section. And, if you ever need to double-check the context for a direct quotation or your own paraphrase, you can click a link symbol to jump back to the exact spot in the PDF that you referred to.
If you do need to at some point export your reference information in table format for an appendix in your dissertation (for example, as documentation of the exclusion process for a systematic review), doing so just requires a few clicks. If you’ve previously worked with Excel and want to try out Citavi, importing is just as easy, and you can of course import all of your existing notes as knowledge items.
Last but certainly not least, if you use Citavi, you have the benefit of working with one tool instead of needing to juggle an Excel spreadsheet, a reference management program, and a PDF annotation tool or PDF reader.
We think it’s a no-brainer to use Citavi instead of Excel or Google Sheets to keep track of your reading for a literature review – but then again, we might be ever so slightly biased. What do you think?
We’d love to hear about your preferred method for keeping track of information for a literature review. Just leave a comment on our Facebook post for this blog article!
About Jennifer Schultz
Jennifer Schultz is the sole American team member at Citavi, but her colleagues don’t hold that against her (usually). Supporting research interests her so much that she got a degree in it, but she also likes learning difficult languages, being out in nature, and having her nose in a book.