Creating a Codebook - where to start?
by Angie Sibley-White, Senior Lecturer in Education, De Montfort University, and Gisela Oliveira, Senior Lecturer in Education, De Montfort University.
This blog is part of a short series of posts in relation to the use of codebooks in qualitative research.
A codebook is typically defined as a guide for coding data on a particular qualitative research project. Yet, it can be so much more: it can be a tool to increase consistency in coding by a team of researchers, or a strategy to showcase rigour and process in a PhD project, or even a developmental tool for learning about coding (Oliveira, 2022).
With such a multitude of aims and approaches, it can be confusing to know where to start. So, here is what we suggest:
The first step in creating a codebook is a reflective one. As a researcher, you will need to consider what you want from your codebook. Is it a tool for yourself, or for a team of researchers? Are you interested in intercoder agreement measures, or do you want to keep an audit trail of your coding decisions? In practice, this will influence the sections in your codebook. For example, you might want to include a section where you track when a code was created and then adapted. The most basic codebook structure will include a list of codes, their definitions, and examples from coded data (DeCuir-Gunby, Marshall, and McCulloch, 2011).
Once you have established the basic structure for your codebook, you can start thinking about the specific codes. Here, we are going to suggest an approach that includes both inductive and deductive coding, as it is the most common. However, codebooks are flexible, and you can adapt these steps to what suits your project and data set. If you want to learn more about codes, we suggest Saldaña’s (2021) extensive handbook on coding. Back to your codebook development, you can start by creating a list of relevant deductive codes. These are codes generated from your conceptual framework and literature review, and are also called theory-driven codes. These are a good place to start because you can easily identify a few relevant codes that you expect to find in the data. This is because your research questions were guided by the same literature that is now informing your deductive codes. This step will generate a list of codes that should not be very long. You just need a few codes to get you started.
Then, it’s time to test the codes against your data. You will do this by applying those initial codes to one transcript/data item. While completing this initial coding, other bits of meaningful information will become clear and, sometimes your theory-driven codes will not be applicable. This is when you will find the need to create new, data-driven codes. These are codes that emerge from your data and reflect what your participants shared with you during data collection. If you are using the codebook as an audit trail tool, you can also make some notes on why you created these codes. Don’t restrict yourself on this step – you will revise later.
By this point, you will likely have a very long list of codes, some theory-driven and many data-driven. This is not the end of your codebook, but rather the beginning. Now, you will need to revise this long list into something more manageable. You will need to decide which codes to keep, which ones to adapt, and which ones to eliminate. For example, some codes might be very similar in scope, and you can merge them. At this point, you will start to look at your codes as individual identities that need a clear name, definition, and some criteria that will help you decide if specific quotes should be assigned this code. You are starting to create inclusion and exclusion criteria for each code, which should make coding less ambiguous. Do keep in mind that it is how well you complete this step, and how well you apply the criteria to your data, that will make your coding consistent. You will be coding different sources, over what is likely to be an extended period of time, so it is fundamental that your code names, definitions and inclusion and exclusion criteria are clear and easy to apply. This will avoid ambiguity while coding.
It is only when you have revised your codes, created a definition, and listed some inclusion and exclusion criteria that you will have your first codebook. Well done on getting this far! From now on, you should test the codebook against a new transcript and adjust the codebook as needed. With each revision, you might want to add a new code, or revise some that are already there. You might also want to invite a colleague or a team member to use your codebook to code one transcript and then compare notes. Continuously testing your codebook against the data is fundamental to ensure it is fit for purpose – this means that the codebook allows you to capture all that is meaningful in your data set, ready for analysis. However, beware, you can get stuck on an endless loop of revision and fine-tuning. Keep in mind that the codebook is a means to an end, and not an end in itself. So, create a codebook that allows you to extract what is meaningful from your data set in a consistent manner, and move on to writing up your findings.
A final point is that this process is the similar whether you are creating a codebook on word, excel, or using a coding software like NVivo. The medium used is less relevant than the thinking and consistent approach taken to develop the codebook. At the end of this process, your codebook should be user-friendly and simple to follow, with coding guidelines that are easy to implement, and make your process more robust.
References:
DeCuir-Gunby, J., Marshall, P., & McCulloch, A. (2011). Developing and using a codebook for the analysis of interview data: an example from a professional development research project. Field Methods, 23(2), 136–155. https://doi.org/10.1177/1525822X10388468
Oliveira, G. (2022). Developing a codebook for qualitative data analysis: Insights from a study on learning transfer between university and the workplace. International Journal of Research and Method in Education. https://doi.org/10.1080/1743727X.2022.2128745
Saldaña, J. (2021). The coding manual for qualitative researchers (4th ed.). SAGE Publications, Inc.