Teaching data analysis? Practical resources for new researchers

By Janet Salmons, PhD., Research Community Manager for Methodspace


What does it all mean? Analysis and interpretation of data are difficult practices to learn. These open-access articles from SAGE journals offer practical how-to steps for new researchers.

Qualitative analysis

Fereday, J., & Muir-Cochrane, E. (2006). Demonstrating Rigor Using Thematic Analysis: A Hybrid Approach of Inductive and Deductive Coding and Theme Development. International Journal of Qualitative Methods, 5(1), 80–92. https://doi.org/10.1177/160940690600500107

Abstract. In this article, the authors describe how they used a hybrid process of inductive and deductive thematic analysis to interpret raw data in a doctoral study on the role of performance feedback in the self-assessment of nursing practice. The methodological approach integrated data-driven codes with theory-driven ones based on the tenets of social phenomenology. The authors present a detailed exemplar of the staged process of data coding and identification of themes. This process demonstrates how analysis of the raw data from interview transcripts and organizational documents progressed toward the identification of overarching themes that captured the phenomenon of performance feedback as described by participants in the study.

Miyaoka, A., Decker-Woodrow, L., Hartman, N., Booker, B., & Ottmar, E. (2023). Emergent Coding and Topic Modeling: A Comparison of Two Qualitative Analysis Methods on Teacher Focus Group Data. International Journal of Qualitative Methods, 22. https://doi.org/10.1177/16094069231165950

Abstract. More than ever in the past, researchers have access to broad, educationally relevant text data from sources such as literature databases (e.g., ERIC), an open-ended response from online courses/surveys, online discussion forums, digital essays, and social media. These advances in data availability can dramatically increase the possibilities for discovering new patterns in the data and testing new theories through processing texts with emerging analytic techniques. In our study, we extended the application of Topic Modeling (TM) to data collected from focus groups within the context of a larger study. Specifically, we compared the results of emergent qualitative coding and TM. We found a high level of agreement between TM and emergent qualitative coding, suggesting TM is a viable method for coding focus group data when augmenting and validating manual qualitative coding. We also found that TM was ineffective in capturing more nuanced information than the qualitative coding was able to identify. This can be explained by two factors: (1) the word level tokenization we used in the study, and (2) variations in the terminology teachers used to identify the different technologies. Recommendations include additional data cleaning steps researchers should take and specifications within the topic modeling code when using topic modeling to analyze focus group data.

Nowell, L. S., Norris, J. M., White, D. E., & Moules, N. J. (2017). Thematic Analysis: Striving to Meet the Trustworthiness Criteria. International Journal of Qualitative Methods. https://doi.org/10.1177/1609406917733847

Abstract. As qualitative research becomes increasingly recognized and valued, it is imperative that it is conducted in a rigorous and methodical manner to yield meaningful and useful results. To be accepted as trustworthy, qualitative researchers must demonstrate that data analysis has been conducted in a precise, consistent, and exhaustive manner through recording, systematizing, and disclosing the methods of analysis with enough detail to enable the reader to determine whether the process is credible. Although there are numerous examples of how to conduct qualitative research, few sophisticated tools are available to researchers for conducting a rigorous and relevant thematic analysis. The purpose of this article is to guide researchers using thematic analysis as a research method. We offer personal insights and practical examples, while exploring issues of rigor and trustworthiness. The process of conducting a thematic analysis is illustrated through the presentation of an auditable decision trail, guiding interpreting and representing textual data. We detail our step-by-step approach to exploring the effectiveness of strategic clinical networks in Alberta, Canada, in our mixed methods case study. This article contributes a purposeful approach to thematic analysis in order to systematize and increase the traceability and verification of the analysis.

Rix, J., Carrizosa, H. G., Sheehy, K., Seale, J., & Hayhoe, S. (2020). Taking risks to enable participatory data analysis and dissemination: a research note. Qualitative Research. https://doi.org/10.1177/1468794120965356

Abstract. The involvement of all participants within all aspects of the research process is a well-established challenge for participatory research. This is particularly evident in relation to data analysis and dissemination. A novel way of understanding and approaching this challenge emerged through a large-scale international, 3-year participatory research project involving over 200 disabled people. This approach enabled people to be involved at all stages of the research in a manner that was collectively recognised to be participatory and also delivered high-quality findings. At the heart of this emergent approach to participatory research is an engagement with risk. This research note explores the types of risks involved in delivering research that seeks to be authentically participatory.

Saldaña, J. (2018). Researcher, Analyze Thyself. International Journal of Qualitative Methods. https://doi.org/10.1177/1609406918801717

Abstract. This article attempts to answer the phenomenological question, “What does it mean to be a qualitative researcher?” and an ancillary question, “What does ‘making meaning’ mean?” The author, in collaboration with selected participants at the 2018 The Qualitative Report and the International Institute for Qualitative Methodology’s Qualitative Research Methods conferences, proposes that research is devotion. Three major categories or components of devotion are purpose (personal and professional validation), belonging (communal grounding), and meaning (an enriched life). Ten subcategories or “elements of style” as qualitative researchers include meticulous vigilance of details, unyielding resiliency, visionary reinvention, social savvy, humble vulnerability, representational responsibility, finding your methodological tribes, emotional immersion, gifting your ideas, and knowing and understanding yourself.

Srivastava, P., & Hopwood, N. (2009). A Practical Iterative Framework for Qualitative Data Analysis. International Journal of Qualitative Methods, 76–84. https://doi.org/10.1177/160940690900800107

Abstract. The role of iteration in qualitative data analysis, not as a repetitive mechanical task but as a reflexive process, is key to sparking insight and developing meaning. In this paper the authors presents a simple framework for qualitative data analysis comprising three iterative questions. The authors developed it to analyze qualitative data and to engage with the process of continuous meaning-making and progressive focusing inherent to analysis processes. They briefly present the framework and locate it within a more general discussion on analytic reflexivity. They then highlight its usefulness, particularly for newer researchers, by showing practical applications of the framework in two very different studies.

Quantitative analysis

Connelly, R., Gayle, V., & Lambert, P. S. (2016). Statistical modelling of key variables in social survey data analysis. Methodological Innovations. https://doi.org/10.1177/2059799116638002

Abstract. The application of statistical modelling techniques has become a cornerstone of analyses of large-scale social survey data. Bringing this special section on key variables to a close, this final article discusses several important issues relating to the inclusion of key variables in statistical modelling analyses. We outline two, often neglected, issues that are relevant to a great many applications of statistical models based upon social survey data. The first is known as the reference category problem and is related to the interpretation of categorical explanatory variables. The second is the interpretation and comparison of the effects from models for non-linear outcomes. We then briefly discuss other common complexities in using statistical models for social science research; these include the non-linear transformation of variables, and considerations of intersectionality and interaction effects. We conclude by emphasising the importance of two, often overlooked, elements of the social survey data analysis process, sensitivity analysis and documentation for replication. We argue that more attention should routinely be devoted to these issues.

Koran, J., & Jaffari, F. (2020). Deletion statistic accuracy in confirmatory factor models. Methodological Innovations. https://doi.org/10.1177/2059799120918349

Abstract. Social science researchers now routinely use confirmatory factor models in scale development and validation studies. Methodologists have known for some time that the results of fitting a confirmatory factor model can be unduly influenced by one or a few cases in the data. However, there has been little development and use of case diagnostics for identifying influential cases with confirmatory factor models. A few case deletion statistics have been proposed to identify influential cases in confirmatory factor models. However, these statistics have not been systematically evaluated or compared for their accuracy. This study evaluated the accuracy of three case deletion statistics found in the R package influence.SEM. The accuracy of the case deletion statistics was also compared to Mahalanobis distance, which is commonly used to screen for unusual cases in multivariate applications. A statistical simulation was used to compare the accuracy of the statistics in identifying target cases generated from a model in which variables were uncorrelated. The results showed that Likelihood distance and generalized Cook’s distance detected the target cases more effectively than the Chi-square difference statistic. The accuracy of the Likelihood distance and generalized Cook’s distance statistics was unaffected by model misspecification. The results of this study suggest that Likelihood distance and generalized Cook’s distance are more accurate under more varied conditions in identifying target cases in confirmatory factor models.

Lee, F., & Björklund Larsen, L. (2019). How should we theorize algorithms? Five ideal types in analyzing algorithmic normativities. Big Data & Society. https://doi.org/10.1177/2053951719867349

Abstract. The power of algorithms has become a familiar topic in society, media, and the social sciences. It is increasingly common to argue that, for instance, algorithms automate inequality, that they are biased black boxes that reproduce racism, or that they control our money and information. Implicit in many of these discussions is that algorithms are permeated with normativities, and that these normativities shape society. The aim of this editorial is double: First, it contributes to a more nuanced discussion about algorithms by discussing how we, as social scientists, think about algorithms in relation to five theoretical ideal types. For instance, what does it mean to go under the hood of the algorithm and what does it mean to stay above it? Second, it introduces the contributions to this special theme by situating them in relation to these five ideal types. By doing this, the editorial aims to contribute to an increased analytical awareness of how algorithms are theorized in society and culture. The articles in the special theme deal with algorithms in different settings, ranging from farming, schools, and self-tracking to AIDS, nuclear power plants, and surveillance. The contributions thus explore, both theoretically and empirically, different settings where algorithms are intertwined with normativities.

Venturini, T., Jacomy, M., & Jensen, P. (2021). What do we see when we look at networks: Visual network analysis, relational ambiguity, and force-directed layouts. Big Data & Society. https://doi.org/10.1177/20539517211018488

Abstract. It is increasingly common in natural and social sciences to rely on network visualizations to explore relational datasets and illustrate findings. Such practices have been around long enough to prove that scholars find it useful to project networks in a two-dimensional space and to use their visual qualities as proxies for their topological features. Yet these practices remain based on intuition, and the foundations and limits of this type of exploration are still implicit. To fill this lack of formalization, this paper offers explicit documentation for the kind of visual network analysis encouraged by force-directed layouts. Using the example of a network of Jazz performers, band and record labels extracted from Wikipedia, the paper provides guidelines on how to make networks readable and how to interpret their visual features. It discusses how the inherent ambiguity of network visualizations can be exploited for exploratory data analysis. Acknowledging that vagueness is a feature of many relational datasets in the humanities and social sciences, the paper contends that visual ambiguity, if properly interpreted, can be an asset for the analysis. Finally, we propose two attempts to distinguish the ambiguity inherited from the represented phenomenon from the distortions coming from fitting a multidimensional object in a two-dimensional space. We discuss why these attempts are only partially successful, and we propose further steps towards a metric of spatialization quality.

Qualitative and Quantitative: Mixed Methods

de Block, D., & Vis, B. (2019). Addressing the Challenges Related to Transforming Qualitative Into Quantitative Data in Qualitative Comparative Analysis. Journal of Mixed Methods Research, 13(4), 503–535. https://doi.org/10.1177/1558689818770061

Abstract. The use of qualitative data has so far received relatively little attention in methodological discussions on qualitative comparative analysis (QCA). This article addresses this lacuna by discussing the challenges researchers face when transforming qualitative data into quantitative data in QCA. By reviewing 29 empirical studies using qualitative data for QCA, we explore common practices related to data calibration, data presentation, and sensitivity testing. Based on these three issues, we provide considerations when using qualitative data for QCA, which are relevant both for QCA scholars working with qualitative data and the wider mixed methods research community involved in quantitizing.

Hellström, J. (2011). Conditional Hypotheses in Comparative Social Science: Mixed-Method Approaches to Middle-Sized Data Analysis. Methodological Innovations Online, 6(2), 71–102. https://doi.org/10.4256/mio.2010.0036

Abstract. This paper discusses under which circumstances and how configurational comparative methods (i.e. QCA) and statistical methods can be combined to provide tests for the ‗quasi‘-sufficiency of any given set of combination of causal conditions. When combined, QCA provides the ability to explore causal substitutability (i.e. multiple paths to a given outcome) and the ways in which many multiple causes interact with one another to produce effects, while the statistical elements can provide robust indications of the probable validity of postulated hypotheses. The potential utility of the mixed-method approach for analyzing political phenomena is demonstrated by applying it to cross-national data regarding party positions on European integration and party-based Euroscepticism in Western Europe. The findings show that oppositional stances to European integration are partly associated with non-governmental ideological fringe parties on both the left and right. The empirical example presented in this paper demonstrates that configurational methods can be successfully combined with statistical methods and supplement the QCA-framework by providing statistical tests of ‗almost sufficient‘ claims. However, combining QCA with statistical methods can sometimes be problematic in middlesized data analysis, especially as the latter usually cannot handle limited diversity (i.e. insufficient information) in the data and/or overtly complex relationships (i.e. having a large number of conjunctional conditions or interacting variables).

Moseholm, E., & Fetters, M. D. (2017). Conceptual models to guide integration during analysis in convergent mixed methods studies. Methodological Innovations. https://doi.org/10.1177/2059799117703118

Abstract. Methodologists have offered general strategies for integration in mixed-methods studies through merging of quantitative and qualitative data. While these strategies provide researchers in the field general guidance on how to integrate data during mixed-methods analysis, a methodological typology detailing specific analytic frameworks has been lacking. The purpose of this article is to introduce a typology of analytical approaches for mixed-methods data integration in mixed-methods convergent studies. We distinguish three dimensions of data merging analytics: (1) the relational dimension, (2) the methodological dimension, and (3) the directional dimension. Five different frameworks for data merging relative to the methodological and directional dimension in convergent mixed-methods studies are described: (1) the explanatory unidirectional approach, (2) the exploratory unidirectional approach, (3) the simultaneous bidirectional approach, (4) the explanatory bidirectional approach, and (5) the exploratory bidirectional approach. Examples from empirical studies are used to illustrate each type. Researchers can use this typology to inform and articulate their analytical approach during the design, implementation, and reporting phases to convey clearly how an integrated approach to data merging occurred.


More Methodspace Posts about Data Analysis

Previous
Previous

Teaching and Learning Excel for Data Analysis

Next
Next

From codes to theory-building