Heather Estop 9/12/23 Heather Estop 9/12/23

The five pitfalls of coding and labeling - and how to avoid them

Whether you call it ‘content analysis’, ‘textual data labeling’, ‘hand-coding’, or ‘tagging’, a lot more researchers and data science teams are starting up annotation projects these days. Learn how to avoid potential pitfalls.

Heather Estop 3/17/21 Heather Estop 3/17/21

Emotion and reason in political language

In the day-to-day of political communication, politicians constantly decide how to amplify or constrain emotional expression, in service of signalling policy priorities or persuading colleagues and voters. We propose a new method for quantifying emotionality in politics using the transcribed text of politicians’ speeches. This new approach, described in more detail below, uses computational linguistics tools and can be validated against human judgments of emotionality.

Chris Burnage 9/4/20 Chris Burnage 9/4/20

The validity problem with automated content analysis

There’s a validity problem with automated content analysis. In this post, Dr. Chung-hong Chan introduces a new tool that provides a set of simple and standardized tests for frequently used text analytic tools and gives examples of validity tests you can apply to your research right away.

Chris Burnage 8/5/20 Chris Burnage 8/5/20

My journey into text mining

My journey into text mining started when the institute of Digital Humanities (DH) at the University of Leipzig invited students from other disciplines to take part in their introductory course. I was enrolled in a sociology degree at the time, and this component of data science was not part of the classic curriculum; however, I could explore other departments through course electives and the DH course sounded like the perfect fit.

Amy Sparrow 7/10/20 Amy Sparrow 7/10/20

How to embrace text analysis as a computational social scientist

In this guest blog, Alix Dumoulin and Regina Catipon cover how to embrace text analysis as a social scientist, the challenge cleaning text corpora brings in preprocessing, and introduce our upcoming tool, Texti, that will save researchers time.

Chris Burnage 1/15/20 Chris Burnage 1/15/20

From preprocessing to text analysis: 80 tools for mining unstructured data

Text mining techniques have become critical for social scientists working with large scale social data, be it Twitter collections to track polarization, party documents to understand opinions and ideology, or news corpora to study the spread of misinformation. In the infographic shown in this blog, we identify more than 80 different apps, software packages, and libraries for R, Python and MATLAB that are used by social science researchers at different stages in their text analysis project. We focused almost entirely on statistical, quantitative and computational analysis of text, although some of these tools could be used to explore texts for qualitative purposes.

Chris Burnage 11/28/19 Chris Burnage 11/28/19

What does it mean to anonymize text?

Text data are a resource that we are only beginning to understand. Many human interactions are moving to the digital world, and we become increasingly sophisticated in documenting interactions. Face-to-face encounters are replaced by written communication (e.g., WhatsApp, Twitter) and every crime incident or hospital visit is recorded. All of these interactions leave a trace in the form of text data.

Heather Estop 2/14/19 Heather Estop 2/14/19

Roundup: #text2data - new ways of reading

‘From text to data - new ways of reading’ was a 2-day event organised by the National Library of Sweden, the National Archives and Swe-Clarin. The conference brought together librarians, digital collection curators, and scholars in digital humanities and computational social science to talk about the tools and challenges involved in large scale text collection and analysis.

11/10/09 11/10/09

Using text analysis of unstructured data to provide real insights

Subscribe to our methods mailing list

Sage Research Methods Community