Unlocking crime data for research: An update from 2019's SAGE Concept Grant winners
Text Wash uses machine learning and natural language processing to unlock previously untapped crime data, that so far has been inaccessible to research due to the need to anonymize the personally identifiable information it contains.
There is a growing body of academic research looking at all aspects of emoji usage ๐๐ด๐๐
If you have a mobile phone made in the last eight years, or if you've used social media, you're likely familiar with emoji. The colorful icons, first available in Japan in the 1990s, are ubiquitous and an increasingly common part of our online lives. They have all but replaced emoticons, their punctuation-based precursors, though kaomoji (more detailed emoticons, originating in Japan) such as แ( แ )แ still enjoy popularity in some corners of the internet. Perhaps the most compelling example of emoji popularity was the "face with tears of joy" emoji ๐ being selected as the Oxford Dictionaries Word of the Year in 2015 - a fact you will find in the introduction of many academic papers on the topic.
Making sensitive text data accessible for computational social science
Text is everywhere, and everything is text. More textual data than ever before are available to computational social scientistsโbe it in the form of digitized books, communication traces on social media platforms, or digital scientific articles. Researchers in academia and industry increasingly use text data to understand human behavior and to measure patterns in language. Techniques from natural language processing have created a fertile soil to perform these tasks and to make inferences based on text data on a large scale.
No more tradeoffs: The era of big data content analysis has come
For centuries, being a scientist has meant learning to live with limited data. People only share so much on a survey form. Experiments donโt account for all the conditions of real world situations. Field research and interviews can only be generalized so far. Network analyses donโt tell us everything we want to know about the ties among people. And text/content/document analysis methods allow us to dive deep into a small set of documents, or they give us a shallow understanding of a larger archive. Never both. So far, the truly great scientists have had to apply many of these approaches to help us better see the world through their kaleidoscope of imperfect lenses.