Leveraging Data Science to Address Important Questions
This blog is part of a 3-year ongoing series “The Future of Computational Social Science is Black” about SICSS-Howard/ Mathematica, the first Summer Institute in Computational Social Science held at a Historically Black College or University. To learn more about SICSS-H/M’s inaugural start, read the 2021 blog “Welcome SICSS-Howard/ Mathematica 2021” or our first blog “Uncovering new keys to countering anti-Black racism and inequity using computational social science.” If you are interested in applying to participate in SICSS-H/M 2024, check out our website.
Acknowledging the Integrative Power of Data Science
A new generation of students is being equipped with applied data science techniques. How do we go about training them to answer important questions, thus leveraging these techniques to develop explanations and solutions for the social problems of our time? Programs in universities across America are dedicated to tilling the soil of the dynamic, multi-disciplinary field of data science. Every day, data are becoming more available and more integrable, expanding our capacity to address vital questions in our society.
Perhaps attending to the above query first requires examining our understanding of what is “important.” As a graduate student training in political science at University of California, Berkeley, I’m grateful for exposure to the range of sub-fields in my discipline, as our study of global governance often occupies us with different levels of analysis.
Consider that researchers desiring to tackle big questions regarding democratic accountability to public opinion in American politics may find themselves making use of large-N surveys or administrative records to identify patterns and discrepancies in the priorities and responsiveness of legislatures. Alternatively, those interested in understanding the incentives of political elites may find themselves more occupied with detailing the development of the configurations of political institutions. Overall, both are convinced that their inquiries offer a bright future for generative questions and research agendas in the study of politics.
Clearly, specialization has its place. Without it, accountability and scientific rigor would suffer. Nonetheless, silos naturally arise, often slowing the dissemination of ideas and projects across sub-communities and limiting our collective capacity to translate generated knowledge to the public.
Perhaps that’s a core reason why the future of data science, and programs like SICCS-Howard/ Mathematica, may have more significant implications than we realize. As articulated by our phenomenal program leads this summer, the space of computational social science is cross-cutting and includes researchers from settings ranging from academia to non-profits. Moving toward the establishment of best practices helps to facilitate accountability across sub-fields and disciplines, thus overcoming the aforementioned silo problem. The capacity we have to build multi-disciplinary research agendas expands parallel to the development of our techniques.
Training Up the Next Generation of Data Scientists
There are multiple axioms of “importance” to address as the field of data science advances. One of them deals with the problems identified above, while another is more concerned with the gap that often exists between historically underrepresented communities and academic scholarship. For example, although it is broadly understood that a first-order function of states is the provision of security as a public good, persistently and comparatively high levels of urban violence in the American context have not been a primary concern for scholars of American politics, with a handful of notable exceptions. As a researcher in the space of urban violence and democratic accountability, I’ve observed that we need a diversity of experiences represented in the research spaces we develop to ask questions deemed “important” to a variety of people groups.
Observing this imbalance in existing scholarship is not unique to my experience. By far, one of the most rewarding parts of SICSS-Howard/ Mathematica this summer was the ability to engage with emerging scholars across various fields concerned with questions that are currently animating the national dialogue: the national housing crisis, rising American inequality, the need to reduce bias in artificial intelligence software, and the burgeoning mental health crisis, to name a few.
Climbing the ladder of expertise in computational social science while maintaining one’s convictions on the issues that are “important” can feel particularly isolating as a graduate student. A commonly shared sentiment among the SICSS community this summer was the concern that we may be forced to stand alone in our home departments or disciplines. There is a natural process of deliberation for graduate students as we choose where and how to focus our efforts.
This deliberation process offers much to contemplate about the research we pursue: Is this an important question to ask? Would this project be fruitful in the knowledge produced? Can it easily connect to threads of inquiry in existing scholarship? For those of us inclined towards academia, how will this project be perceived on the job market? Parsing through these questions within a community of other steadfast early scholars pooling their experiences and knowledge bases together encourages and sharpens us in that process. I’m forever grateful to the SICSS community for facilitating a space to engage in this formative contemplation with my peers.
Building On A Strong Foundation
Aside from our formal instruction in using applied data science to answer important questions, one of the most encouraging aspects of our SICSS-Howard/Mathematica experience was hearing the perspectives of current faculty members and organizational leads who have been dedicated to this endeavor. A highlight for me was our session with Dr. Latanya Sweeney, one of Harvard’s leading public intellectuals in the space of data privacy and discrimination in technology. Her scholarship can stand alone as a motivator to pursue questions for impact - yet it was her exhortation to stay the course and stand by one’s convictions that truly resonated with me.
Another highlight was our time with Dr. Desmond Patton, who presented some of his work on building and improving algorithms to sustainably identify potential victims of gun violence. He shed insight into how incorporating community members to inform how algorithms read social media and other public data may increase algorithmic efficiency and reduce discrimination. His work is a tangible example of operationalizing one’s convictions to develop meaningful applications of data science.
There need not be a tradeoff between utilizing rigorous applied data science and asking profound questions that could illuminate the true nature of our most pernicious social problems. We’re in the midst of a critical moment in this cross-cutting field. My experience in SICSS-Howard/Mathematica confirmed that our shared commitment to raising a generation of scholars with the boldness and persistence to ask difficult questions is needed, now more than ever.
For more information about SICSS-Howard/Mathematica, check out our website, follow us on Twitter, like us on Facebook, and join our email list. The application for SICSS-Howard/Mathematica 2024 is open! Apply now!
About the author
Rebekah Jones is a political science PhD candidate at the University of California, Berkeley. Her current research agenda examines the local political economy of public safety. Her work has been supported by the National Science Foundation and the American Political Science Association. Prior to Berkeley, Rebekah received a B.S. from Cornell University in Development Sociology (with distinction in research) and minors in Crime, Prisons, Education, and Justice (CPE+J), Public Policy, and Law and Society.