Understanding institutions in text
By Saba Siddiki, Doug Rice, Seth Frey, and Christopher Frantz
Institutions — rules that govern behavior — are among the most important social artifacts of society. So it should come as a great shock that we still understand them so poorly. How are institutions designed? What makes institutions work? Is there a way to systematically compare the language of different institutions? One recent advance is bringing us closer to making these questions quantitatively approachable. The Institutional Grammar (IG) 2.0 is an analytical approach, drawn directly from classic work by Nobel Laureate Elinor Ostrom, that is providing the foundation for computational representations of institutions. IG 2.0 is a formalism for translating between human-language outputs — policies, rules, laws, decisions, and the like. It defines abstract structures precisely enough to be manipulable by computer. Recent work, supported by the National Science Foundation (RCN: Coordinating and Advancing Analytical Approaches for Policy Design & GCR: Collaborative Research: Jumpstarting Successful Open-Source Software Projects With Evidence-Based Rules and Structures ), leveraging recent advances in natural language processing highlighted on this blog, is vastly accelerating the rate and quality of computational translations of written rules.
The Institutional Grammar
The Institutional Grammar operates on institutional statements, statements that may articulate rules, norms, or ever suggestions. It creates a mapping between well-defined syntactic components and features of institutional statements that define features of social systems, such as what one can, cannot, or must do in different times and places. Defining the grammar of institutions in terms of the grammars of language makes it possible to discern patterns in institutional language across institutional statements. The Institutional Grammar 2.0 in particular is designed for syntactic coding of institutional statements found in policy documents.
Automating annotation
Traditionally, the annotation of policy statements with the IG was done by hand, a time-consuming process that required specialized knowledge. For those reasons, the accessibility of the IG as a tool for the analysis of policy language was limited, and the promise of the approach was left largely unfulfilled. Between 1995, when Ostrom and her student Sue Crawford published the seminal “A Grammar of Institutions,” and 2015, only 14 papers had actually applied the method. However, a recent surge in both interest and tools is driving tremendous growth: as of 2020, the number of papers had increased to 43. In our new work, we provide an approach that opens up the systematic analysis of policies and their composite statements using the IG, with an automated tool that leverages recent revolutionary advances in computational linguistics and natural language processing. Our tool leverages an extensive array of features from policy statements — including word indicators, parts of speech, and dependency relations — to train a neural network classifier to automatically identify IG components. In early testing, our automated classifier approaches the same levels of inter-coder reliability that would be expected from human coding of the texts.
Even more promising, in recent testing, we have employed contextualized word embeddings as features for the classification. Throughout natural language processing work, contextualized embeddings have led to vast improvements in task performance. The result in our setting is a quantum leap in classification accuracy. Adding contextualized embeddings as features for our classifier improved out-of-sample classification performance from approximately 73% to 88%.
Future directions
We are now in the process of developing a new R package to implement our approach and will announce the release of the package to the SAGE Ocean community when it is complete. For now, we encourage scholars and others interested in systematic measurement and understanding of policies, the IG, and our automated approach to reach out. The co-authors are part of a growing, international research collaboration — the Institutional Grammar Research Initiative — between a multi-disciplinary network of scholars all interested in the policies that shape, and are shaped by, human behavior. If you are interested in using or contributing to the development of these tools, we welcome you to the community.
About
Prof. Saba Siddiki is a policy scholar in the Department of Public Administration and International Affairs at Syracuse University. She specializes in policy design, collaborative policymaking, and regulatory implementation and compliance.
Prof. Doug Rice is a political scientist and faculty affiliate of the Computational Social Science Institute at UMass Amherst specializing in judicial policymaking.
Prof. Seth Frey is a computational social scientist and cognitive scientist in the Department of Communication at the University of California Davis. He specializes in applications of data science for large-scale comparative institutional analyses.
Prof. Christopher Frantz is a computational social scientist in the Department of Computer Science at the Norwegian University of Science and Technology (NTNU). He specializes in social simulation techniques, with specific interest in agent-based institutional modeling and its application in policy analysis.