Research

Working Papers

with Anjali Adukia, Matthew Bonci, Paula Dastres Gallardo, Emileigh Harrison, and Teodora Tsasz

Abstract

Emotional intelligence is a key component of human capital, shaped in part by the educational materials children consume. These materials send messages about the emotional reactions that are and are not socially appropriate. In this study, we apply natural language processing and computer vision tools to examine the emotional representations conveyed in the text and images of influential educational materials: public school elementary textbooks and award winning children’s literature. Our analysis reveals a stark mismatch between the emotions children read about and the ones they see in images. We find that textual context exposes children to a diverse emotional landscape, including happiness, sadness, anger, and calm, all with relatively balanced frequency. In contrast, pictured characters overwhelmingly display happiness and calm, while ``negatively’’ valenced emotions rarely appear. This pattern persists across time, genre categories, and demographic subgroups in our corpus. Using individual-level book purchases and library inventory data, we provide evidence that the overrepresentation of happy and calm emotions in visual content reflect supply-side responses to consumer preferences.



Works in Progress

“Influencing Identities: Creator Identity and Character Representation in Children’s Literature”

with Anjali Adukia, Emileigh Harrison, and Celia Zhu

Abstract

Books convey messages about social values and norms to children, especially through characters that are personally relevant to them. We quantify representation in books that have won or received honorable mentions from the Newbery, Caldecott, or Coretta Scott King awards. We extend the idea of representation from whether people of diverse groups appear to how they are depicted in a book. Using modern techniques in natural language processing and computer vision, we develop metrics to quantify the presence and portrayal of people of diverse groups in children’s books. We apply our metrics to assess how creator identity and award selection criteria influence the supply of representation in children’s books. We find that, conditional on the award, books by Black authors depict darker-skinned characters than books by White authors. Additionally, independent of the award, females are increasingly present in female-authored books over time, whereas their relative lack of presence in male-authored books remains somewhat stable across decades. In terms of depiction, we see that females are portrayed in relational to men, whereas men are portrayed in relation to occupations. Surprisingly, this pattern is consistent across author genders. On the demand side, we find that, conditional on the award, Black purchasers are more likely to purchase books by Black authors than White authors.



“Pigeonholed: Category Embeddings Measure Limited Intersectional Portrayals in Real and Imagined Worlds”

with Anjali Adukia, Alex Eble, and Emileigh Harrison

Abstract

Media portrayals shape how people see themselves and others, particularly during child- hood when stereotypes are first learned. This is especially true when an identity is underrepresented, making each portrayal more influential in shaping perceptions. How- ever, existing applications of natural language processing (NLP) struggle to capture how underrepresented and intersectional identities are portrayed because such identities are mentioned infrequently or cannot be inferred from names alone. We introduce a new application of word embeddings that aggregates contexts across identifiable individu- als to estimate a single “category” embedding for entire identity categories (e.g., Black heterosexual females). This allows us to measure the portrayal of underrepresented identities previously beyond the reach of traditional word embedding applications. Ap- plying this approach to two influential corpora – 1,130 award-winning children’s books (which typically contain fiction or “imagined” worlds) and 250,000 articles from The New York Times and The Wall Street Journal (which describe “real” world news and events) – we show that White heterosexual males are depicted across the broadest range of societal roles, while historically marginalized identities are consistently pigeonholed into narrower domains such as sports, struggle, or the performing arts. These findings highlight how media consumed by both children and adults perpetuate dated patterns of representation, and they demonstrate the value of category embeddings for advancing the study of intersectionality across cultures, contexts, and time.