Social Chemistry

We present a new conceptual formalism to study people’s everyday social norms and moral judgments.

scroll down


Our study centers around cultural rules-of-thumb. Each rule-of-thumb is inspired by a situation:


My roommate ran the blender at 5am


It's rude to make loud noises early in the morning

A rule-of-thumb has a simple structure: it is the judgment of an action.

It's rude to make loud noises early in the morning
judgment action

We use rules-of-thumb to capture cultural norms. These include moral, ethical, and social norms. We treat rules-of-thumb as explanations of everyday social expectations.

Multiple perspectives

We try to collect rules-of-thumb that capture competing perspectives. This helps us understand the different factors someone might be weighing in the situation.


Narrator: "Asking my boyfriend to stop being friends with his ex"

rule-of-thumb 1 for Narrator

It's okay to ask your significant other to stop doing something you're uncomfortable with

rule-of-thumb 2 for Narrator

It's not right to tell another person who to spend time with

Those two rules-of-thumb are both written for the narrator of the situation. But we can also write rules-of-thumb for other characters.

rule-of-thumb 3 for my boyfriend

You should make sure your significant other doesn't feel like a lower priority than your ex


We annotate each rule-of-thumb with discrete labels along several different dimensions. These new attributes let us understand rules-of-thumb at a coarse level, and tie the rule-of-thumb to its parent situation.

Moral Foundations

A popular social psychology theory, Moral Foundations (Haidt et al., 2013) define fundamental axes of morality.

axis Example
Care ⇄ Harm It's mean to tell someone they aren't as attractive as someone else.
Fairness ⇄ Cheating It's wrong to copy someone else's answers in an exam.
Loyalty ⇄ Betrayal It's rude for a cheerleader to boo their own team.
Authority ⇄ Subversion It's rude to walk way from your boss while they are talking to you.
Sanctity ⇄ Degradation It's disgusting to pee into a public pool full of people.
Anticipated agreement

How universally held is the rule-of-thumb as a belief? This question is captured by the anticipated agreement attribute. This label allows us to understand a broader distribution of cultural beliefs.

agreement example
Universal (~99%)
You're expected to wear clothes in public.
Common (~75% – 90%)
Human beings evolved, like other animals.
Controversial (~50%)
We should pass stronger gun control laws.
Uncommon (~5% – 25%)
Believing that ghosts exist.
Almost no one (< 1%)
It's good to murder others.
Relevant character

If the rule-of-thumb is advice, then who should follow it? The relevant character marks who in the situation (including the narrator) is the person to who you would tell this rule-of-thumb. They may already be following its advice, they may be doing exactly the opposite, or we might not be able to tell from the situation.


Narrator: "I noticed my friend wasn't tipping the bartender"

rule-of-thumb for Narrator

It's all right to gently correct a friend who is being rude

rule-of-thumb for my friend

It's expected that you tip bartenders for each drink

rule-of-thumb for the bartender

It's professional to give good service even to people who don't tip


Each rule-of-thumb is the judgment of an action. We isolate the action so that we can study it in further detail with the attributes below.

rule-of-thumb action
It's good to care for the elderly. caring for the elderly
It's bad to expose others to secondhand smoke. exposing others to secondhand smoke
It's okay to be angry if your friend talks to someone you used to date. being angry when your friend talks to someone you used to date
Cultural pressure

Cultural pressure measures to what degree someone feels socially influenced to do (or avoid) an action. This pressure may come from one’s family, friends, community, culture, or society at large.

pressure Example
Strongly for
Wearing clothes in public
Being honest with people
Choosing to read before bed
Spending money on jewelry if you can't afford it
Strongly against
Intentionally harming an animal
Social judgment

A subjective moral judgment is captured by the social judgment of an action. This is an intuitive reaction of whether something is good or bad.

judgment Example
Very good
Buying groceries for a financially struggling neighbor
Driving a friend to the airport
Expected / OK
Wearing clothes in public
Saying something mean to a friend
Very bad
Slashing tires
Is character doing action?

Using the relevant character identified above, we might want to know: in the situation, are they doing the action given by the rule-of-thumb?


Narrator: "I noticed my friend wasn't tipping the bartender"


my friend

doing action? action
Having drinks at a bar
Paying for drinks
Going clubbing every day
Probably not
Enjoying the drinks
Explicitly not
Tipping the bartender


Try writing a situation and see what rules-of-thumb the model writes.

This model—which we call the Neural Norm Transformer—is a conditional language model based on the GPT2-XL (Radford et al., 2019) architecture. It was trained on a large collection of rules-of-thumb as well as label tokens representing the attributes described above. In advanced mode, you can also select and view these attributes.



... or pick one randomly.

Moral Foundation axis


Pick from one of these five moral foundation axes:

Model choice

The model will choose a moral foundation axis for each rule-of-thumb. Click on one of the five boxes above to pick one yourself.

Care ⇄ Harm

Preventing or inflicting pain or suffering. For example, It's mean to tell someone they aren't as attractive as someone else.

Fairness ⇄ Cheating

Notions of equity, justice, and rights. For example, It's wrong to copy someone else's answers in an exam.

Loyalty ⇄ Betrayal

Obligations or concerns for group, family, and nation. For example, It's rude for a cheerleader to boo their own team.

Authority ⇄ Subversion

Submission and deference to traditions or legitimate authority. For example, It's rude to walk way from your boss while they are talking to you.

Sanctity ⇄ Degradation

Preference for purity, or abhorrence for disgusting things or actions. For example, It's disgusting to pee into a public pool full of people.

cultural pressure & social judgment ?

Select values for cultural pressure and social judgment from the grid:

Cultural pressure

Social judgment

Cultural pressure


Value description

Social judgment


Value description

Anticipated Agreement


Use the slider to pick from three levels of anticipated agreement:

~75% – 90%


People are divided on this rule-of-thumb. Roughly half of people will agree with it, and half will disagree.


Most people generally believe this rule-of-thumb,
but there is a minority who will disagree.


Nearly everyone believes this rule-of-thumb.

Model choice

The model will choose an anticipated agreement level. You can use the slider above to pick one yourself.


Situation and Constraints

The input you provided

generating rules-of-thumb ...


Generated by the model *disclaimer

The model generated rules-of-thumb, and then it broke apart each into an underlying action along with six attributes. *disclaimer

Social-Chem-101 Dataset

We collect 292,000 rules-of-thumb based off of 104,000 situations. Along with each rule-of-thumb, we provide a complete set of attribute labels, breaking it down along 12 dimensions. In total, we collect over 4.5 million free-text and labeled annotations. We release this as the Social-Chem-101 dataset.

Figure: Statistics of the Social-Chem-101 dataset. All bars are drawn to scale. Several attributes are shown that aren't described on this page. Please see the paper for comprehensive details.

The discrete-valued attributes, annotated on top of rules-of-thumb and actions, present many possibilities for stratifying the data.

Figure: Rules-of-thumb in Social-Chem-101 plotted according to social judgment (x), agreement (y), and cultural pressure (color). Jitter is applied to each bucket to help show the distribution. Two observations: Discretionary actions (yellow) span a range of moral values (social judgment; x-axis). Also, fringe beliefs (bottom) often evoke strong negative cultural pressure (red), even when morally neutral (center).

Dataset Browser

Start typing to see situations and the complete annotations from the Social-Chem-101 Dataset.

browse Situations

loading dataset ...

selected situation


full annotation


Cultural Scope

We recognize that social norms are often culturally sensitive and judgments of morality and ethics concerning individuality, community and society do not always hold universally. While some situations (e.g., "punching someone") might have similar levels of acceptability across a number of cultures, others might have drastically varied levels depending on the culture of its participants (e.g., "kissing someone on the cheek as a greeting").

As a starting point, our study focuses on the socio-normative judgments of English-speaking cultures represented within North America. While we find some variation of judgments in our annotations (e.g., with respect to certain worker characteristics), extending this formalism to other countries and non-English speaking cultures remains a compelling area of future research.


The rules-of-thumb, especially those output by the model, are intended for research purposes only. None of this work should be used for advice, or to aid in social understanding by humans.

find out more


Social Chemistry 101: Learning to Reason about Social and Moral Norms EMNLP 2020 read on arxiv


Social-Chem-101 Dataset 4.5M+ annotations 28 MB .zip download


Accompanies the paper Training, inference, analysis Python view on github
Authors Maxwell Forbes, Jena D. Hwang, Vered Shwartz, Maarten Sap, Yejin Choi
Built at University of Washington, Allen Institute for AI
Acknowledgments The authors would like to thank Nicholas Lourie for project inspiration and the Scruples dataset, Rowan Zellers for advice on grounding the attribute collection, Chandra Bhagavatula for discussions about modeling, and Sam Skjonsberg and Carissa Schoenick for help with the demo infrastructure. Thanks to the hundreds of workers on Mechanical Turk who spent hours building the dataset. Thanks also to OpenAI for significant influence in web design.
Funding This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE1256082, and in part by NSF (IIS-1714566), DARPA CwC through ARO (W911NF15-1-0543), DARPA MCS program through NIWC Pacific (N66001-19-2-4031), and the Allen Institute for AI.
Website Maxwell Forbes