Nov 07

Cognitive Symmetry and Trust

A chain of speculative scientific reasoning from our work into really big social/society questions:

  1. Skill learning is a thing. If we practice something we get better at it and the learning curve goes on for a long time, 10,000 hours or more.  Because we can keep getting better for so many hours, nobody can really be a top-notch expert at everything (there isn’t time).  This is, therefore, among the many reasons why in group-level social functioning it is much better to specialize and have multi-person projects done by teams of people specializing in component steps (for tasks that are repeated regularly).  The economic benefits of specialization are massive and straightforward.
  2. However, getting people to work well in teams is hard. In most areas requiring cooperation, there is the possibility of ‘defecting’ instead of cooperating on progress – to borrow terms from the Prisoner’s Dilemma formalism.  That powerful little bit of game theory points out that in almost every one-time 2-person interaction, it’s always better to ‘defect,’ that is, to act in your own self-interest and screw over the other player.
  3. Yet, people don’t. In general, people are more altruistic than overly simple game-theory math would predict.  Ideas for why that model is wrong include (a) extending the model to repeated interactions where we can track our history with other players and therefore cooperation is rewarded by building a reputation; (b) that humans are genetically prewired for altruism (e.g., perhaps by getting internal extra reward from cooperating/helping); or (c) that social groups function by incorporating ‘punishers’ who provide extra negative feedback for the non-cooperators to reduce non-cooperation.
  4. These three alternatives aren’t mutually exclusive, but further consideration of the (3a) theory raises some interesting questions about cognitive capacity. We interact a lot with a lot of different people in our daily lives.  Is it possible to track and remember everything about our interactions in order to make optimal cooperate/defect decisions?  Herb Simon argued (Science, 1980) that we can’t possibly do this, working along the same lines as his ‘bounded rationality’ reasoning that won him the Nobel Prize in Economics.  His conclusion was that (3b) was more likely and showed that if there was a gene for altruism (he called it ‘docility’), it would breed into the population pretty effectively.
  5. No such gene has yet been identified and I have spent some time thinking about alternate approaches based on potential cognitive mechanisms for dealing with the information overload of tracking everybody’s reputation. One really interesting heuristic I ran is the Symmetry Hypothesis, which I have slightly recast for simplicity.  This idea is a hack to the PD where you can reason very simply as follows: If the person I am interacting with is just like me and reasons exactly as I do, no matter what I decide, they are going to do the same and in this case, I can safely cooperate because the other player will too.  And if I defect, they will also (potentially allowing group social gains through competition, which is a separate set of ideas).
  6. Symmetry would apply in cases where the people you often interact with are cognitively homogeneous, that is, where everybody thinks ‘we all think alike.’ Here, where ‘we’ can be any social group (family, neighborhood, community, church, club, etc.).   If this is driving some decent fraction of altruistic behavior, you’d see strong tendencies for high levels of in-group trust (compared with out-group), and particularly in groups that push people towards thinking similarly.  You clearly do see those things, but their existence doesn’t actually test the hypothesis – there are many theories that predict in-group/out-group formation, that these affect trust, that people who identify in a group start to think similarly.  Of note, though, this idea is a little pessimistic because it suggests that groupthink leads to better trust and social grouping should tend to treat novel, independent thinkers poorly.
  7. Testing the theory would require data examining how important ‘thinks like me’ is to altruistic behavior and/or how important cognitive homogeneity is to existing strong social groups/identity. This is a potential area of social science research a bit outside our expertise here in the lab.
  8. But if true, the learning-related question (back to our work) is whether a tendency to rely on symmetry can be learned from our environment. I suspect yes, that feedback from successful social interactions would quickly reinforce and strengthen dependency on this heuristic.  I think that this could cause social groups to become more cognitively homogeneous in order to be more effectively cohesive.  Cognitively homogeneous groups would have higher trust, cooperate better and be more productive than non-homogeneous groups, out-competing them.  This could very well create a kind of cultural learning that would persist and look a lot like a genetic factor.  But if it was learned (rather than prewired), that would suggest we could increase trust and altruism beyond what we currently see in the world by learning to allow more diverse cognitive approaches and/or learning to better trust out-groups.


I was moved to re-iterate this chain of ideas because it came up yet again in conversational drive into politics among people in the lab.  Although our internal debates usually center around how different groups treat the out-groups and why.  Yesterday, the discussion started with observing that people we didn’t agree with seemed often to be driven by fear/distrust/hate of those in their out-groups.  However, it was not clear that if you didn’t feel that way, whether you had managed to see all of humanity as your in-group or instead had found/constructed an in-group that avoided negative perception of the out-groups.  We did not come to a conclusion.

FWIW, this line of thinking depends heavily on the Symmetry idea, which I discovered roughly 10 years ago via Brad DeLong’s blog (  According to the discussion there, it is also described as the Symmetry Fallacy and not positively viewed among real decision scientists.  I have recast it slightly differently here and suspect that among the annoying elements is that I’m using an underspecified model of bounded rationality.  That is, for me to trust you because you think like me, I’m assuming both of us have slightly non-rational decision processes that for unspecified reasons come to the same conclusion that we are going to trust each other.  Maybe there’s a style issue where a cognitive psychologist can accept a ‘missing step’ like this in thinking (we deal with lots of missing steps in cognitive processing) where a more logic/math approach considers that anathema.

Aug 30

Evidence and conclusions

I think this should be the last note on this topic for awhile, but since it’s topical a new piece of data popped up related to possible sources of gender outcome differences in STEM-related fields.


The new piece of data was reported in the NY Time Upshot section, titled “Evidence of a Toxic Environment for Women in Economics”


The core finding is that on an anonymous forum used by economists (grads, post-doc, profs) there are a lot of negatively gendered associations for posts in which the author makes explicitly gendered reference.  In general, posts about men are more professional and posts about women are not (body-based, sex-related and generally sexist).  Note that “more” here is done as a likelihood ratio, mathematically defined but the effect size is not trivially extracted. Because we always like to review primary sources, I dug up the source, which has a few curious features.

First, it’s an undergraduate thesis, not yet peer-reviewed, which is an unusual source.  However, I looked through it and it is a sophisticated analysis that looks to be done in a careful, appropriate and accurate way (mainly based on logistic regression).  I read through it a bit and the method looks strong, but is complex enough that there could be an error hiding in some of the key definitions and terms.

Paper link:

Of note, the paper seems to be a careful mathematical analysis of something “everybody knows” which is that anonymous forums frequently include high rates of stereotype bias against women and minorities.  But be very careful with results that confirm what you already know, that’s how confirmation bias works.  In addition, economists may not be an effective proxy for all STEM fields.  I don’t know of a similar analysis for Psychology, for example.

But as an exercise, lets consider the possibility that the analysis is done correctly and truly captures the fact that there are some people in economics who treat women significantly differently than they treat men, i.e., that their implicit bias affects the working environment.  So we have 3 data points to consider.

  1. There are fewer women in STEM fields than men
  2. There are biological differences between men and women (like height)
  3. There are environmental differences that may affect women’s ability to work in some STEM fields

The goal, as a pure-minded scientist is to understand the cause of (1).  Why are there fewer women in STEM?  The far-too-frequent inference error is when people (like David Brooks) take (2) as evidence that (1) is caused by biological differences.  That’s simply an incorrect inference.

It turns out to be helpful to know about (3) but only because it should reduce you to less certainty that (2) implies (1).  It’s still critical to realize that we do not know that environment causes (1) either.  All we know is that we have multiple hypotheses consistent with the data and we don’t know.

What we do know is that (3) is objectively bad socially.  Even if (2) meant there were either mean or distributional differences between men and women, the normal distribution means there are still women on the upper tail and if (3) keeps them out of STEM, that hurts everybody.

The googler’s memo assumes (2) and reinforces (3), which is clearly and objectively a fireable offense.


Aug 11

See the problem yet?

The entirely predictable backlash against Google for firing the sexist manifesto author has begun.  Among the notable contributors is the NY Time Editorial page in the form of David Brooks.  In support of his position that the Google CEO should resign, he’s even gone so far as to dig up some evolutionary psych types to assert that men and women do, indeed, differ and therefore the author was on safe scientific ground.

The logical errors are consistent and depressing.  Nobody is arguing that men and women don’t differ on anything. The question is whether they differ on google-relevant work skills.  Consider the following 2 fact-based statements:

  1. Men are taller than women
  2. Women are more inclined towards interpersonal interactions

There is data to support both statements on aggregate and statistically across the two groups.  The first statement is clearly a core biological difference with a genetic basis (but irrelevant to work skills at Google).  However, the fact that (1) has a biological basis does not mean the second statement does.  The alternative hypothesis is that (2) has arisen from social and cultural conditions, not something about having XX or XY genes (or estrogen/androgen).  The question is between these two statements:

2a. Women are more inclined towards interpersonal interactions because of genetic differences

2b. Women are more inclined towards interpersonal interactions because they have learned to be

And while statement 2 is largely consistent with observations (e.g., survey data on preferences), we have no idea at all which of 2a or 2b is true (or even if the truth is a blend of both).  Just because an evo psych scientist can tell a story about how this could have evolved does not make it true either, it just means 2a is plausible (evo psych cannot be causal).  It’s unambiguous that 2b is also plausible. There simply isn’t data that clearly distinguishes one versus the other and anybody who tells you otherwise is simply exhibiting confirmation bias.

And anybody even considering that the firing was unjust needs to read the definitive take-down of the manifesto by a former Google engineer, Yonatan Zunger, who does not even need to consider the science, just the engineering:

His core point is that software engineering at Google is necessarily a team-based activity, which (a) means the social skills the manifesto author attributes to women are actually highly valuable and (b) the manifesto author now no longer has any prospects for being able to be part of a team at Google because he has asserted that many colleagues are inferior (and many others will be unhappy they’d even have to argue the point with him).  Given that any of these skills is likely present in a normal distribution across the population and that Google selectively hires from the high end for both men and women, even if his pseudoscience was correct, you’d still have to fire him to maintain good relationships with the women you’ve already hired from the very high end of the skill distribution.

It’s kind of amazing how bad so many people are at basic logic once a topic touches their implicit bias.  Also depressing, too.

Aug 08

Anti-diversity “science”

Somebody at Google wrote a memo/manifesto arguing against diversity (mainly gender), caused something of a ruckus and got himself fired.  The author was clearly either trying to get terminated (as a martyr) or simply not very bright.  A particularly articulate explanation of why it is necessary to fire somebody who did what he did is here (TL;DR the memo author doesn’t seem to understand very important things about engineering or being part of a company that has engineering teams):


There are a number of interesting things about the whole episode, but one that we’ve had pop up in the lab recently in discussion is when it is possible for science to be ‘dangerous.’  The memo provides a convenient example since an early section attempts to wrap assumptions of biologically-driven gender differences in a thin veneer of science.  It’s a particularly poorly done argument, which I think makes it easier to see the overall inference flaws.

The argument is something like:

Men and women differ on X due to innate, biological and immutable differences (feel free to throw “evolutionary” in there as well if you’d like).  Men and women also differ on Y, which therefore must also be due to innate and immutable biological causes.

That there exists some value of X that make the first statement true (e.g., the number of X chromosomes) is not really worth arguing about. It should be obvious that you can’t assert the second statement regardless.  I usually frame it as a reminder to consider the alternate hypothesis, which we can state here as “Men and women differ on Z, which is due to cultural and environmental differences.”  Cross-cultural studies of gender differences make it unambiguously clear that there are values of Z that make this third statement true as well.

So what do we do about the middle statements, for values of Y for which we do not know if they are based mainly on nature or nurture?  Well, for one thing, we don’t make policy statements based on them.

For another, though, we’d like to do science that tackles difficult and thorny issues like nature vs nurture, individual differences, stability of personality measures, effects of education, culture and environment.  But how do we do science, which is often messy and even unstable on the cutting edge, when there are ideologically minded individuals waiting to seize on preliminary findings to drive a political agenda?

I don’t actually know. And that bothers me a fair amount.

If you doubt the danger inherent here, consider that you can make a pretty good case that the current president of the US is largely in place due to exactly this kind of bad, dangerous science.  The alt-right, which probably moved the needle enough to swing the very close election, is a big fan of genetic theories of IQ, especially ones that support the assumption that the privileged deserve all their advantages.  So they are highly invested in the discredited book The Bell Curve and the type of argument that got Larry Summer’s fired from Harvard (the ‘fat tails’ hypothesis of gender differences).  These ideas are generally focused through the lens of Ayn Rand’s Objectivism which asserts the moral necessity of rule of the privileged over the masses — which is, fwiw, pretty well reflected in the googler’s memo as well.


Aug 01

In Memoriam Howard Eichenbaum

Howard Eichenbaum was a great scientist in the field of memory.  He passed away unexpectedly last week at the age of 69.  His research was directly on the boundary between basic neuroscience and cognitive neuroscience, making the connections from neurobiological studies done with rats to how human memory works.  He was particularly well known as a teacher and mentor.  His passing was noted to the MDRS community by his good friend Neal Cohen, which elicited a remarkable outpouring of affection and kind words about Howard.  Neal also shared this obituary:

Howard B. Eichenbaum

Howard B. Eichenbaum, a William Fairfield Warren Distinguished Professor in the Department of Psychological and Brain Sciences at Boston University, and an internationally recognized figure in advancing our understanding of the fundamental nature and brain mechanisms of memory, died in Boston on July 21, 2017 following recent spine surgery at age 69.

Eichenbaum’s contributions to the field of memory research were profound, in helping us to better understand how memory works and how it is organized in the brain. His contributions come from his extensive empirical findings, including the discovery of “time cells” in the hippocampus; his integrative approach, committed to synthesizing results across species, across methods, and across levels of analysis; his important theoretical advances, concerning multiple memory systems of the brain; his creative long-term editorship of the journal Hippocampus, even while serving on the editorial boards of 10 other journals; his mentorship, guidance, and encouragement of scores of undergraduates, graduate students, postdoctoral fellows, and junior faculty who went on to have their own significant impact on the field; and his remarkable history of service and leadership.

Eichenbaum joined the Boston University faculty in 1996 after obtaining a BS in cell biology and a PhD in psychology at the University of Michigan, then holding faculty positions at Wellesley College (1977-1991), University of North Carolina at Chapel Hill (1991-1993), and SUNY Stony Brook (1993-1996).  At the time of his death, Eichenbaum’s other roles at Boston University included serving as founding Director of the Center for Memory and Brain and of the Cognitive Neurobiology Laboratory, after having earlier founded both the Undergraduate Program for Neuroscience and the Graduate Program for Neuroscience.

His contributions have been formally recognized with multiple honors, including being named a Fellow of the American Association for the Advancement of Science, the American Academy of Arts and Sciences, and the Association for Psychological Science; appointment to the Council of the Society for Neuroscience and the NIMH National Advisory Mental Health Council; and election to Chair, Section on Neuroscience, American Association for the Advancement of Science.

Eichenbaum’s non-science pursuits included coaching his two sons’ Little League baseball teams for many years, taking his sons around the country on their “baseball-parks-of-America tour” – a quest to catch a game at every Major League Baseball Park in America that spanned 6 summers across 15 years, kayaking in the waters off Chatham, MA, and rooting passionately for his Boston Red Sox and University of Michigan teams. He is survived by his beloved wife of 35 years, Karen J. Shedlack; two sons, both pursuing graduate studies, Alexander E. Eichenbaum and Adam S. Eichenbaum; 100-year-old mother, Edith (Kahn) Eichenbaum; brother, Jerold Eichenbaum; sister, Miriam Eichenbaum Drop; nephews Michael Eichenbaum and Dylan Drop; and niece, Tali Eichenbaum.

Jun 26

Confirmation bias

Mistaking data consistent with your hypothesis for data establishing your hypothesis is a surprisingly common mistake, even for highly trained, experienced scientists.  The subjective experience is common: you develop and carry around a theory on some topic and over the course of your day, you run into evidence (anecdotes, or other scientific findings) that would be predicted by your theory.  So you think, “hey, that fits too, my theory must be right.”

That’s fine for informal theory development, but when you want to really test your hypothesis, it doesn’t work.  After you identify data consistent with your theory, you need to figure out what theory that data actually rules out.  The problem is that it is often the case that competing theories make similar predictions, so data fitting your theory doesn’t prove the alternate theory wrong.  This is why the process of doing science generally requires collecting new data that is carefully designed to discriminate between alternate theories.  This requires both (a) figuring out the alternate theories and (b) designing a good experiment — and so it can be hard.

An interesting example that has been popping up again in popular discussion is the question of the genetic contribution to IQ.  This is an area where everybody thinks they have a strong theory and it’s remarkable how poorly almost everybody does at considering alternates.

For example, you think genes contribute strongly to IQ and you notice that apparently IQ-related success seems to run in families, so you think “Aha! My theory is supported.”  Maybe you’ll even correctly spot the ‘null hypothesis’ that “IQ doesn’t run in families” as being disconfirmed by these data.  But the real alternate is “environmental factors determine IQ more than genes” and since families largely share environments, your data don’t discriminate.

Note that this doesn’t mean we know which theory is correct.  Both theories are consistent with the observed data and no strong conclusions can be drawn.  I imagine this is very frustrating to non-scientists because you’d rather than a clear answer than a perpetual state of uncertainty.  Scientists have to live with uncertainty all the time and it can make it tricky to talk about your work — you want to make simple statements, but you don’t want to overstate your confidence.

The topic has come up because Charles Murray is once again in the news, happily going around talking about his theory that (a) genes are a major/primary determinant of IQ and (b) these genes vary substantially across race.  If you know the actual science being done around this area, you know that neither of those statements are established to the point of ruling out any plausible alternatives.  Even setting aside the question of “what is measured in an IQ test” we know for sure that genes have an effect, education has an effect, and there is increasing evidence of other non-education environmental effects (lead, stress, nutrition).  Nobody knows the relative importance of these effects — and really careful thinkers are also well aware the relative values change across samples within the population (e.g., nutrition effects don’t count for much across a sample that is all well-nourished across the lifespan).

But if somebody like Murray presents the messy data, knowing that a lot of racist listeners are going to simply hear confirmation bias, and then make no effort to argue for non-racist intervention policy to improve the environment (that is entirely supported by the data he presents), then the rest of us can confidently rule out the hypothesis that Murray isn’t a racist.

If you are really interested in the science in this area, there are much better people to be paying attention to:


Mar 06

Explaining neuroscience

I ran across this link referenced by its title:

A neuroscientist explains a concept at five different levels

I was initially worried it would annoy me, but eventually decided to take a look at it anyway, figuring it would be interesting at the level of thinking about your audience when describing a complex scientific concept.  On one hand, parts of it are better than expected.  On the other, some of the interactions were a bit odd (the college & graduate student — but maybe it was hard to edit it for the time allowed).

Overall, it’s a good micro example of choosing your language to be appropriate to what you expect your audience to know.  I also noted that for the two youngest explainees, the scientist presented things to get a nice ‘wow’ response, which is probably a good memory aid (and generally good in teaching or explaining).  For the other audiences, he did a lot more listening, which seems natural since there are a lot of different possible backgrounds for undergraduates and graduate students.

It was advertised as similar in spirit to the Feynman descriptions of how everyday things work.  I don’t think it reaches that level, but those are pretty extraordinary, so it’s not really a fair standard.

Jul 27

Surgical skill

One potential application for our basic studies of skill learning is understanding the development of skill in performing surgery.  So I was intrigued when happening to stumble across the following report of factors predicting successful surgical outcomes:

Surgeon specialization and operative mortality in United States: retrospective analysis

BMJ 2016; 354 doi: (Published 21 July 2016) Cite this as: BMJ 2016;354:i3571

The take home message was that ‘specialization’ may be as important as repetitions in successful skill performance.  The following clipped paragraph captures many of the things I find fascinating here:

At the same time, the degree to which a surgeon specializes in a specific procedure may be as important as the number of times that he or she performs it.11121314 A surgeon who specializes in one operation may have better outcomes owing to muscle memory built from repetition, higher attention and faster recall as a result of less switching between different procedures, and knowledge transfer of outcomes for the same procedure performed in different patients.815161718 If this specialization hypothesis holds true, a surgeon performing 20 procedures of which all 20 are valve replacements (denoting 100% specialization in the procedure) would have lower operative mortality rates than a surgeon who performs 100 operations of which 40 are valve replacements (denoting 40% specialization in the procedure). In contrast, the volume-outcomes hypothesis would suggest that selecting the surgeon who performs 40 valve replacements would lead to superior outcomes for patients. To the best of our knowledge, no study has described a statistical association between a surgeon’s degree of specialization in a specific procedure and patients’ mortality.


Of note, this is not what we’d expect from our lab work.  We find reps to be the most important.  Is there something we are missing in our work that captures some factor based on ‘specialization’?  Or is the study being influenced by other variables.

Note the assumption in the text that specialization leads to better “muscle memory.”  We think that should derive from reps, but have often wondered about interference.  Interference among procedures would produce a specialization effect, but we’ve never observed interference among sequences learned together.  Another possibility is something like “rust” — perhaps less specialized surgeons have longer intervals between performances (we have seen something like that in our forgetting curves).

It’s also possible that these ‘unspecialized’ surgeons are simply performing too many surgeries and having fatigue or other state effects.  You wouldn’t be able to spot those effects without the analysis done here because practice would be improving performance and hiding any costs of doing too many.

Or maybe it is something else entirely…

Jul 15

Dark side of intuition — unconscious bias

The occasion is tragic, but I am happy to see some more public discussion of ‘unconscious bias’ in the context of recent events related to the police shootings of minority ‘suspects.’  I particularly like the title of this piece:

“A former officer explains why racist police violence occurs even when cops ‘aren’t racist’”

I think that is the right framing.  Telling people they are influenced by unconscious bias is important.  Telling people that makes them racist (or sexist, or perpetrating any other prejudice) does not seem to be helpful.

When I describe this to people interested in intuition, I like to make the connection between implicit learning and habits.  Your brain learns bad habits as easily as good habits.  So the neural processes that support skill learning and useful intuition can also easily pick up bias from the statistical structure of the environment and mislead your instincts and gut reactions.

I’d like to think our recent interest in research about the ‘meta-cognition’ of implicit knowledge will eventually be relevant here.  We have been exploring ideas for how to learn to be more sensitive to and to evaluate our intuitions (e.g., to detect when they are accurate and should drive decision making).  In theory, that line of research should eventually extend to how to detect when your bias is pushing you in the wrong direction as well.

May 10

10,000 hours

I ran into a few references/mentions recently of The Dan Plan, a guy who is dedicating a few years of his life to “testing the 10,000 hours hypothesis”.  Specifically, he quit his job and is playing golf full-time trying to reach a professional level of play from a starting point of never having played before (

It is an interesting project.  It moved me to write this note in an attempt to clarify what is meant by “testing the 10,000 hours hypothesis” because there are two sides to this.  The first is whether 10,000 hours is necessary.  I believe the first characterization of the 10,000 hour rule by K. Anders Ericsson was aimed at whether it was necessary — or whether a true ‘prodigy’ might achieve the level of being ‘internationally competitive” based on pure, natural talent  (I knew something about this idea from discussions at CMU, where I did my PhD a few years after he was a post-doc there, but I am working from memory here).

Evidence counter to the need for 10,000 hours would be spectacular achievement by very young, talented individuals.  However, I am not aware of good evidence of anybody reaching the level of ‘internationally competitive” without putting in the hours — even if they are very young.

The other aspect is whether 10,000 hours is sufficient and about this there are definitely differing opinions.  The main alternate hypothesis is that to reach the top level, you need 10,000 hours *and* you need some innate talent.  This would imply you could put in 10,000 hours and still not be internationally competitive.  However, nobody runs this study because who wants to spend 10,000 hours and then still be mediocre.

My old friend Fernand Gobet (also from CMU) was a professional chess player for many years before leaving chess for Psychology and strongly asserts that there is too much evidence for talent being important.  He is pretty sure expertise comes from 10,000 + talent (and should be skeptical of the DanPlan).

In real life, people who put in 10,000 hours are probably responding to a lot of encouragement from trainers, teachers, and coaches who might also be picking up signs of some additional innate talent.  This is probably good for training but bad for science because if you only reach 10,000 hours if you already have talent, the variables are confounded and we can’t test the hypothesis.

In addition, Ericsson has also pointed out that it’s almost certainly not just any old 10,000 hours, but probably requires what he calls “deliberate practice” — which means 10,000 hours practicing the “right” things.  Sadly, we also don’t really know exactly what this is, even for golf.  So if the DanPlan doesn’t make pro level at 10k hours, he might simply not have gotten the right coaching (or he was missing the talent, we’ll never know).

My own guess is that 10,000 hours will get you pretty far on it’s own (I’m a learning researcher after all).  But I’m sympathetic to the idea that talent kicks in somewhere.  Maybe it distinguishes among those in the top 0.1%?  Or 1%?  Or is it even top 10%?

Older posts «