Confirmation bias
Mistaking data consistent with your hypothesis for data establishing your hypothesis is a surprisingly common mistake, even for highly trained, experienced scientists. The subjective experience is common: you develop and carry around a theory on some topic and over the course of your day, you run into evidence (anecdotes, or other scientific findings) that would be predicted by your theory. So you think, “hey, that fits too, my theory must be right.”
That’s fine for informal theory development, but when you want to really test your hypothesis, it doesn’t work. After you identify data consistent with your theory, you need to figure out what theory that data actually rules out. The problem is that it is often the case that competing theories make similar predictions, so data fitting your theory doesn’t prove the alternate theory wrong. This is why the process of doing science generally requires collecting new data that is carefully designed to discriminate between alternate theories. This requires both (a) figuring out the alternate theories and (b) designing a good experiment — and so it can be hard.
An interesting example that has been popping up again in popular discussion is the question of the genetic contribution to IQ. This is an area where everybody thinks they have a strong theory and it’s remarkable how poorly almost everybody does at considering alternates.
For example, you think genes contribute strongly to IQ and you notice that apparently IQ-related success seems to run in families, so you think “Aha! My theory is supported.” Maybe you’ll even correctly spot the ‘null hypothesis’ that “IQ doesn’t run in families” as being disconfirmed by these data. But the real alternate is “environmental factors determine IQ more than genes” and since families largely share environments, your data don’t discriminate.
Note that this doesn’t mean we know which theory is correct. Both theories are consistent with the observed data and no strong conclusions can be drawn. I imagine this is very frustrating to non-scientists because you’d rather than a clear answer than a perpetual state of uncertainty. Scientists have to live with uncertainty all the time and it can make it tricky to talk about your work — you want to make simple statements, but you don’t want to overstate your confidence.
The topic has come up because Charles Murray is once again in the news, happily going around talking about his theory that (a) genes are a major/primary determinant of IQ and (b) these genes vary substantially across race. If you know the actual science being done around this area, you know that neither of those statements are established to the point of ruling out any plausible alternatives. Even setting aside the question of “what is measured in an IQ test” we know for sure that genes have an effect, education has an effect, and there is increasing evidence of other non-education environmental effects (lead, stress, nutrition). Nobody knows the relative importance of these effects — and really careful thinkers are also well aware the relative values change across samples within the population (e.g., nutrition effects don’t count for much across a sample that is all well-nourished across the lifespan).
But if somebody like Murray presents the messy data, knowing that a lot of racist listeners are going to simply hear confirmation bias, and then make no effort to argue for non-racist intervention policy to improve the environment (that is entirely supported by the data he presents), then the rest of us can confidently rule out the hypothesis that Murray isn’t a racist.
If you are really interested in the science in this area, there are much better people to be paying attention to:
https://www.vox.com/the-big-idea/2017/6/15/15797120/race-black-white-iq-response-critics
Explaining neuroscience
I ran across this link referenced by its title:
A neuroscientist explains a concept at five different levels
http://kottke.org/17/03/a-neuroscientist-explains-a-concept-at-five-different-levels
I was initially worried it would annoy me, but eventually decided to take a look at it anyway, figuring it would be interesting at the level of thinking about your audience when describing a complex scientific concept. On one hand, parts of it are better than expected. On the other, some of the interactions were a bit odd (the college & graduate student — but maybe it was hard to edit it for the time allowed).
Overall, it’s a good micro example of choosing your language to be appropriate to what you expect your audience to know. I also noted that for the two youngest explainees, the scientist presented things to get a nice ‘wow’ response, which is probably a good memory aid (and generally good in teaching or explaining). For the other audiences, he did a lot more listening, which seems natural since there are a lot of different possible backgrounds for undergraduates and graduate students.
It was advertised as similar in spirit to the Feynman descriptions of how everyday things work. I don’t think it reaches that level, but those are pretty extraordinary, so it’s not really a fair standard.
Surgical skill
One potential application for our basic studies of skill learning is understanding the development of skill in performing surgery. So I was intrigued when happening to stumble across the following report of factors predicting successful surgical outcomes:
Surgeon specialization and operative mortality in United States: retrospective analysis
BMJ 2016; 354 doi: http://dx.doi.org/10.1136/bmj.i3571 (Published 21 July 2016) Cite this as: BMJ 2016;354:i3571
http://www.bmj.com/content/354/bmj.i3571
The take home message was that ‘specialization’ may be as important as repetitions in successful skill performance. The following clipped paragraph captures many of the things I find fascinating here:
At the same time, the degree to which a surgeon specializes in a specific procedure may be as important as the number of times that he or she performs it.11121314 A surgeon who specializes in one operation may have better outcomes owing to muscle memory built from repetition, higher attention and faster recall as a result of less switching between different procedures, and knowledge transfer of outcomes for the same procedure performed in different patients.815161718 If this specialization hypothesis holds true, a surgeon performing 20 procedures of which all 20 are valve replacements (denoting 100% specialization in the procedure) would have lower operative mortality rates than a surgeon who performs 100 operations of which 40 are valve replacements (denoting 40% specialization in the procedure). In contrast, the volume-outcomes hypothesis would suggest that selecting the surgeon who performs 40 valve replacements would lead to superior outcomes for patients. To the best of our knowledge, no study has described a statistical association between a surgeon’s degree of specialization in a specific procedure and patients’ mortality.
Of note, this is not what we’d expect from our lab work. We find reps to be the most important. Is there something we are missing in our work that captures some factor based on ‘specialization’? Or is the study being influenced by other variables.
Note the assumption in the text that specialization leads to better “muscle memory.” We think that should derive from reps, but have often wondered about interference. Interference among procedures would produce a specialization effect, but we’ve never observed interference among sequences learned together. Another possibility is something like “rust” — perhaps less specialized surgeons have longer intervals between performances (we have seen something like that in our forgetting curves).
It’s also possible that these ‘unspecialized’ surgeons are simply performing too many surgeries and having fatigue or other state effects. You wouldn’t be able to spot those effects without the analysis done here because practice would be improving performance and hiding any costs of doing too many.
Or maybe it is something else entirely…
Dark side of intuition — unconscious bias
The occasion is tragic, but I am happy to see some more public discussion of ‘unconscious bias’ in the context of recent events related to the police shootings of minority ‘suspects.’ I particularly like the title of this piece:
“A former officer explains why racist police violence occurs even when cops ‘aren’t racist’”
A former officer explains why racist police violence occurs even when cops ‘aren’t racist’
I think that is the right framing. Telling people they are influenced by unconscious bias is important. Telling people that makes them racist (or sexist, or perpetrating any other prejudice) does not seem to be helpful.
When I describe this to people interested in intuition, I like to make the connection between implicit learning and habits. Your brain learns bad habits as easily as good habits. So the neural processes that support skill learning and useful intuition can also easily pick up bias from the statistical structure of the environment and mislead your instincts and gut reactions.
I’d like to think our recent interest in research about the ‘meta-cognition’ of implicit knowledge will eventually be relevant here. We have been exploring ideas for how to learn to be more sensitive to and to evaluate our intuitions (e.g., to detect when they are accurate and should drive decision making). In theory, that line of research should eventually extend to how to detect when your bias is pushing you in the wrong direction as well.
10,000 hours
I ran into a few references/mentions recently of The Dan Plan, a guy who is dedicating a few years of his life to “testing the 10,000 hours hypothesis”. Specifically, he quit his job and is playing golf full-time trying to reach a professional level of play from a starting point of never having played before (http://thedanplan.com/about/).
It is an interesting project. It moved me to write this note in an attempt to clarify what is meant by “testing the 10,000 hours hypothesis” because there are two sides to this. The first is whether 10,000 hours is necessary. I believe the first characterization of the 10,000 hour rule by K. Anders Ericsson was aimed at whether it was necessary — or whether a true ‘prodigy’ might achieve the level of being ‘internationally competitive” based on pure, natural talent (I knew something about this idea from discussions at CMU, where I did my PhD a few years after he was a post-doc there, but I am working from memory here).
Evidence counter to the need for 10,000 hours would be spectacular achievement by very young, talented individuals. However, I am not aware of good evidence of anybody reaching the level of ‘internationally competitive” without putting in the hours — even if they are very young.
The other aspect is whether 10,000 hours is sufficient and about this there are definitely differing opinions. The main alternate hypothesis is that to reach the top level, you need 10,000 hours *and* you need some innate talent. This would imply you could put in 10,000 hours and still not be internationally competitive. However, nobody runs this study because who wants to spend 10,000 hours and then still be mediocre.
My old friend Fernand Gobet (also from CMU) was a professional chess player for many years before leaving chess for Psychology and strongly asserts that there is too much evidence for talent being important. He is pretty sure expertise comes from 10,000 + talent (and should be skeptical of the DanPlan).
In real life, people who put in 10,000 hours are probably responding to a lot of encouragement from trainers, teachers, and coaches who might also be picking up signs of some additional innate talent. This is probably good for training but bad for science because if you only reach 10,000 hours if you already have talent, the variables are confounded and we can’t test the hypothesis.
In addition, Ericsson has also pointed out that it’s almost certainly not just any old 10,000 hours, but probably requires what he calls “deliberate practice” — which means 10,000 hours practicing the “right” things. Sadly, we also don’t really know exactly what this is, even for golf. So if the DanPlan doesn’t make pro level at 10k hours, he might simply not have gotten the right coaching (or he was missing the talent, we’ll never know).
My own guess is that 10,000 hours will get you pretty far on it’s own (I’m a learning researcher after all). But I’m sympathetic to the idea that talent kicks in somewhere. Maybe it distinguishes among those in the top 0.1%? Or 1%? Or is it even top 10%?
Replicability and Ego (depletion)?
I’ve written/stated in a few places that the main problem with replicability in psychology and social science is simply that we don’t replicate enough. Participants are a precious resource that are time-consuming (and therefore expensive) to recruit and test. Any decision to replicate a study reflects a huge opportunity cost — you spend resources on an old idea instead of developing a new one.
A key part of my perspective is that research areas with large volumes of research are stable and that accuracy concerns about research are typically for individual studies.
However, a claim that a large research area is in imminent collapse was brought to my attention:
The claim is that Ego Depletion may not actually exist. If so, this is potentially an important counter-example to the idea that a volume of research establishes an idea.
Having worked with the concept of ego depletion, I would say that we would not be all that surprised that there is something misunderstood in this area. We found an effect of ego depletion on implicit learning, but also found the effects to be somewhat small and “slippery.”
I wouldn’t abandon the core idea just yet for two reasons:
- The “collapse” of the area is simply one large-scale failure to replicate. I think that non-replication study shows that a key paradigm doesn’t work as intended, but there are a lot of other paradigms and probably something is going on there.
- When/if a major research area is invalidated, it is likely to be due to conceptual/definitional issues. That is, the theoretical description embeds a flaw or overlooks key hidden factors that we have yet to discover. That process is a normal part of research and isn’t so much a “replication failure” as how science advances.
Probably there is something importantly “wrong” about current theories of ego depletion. But there is also something right about it we haven’t nailed down yet.
It is difficult to get a man to intuit p-values when his h-index depends upon his not intuiting them
More cleverness from John Holbo at Crooked Timber, especially the title. There’s a lot to like in this piece and a few things that could be quibbled with. Probably unsurprisingly, I’m not fond of the accusation that psychologists in general are ‘omnicausal’ as in we are all too vulnerable to crazy ideas about causal structure of the world. But I’m not even familiar with the examples given (mainly quotes from statistical social science skeptic Andrew Gelman) and I suppose there may be enough ‘wacky’ claims out there to be off-putting.
The whole point of science is to uncover hidden causes that help us understand the world around us. In psychology the thing we are trying to understand is us, which is peculiarly difficult given that we are us. Implicit in all of this is the importance of how implicit much of our cognitive processing is. The so-called “replicability crisis” mostly just tells us that science is hard (I think) and we should always rely on volumes of research and not any once study at p<.05 or any other statistical criterion.
Implicit sequence learning in ring-tailed lemurs (Lemur catta)
I think I’m not even going to explain why this is interesting to me beyond the obvious title and the fact that the senior author, Liz Brannon is a childhood friend and now distinguished researcher and Professor at Duke.
Abstract
Implicit learning involves picking up information from the environment without explicit instruction or conscious awareness of the learning process. In nonhuman animals, conscious awareness is impossible to assess, so we define implicit learning as occurring when animals acquire information beyond what is required for successful task performance. While implicit learning has been documented in some nonhuman species, it has not been explored in prosimian primates. Here we ask whether ring-tailed lemurs (Lemur catta) learn sequential information implicitly. We tested lemurs in a modified version of the serial reaction time task on a touch screen computer. Lemurs were required to respond to any picture within a 2 × 2 grid of pictures immediately after its surrounding border flickered. Over 20 training sessions, both the locations and the identities of the images remained constant and response times gradually decreased. Subsequently, the locations and/or the identities of the images were disrupted. Response times indicated that the lemurs had learned the physical location sequence required in original training but did not learn the identity of the images. Our results reveal that ring-tailed lemurs can implicitly learn spatial sequences, and raise questions about which scenarios and evolutionary pressures give rise to perceptual versus motor-implicit sequence learning.
I do wonder about their definition of “implicit” in lemurs, though…
Models
I ran across this link to Paul Krugman being insightful and thoughtful about the general question of ‘What is a Model” and “What do we use them for in Science?”
It’s about economics and specifically models of development economics, but the general questions of methodology apply to social sciences more broadly.
It is in a way unfortunate that for many of us the image of a successful field of scientific endeavor is basic physics. The objective of the most basic physics is a complete description of what happens. In principle and apparently in practice, quantum mechanics gives a complete account of what goes on inside, say, a hydrogen atom. But most things we want to analyze, even in physical science, cannot be dealt with at that level of completeness. The only exact model of the global weather system is that system itself. Any model of that system is therefore to some degree a falsification: it leaves out some (many) aspects of reality.
How, then, does the meteorological researcher decide what to put into his model? And how does he decide whether his model is a good one? The answer to the first question is that the choice of model represents a mixture of judgement and compromise. The model must be something you know how to make — that is, you are constrained by your modeling techniques. And the model must be something you can construct given your resources — time, money, and patience are not unlimited. There may be a wide variety of models possible given those constraints; which one or ones you choose actually to build depends on educated guessing.
And how do you know that the model is good? It will never be right in the way that quantum electrodynamics is right. At a certain point you may be good enough at predicting that your results can be put to repeated practical use, like the giant weather-forecasting models that run on today’s supercomputers; in that case predictive success can be measured in terms of dollars and cents, and the improvement of models becomes a quantifiable matter. In the early stages of a complex science, however, the criterion for a good model is more subjective: it is a good model if it succeeds in explaining or rationalizing some of what you see in the world in a way that you might not have expected.
There is also a nice description of a “Dishpan model” by David Fultz as an example of a hyper-simplified model that illustrated some emergent properties useful for meteorology.
What resonates with me about Krugman’s description is a common interest in building the simplest, descriptive models that we hope illuminate underlying principles in complex processes. In Economics, particularly Macro, the scientific goal is to understand systems of unmanageable complexity (interactions among all the people and institutions that produce economic activity). In Neuroscience and Psychology, we attempt to understand the human brain, also a system of unmanageable complexity.
I also prefer simple models with a small handful of parameters to illustrate concepts, while having a lot of admiration and respect for modelers who take on the complexity of building up from individual neurons (each themselves having nearly unmanageable complexity, fwiw). The simple models also cannot be “right” in the same sense Krugman describes above, but they can account for some useful fraction of the variance we aim to explain and hopefully expose some deeper principles that might even eventually direct neural-level modeling.
There’s a good question on the other end of the complexity spectrum as well, about why it is worth even building simple models with a few parameters over and above simply making theoretical statements like “changing x causes a change in y.” Such theoretical statements are the bread and butter of standard approaches to Psychological Science, especially experimental work, but I’ll leave the answer as an exercise, perhaps to be tackled in my graduate seminar next time I teach modeling (hints: quantification and prediction are important).
Links to sources:
Questions from a middle schooler about videogames
I was asked to answer some questions from a middle school student doing a research project on video games. Since I am interested in the topic generally, I should probably figure out how to answer these kinds of questions at an age-appropriate level. My attempt:
Jose asks:
1. Do video games affect the human brain? Do video games affect the way of thinking? Do video games damage the thinking part of the brain?
Yes, video games can affect your brain, like anything else that you do a lot of. However, these changes can sometimes be for the better. There is recent evidence of improvements in “visuospatial attention” (how you see the world) following video game play. There may also be changes for the worse, like increasing aggression, but these are not yet well understood.
2. Can video games improve people’s knowledge? Can they help people’s grades get better in school? Or can the[y] get bad grades?
Video games probably won’t help you in school very much. They can cause problems in schoolwork when kids play too many games and don’t keep up with homework and assignments. If you are getting your homework done, playing games won’t hurt and may actually help a little bit.
3. Can video games make people lose time? With friends and family? Time outside?
If you spend too much time on games and do not make time for friends, family, proper exercise and sleep, then that will very likely cause problems.
4. Can video games make people sick? Gain weight? Headaches or a tumor?
Some people report dizziness and nausea (upset stomach) from games that give you first person perspective. This is very likely related to the kind of motion sickness you can get when riding in a car. In rare cases, some people may react badly to flashing lights/sounds in video games. In general, games won’t make you sick. If you eat in an unhealthy way when playing videogames, that can lead to weight gain and other health problems.
5. Can video games make people addicted to what their mainly about? How do they do this? Why do people get addicted?
Gaming addiction is not well understood. Games aren’t addictive the way other things are (like cigarettes). However, there are certainly some people who have problems like in (2) and (3) above. They seem to play so much that it messes up a lot of other things in their life. That looks a lot like being addicted. It also can look like a lot of other problems that teenagers often run into — mood swings, depression, difficulty in relating to others. I do not think it is well known whether games can cause those problems or whether kids having those kinds of problems for another reason sometimes like to play a lot of videogames.
Thank you very much for your help.
You are welcome, Jose.