This month Biostatistics published online an open access article I co-authored with Dr. Laura Black from Montana State University: “Learning From Our GWAS Mistakes: From Experimental Design To Scientific Method.” The paper version is expected to come out in the April 2012 issue.
I’m hoping that you will take the time to read it. And I’m hoping you will violently agree or disagree with the contents of the article and post your thoughts to this blog.
This post extends on some of the themes of the paper, and continues my ongoing effort to examine the systems and paradigms in which we work and how we can improve or even replace them. I’m going to try to be more provocative here than the paper itself. It may insult you, but I’ll make sure to insult myself as well.
I’m going to go out on a precarious limb and claim without any formal data collection, that 90+% of people who call themselves scientists have a vague and even incorrect understanding of the scientific method. And further, of those who understand the scientific method, 90+% routinely do not follow it in their scientific profession. And of the vanishingly small percentage who apply it in their scientific profession, almost none apply it in their daily life outside of their profession.
I claim that the paradigm of science as a means for understanding reality or pursuing truth, that is carried out through an iterative process of conjecture and refutation, has not been adopted by humanity in the main, nor even by the majority of those who call themselves scientists, and I include myself in that category. And I include you in that category.
Do you feel insulted? If you can falsify my claims, I’d like to sit at your feet and become your student. Show me that the vast majority of us are not, in practice, following a process of conjecture and then finding evidence to back up our conjectures.
It seems likely that those who truly could apply the scientific method as their primary thought process would outpace the rest of us in productivity and discovery. Richard Feynman comes to mind as someone who may have operated there — at least professionally.
Here is the rub: humans may not be wired to be scientists. Or as I’ll discuss later — perhaps we have the wiring as children but lose it along the way. We want certainty in order to act, and science (at least as formulated by Karl Popper) gives us only falsification, not proof or confidence. Yet society en masse seems to view science otherwise — is it only me who screams “oxymoron” each time another TV commercial uses the phrase “scientifically proven” to sell products?
So how come science is so successful if we are not following the scientific method individually? I’m not saying we never follow it. In fact I think as a collective, we may be following it more than we do individually — that is, conjecture and refutation happens as scientists compete with each other. We don’t like to disprove our own ideas — being wrong sucks. So how it works is, someone else falsifies my hypothesis, and I in turn falsify theirs. However, the OODA loop speed of having someone else falsify one’s hypotheses can be very slow, particularly with the barriers to reproducible research touched on in an earlier blog of mine.
Article High Points
The genesis of the Biostatistics article was my blog post, “Stop Ignoring Experimental Design (or my head will explode)” written in September 2010. The editors of Biostatistics asked me to write an editorial along the lines of the blog post, but to write for a broader audience. Along the way, I enlisted the help of a co-author, Dr. Laura Black, whom I highly respect in part for her insight into systems dynamics. As we began looking deeper into why bad or non-existent experimental design was so prevalent in the field of GWAS, we began to see the phenomenon as a symptom of deeper problems:
- That the scientific method is not well understood by scientists, let alone the public.
- That association and causation continues to be confused.
- That pernicious multiple testing practices continue to find new ways to manifest.
- That the scientific method runs into real problems when dealing with aggregated complex systems. We run into huge degrees of unmeasurable uncertainty when making comparisons between objects that are similar but different. That is, we run into a wall that ceteris paribus — “all things being equal” — is not even approximately valid with respect to the questions we are asking, making, for instance, generating universal hypotheses about human health problematic.
- That the scientific method is in conflict with the implicit metrics of academia.
- That experimental design and the hunt for correlations divorced from causal falsifiable hypotheses is an emergent property of a system that sees mistakes as bad rather than learning experiences.
Ceteris Paribus, Beer, and Longitudinal Prevention
I really wished I could have touched more on the ceteris paribus concept in the paper. Science has us constructing causal hypotheses of various degrees of universality. When we test hypotheses on atoms, molecules, enzymes, etc., we can consistently and repeatably create configurations of these that are, for all intents and purposes, identical. Thus, we can truly repeat experiments that we or others have performed in the past, and make generalizations that are likely to hold more universally.
In aggregated complex systems, as is the case when looking at whole organisms and people in particular who can’t be cloned like fruit flies, we run into difficulties. Despite there being no uniform units of “people”, we classify highly non-identical people into the same category, and then label them by disease categories that may also have disparate and heterogeneous causes, and then run experiments comparing them and wonder why we find reproducibility so elusive. We want to be able to make universal statements about unequal objects and then be able to extrapolate to additional disparate objects and make predictions about them. Isn’t it amazing if and when it ever works at all?
For example, in aggregate we may be confident that a beer commercial will move more people to buy beer than if the commercial was not played. Yet we cannot say if a given person will buy beer.
We can say that, in aggregate, a drug will help more people on average than if the population was not exposed to a drug, but we cannot say if a given drug will be good for you in particular (though the drug company can be sued anyway). The dichotomy between the doctor and the population geneticist is that the doctor treats a particular patient — first do no harm — not a population. Someone who works on public policy looks to what is good on average to a population. Unfortunately, what is good on average may have no bearing on what is good for someone in particular.
I continue to come to the place where I think we need to move to minimally invasive probe/test approaches that work on an individual person. The concept of longitudinal disease detection and prevention that I’ve touched on in two other blog posts (Never Let the Important Become the Urgent and Influencing the Global Dialog on Healthcare) hinges on the idea that we will have to face the fact that people are unique complex systems. Further, we will have to have temporal individual data collection, probing and perturbing an individual over time by small low-risk increments to understand cause and effect and work outside the paradigm of population-based studies to make personalized recommendations for health.
The 6th insight in the paper listed above was the most profound for me, and hit like a ton of bricks. How we treat ourselves with respect to mistakes may be the biggest determiner of our ability to be scientists. The paradigm that a mistake is “bad” may be what most holds us back from applying the scientific method. We often and unfortunately punish our children for making mistakes — even the first time they make a given mistake. I suspect this is an emergent property not only of human behavior, but stems from a fundamental survival adaptation that may be universal to all organisms.
Organisms continually face the dilemma of acting/changing versus not acting/not changing. In order to survive, we need security and growth. Acting and changing is required for growth in a dynamic environment, but can jeopardize our security if we act with a wrong causal model and end up squandering resources that we would have otherwise retained if we hadn’t acted. Not changing or acting keeps us where we are and avoids making an error of commission, but also has us risking making an error of omission — of not responding better to our dynamically changing environment. Not changing or acting actually blocks growth of better mental models. Only by making errors — taking actions that do not lead to expected results — do we update our mental models with better models of action so we are more fit for survival in the future.
Kelvyn Youngman gave a brilliant presentation at the TOCICO 2011 international conference (a for-fee video version of his talk can be seen here) where he presented a key insight that the structure of a well-formed conflict comes in hierarchical form, which is consistent with the hierarchical organization of levels of thought complexity as described in the Requisite Organization body of knowledge. Without going into all the details, a well-formed conflict obeys a superset (A), set (C), subset (B) relationship with regard to the goal and the two necessary conditions. In the conflict above, security is a proper subset of growth, which in term is a proper subset of survival:
Security ⊂ Growth ⊂ Survival
The implication for human behavior is that security trumps growth — we will not make decisions for growth if security is not sufficiently covered. Thus we punish our children for making mistakes, despite that being exactly how learning occurs.
Thus in putting security first, our culture is more likely to give a hall pass on errors of omission, though they can often have larger consequences than our errors of commission. So we either avoid taking actions with uncertain outcomes, or we put barriers in place to our mistakes being detected and exposed. Even in science where it should be the opposite, if we are honest with ourselves, we are all too often seeking to show that we are right, rather than disproving our hypothesis. So while we have all heard that mistakes are to be used as learning experiences, and we understand that conjecture and refutation — the scientific method is all about posing a hypothesis and trying to show that our hypothesis is mistaken — we for the most part don’t act accordingly. Be honest, maybe Richard Feynman did this, but you and I probably don’t. And if we do it, it is mainly in a narrow area of professional application and not applied throughout our life.
So how do we change our paradigm? How do we mitigate our need for certainty for action or at least raise the bar on our tolerance for mistakes?
What is the reward? Being a Richard Feynman in our area of passion.
What is the downside? Making a big hypothesis that turns out to be false and getting dog-piled by the scientific establishment and losing the credibility that is necessary for sustained research funding? Wallowing in freakish misery forever?
How do we change a paradigm that has been wired into our system since childhood? Isn’t it ironic that the child mind is actually closest to the ideal? That is, the child has few preconceptions he/she feels compelled to defend and thus is most open to learning and change until we “beat it out of them.”
Has anyone really succeeded at this and would like to tell me how they did it? T. C. Chamberlin’s method of multiple working hypotheses published back in 1890 is hardly in use today — and yet it provides a mechanism for being less attached to our hypotheses (see Chamberlin, T. C. (1890) The method of multiple working hypotheses. Science (old series) 15:92–96; reprinted 1965(148): 754–759)? I want to be a scientist in the fullest sense, but I don’t know how to overcome my wiring. Maybe I want to be right too much. Maybe it is too late for you and me. I have young children and wonder if it is already too late for them. Do you have any suggestions?
…And that’s my 2 SNPs.