What is True?

Written by:  Joseph Brean/Wendell Krossa/Herb Sorensen

How one man got away with mass fraud by saying ‘trust me, it’s science’

Joseph Brean  Dec 30, 2011 – 3:43 PM ET | Last Updated: Dec 30, 2011 6:18 PM ET

When news broke this year that Diederik Stapel, a prominent Dutch social psychologist, was faking his results on dozens of experiments, the fallout was swift, brutal and global.

Science and Nature, the world’s top chroniclers of science, were forced to retract papers that had received wide popular attention, including one that seemed to link messiness with racism, because “disordered contexts (such as litter or a broken-up sidewalk and an abandoned bicycle) indeed promote stereotyping and discrimination.”

As a result, some of Prof. Stapel’s junior colleagues lost their entire publication output; Tilburg University launched a criminal fraud case; Prof. Stapel himself returned his PhD and sought mental health care; and the entire field of social psychology — in which human behaviour is statistically analyzed — fell under a pall of suspicion.

One of the great unanswered questions about the Stapel affair, however, is how he got away with such blatant number-fudging, especially in a discipline that claims to be chock full of intellectual safe-guards, from peer review to replication by competitive colleagues. How can proper science go so wrong?

The answer, according to a growing number of statistical skeptics, is that without release of raw data and methodology, this kind of research amounts to little more than “‘trust me’ science,” in which intentional fraud and unintentional bias remain hidden behind the numbers. Only the illusion of significance remains.

S. Stanley Young and Alan Karr of the US National Institute of Statistical sciences, for example, point to several shocking published claims that were not borne out by the data on which they were based, including coffee as a cause of pancreatic cancer, Type A personality causing heart attacks, and eating breakfast cereal increasing the odds that a woman will give birth to a boy.

“The more startling the claim, the better,” they wrote in a recent issue of the journal Significance. “These results are published in peer-reviewed journals, and frequently make news headlines as well. They seem solid. They are based on observation, on scientific method, and on statistics. But something is going wrong. There is now enough evidence to say what many have long thought: that any claim coming from an observational study is most likely to be wrong – wrong in the sense that it will not replicate if tested rigorously.”

Victor Ivrii, a University of Toronto math professor, described the problem similarly on his blog: “While Theoretical Statistics is (mainly) a decent albeit rather boring mathematical discipline (Probability Theory is much more exciting), so called Applied Statistics is in its big part a whore. Finding dependence (true or false) opens exciting financing opportunities and since the true dependence is a rare commodity many “scientists” investigate the false ones.”

“If jumping to wrong conclusions brings a scorn of colleagues and a shame, they will be cautious. But this does not happen these days,” Prof. Ivrii said in an email. “Finding that eating cereals does not affect your cardio [for example] brings neither fame nor money, but discovering that there is some connection allows you to apply for a grant to investigate this dependence.”

Science, at its most basic, is the effort to prove new ideas wrong. The more startling the idea, the stronger the urge to disprove it, as was illustrated when European physicists last month seemed to have seen particles travel faster than light, which has prompted a massive effort to replicate (or more likely debunk) such a shocking result.

Although science properly gets credit for discovery and progress, falsifiable hypotheses are its true currency, and when scientists fail to disprove a false hypothesis, they are left with a false positive.

Technically known as the incorrect rejection of the null hypothesis, a false positive is “perhaps the most costly error” a scientist can make, according to a trio of leading American researchers who this fall published “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.”

At worst, correcting a false positive can cost lives and fortunes. At best, it is a distraction. In the paper, the authors argued that modern academic psychologists have so much flexibility with numbers that they can literally prove anything. False positivism, so to speak, has gone rogue.

By seeming to prove, through widely accepted statistical methods, an effect that could not possibly be real, the authors vividly illustrated the problem. In “many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not,” they wrote.

Psychology has often been especially insecure as an objective science, far removed from its roots in philosophy. It strikes this pose with statistics, which seem objective, but can be interpreted either well or poorly. With nothing else but theory to fall back on, psychology is particularly vulnerable to the illusion of significance.

Critics point to the prevalence of data dredging, in which computers look for any effect in a massive pool of data, rather than testing a specific hypothesis. But another important factor is the role of the media in hyping counter-intuitive studies, coupled with the academic imperative of “publish or perish,” and the natural human bias toward positive findings — to show an effect rather than confirm its absence.

Even in the Stapel case, his exposure as a fraud was covered less extensively than some of his bogus claims.

In a much cited article in the New Yorker last year, which highlighted problems in the scientific method, Jonah Lehrer wrote: “Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.”

In their paper, Mr. Young and Mr. Karr shoot down this radical notion that science is a matter of personal choice, and proof a luxury. They point out that the example used in the magazine — the “decline effect,” seen in studies of paranormal extra-sensory perception, in which an initially high success rate drops off steeply, later explained by the statistical concept of regression to the mean — was simply “wrong and therefore should not be expected to replicate.”

It might not be the most exciting topic for the faculty lounge, and reporters might ignore it, but at the end of the day, a debunked false claim remains one of the highest achievements in science.

National Post

Comment by Wendell Krossa

With all the outrageous claims and counter-claims presented publicly how do we go about discerning what is true or not? So much conventional wisdom is based on anecdotal things or stats from who knows where and manipulated who knows how. Researchers like Julian Simon have pointed us to some valuable safeguards to getting toward a more true picture of the actual state of things in the world- for instance, look at the entire overall picture of some situation or issue. And look at the longest term trends. Bjorn Lomberg added to this with his urging people to go to the best sources of data- for instance, on forests and agriculture the FAO is recognized as a world-leading source. But even these organizations are stacked with alarmists that ignore the data to push alarmist agendas.

In the end it comes down to one’s worldview and this is a very emotional thing. People look for what affirms their previously held ideas and feelings and ignore what disproves them. We look for what we want to focus on and ignore what undermines our positions. And we interpret data flexibly to prove or disprove what we choose. So the safeguards of science are important to follow to correct these very human tendencies.

I appreciated the emphasis below on the null hypothesis which any truly honest person should engage as part of their research program (the falsification of hypotheses).

The more startling claims made publicly get media exposure and as Simon noted run across the planet and scare everyone and it takes a lot of money and follow-up effort to disprove them and most people are not listening anymore, anyway. Their minds are already made up so don’t bother them with facts.

An example of anecdotal is the very public pictures and stories of polar bears in trouble. Alarmists present that famous picture of a polar bear floating on a small ice floe and tell us we are in danger of losing this beautiful animal due to melting ice. What they don’t tell you is that in the 1950s the polar bear population estimates were about 5,000 and today the polar bear population is around 20,000 to 25,000. And of the 11 subpopulations 8 are thriving and growing. They don’t tell you that polar bears have survived past entire melts of Arctic ice and much warmer temperatures than any experienced today. They ignore the overall and the long term. But anecdotal evokes emotions and wins over supporters and their bucks even if it is outright lying.

Comments by Herb Sorensen

This is a subject that is very near and dear to my heart, and I heartily endorse some of the authors comments, and am less enthusiastic about others.  Specifically, I am absolutely committed to observation as the FOUNDATION of science, and consider the hypothesis approach to be more flawed, or possibly inadequate, in very many cases.  I wrote about this recently for something that I will be posting and distributing in the next couple of weeks.  Briefly:

First, a brief excursion into the fundamentals of science, to ground our approach. The foundation of science is observation and measurement. As individuals, we are all subjective, but the world around us is objective, that is, us subjects can share in observing the objects about us. What any one can observe, all can observe. This is the foundation of scientific knowledge. For us to share our knowledge, as well as to expedite our own thinking, we must name objects, and organize things into categories. So we begin with naming five entities, and will proceed with the same approach in naming the component parts of the store.

Next, we are mindful of Lord Kelvin’s admonition that: “If you cannot express your knowledge in numbers, it is of a meager and unsatisfactory sort!” In addition to counting the entities, we will count both the money and shoppers time involved in global retailing, as well as in the various areas of individual stores. For money it is $14 trillion annually; and for time it is something like a quadrillion seconds.

I have spent the past ten years bringing the full force of science to bear on retailing and shopping, a subject where very few real scientists have paid attention, and those who have, have largely looked at a huge subject down a very narrow pipe.  As a scientist, I have always preferred to study subjects that are largely “uninvestigated,” at least from the quantitative approach that I favor.  However, I believe before any hypothesis is formulated, the truth must first be intuited on the basis of massive, organized, unfocused observation.  That is, let reality sweep over you in wave after wave, marinating in “the truth,” as it were.

I realize that in certain branches of science this is more difficult than in others, but any time behavior is involved, this is ideal.  I also have serious reservations about “laboratory” behavioral studies, considering them to be limited to proving only what happens in laboratory conditions.  Rather, I favor “natural” research, studying phenomena in its native environment, where it REALLY happens.

Some of the worst and most outrageous frauds on science, even by strictly by-the-book researchers, is hypothesis driven research of laboratory data.  Another is massive analytical engines processing gargantuan data supplies, looking for truths.  As has been noted, an explosion in a print-setting shop could produce the works of Shakespeare.  But the resulting meaning would not derive from the explosion, but from the reader already knowing Shakespeare, and saying, Aha, here’s a perfect rhymed couplet.  These findings from massive analytical engines too often fit into this category – fools looking for the future in the entrails of slaughtered animals.  And finding a truth that is not there, but is some reflection of their own thought patterns.

I’ve outlined my own approach in that brief quote above.  But to put a little more context to it, my research is ordinarily of the Jim Bridger or Lewis & Clark variety – first to survey a vast domain.  This means that I owe some debt to those who will follow to accurately record what I am seeing, and to first create the paradigm through which I see the world, to aid those who follow in seeing the same world.  I create maps, and try to understand the relations among the animals, forests, streams, etc.  With enough observation, I know that nature may have hidden some things from me, and I will produce “Picasso” images distorted by my own perceptions, but in the end, posterity will judge the usefulness of the maps and plans I have produced.  I am not the most careful or accurate surveyor.  But my work is “good enough” to serve as are liable guide to future more accurate and in depth work, by others to come.