Consider how creative you might be when you have a stack of economic variables as big as a phonebook. The once famous "leading indicator" of economic performance, for instance, was the winner of the Super Bowl. From the Super Bowl I in 1967 through Super Bowl XXXI in 1997, the stock-market gained an average of 14% of the rest of the year when a team from the original National Football League (NFL) won the game. But it fell almost 10% when a team from the original American Football League (AFL) won instead.
Through 1997, this indicator had correctly "predicted" the direction of the stock market in 28 of 31 years... It was just a coincidence, of course. And eventually, the indicator began to perform badly."
From Chapter 6. How to Drown in Three Feet of Water, in The Signal and the Noise. Nate Silver
Meanwhile, in back in Paris, they have been talking about rats that are fed GM corn and whether patterns in hundreds of measurements mean anything. (see L'Affair Seralini tag for copious background discussion):
HAUT CONSEIL DES BIOTECHNOLOGIES, COMITE SCIENTIFIQUE
Paris, le 19 octobre 2012 AVIS en réponse à la saisine1 du 24 septembre 2012 relative à l’article de Séralini et al. (Food and Chemical Toxicology, 2012).
2. Unjustified selection of the results presented:
In a context where the results are not supported by statistical analysis, choose[ing0 in isolation results in the presentation of a multi-factorial experimental work led to a dimension of the analysis that has no scientific value. Without justification, the authors have chosen to present the results of four biochemical and hormonal parameters which present two strongest variation compared to the control (Figure 5B). This choice was made retrospectively. We can obviously expect that there are differences among the 864 (*) comparisons made by the authors to the values of the 48 biochemical parameters at 15 months of the study. Presentation of selected results can therefore be misleading for a non-specialist reader because of multiple comparisons, which erroneously conclude that the differences observed are representative of differences between experimental groups and control groups.* 864 is the number of comparisons made between the data obtained for the treated groups in the 15th month of the study (18 experimental biochemical batches x 48 = 864) and the corresponding data from control groups.
Original French:
2. Sélection injustifiée des résultats présentés :
Dans un contexte où les résultats ne sont pas appuyés par l’analyse statistique, choisir* 864 est le nombre de comparaisons effectuées entre les données obtenues pour les lots traités au 15ème mois de l’étude (18 lots expérimentaux x 48 paramètres biochimiques = 864) et les données correspondantes des groupes témoin.
d’isoler des résultats dans la présentation d’un travail expérimental multi-factoriel conduit à donner une dimension à ces analyses qui n‘a aucune valeur scientifique. Sans justification, les auteurs ont choisi de présenter les résultats de quatre paramètres biochimiques et deux paramètres hormonaux qui présenteraient la plus forte variation par rapport au témoin (Figure 5B). Ce choix a été effectué a posteriori. On peut évidemment s'attendre à ce qu'il y ait des différences parmi les 864 (*) comparaisons effectuées par les auteurs pour les valeurs des 48 ème paramètres biochimiques au 15 mois de l’étude. La présentation de résultats sélectionnés peut alors être trompeuse pour un lecteur non spécialiste des comparaisons multiples, qui conclura à tort que les différences observées sont représentatives de différences entre groupes expérimentaux et groupes témoin.
Another example:
Appendix 4 of above HCB French report (English translation)
English translation version by Kevin Bleakley, Inria Saclay
6. Biochemical parameters
In the study, 48 biochemical parameters were measured for each of the 20 groups. For each sex and for each experimental condition, an “Orthogonal Partial Least Squares Discriminant Analysis” (OPLS-DA) method was implemented to discriminate between the control and experimental group.
The OPLS-DA method is frequently used in chimiometrics and genomics for identifying a subset of variables which can best separate different groups. It is particularly pertinent when the number of explicative variables is large with respect to the number of observations. Furthermore, the method allows construction of a predictive model which, for a given set of explicative variables, can output the probability of belonging to each of the groups under consideration.
The choice of this method and its use by the authors in the present context deserves several comments:
1. This type of method is well-known for “over adjusting” the observed data when the number of explicative variables is large with respect to the number of observations (which is the case here). In effect, it is always possible to find a model defined by 48 parameters that perfectly separates 2 groups of 10 subjects, no matter the groups! To validate the obtained model (i.e., to assure oneself that it possesses good predictive properties), one can use
- an independent test set, which helps to ensure that the model fitted to the training set retains good predictive properties on new data which has not been previously used to fit the model.
- cross-validation methods (essentially, different subsets of the data are successively used to play the role of training and test set).
Original French version
4. Paramètres biochimiques
47 paramètres biochimiques ont été mesurés et un paramètre supplémentaire a été calculé lors de l’étude auprès de chacun des 20 groupes. Pour chaque sexe, pour chaque condition expérimentale, une méthode de type « Orthogonal Partial Least Squares Discriminant Analysis » (OPLS-DA) est mise en oeuvre pour discriminer le groupe témoin et le groupe expérimental.
La méthode OPLS-DA est largement utilisée en chimiométrie ou en génomique pour identifier un sous-ensemble de variables qui différencient au mieux différents sous-groupes. Elle est tout particulièrement pertinente lorsque le nombre de variables explicatives est grand devant le nombre d’observations. De plus, la méthode OPLS-DA permet de construire un modèle prédictif, qui, pour un jeu de variables explicatives donné, fournit les probabilités d’appartenir à chacun des sous-groupes considérés.
Le choix de cette méthode et son utilisation par les auteurs dans ce contexte appellent plusieurs commentaires :
1. Ce type de méthode est réputé pour « sur-ajuster » les données observées lorsque le nombre de variables explicatives est grand devant le nombre d’observations (ce qui est le cas ici). En effet, on pourra toujours trouver un modèle défini par 48 paramètres quiséparera parfaitement 2 groupes de 10 sujets, et ce, quels que soient les groupes ! Pour valider le modèle obtenu (i.e. s’assurer qu’il possède bonnes propriétés prédictives), on peut utiliser :
- un échantillon test afin de s’assurer que le modèle ajusté sur un échantillon d’apprentissage conserve de bonnes propriétés prédictives sur de nouvelles données qui n’ont précisément pas été utilisées pour la construction du modèle,
- des méthodes de validation croisées (ce qui revient à faire jouer aux différentes données alternativement le rôle d’échantillon d’apprentissage et d’échantillon test).
Nate Silver will be happy with this fine French illustration of just one the ideas in his book.
For those considering reading the book (available through Amazon for US$9.99 on Kindle, here a good review:
"Nate Silver's The Signal and the Noise is The Soul of a New Machine for the 21st century."
—Rachel Maddow, author of Drift
Nate Silver built an innovative system for predicting baseball performance, predicted the 2008 election within a hair’s breadth, and became a national sensation as a blogger—all by the time he was thirty. The New York Times now publishes FiveThirtyEight.com, where Silver is one of the nation’s most influential political forecasters.
Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the “prediction paradox”: The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.
In keeping with his own aim to seek truth from data, Silver visits the most successful forecasters in a range of areas, from hurricanes to baseball, from the poker table to the stock market, from Capitol Hill to the NBA. He explains and evaluates how these forecasters think and what bonds they share. What lies behind their success? Are they good—or just lucky? What patterns have they unraveled? And are their forecasts really right? He explores unanticipated commonalities and exposes unexpected juxtapositions. And sometimes, it is not so much how good a prediction is in an absolute sense that matters but how good it is relative to the competition. In other cases, prediction is still a very rudimentary—and dangerous—science.
Silver observes that the most accurate forecasters tend to have a superior command of probability, and they tend to be both humble and hardworking. They distinguish the predictable from the unpredictable, and they notice a thousand little details that lead them closer to the truth. Because of their appreciation of probability, they can distinguish the signal from the noise.
With everything from the health of the global economy to our ability to fight terrorism dependent on the quality of our predictions, Nate Silver’s insights are an essential read.
See also earlier GMO Pundit posts
For those considering reading the book (available through Amazon for US$9.99 on Kindle, here a good review:
"Nate Silver's The Signal and the Noise is The Soul of a New Machine for the 21st century."
—Rachel Maddow, author of Drift
Nate Silver built an innovative system for predicting baseball performance, predicted the 2008 election within a hair’s breadth, and became a national sensation as a blogger—all by the time he was thirty. The New York Times now publishes FiveThirtyEight.com, where Silver is one of the nation’s most influential political forecasters.
Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the “prediction paradox”: The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.
In keeping with his own aim to seek truth from data, Silver visits the most successful forecasters in a range of areas, from hurricanes to baseball, from the poker table to the stock market, from Capitol Hill to the NBA. He explains and evaluates how these forecasters think and what bonds they share. What lies behind their success? Are they good—or just lucky? What patterns have they unraveled? And are their forecasts really right? He explores unanticipated commonalities and exposes unexpected juxtapositions. And sometimes, it is not so much how good a prediction is in an absolute sense that matters but how good it is relative to the competition. In other cases, prediction is still a very rudimentary—and dangerous—science.
Silver observes that the most accurate forecasters tend to have a superior command of probability, and they tend to be both humble and hardworking. They distinguish the predictable from the unpredictable, and they notice a thousand little details that lead them closer to the truth. Because of their appreciation of probability, they can distinguish the signal from the noise.
With everything from the health of the global economy to our ability to fight terrorism dependent on the quality of our predictions, Nate Silver’s insights are an essential read.
See also earlier GMO Pundit posts
- False-Positive Psychology
- GMO Statistics Part 17. Correlation is not causation, even for chocolate eating Swedes.
No comments:
Post a Comment