Pages

Monday, September 25, 2017

manipulating Flint’s water fallout

Academic studies are difficult to publish and so finding ways to optimize results can give the best chance of seeing a paper published.  It sometimes fails, as we helped retract an Oxford paper once and asked to officially critique others, and other times mostly shaming academic scientists or corporations into changing their preferred metrics.  Finding a spurious result has always been easy and if one throws just a few handful of random explanatory variables against a data pattern, seeing a p-value of say 7% is not interesting.  In fact such an odd event would generally happen with just 15 variables, or with a random experiment conducted 15 times.  So the game is set to create p-values that are more enticing.  1% “errors” will do.  Like the Princeton professor who insanely claimed Hillary Clinton had a >99% probability of winning.  In case you are wondering, there is no waitlist to take his class.  I have reviewed academic papers myself on a number of occasions, including as one of the editors of a prestigious statistics journal.  What we have with the Flint lead water study seems to fall along the lines of the climate scandals (here, here) that have come to pass.  Where reasonable differences in the data, perhaps a trend, have been overly manufactured and falsified based on greed.  Thereby casting doubt on so much beyond that.  This site will showcase what happened with the Flint water data, and how easy it is to deceive the general public that results are far more statistically alarming than they actually are.

First, enjoy this witless response received from a Washington Post nut, when we initially noted that the results were hacked by altering the conception target-date: 

And that was in response to the summary illustration below.


The Washington Post journalist is a journalist for just this reason.  Chiming in as a globally knowledgeable person, yet knows nothing.  He can’t read academic papers, nor critically think through advanced math topics.  Just send not-so-cute tweets.

Issue 1: where is the unadjusted top?
Let’s keep an eye on the raw data and see what’s been done to manipulate it.  Per chart B1 on page 46 of the paper the conception cut-off date was January 2014 (more on that below).  Meanwhile the nearest peak within a few months of either side of that date is before, on November 2013. 

That’s awkward for the research hypothesis to have an unadjusted peak in November 2013.  And if we expand the window beyond a few months in either direction then we get additional local peaks in June and July of 2013. 

Here’s the game.  Within a 7-month window on either side of January 2014 (the January 2014 date itself is again, a few months before the Flint switch to leaked water), we get three peaks 2, 6, 7 months prior (November, July, and June of 2013 respectively).  Now the adjustment sport begins to retarget a mathematical post-water date that instead appears about several months earlier: say October 2013.

Issue 2: counting to less than 9
As noted about there is an arbitrary assignment given in the paper to when to begin to capture the “post-water” data.  The water was switched in late April 2014.  Also, noted in the page 11 footnote, countless babies have been birthed earlier in a previous calendar month(s) versus that expected assuming a full 9-month pregnancy.  We take all of this into account below:

Conception date
August 2013
September 2013
October 2013
November 2013
December 2013
January 2014
February 2014
March 2014
Months unexposed
>8
>7
>6
>5
>4
>3
>2
>1
Expected exposure
1
2
3
4
5
6
7

Per the table above, if one subjectively assigns at least 3 months of exposure, then what is the acceptable conception date?  It is December 2013 for the earliest month where there is adverse fertility impact.  But with just a flip of the switch, the paper overrules that even this earlier December 2013 date, and goes straight to November 2013 without explanation.  Collecting a larger sample size as they disingenuously move a little closer to what we state is the favorite cut-off date: October 2013.

But then once more, paper tricks and in other sections discusses the cut-off date in a lazy short-hand of “around October 2013”.  This black magic is unacceptable.  Slyly moving the goal post by 2 precious months. 

Look again at the table above and see what they are suggesting so far.  That with only less than 2 of the final pregnancy months of water exposure we could create an entire fertility effect! 

Issue 3: the enchanted 13-month average
To capture more of these 3 peaks, it helps to have this wide enough window that can smooth in as many of these peaks as possible, and then set this as the cut-off.  Here the flexibility to use “around October 2013” provides the 13-month average chart [ the chart above that has now gone viral] to set the window to precisely cut-off the data at the November 2013 final peak.  This deceptively maximizes the visual contrast in before-VS- after the water switch to the false May 2013 vertical line shown (May 2013 is November 2013 minus the six months window). 

Still not done being covetous, they instead ignore the May 2013 and instead show April 2013 on the chart to juice it up.  Perhaps the Washington Post journalist didn’t read all that.  Also bear in mind the 13-month averaging gives additional weight to the outer months by design (those months being the only two of the 13 representative of the same calendar month).  And in this case, that month concocts perfectly to just include the latest peak fertility month (November 2013)!


Flint's insignificant conclusions
Now let’s examine what some of the impacts are of these changes, and you’ll note that they are quite striking.

Pre-water fertility
Post-water fertility
Difference
Reality:
January 2014 cut-off
66 (n=12)
54 (n=12)
-12
Fudged:
November 2013 cut-off
68 (n=12)
56 (n=12)
-14

The bottom line is a 12-month window before and after has its statistical significance halved (e.g., p-value doubled) through this paper’s layered and subjective mathematical modifications.  Another aspect of this, beyond the fact that other variables such as fetal deaths were not as lead-impacted, or that they removed the economically strong 2007 data as having “outlier low” conceived births, or continued to cherry-pick other Michigan cities that didn’t fit into their hypothesis, is that the sample of home pipes they chose for the study were associated with a rebound in fertility rates. 

So a strong resurgence in fertility rates on their own, even though the public was still unaware that lead was (and still was) in the water.  See the table below and the jump from a fertility rate of 53, to 59.  Notwithstanding the sample sizes we have, of course this information here never made it into this Flint study for another reason.  It runs completely counter to the ideas the authors want to put forward.

March 2013 - July 2013
August 2013 - December 2013
January 2014 - April 2014
May 2014 - September 2014
October 2014 – March 2015
Number of months
5
5
4
5
6
Fertility rate
65
68
54
53
59

No comments:

Post a Comment