How The New York Times Lost Nate Silver
Apparently, there was a culture clash at the Grey Lady.
The New York Times’ Public Editor places at least some of the blame for losing Nate Silver on the fact that the newspaper’s culture never really made him feel welcome:
* I don’t think Nate Silver ever really fit into the Times culture and I think he was aware of that. He was, in a word, disruptive. Much like the Brad Pitt character in the movie “Moneyball” disrupted the old model of how to scout baseball players, Nate disrupted the traditional model of how to cover politics.
His entire probability-based way of looking at politics ran against the kind of political journalism that The Times specializes in: polling, the horse race, campaign coverage, analysis based on campaign-trail observation, and opinion writing, or “punditry,” as he put it, famously describing it as “fundamentally useless.” Of course, The Times is equally known for its in-depth and investigative reporting on politics.
His approach was to work against the narrative of politics – the “story” – and that made him always interesting to read. For me, both of these approaches have value and can live together just fine.
* A number of traditional and well-respected Times journalists disliked his work. The first time I wrote about him I suggested that print readers should have the same access to his writing that online readers were getting. I was surprised to quickly hear by e-mail from three high-profile Times political journalists, criticizing him and his work. They were also tough on me for seeming to endorse what he wrote, since I was suggesting that it get more visibility.
Many others, of course, in The Times’s newsroom did appreciate his work and the innovation (not to mention the traffic) that he brought, and liked his humility.
* The Times tried very hard to give him a lot of editorial help and a great platform. It bent over backward to do so, and this, too, disturbed some staff members. It was about to devote a significant number of staff positions to beefing up his presence into its own mini-department.
Well, the fact that the Times was being helpful to Silver is understandable. After all, by the time they had hired him after the 2008 election he had already built a name for himself with his uncannily accurate statistical projections, something that was entirely new at the time. The fact that it was not just new but seemingly so accurate made Silver an overnight star and the decision to move his entire “FiveThirtyEight” operation under the rubric of The New York Times was a big deal, not just in the newspaper business but also in the still burgeoning world of online political punditry and election forecasting. By the time the 2012 election rolled around, Silver’s site was accounting for the largest part of the traffic to the Times’ website. Given all of that, it made sense that Times management would be interested in making him happy. It’s also understandable why the more traditional political reporters at the paper might not be so thrilled about the sudden laurels being given to this “kid” who does math.
Sunday’s Politico Playbook provides some more behind the scenes information about Silver’s departure:
Silver had told The Times that he wanted to expand to weather, economics and anyplace else at The Times that had statistics and numbers he could bring to life. He had already begun doing that, with “Claims on I.R.S. Are Challenged By Probability,” which ran in the paper, as did an examination of Chief Justice John Roberts’s use of statistics, along with “Health Care Drives Increase in Government Spending” and “Congressional Proposal Could Create ‘Tax Bubble.'” In December, Silver had his first front-page story in the print paper.
Early this year, The Times laid out a plan that would give Silver a staff of six to 12 bloggers to focus on a variety of topics, modeled on Ezra Klein’s Wonkblog at The Washington Post. The plan was so specific that it named Megan Liberman, an up-and-coming deputy news editor at The Times, as Silver’s editor. As recently as last month, some executives at The Times were confident Silver would stay, mainly because they had given him everything he had asked for. Silver is very interested in prestige, and the prestige of The Times was a huge deal to him. But Silver, who first made his name with forecasts for Major League Baseball players, still loves sports. At times, he felt unwelcome in the Times Sports section, and seemed to struggle to fit into its culture. The section is among the most innovative at the paper, but not in the areas that are Silver’s wheelhouse.
There’s much more at the link about the negotiations with ESPN/Disney that led to the deal that Silver ended up accepting. What’s interesting about it all is that it makes one wonder if Silver hasn’t possibly bitten off more than he can chew. This is a guy who earned his chops with Sabermetrics in baseball and then with a statistical model for analyzing poll results in the political world that proved to be uncannily accurate. He’s obviously a very smart guy who understands more math than I’ve forgotten at this point. However, numbers and statistical analysis can only get you so far. As several people have pointed out, Silver’s Oscar predictions, which apparently is something he’ll be doing on a regular basis for ABC now as part of his contract given that ABC has the broadcast rights for those awards at least through 2020, haven’t really been any more accurate than those made by anyone else. And I’m not really sure how you an apply statistics to an analysis of the weather without having an in-depth understanding of the science of meteorology. So, it’s possible that Silver is biting off more than he can chew here. Of course, people have underestimated the guy before so I’m not going to say that I’m absolutely sure about that.
The one thing I am sure about, though, is that he’s probably going to need a lot of patience to work with Keith Olbermann on a regular basis Good luck with that one, Nate.
Seems you are wrong.
Eh,I think Disney/ESPN/ABC could simply offer him more – not just more money, but more ways to dabble in this or that. Silver seems to like to dabble, because he gets bored if he focuses on one thing for too long.
Did Nate Silver say that because he actually meant it, or just to avoid burning bridges?
This (incorrect) attitude is why Silver is so “disruptive”. His entire career has been entrenched interests telling him this and then Silver showing them that, yes, the numbers and statistical analysis can get you that far.
Too many journalists practically wallow in their innumeracy, and Nate Silver was demonstrating how this was hurting their output. Which made him unpopular, because rather than accepting they have to learn to do their jobs better, it was easier for the old guard to attack him.
Tough saying…not knowing.
I do think posting a piece about someone without bothering with their take on the situation is pretty weak tea.
Especially from someone with a BOTH SIDES fetish.
Silver’s analysis is fascinating. But it leaves out the human factor, as his inability to accurate predict the winner of the Super Bowl or all but one of the Final Four in 2013 demonstrate quite well I think.
And you do not need contacts or control of the message to do what Nate Silver is doing. An enormous amount of what the major media outlets have been doing is about maintaining contacts and control of the message.
@Stormy Dragon: “This (incorrect) attitude is why Silver is so “disruptive”. His entire career has been entrenched interests telling him this and then Silver showing them that, yes, the numbers and statistical analysis can get you that far.”
Excellent point! And assuming that Nate is willing to work with experts in each field (note – ‘expert’, not just any pundit or columnist) he should be able to go far.
There’s also the driving force pushing him *away* from the NYT. If you read Sullivan’s post on Nate’s departure, note that she never mentions piddling little things like accuracy, truth, who got it right, and other such stuff that political coverage in the MSM doen’t like. That’s a rather toxic environment.
Well, no, with this you demonstrate your own innumeracy. Silver won’t always predict the winner, but he will accurately predict the expected probability of a result. If he predicts that, say, the Patriots have a 70% chance of winning and they then lose, that doesn’t mean he’s wrong, it just means that the 30% chance has happened.
If, for example, I predict that a coin tossed 100 times will come up heads 50% of the time, and the first two tosses are tails, that doesn’t mean that my prediction is wrong, just that the sample set has not yet grown large enough.
Once again, and I can’t stress this enough: statistical analysis does not predict the result, it predicts the statistical probability of a result.
So, essentially, by saying he’s only predicting probabilities you are saying he can never be disproven. That tends to make the analysis far less interesting to the average reader, I’d submit.
Like I said, Silver’s work is fascinating but the idea that he’s a guru who’s always going to be right strikes me as a form of epismistic closure.
@Rafer Janders: My last 4 dentists didn’t want me to chew gum, therefore my new dentist will recommend Dentyne.
Um, no. Again, I think the issue is that you simply do not seem to understand exactly how statistical analysis and probability works.
Beat that strawman, Doug! Beat ‘im good! By gum, you got ‘im on the ropes!
I think you really don’t understand.
He’s not a guru. That’s sort of the whole point.
Nobody is claiming perfect accuracy. Nobody is claiming there is zero “human element” (random chance/luck/whatever).
He’s disruptive because a bunch of political pundits (and sportswriters before them) conjure narratives with little or no basis in fact, and spew this stuff. They get paid. Along comes Silver, and shows that their narratives are build on bullshit. Not by being perfectly right (wasn’t Sam Wang’s system slightly more accurate?), but by being grounded in data.
An individual tournament, or one team’s entire season, is the sort of thing that no one can ever predict perfectly. It’s impossible. You can run projections in March and decide that the Yankees
look to be a roughly 88-win team (plus or minus 5 wins), with downside risk due to age/injury. That doesn’t mean you’re saying “the Yankees will win 88 games.”
I think Josh Marshall has a good take on Nate’s methods:
“I don’t mean to diminish his feat. I just think people are focused on the wrong part. The fact that Silver’s numbers were so good at the very end is not that big a thing. Others came up with pretty much the same stuff. But as Silver would say himself, as his models converge on election day they give greater and greater weight to the actual polls and less and less to economic data, historical data and whatever else he figures into his system. So the fact his model pretty much called it on election day isn’t that big a thing to me; the fact that he pretty much called it six months or 9 months before, based on a system factoring in lots beside polls, is a much bigger one.”
Is it really at all surprising that the most pundit-y (in the negative sense) of the OTB front pagers is taking umbrage at his lack of understanding of statistical analysis?
1-) The Times should exempt at least some blogs from the paywall. Many opinion writers like to have a large audience, I think that´s counted for Silver.
2-) The Times could not compete with the offers coming from corporations like Disney and Comcast. It´s simple as that.
Which also demonstrates another aspect of his work that disturbs people. Humanity, particularly as you start considering larger and larger groups, is a lot more predictable than people like to admit. We like to think of ourselves as unique individuals, but the fact is that for each one of us there are probably hundreds, if not thousands, of virtually indistinguishable people out there.
When analyzing small groups like the 20 players in the final four, or the 22 players in a superbowl, individual differences can make a great differences. When you’re analyzing the millions of voters in a presidential election, we do behave in a disturbingly predictable manner.
I understand statistics quite well, people. You’re just taking one off hand comment regarding well-founded doubts about Silver’s ability to apply his model to areas outside sports and politics and blowing it way out of proportion.
“Is it really at all surprising that the most pundit-y (in the negative sense) of the OTB front pagers is taking umbrage at his lack of understanding of statistical analysis?”
And the one whose election posts focus extensively on marginal changes to the horse race, to the exclusion of actual discussion of a candidate’s actual positions, strengths and weaknesses.
The point is not that Nate is really good. The point is that most pundits are bad, and even if they are intelligent people many of them are simply going to say what their audience wants to listen. Besides that, polling is being improved and there is more information to deal with bad polls. Many analysts inaccurately thought that Kerry would win Florida in 2004, but the polling at the time was fair less accurate than in 2012.
@Doug Mataconis: So, essentially, by saying he’s only predicting probabilities you are saying he can never be disproven. That tends to make the analysis far less interesting to the average reader, I’d submit.
But nothing is going to meet that metric; you’re asking for certanties in a field that’s about what’s probable. I agree that “most people” just want to know if they need to bring a dang umbrella with them, and telling them “there’s a 60% chance of precipitation” doesn’t really answer that question. But that is the closest you’ll get to a factual answer.
Doug, you’ve printed the New York Times’ Public Editor’s quote twice.
@Doug Mataconis: Stop digging.
For Doug and for all, just finished Silver’s The Signal and the Noise. I expected a boring memoir of his sports and political work. Far more to it than that. I’d almost say it should be shelved in Philosophy. I see a lot of references lately to Bayesian statistics. (Even to Bayesian quantum mechanics, how’s that for obscure?) Now, thanks to Nate, I sort of understand what “Bayesian” means. Actually a very good read, I highly recommend it.
I suspect there’s more to this story than has been published. Or maybe less. maybe ESPN just offered more money.
@C. Clavin: I think that quote acknowledges there was a culture clash. He’s just saying that’s not why he left (which may be true or false, or somewhere in between).
William of Ockham has something to say about this.
(Hint: it’s all about the money)
You need to read some works on odds. “Against the Gods” by Peter L. Bernstein is a good one.
Basically the odds can be 10:1 against. You can be correct to take the 10 side. You can lose because the 1 came up.
That doesn’t make you wrong, it just means you lost.
Of course odds in complex situations can never be known.
That is one thing pundits don’t get. That’s why they trip over being “right” or “wrong” about elections. The only place odds can be known are in constrained environments, coin tosses, drawing colored balls from a bag, a roulette spin.
Actual horse races are less unconstrained, and political horse races are off the map.
(Stock market analysts who claim to be “right” because they “won” don’t fully grasp their domain either. The smart ones distrust their own past decisions, even the ones which paid off.)
and Doug’s first sentence says:
which seem somewhat at odds with each other.
it seems the very least a writer could do is include the subjects point of view.
Doug, no one who “understands statistics quite well” would write something like “but it leaves out the human factor, as his inability to accurate predict the winner of the Super Bowl or all but one of the Final Four in 2013 demonstrate quite well I think.” As already noted, Silver’s analysis does not attempt to predict a winner with certainty, rather, it only lays out the statistical probability of a result happening or not. If you actually understood statistics or what Silver does then you wouldn’t have made that bizarre complaint.
Complaining that it leaves out “the human factor” is nonsensical, it’s like complaining that algebraic equations leave out the human factor. Of course it does, since the human factor has nothing whatsoever to do with it.
I agree to large extent, but there is a “volition factor” that goes beyond any “human factor.”
Any system containing agents which may make decisions is inherently more fuzzy than a rigid system driven by physics. A horse can be in a bad mood. That makes a horse race different than a coin toss in a fundamental way.
(In terms of sports an politics, we presume principle agents will keep trying to win, and that lesser agents won’t suffer great spontaneous whim. Apparently this is true for major campaigns. Mitt kept on trying to win, but for better or worse opinion about him could not change dramatically day to day.)
Note that the volition factor is the same one that ruins long term economic models.
“Animal spirits” matter and no one knows animal spirits, five years out, let alone twenty.
side note: I would expect the predicted results of a MLB or NBA season to be more accurate than the Final Four because the elimination rounds are a series of datapoints, rather than a single ‘winner take all’ event. In other words, a best 4 out 7 or best 3 out of 5 can be predicted with a higher confidence level than a single Kansas vs Tennessee Elite 8 game.
Similarly, a presidential race has an enormous amount of polling data – data that simply does not exist for an Oscar race. So the prediction must be made at a much lower confidence level, which gives it less meaning.
Doug, your comments don’t indicate that you understand statistical inference or modeling at all. Your post certainly shows that you have no grounding in election forecasting or the 30 years of solid political science work that has predated Silver’s work.
Your comments smack of “John McCain is aware of all internet traditions.”
@Doug Mataconis: “I understand statistics quite well, people. You’re just taking one off hand comment regarding well-founded doubts about Silver’s ability to apply his model to areas outside sports and politics and blowing it way out of proportion. ”
Um, you’re now calling it off-hand.
@Doug Mataconis: “So, essentially, by saying he’s only predicting probabilities you are saying he can never be disproven. That tends to make the analysis far less interesting to the average reader, I’d submit.”
If you understood statistics, you’d know that your first statement is incorrect.
In many cases I wouldn’t expect the results to be better in any appreciable way, depending on how evenly matched the teams are. It’s been a long time since I studied statistics, but trying to determine which is the better team from a series of games is essentially the same problem as determining the bias of a coin with a series of tosses. It requires a lot of trials to determine which side is more likely to “win”, and this number gets extremely big extremely fast as the actual odds get closer to 50-50
“trying to determine which is the better team from a series of games is essentially the same problem as determining the bias of a coin with a series of tosses.”
And if all the sabermetricians were doing is determining afterwards which is the best team, this would be true. They analyze statistics to predict which is the better team before they start playing.
@Ken: No, no. If you are predicting how two teams will match up, you’ll come up with a prediction like “The likelihood of Tennessee beating Kansas is 0.85 at a 95% confidence level.” Usually, that will be based on a monte carlo simulation, where all of the variables are fed in, and then the algorithm is run 10,000 times.
In other words, you are 95% confident that any given single occurance (in this case, a single game) is going to be one of the 8,500 scenarios in which Tennessee wins. However, if, say, their starting center blows out his knee 10 minutes into the game, suddenly you’re in one of the 1,500 scenarios where Kansas wins. It doesn’t make the prediction wrong – it just means that in this one observable game, one of the 1,500 scenarios played out instead of one of the 8,500 scenarios.
The 4 of 7 series is more likely to match the predicted outcome because instead of your prediction being all or nothing, you have can have one of those 1,500 scenarios where Kansas wins show up 3 times and still allow Tennessee to win their series.
Whether the blown knee was “in” one of the scenarios really depends on whether you have a physical model the system (as in the ballistic model for a cannon ball) or whether you have a statistical model (a mean and standard deviation for the last 100 shots).
In the statistical model, a blown cannon is not “in there.”
This all relates to my points above about physical and human systems. A weather model based on atmospheric physics is fundamentally different than a Social Security model proposing future employment and unemployment rates.
(A blown knee/cannon is what killed Long-Term Capital Management.)
A particular horse may be unexpectedly in a bad mood. If you have a million horses, you’re going to expect a roughly constant number of them to be in a bad mood. If an unusually large or unusually small number of horses are in bad moods, there is pretty much always a reason for this, which can be captured in the model.
Do you have a million horses?
(This was LTCM’s fundamental problem, they assumed that a certain number of years of market data were equivalent to a million market years. You make the same mistake.)
Let’s review, there are three types of models used to make predictions:
1) pure physical models, as in ballistics
2) pseudo physical models, with assumed values, as in Social Security
3) statistical models, as in sports or LTCM.
Of those types 2 and 3 are subject to black swans.
Put differently, you can never know by examining within your data set if your set id complete for events outside your data set. The words “past performance does not guarantee future results” are often spoken but often not fully internalized.
(Or, see also “horse betting systems”)
I read O’Sullivan’s orginal post and have been following the blogosphere’s reactions and – man, you pundits need to get out more. A new department is created in a mature organization and budget and head count is taken from other departments, and people within the organization take sides and fight for their share of the pie? And this is a sign of a crippling culture clash? It seems quite the opposite to me. Despite the entrenched opposition to big changes found in any organization, the NY Times created his group and fought hard to keep him even when a big-pocketed suitor showed up and whispered sweet-nothings in his ear.
Also, significantly, running a 10 million monte carlo simulations, based on 10 years of past market data do not make that 10 years of market data any better, or any more characteristic of future events.
But physical models are often statistical models themselves. When modelling ballistics, to use your example, we don’t model every individual molecule of the atmosphere and how their collisions with the bullet affect its flight. We assume they can be replaced with a statistically determined “drag force”.
We should stop at the why of that. The reason gases can be treated as uniform is that to vanishing levels of detail one N2 molecule is like another N2 molecule. The sameness makes math easy, and makes long term sampling moot.
In contrast (pdf):
When you reach a once in 14,756 year event in the short term, something is probably deeply wrong with you assumptions about the mapping of collected data to future events.
People who give this 5 up-votes only get a B- on the test.
And actually uniform gases do not make a physical model into a statistical one, no. Fail.
The fact that a specific statistical model was flawed doesn’t mean all statistical models are necessarily flawed.
@john personna: “(A blown knee/cannon is what killed Long-Term Capital Management.) ”
From what I saw in the press, what killed them was that they were running highly leveraged bets, using data and models which assumed no really bad things would happen (their data set seemed to not include a recession!). When a crisis hit, the same leveraging put them deeply into the red.
@MarkedMan: “A new department is created in a mature organization and budget and head count is taken from other departments, and people within the organization take sides and fight for their share of the pie? And this is a sign of a crippling culture clash? ”
Yes, because the NYT clearly didn’t value ‘being right’ over the horserace. Read Ms. Sullivan’s column on Nate’s departure, and see if she *ever* mentions the raw fact that the NYT’s ‘experts’ blew it, compared to Nate.
You aren’t really grasping it. The problem, investigated mathematically by Mandelbrot and now Taleb, is not with the model, it is with the nature of the data.
The Misbehavior of Markets: A Fractal View of Financial Turbulence, by Benoit Mandelbrot
I can only speak to the simulations folks run for MLB (and by “folks” I mean bloggers I follow, not me). I don’t think they build in injury risk. I think a while back some of them did, but the simpler models often did better.
What some folks do is they set playing time expectations before feeding the roster into the monte carlo sim. So, taking the Yankees again, pre-season the expectation would be that several key players might only play partial seasons. If you were to run the sim with them all playing full time, you’d probably get a 95-win +/- 5 result. But when you use your knowledge of injury and age and you therefore assume that Mark Teixiera, Alex Rodriguez and Derek Jeter were all likely to miss time, you end up with 88 wins +/- 5. [in reality, the injuries have been far worse than anticipated, but until recently unexpectedly strong pitching had partially compensated. The bottom is just now in the process of falling out]
Silver does stuff like this too, as I understand it. His model didn’t simply aggregate poll data. It included some economic info and some state-specific special sauce.
@Rob in CT:
Gah, unclear. Let me explain. They build in injury risk not by giving a given player a % chance of injury, but by guesstimating playing time. Some do this in a “dumb” way: they look at the average # of games played by the player over the past 3 seasons and go with that, maybe deducting a few games for an age penalty. Others try to be “smart” and adjust for other known factors (so-and-so had a fluke injury at the age of 25 that cost him a whole year, from which he made a full recovery. Two years on, there is no reason to dock him for that. Versus: so-and-so is 35 and had off-season surgery we know from past experience is risky. Hmm, put him in for half a season).
@Barry: I get your point but I still don’t see the connection. The NYTimes fought to keep him. They lost the bidding war. How does that translate into them not getting it?