Wednesday, October 26, 2016

Bad stats -- "good enough"

I'm reading (or really listening to the audiobook during my commute) "Traffic: Why we drive the way we do" by Tom Vanderbilt.  It is a pretty good book and well researched and surprisingly interesting given the pretty dry nature of the material.

This morning, he was discussing what is a "safe road."  In particular, he was talking about small, curvy roads and whether they are made safer when the turns are labeled with reflective posts and recommended speed signs.

Proper signage is optional
Research has shown that the average speed on unlabeled roads is lower than on labeled roads.  The proposed reason is that the unlabeled road is observably dangerous, so drivers are more careful and drive more slowly.  The labeled road lulls drivers into a false sense of security and therefore they drive faster.  His conclusion is that the labeled roads are therefore more dangerous, as evidenced by the higher average speed.

Here's the thing though... average speed doesn't matter!

What matters is the occurrence of a driver entering a turn with a speed that exceeds the combination of his driving ability, the character of the turn (radius, traction, etc.) and the performance of his car.  If the "fly off the road threshold" (or put another way, "bad enough") isn't exceeded, the driver keeps right on going, no matter how close he might have gotten.  The conclusion drawn by the author is therefore baseless.

Further, if we imagine that the presence of the signage increases the average speed but decreases the standard deviation (or more specifically tightens up the distribution in one way or another), then we're probably less likely to hit the bad-enough threshold and have an accident, even though average speeds are higher.  This seems believable, since better information means that people don't have to guess how fast to go through the curve, and they should converge on something similar to the posted, recommended speed.

made with @Risk by Palisade
In the little simulation I did here, we have two distributions.  The blue represents the "no signs" scenario, where the driver has no idea what they're in for and tends to drive slower.  The red represents the good signs scenario, when everybody knows what's coming.  I set the "bad enough" threshold at 40 mph.  You can see that 0% of the good signs people crash, but 1.3% of the no signs people crash.

There is an analogous concept in marketing, called the "good enough" line.  The idea is that people have some threshold for purchase that depends on certain quality vectors.  In a car, for example, a driver might have a minimum requirement for horsepower, seating capacity, gas mileage, etc.  If your car doesn't meet that threshold, that person won't buy it.  If you exceed that threshold, they won't pay any more for the car and you don't monetize the value.  It's all or nothing.  This concept applies to lots of things.  While the car crashes when it exceeds the threshold, the customer buys something when their desire for the product exceeds the threshold.

So here's the problem.  Statistical process control (SPC) is all about removing variability.  Mass marketing and homogeneity of product offering is the way of the world.  What are the actual differences between different Android phones, for example?  As things move toward the red distribution, fewer and fewer outliers will exist in the tail, and fewer and fewer people will reach the purchase threshold for your product.

So the takeaway is that new products should desperately avoid the middle path.  If you're going to put out something new, put out something really new.  Expect most people to dislike it.  What you need is a few people who will like it, and you won't get there if you don't produce the wildly unpredictable product.

Tuesday, October 18, 2016

Growth and ungrowth - forcing the vote

Over the past 30 years the US economy has grown.

from http://data.worldbank.org/
At the same time, middle class incomes have stayed flat.
from: http://www.advisorperspectives.com/


I found a really great breakdown and analysis of earnings trends at this website.  It's interesting to note that the sources here are not political sources or think tanks.  This is the US census and a couple of investment advisers that have aggregated and charted the census data.

At the same time, there has been considerable growth in the income of middle class women, and some minorities.

http://www.bls.gov/opub/ted/2013/ted_20131104.htm

The majority of middle class men have not been realizing these gains.  Additionally, many recent college grads are not experiencing these gains.  Link to wikipedia on income

There is a significant body of research on happiness that points to the concept that an individual's sense of happiness is a function of how they feel they compare to their self-described set of peers.  When a person looks around at their friends they do a natural, precognitive comparison that establishes happiness on a sliding scale.  In the case of the economy, the rule is similar.  White men are still better off than most ethnographic groups in the US, but when they measure themselves based on growth, they feel left behind (they are left behind in growth).

Lastly, I'll leave this article on the slide of the American Middle class from the NYT.

Now I will make a logically unsubstantiated leap here and posit that a vast majority of Donald Trumps supports are accurately described by the above economic picture.  I believe that Trump's base is built on the frustrations of white men (and those who are close to them and care about them) who have legitimate economic grievances regarding their individual situation.

It is really a shame that Trump is their mouthpiece, however, because his root cause analysis is deeply flawed.  I don't think there are any credible economists out there who would blame illegal immigration for the trends I identified above.  Certainly there are no economists who would blame ISIS or gay rights or a woman's right to freedom from sexual assault or any of Trump's other defining positions.

The shame here is that he a) provides an unproductive distraction to the people who are getting screwed.  They're now getting screwed by him, too.  b) His angry rhetoric is producing a dangerous death spiral in the form of a reactionary populist echo chamber.  c) the result is that everybody who has been experiencing the gains in the economy assumes that those who are left behind are just racist crackpots.  The whole argument becomes illegitimate.

Ironically, the result is very similar to what you see with Black Lives Matter.  White men are not the subject of negative implicit bias and therefore are not systematically and consistently exposed to negative (and often lethal) encounters with the Police can't wrap their heads around the problem.  They look at the movement, find the worst in it, see some looters or rioters and decide the whole thing is bullshit.  Same story, different disenfranchised and marginalized set of experiences.

Tuesday, October 4, 2016

Metacognition, forecasts and estimating ability

tl;dr: Odds are pretty good that your opinions on everything are worse and more ill-informed than you think they are. Suspend judgment about the ability of others (positive or negative) because it is probably just an echo effect (they're brilliant if they agree with you and stupid if they don't). This is all especially true if you aren't an expert in the field under consideration. Don't apply this logic to others, apply it to yourself first.

Metacognition refers to the act of thinking about thinking. This is the same sort of concept as Freud's superego, or what phenomenologist philosophers call our consciousness. Kahneman might refer to it as "slow thinking".

In general, our brain has two methods for making a decision. By "decision" I mean just about any sort of conclusion about something. There is a fast method that runs ahead and draws its conclusions based on instinct, training, experience and mental shortcuts. The fast method is great for things that we are evolutionarily prepared for and for things where training is effective. Generally training is effective for things that require quick reflexes and that provide immediate, salient feedback. Fast thinking is gut feel and reflexes and is not something we are aware of consciously because it doesn't formulate in language -- it forms in action.

The slow method is what we think of as "thinking". It is logical, it handles new ideas and concepts. It is what we observe as "thought". It presents itself in our language and is shaped by the way our language works. Slow thinking spends a lot of its time confabulating a logical reason why the fast method came up with what it did. A lot of our thought is spent thinking about what we just did and justifying it to ourselves.

This where our ability to forecast comes in. Most of the time, when asked about the future, people immediately have a gut feel. That gut feel is hugely biased (too many forms of bias to go into now). As long as we don't get immediate and obvious feedback that the forecast was wrong, our brain has mechanisms in place to pat itself on the back for another great forecast. Even if the forecast is terrible, if the feedback is subtle or slow, our brain still thinks it did a great job and our metacognition formulates a story to back it up.

Unfortunately, most of the forecasting situations we face in the modern world don't provide immediate and obvious feedback.

This is the reason that we're all bad at estimating our ability. We are best at identifying problems when we have expertise with a subject. The better we are at something, the more capable we are of identifying when we screw it up. Conversely, the worse we are at something, the worse we are at identifying how bad we are. We blithely go along, thinking we're pretty awesome at the stuff we're worst at. Those who are the best at things are also the most aware of their shortcomings.

There are lots of examples of this in the world. I guarantee that Tiger Woods is far more critical of his golf game than a typical weekend duffer. Proof-reading the English language is another great example--if you don't know the rules of grammar very well, then you won't be aware of your errors. Driving is another example where people are notoriously bad at identifying their own skill (except race-car drivers, who are typically very slow drivers on city streets).

So what?

Well, this whole thing applies to business, sure. I think what's really interesting is how much it applies to politics. Look at how many people who know nothing about economics are judging the economic policy pronouncements of the Presidential candidates. Look at how many people who know nothing about police training are hypothesizing about their motives. Look at how many people who know nothing about life experience as a black American are denigrating their perspective. Science would tell us all those people are probably wrong and atribiliously wrong about it because they are even ignorant of their ignorance.