Everybody Lies by Seth Stephens-Davidowitz

This book was a bestseller and reviewed/talked about a lot -- so there's a very good chance you're not hearing about it for the first time from me. In fact, this paperback edition is already a year old itself.

More importantly, Stephens-Davidowitz's central point -- that there are now large datasets, mostly around Internet usage, which can be used by social scientists and other researchers to get closer to the truth about what people really think and feel about taboo or contentious subjects -- might be news in a lot of circles, but not to anyone who's been paying attention for the last decade or so.

(Admittedly, a lot of people don't pay attention. People are the worst, as we can also learn from this book.)

So. We have the usual punchy, expansive title: Everybody Lies. And the equally usual descriptive subtitle that claims even more territory: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are. So far, so much like a million other non-fiction bestsellers and would-be bestsellers since bookselling became a regular racket. We expect a quick, punchy read that makes big claims in a lot of areas, backs up at least some of them at least some of the time, and gives us a few facts which we can use to sound smart at a cocktail party or on the Internet.

Everybody Lies is a bit better than that, actually, but it follows that model pretty closely. Again, if you're in any data-driven field, it won't particularly shocking. (In Chapter 6, about two-thirds of the way through the book, Stephens-Davidowitz spends several pages explaining what A/B tests are -- I, and I hope every other marketing person currently in existence, have been doing A/B tests for probably a decade now. Not as often or as rigorously as I might like, true, but it's not a new concept for that many people, I hope.)

Stephens-Davidowitz (can I call him SSD from here, for short?) starts off with sex, because he is not at all stupid. He doesn't really note that one of the great precursors of this book are the occasional posts by the data scientists of (of all places) PornHub, delving into questions like whether porn viewing dips on Super Bowl Sunday and what the most popular kinds of entertainment are in different nations. But who ever wants to emphasize that other people have been doing the same thing, in more depth and sometimes better?

SSD was a data scientist for Google, and it seems that the best data he has to work with is still mostly from Google, so that informs what he's been looking at and researching. (Admittedly, I expect Google would be the best Internet data anyone could have to work with in most cases, given their size and ubiquity.) I do wonder what a similar book by a Facebook expert would say -- SSD is mostly looking at individual behavior and attitudes, as seen by searches, and a Facebook-centric (or even just social-media-focused) project would be much more about social maps, how ideas spread, which ideas spread, and the contagiousness of various things. [1]

Everybody Lies starts out with sex and racism -- it is a book by an American, for Americans, after all -- and then moves on to less immediately juicy topics and then to general issues raised by the existence of these tools and research techniques, as it tries to cover everything a general reader might want to know about Big Data and its uses.

I don't want to be flippant, because SSD has a fairly rigorous academic background, and he's clearly brought that to his data-science work and the original research that underpins a lot of this book. A lot of what he's doing here is simplifying complex data-analysis concepts to explain them to a mass audience -- but that's what a mass audience needs. Everybody Lies does a good job of summarizing both what we can know about (mostly American) mass culture and attitudes from Internet data, and at examining some particular examples of that data.

I personally would like a book with even more charts and detail, but that's me. This is probably more chart-heavy than the average reader wants to begin with.

[1] That book might exist -- let me know if anyone out there has read or seen it.

