The Business Rusch: Lies, Damned Lies, and Statistics
The Business Rusch: Lies, Damned Lies, and Statistics
Kristine Kathryn Rusch
The quote in my title comes from Mark Twain’s autobiography. Twain said:
“Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: “There are three kinds of lies: lies, damned lies and statistics.”
The problem with Twain’s attribution, however, is that no scholar can find anything in Disraeli’s papers that even resembles it. (Yes, scholars have that kind of time on their hands.) The website twainquotes.com cites an 1895 article by Leonard H. Courtney in which the quote first appeared—or so everyone thinks.
I find it hilarious that the source of this quote about statistics is almost impossible to track down. I also find it funny that Twain’s preface to the quote has gotten lost in the pithiness of the “lies, damned lies, and statistics.”
“Figures often beguile me,” he wrote, “particularly when I have the arranging of them myself.”
And thus, Mark Twain, who died in 1910, has poked at the heart of modern publishing. We all love statistics – or figures, as he calls them – but they prove nothing. In fact, this year, statistical analysis is harder than ever.
You’d think it would be easier. We have computers, after all. We have incredible processing speeds and more information at our fingertips than ever before. We can “crunch” the numbers quickly and easily.
The problem is in which numbers we crunch.
Let’s take, for example, the number of e-book sales versus the number of print book sales. We’re seeing a lot of statistics about the percentage of e-books in the marketplace. And those statistics come from reputable organizations.
I felt uncomfortable about those statistics at the end of 2011, and I feel even more uncomfortable about them now. These statistics purport to examine all books sold, and I know that’s not true. I also know that there are equations that supposedly take a statistical sample, and apply them over information not yet gathered (or information that’s impossible to gather). And even though I know the mathematical model is accepted, I’m still uncomfortable.
You see the mathematical model in polling all the time. Pollsters contact 1,000 or 10,000 or 100,000 sufficiently diverse people, poll them, and then use them as a statistical sample that supposedly represents the entire population. This same technique takes place in medical studies. Studies gather information from 50 to 500 to 5,000 people, gauge their reactions to, say, a medication over a period of time, and then use those as a basis for the result.
People who watch medical studies, for example, generally ignore the ones with less than 100 participants, and really believe the ones with tens of thousands of participants. And if those tens of thousands were studied over years, then the medical study is considered even more accurate than the one that follows someone’s reaction to a treatment or a medication over a few hours.
See why Mark Twain insisted that he liked figures if he arranged them himself? Or to put it in 2012 language: he liked statistics if he manipulated the information himself.
One of the first things I learned as a journalist, back in high school of all places, was how to look for statistical manipulation. “Four out of five dentists surveyed” might mean that five dentists were surveyed, and four of them (the ones who worked for the company) liked the product. Or it might mean that four out of five dentists in a survey that contacted 10,000 dentists (none of whom worked for the company) liked the product.
Both statements would be true. Four out of five dentists liked the product. But only one statement might be information that a consumer might benefit from.
As the past year has continued, it has become clear to me that e-book sales are rising. Anyone who watches numbers knows that. Every day there’s a new tablet hitting the market, or some new version of an e-reader. Just this week, Apple unveiled iPad 3. At the same time that Apple announced the New HD iPad (which is what they’re calling it), Google announced Google Play which it claims will rival iTunes. We’ll see.
I spent some time as I wrote this trying to find exact numbers of tablets and e-readers sold, and I can’t. Part of that is Amazon’s unwillingness to impart information on its sales in this area, and part of it is the rapidity of growth.
At the moment, there aren’t even enough statistics out there to manipulate. At the moment, even the folks in the know admit that they’re guessing.
The point here is that the ways to buy and read electronic books are multiplying daily. The low prices over the holiday season made it possible to give e-readers and tablets to people who wouldn’t buy the devices on their own. Most of those folks are using those devices.
We’re already seeing an impact on pricing. I will deal with that in a future post. But we’re also seeing the impact on e-book sales. Every indie publisher I’ve talked with has seen a serious spike in sales. Traditional publishers also discuss the increased sales.
But keep this in mind: As readers order a new e-book, they don’t notice who published it. In fact, over the past few months, Amazon has changed their algorithm, so you actively have to look for a publisher on many titles—including traditionally published books.
It’s a smart move on Amazon’s part, because it ensures that indie writers will sell based on the book itself rather than who published the book. But study after study after study for decades have shown that readers rarely pay attention to the publisher of a book. Readers pay attention to the author name instead.
Some of that is branding. Readers will buy an entire book line, like a Harlequin category novel, because the books provide a consistent and predictable reading experience. Not that the books are the same, mind you, but they’re of a type. Just like an author’s books are of a type. The “voice” of the book line is consistent. Daw books did that in its early years, and branded all of its books with a yellow spine. Daw published everything from fantasy to science fiction during those years, but the books had a consistent look, and they had an editorial quality control rare in traditional publishing.
To pull something like that off is hard. Most traditional publishers can’t do it, which makes them reliant on their authors (whom they’re afraid of losing right now—also a future blog post).
So the point is that if I publish a book myself, the readers don’t care as long as I make sure someone else edits the book, and a different someone else copyedits it. Readers only care about publishers if the publishers get in the way of the reading experience.
In the past, publishers have gotten in the way by no longer publishing an author or canceling an on-going series. Then readers might notice that the publisher has done this to a favorite author. But usually the reader writes to the author asking why she discontinued a series, and the author then points to her publisher. The reader might or might not write to the publisher to complain, but even if the reader does, the publisher doesn’t care. The publisher made a business decision long ago, and really doesn’t plan on revisiting that decision no matter how many readers say they will buy the books.
Keep this fact in mind: Readers don’t notice publishers. Readers only notice writers. Readers will buy a writer’s next book no matter who publishes it.
But most of the statistical measures of book sales track book sales by publisher.
Twenty years ago, when Dean Wesley Smith and I ran Pulphouse Publishing, a small press by New York standards, (although we sold as many sf/f books as many imprints of major publishers), no major statistical analysis of books sold counted our books. Those measurements rarely counted any regional book sales or any university press book sales, even if the university press published trade (commercial or non-educational) books.
When these analysts looked at books in print, they would be able to find our books because our books—and the regional press books and the university and small press books—had an ISBN number. The ISBN (which is short for the International Standard Book Number) gets sold around the world through different companies. (Not every book in print gets one, by the way, and never did. The ISBN has existed since 1966, and came into wide use in the late 1970s.) R.K. Bowker, which sells the ISBNs in the United States, keeps careful track of those numbers and can give books in print statistics to anyone willing to pay for them.
But books in print is a different measure than book sales. (Just because a book exists doesn’t mean it has sold a copy.) And the introduction of e-books has changed the system dramatically. Many (dare I say most?) e-books don’t have an ISBN. Amazon has its own e-book tracking system, and so do some of the other e-bookstores. That ISBN system has broken down as well.
Which means that right now, no one knows how many different titles exist in e-book form. We now have even less information about e-books and e-book sales than we did about paper books back in the day, which we only knew imperfectly anyway.
In the past, if someone really wanted to, that someone could make an educated guess about the number of books for sale in the United States, based on the books in print, the size of book stores, the number of readers, and a bunch of other factors. That guess had the possibility of being reasonably accurate.
That guess couldn’t take into account sales velocity (meaning how fast a book sold once it was on the stands), but it might be able to track certain kinds of sales, given the information available through publishing company sales departments, bookstores, advances given to authors, and the number of books going through web presses, etc. It could be calculated if—like those Disraeli scholars above—you really wanted to lose half your life to tracking down arcane information.
Book sales can’t be tracked now. Because e-books can be produced in someone’s bedroom and uploaded on a home computer (no web press), and sold through a website (no bookstore or sales department).
Even existing bookstores aren’t much help. Amazon doesn’t give out sales figures on anything. Apple keeps a lot of information about the iBookstore close to the chest. Barnes & Noble, which works on the old book publishing method, does release more information, but still not enough. (And with B&N thinking about spinning off its Nook/e-book business, that behavior might change.)
So add this as your second piece of information: No one can accurately track all e-book sales. There isn’t enough information for anyone to piece those sales together from disparate bits of data, like you could once do with print books.
Finally, add this to the equation: traditional publishing got into e-books late. They didn’t convert most of their backlist (and still haven’t), so there is no one-to-one measure of paper books in print to e-books in print. From traditional publishers only.
Many—dare I say most?—of the e-books published right now are through indie presses or individual writers. I can’t say that with complete certainty because, as I said, the statistics don’t exist.
But Amazon keeps track of the indie titles published through its KDP program, and B&N keeps track of the indie titles published through its PubIt program. Those book titles aren’t always the same, either. Some authors publish exclusively through Amazon. Others publish exclusively through B&N. Many authors avoid all the big publishers and go it alone, on their websites or developing their own stores.
So add this third piece of information: traditional publishers have not published their entire backlists as e-books. Meaning that all the books traditional publishers could have published have not been published.
And now, let’s look at last week’s news.
The American Association of Publishers reported that e-book sales rose 117% in 2011. The report stated that net book sales went down 3.5%. What that means is that even though e-book sales have risen, they haven’t risen enough to compensate for the decline in paper book sales. For example, mass market paperback sales were down 35.9% with adult mass markets dropping off 40.9%. E-books rose 72.1% over December of 2010.
Why are mass market sales down? Mostly because of the decline in slots to sell the books. Safeway, Albertsons, and other grocery stores reduced their book sections dramatically. Borders is gone. And traditional publishers have given up on the mass market form, trying to get readers to go to more profitable e-books, trade paperbacks or hardcovers. Even if you want a mass market, good luck finding it in a brick-and-mortar store. If the mass market version even exists.
That’s a sidebar to my real point, however. The AAP is the only organization that tracks book sales with any type of accuracy. When you see a statistic like this one—E-books are now 18.6% of the market—that statistic usually comes from the AAP.
The AAP is an organization with 300 members. Those members come from all aspects of publishing, not just trade publishing (which is the category most books you find in a brick-and-mortar store like B&N fall into). Only 77 publishers reported sales to the AAP for its 2011 report which is up by the way, from its 2010 report, which was based on 12-15 publishers.
Those publishers are commercial trade book publishers active here in the United States. They are self-selected and their numbers are not verified.
Got that? Self-selected and not verified.
Let’s assume, though, that the numbers are accurate. Let’s also assume that these numbers can be extrapolated to include all traditional trade publishers here in the United States.
What’s wrong with these statistics?
- Books sell because of the author. These statistics only show traditional publisher sales, not sales by author. More authors are publishing their own backlist than ever before. Barbara Freethy alone has sold a million copies of her backlist books as e-books. She published those books herself.
- No one can accurately track e-book sales. There are too many places to sell the books to get any accurate count. Also, no one knows how many e-book titles are actually in print, so we can’t even make an educated guess as to the sales. Plus the biggest bookstores selling e-books don’t give out sales information.
- Traditional publishers still have a small percentage of their available books in e-book form. So if you’re reading a backlist e-book title, chances are it was published by someone other than a traditional publisher.
What does this all mean? It means that e-book sales are most likely greater than the statistics show.
Traditional publishing saw the decline in book sales revenue in 2011 as a bad thing. Yet all of the evidence coming from studies done of people who have a new e-reader finds that readers increase their book-buying after they get an e-reader. (Because I searched for this fact, I could only find the 2010 Wall Street Journal article on one study. There have been more since.)
Traditional publishers assume that if their overall book sales went down, all book sales have gone down. But I think that extremely unlikely. Since readers only care about who wrote the book and not who published it, and since readers have a finite amount of dollars, there is a good chance that those readers bought books from their favorite authors somewhere else. The book sales that have disappeared from traditional publishers’ ledgers have actually shown up on the ledgers of indie-published writers. Plus some.
In other words, folks, it’s my personal opinion that the traditional publishing statistics on e-books are completely meaningless. Or to put it another way, the news is much better than these statistics lead us to believe.
At some point, we need to figure out a way to count the e-books being sold through all venues and from every kind of publisher, from traditional to indie. Then, perhaps, we’ll have an accurate picture of what’s really going on in the book business.
Right now, we get only a snapshot of what’s happening among a self-selected group of traditional publishers who give out unverified information. We make all kinds of proclamations based on statistics derived from that faulty information.
Good old Mark Twain. Who knew that he could peer more than a century into the future and see what would happen with publishing? He loved to arrange figures to suit his point. Traditional publishers are doing the same at the moment—intentionally or not.
Last week’s post brought a bunch of strangers to this blog to disagree with me viciously and vehemently. The point was (I think) to get me to change my mind about what I wrote. That didn’t work.
What that tidal wave of negativity did do, however, was make me even more appreciative of those of you who have come to the blog over the past three years. You haven’t always agreed with me, and that’s a good thing, because I learn from you. But you’ve been polite and respectful. You’ve sent me links and helped me find information. You’ve made great comments, and sent wonderful, informative e-mails.
You’ve also donated your hard-earned cash to keep this blog alive.
I want to thank you for all of that. You folks are the reason I write this blog every week, even though it takes time away from my fiction. You’re a spectacular group and it’s a pleasure to interact with you every week.
“The Business Rusch: “Lies, Damned Lies, and Statistics” copyright © 2012 by Kristine Kathryn Rusch.