Debunking Author Earnings Report
Above is a graph of E-Book market share, by Author Earnings Reports. The graph above was taken from an article by PublishDrive (whom I love, btw, great company), the same graph was re-designed and displayed in multiple articles on Kindlepreneur, similar info is on Idealog, Lulu.com, Geekwire, Janefriedman, QZ, Observer, ElectricLiterature, virtually every website that does articles on the self publishing industry. Before I get into just how inaccurate a lot of his data must be, keep in mind that Amazon releases NO ebook sales data. None. Zilch. All sales data, including Bookstat/AER’s, is from websites trawling Amazon, collecting sales ranks, and assuming. Ditto for other websites, most of whom don’t release hard or detailed data. It could be that Paul is not inaccurate in himself, it could simply be that collecting secondary scraps of information will, by definition, be wildly inaccurate when comparing so many secretive data sources.
I would also like to remind people that most websites where “data is money” skew numbers intentionally. Reddit stopped displaying downvotes, and also started fudging the exact upvote amounts, both to make it so anyone who wants accurate data must pay them, and also to retain a degree of control by letting themselves promote or punish certain behaviors with a higher ranking. If you think Amazon does not do this, you are naive. Amazon has incredibly complex algorithms, I certainly do not understand them, and any author who has been publishing with good results over a long period will tell you that the same amount of sales will give you a wildly different sales rank, and I’m not talking about subcategory rank. As Jeff Bezo’s essentially pioneered data scrapping his competitors and undercutting them, and the sales rank is an unofficial ranking granted to you for ranking purposes by Amazon, I think it’s safe to assume that your rank is not determined as a simple function of sales per day. As such, any data analysis collecting “sales” information purely from sales rank is doomed to fail. Sales rank is great to make a rough estimate of how much a book is selling, but it’s obviously insufficient to estimate total sales, otherwise Amazon would stop protecting their e-Book sales data like it’s the 11 herbs and spices.
Anyway, on to the bad data, in no particular order.
I will start with a silly one, he’s written JK Rowling twice into the list of top selling audiobook authors. This list looks almost official enough to base my entire writing strategy on, right?
Then he says (picture after next) that sales in Mystery, Thriller & Suspense was 215,519,384 e-books sold for $1,101,587,355 for the 18 month period from April 2017 – September 2018.
So to re-cap, he is saying the 2016 data is less accurate, and is putting 2017 e-book sales at 3 billion (where did he get this figure from? It’s not in the report.) The fact that he clearly listed 9 months of e-book sales in 2017 here at a total value of 1.3 billion? Ignored completely.
So. The last 9 months of 2017’s e-book sales were 1.3 billion and that includes over 90% of the market but all of 2017’s e-book sales were 3 billion, with 1.7 billion being made in the first 3 months. However, he only gives the 3 billion figure in the comments, with absolutely no explanation of how 1 + 1 = 5.
His explanation that he was capturing less of the market towards the beginning of the year, is paltry. He never mentioned that when he released the report, only as a reply to a comment many months after the report was released. He also, according to many forum posts and comments, went back and added 50-100 million to various figures, which is an absurd amount to simply add without explanation.
Some of what he says could make sense, if it was said upfront. Editing, adding and subtracting from your figures in real time to keep up with criticism of your figures as the criticism comes in, is shady and sloppy at best. Throwing out large numbers without explanation is also absurd. Not to mention that, without fail, every years report he releases contradicts the prior years report in such extreme ways as to make it clear that one of them must be wrong. The actual data analysis of the figures seems sloppy, and the underlying data could better be referred to as the underlying assumptions. Throw in the fact that even if he had a perfect data collector, he could never directly collect sales data complicated enough to make his data useful for targeting – and you’re left with the notion that letting this guy inform us is wrong. People make real, life changing decisions as authors and publishers based on this information. What if the commonly referred to 85% market share of the e-book market Amazon has, isn’t true? Amazon doesn’t claim it. The implications for decisions such as to go exclusive or wide, or what category to publish in, are massive.
I suppose the obvious next step is to ask – how can we get accurate data on the market?
To be honest, I’m unsure. I feel a combination of the little hard data we have, such as from The Association Of American Publishers and NPD Pubtrack, plus $ statements from the companies selling the books themselves, along with niche analysis through inference from a rough analysis of sales ranks, would give a relatively decent picture, good enough for any small publisher to rely upon. Who knows, maybe if enough people are interested I will do my best to paint a decent picture of the current e-book market. Until then, I hope this helped some people understand just how sad the state of “common knowledge” e-book market analysis is. Stay aware!