Business Musings: Confidential Business Information

Business Musings Current News free nonfiction On Writing

By the time you read this, I hope the tempest has stopped rollicking the teapot. Because I only want to use the incident as a spark to discuss something most writers don’t understand.

Confidential Business Information.

When the big indie publishing movement started almost a decade ago, we were all startled by it. From the ease of publishing ebooks directly to readers to the way that fortunes could be made, seemingly overnight, it all felt too good to be true.

And that created a culture of Prove It To Me. Part art of that culture was absolutely necessary. Most people did not believe that self-publishing could make them money. Those who did had either done it before, or had such wild success that they were as startled as the rest of us.

Joe Konrath and others started posting their indie sales numbers pretty early on.  I’m linking randomly to Joe’s blog posts from 2010, just because I remember him being so open about the numbers.

I also liked the tone of his blogs—surprise that the numbers were working out the way that they were; pleased that the numbers were working out the way that they were; and a small bit of worry that the numbers might not continue working out that way.

I’m assuming that they have for Joe, although I don’t know. He stopped reporting his sales numbers a long time ago. But he’s still going at it, still publishing his books, still analyzing what’s going on (mostly in his own career), and occasionally—very occasionally—blogging for writers.

In those days, most self-published writers posted their numbers, sometimes from all the sources (all two or three of them), and usually did so monthly. It was part of what became the indie community—this openness about how well you were doing, mixed with isn’t this great? and a lot of cheerleading (If I can do it, you can too.)

The cheerleading is mostly gone. Now it’s more business focused, and many of the people who are posting their numbers are trying to sell their method for garnering those sales.

But back in the day, Dean and I were unusual. We did not post our numbers anywhere, and got rafts of shit for it. When we wrote our business blogs, new indie writers were looking at our Amazon sales ranks on one or two of our books (out of the hundreds we published through our company WMG), and decided that we weren’t making money on our writing. At all. That made it easier for those writers to dismiss our advice, even when—on that level, at least—our advice remained consistent. Go wide, sell subsidiary rights, have multiple income streams, and so on. (On other things, our advice did not remain consistent; it changed as the marketplace changed.)

Dean and I have owned businesses for decades, and not just writing and publishing businesses. We’ve had retail businesses (still do), online businesses, and a tree farm. (Yeah, don’t ask.) We know our way around business, and if we don’t understand it, we can figure it out.

One thing we learned early was to keep as much information we can about the business completely private. It’s not that we’re privacy freaks, mind you, but the internal workings of the business—of any business—are tools of the business. The more you control the tools of your business, the more you control the business.

As I prepped for this article, I found a series of lawsuits that differentiated (or tried to differentiate) between confidential information, proprietary information, and secure information in regard to any business. Quickly, those suits devolved into the weeds—was this particular method of doing x confidential, proprietary, or open to anyone? And who got to decide?

Some of the issues are even more complex. When one business mines data or uses secret shoppers or sends a mole into another company, is the information taking from company two confidential or protected in anyway. Was the information “in plain sight,” as the criminal law would say? Was it readily available to anyone who looked? Or was it something that the first company stole?

These questions aren’t always easy to answer, and often go to court. Once these things end up in court, then the “winner” isn’t always the company with the law on their side; it’s the company with the deepest pockets or the most determination to prevail. (A lot of these cases get settled out of court…to prevent the confidential/protected/proprietary information from getting into public as part of a lawsuit.)

Usually the entity that got to decide whether the information is confidential was the business itself. The lawsuits were often about someone revealing information they shouldn’t have or using some sort of method to share that information—often with a competitor—in a way that either state or federal law prohibited. Sometimes the prohibitions were in employment law, but sometimes they were simply about what another entity can reveal about your business to the public.

Again, this sounds like the stuff of control freaks.

But once you understand how to business works, you learn that the more you keep to yourself, the better off your business is.

Time for examples.

Let’s move outside of the realm of publishing for a moment and into just general business negotiation.

A dear friend of mine with many years of experience in a particular industry retired from a rare senior position in that industry to start his own business. His new business is to advise other companies in that industry on how to work best within the industry. Believe me when I tell you that this man is someone whose knowledge is gold. There aren’t many people like him in that industry, particularly on the consulting side of that industry.

When he decided to become a consultant, he did almost everything right. He set up his business really well, but he’s never negotiated before. He hasn’t run a business this way—he was always employed by someone—so he had no idea how to attract clients. He was thinking of setting up a website, listing his rates, and let people’s fingers do the walking.

I talked him out of listing his rates. (So did another friend of ours, with 40 years of experience running businesses.) In fact, I told him to hold off on offering his services to just anyone walking through the door. He had already set up pitch sessions with his most likely clients. I told him to wait until he had talked to them before going wide. (Note this is the opposite advice than I give to writers.)

He went into those pitch sessions, armed with negotiation tools that included not listing his fees, letting the potential clients know that he was offering them a one-time chance to get his services before anyone else, and so on and so forth.

If they asked how much he charged, he told them he preferred to discuss fees after they settled on what he would do for their companies. If they needed to have an idea of the cost of his service, then they could tell him what they normally paid for the service, and he would determine if they were in the ballpark.

You see, I learned the hard way that you should let the other guy set the opening bid. Often the other guy sees your skill set as waaaaay more valuable than you do.

Sure enough, one of the possible clients mentioned price to my friend and the price was double what he would have charged from the get-go. And this is an industry in which negotiation on fees (in business) is common, so “high” price (to my friend) that was the client’s lowball offer. It only went up from there.

If my friend had posted his fees from the start, he would have gotten the money he asked for and maybe not the clients he wanted. They might have seen him as too cheap.

If other clients know how much their competitors are paying for his services, they can ask to pay the same price or the competitor might ask him to lower the fee. For example, one of the clients he talked to represented a multinational conglomerate. Ideally, my friend should charge more for that client than he would for the mom-and-pop business that he also talked to a day or so later. Everyone knows that’s how business is done, but if my friend had advertised his fees or let them leak into the public, he couldn’t maximize his profits.

Although I’m talking about financial and negotiation matters here, control of business information isn’t just about finances. It’s also about procedures and methods, ways of doing things, ways of handling a certain crisis (or not handling it), decisions that are not logical to outsiders but which make sense to the business itself. It’s also about certain business practices, which could be patented, and could then become IP.

I can’t list everything a business might want to keep private, because it varies from industry to industry, and business to business. Many businesses judiciously release certain bits of information, so that they control the perception of their business. Others keep everything private. And some businesses make it look like all of their material is in the public.

If that business has been around for a while, and has a real business structure, I can guarantee that what you see with the business is only the tip of the iceberg.

I started as a business reporter. I’m very good at taking disparate bits of information and piecing together an entire fabric about the business, based on what that business has made public. I had those skills even in the analog era.

There were ways back then to discover, for example, how much ready cash a business had in its bank accounts or how much real property it had. You couldn’t do it at home on your computer, but with a telephone and a bank account number, you could get a lot more information than you’d want to know.

Plus there are records that each municipality makes public, from property records to legal disputes. Many corporations hide a lot of information by creating smaller corporations within the corporate umbrella. Maybe you thought that XYZ corporation owned the empty lot across the street from you, but when you look into the title records, you see that the title is held by 123 corporation. If you still think that XYZ and 123 are related, then you have to look up the incorporation papers for 123—and in some states those papers are readily available, and in other states, they aren’t.

There’s nothing sinister about keeping information private. (Although there can be, at times.) Most businesses just prefer to work that way. They want to control the reveal of information.

I’ve often been in conflict with that reveal, because I’ve been a reporter off and on throughout my life. There are rules and ethics behind how reputable reporters get information, many of those adjudicated in courts of law.

In doing this blog, I often know much more information than I can comfortably (or legally) reveal to you folks. It’s part of the business.

So let’s come back to publishing. Those of you who are on Patreon will see this post in the week it was written, so you probably already know the precipitating event for the post.

The latest Author Earnings report came out on January 22, after nearly a year of silence.  Data Guy has been capturing information what he says are the million top selling titles every single day, and then using his algorithm to turn that information into actual sales numbers. He’s tweaking Amazon’s algorithms so that they work for him.

Data Guy has done this, keeping his name and identity private, for four years now. He has done a lot of work to gather this information. He says he only uses information that’s publically available…to someone with his skills. Then he uses his own secret sauce to scrape and interpret the data.

I’ve always had a slightly queasy reaction to Data Guy’s methodology, because I can’t verify it. At worst, I figured, he was a white hat hacker. At best, he was right about how he got the information. Early on, I was usually one of the last people to report his findings, expecting Amazon to shut him down.

After years, and his appearance at various conferences, I figured Amazon knew about him. If they didn’t like what he was doing, or saw it as a threat to their private algorithms, they would shut him down. Apparently, in this early stage, anyway, they saw him as another data capture service.

They might also have had no qualms about what he was doing because he was not making money from it.

Anyway, this report’s findings are a nice confirmation of what we’ve all been saying—that the market has matured, and ebooks continue to grow. That would have made a nice two-day story, something that would have helped the publishing community along.

But hand-in-glove with the new report was the announcement that Data Guy has started a new company. It was only a matter of time before Data Guy did this. Either he would quit offering the information for free and go on to doing other things, or he would monetize his work. (I had thought, early on, that Hugh Howey paid him. I’m not sure what their arrangement is.)

In response to a question I asked on the site, Data Guy said the cost of providing this information has become so extreme he either had to monetize it or quit. That makes sense to me.

The method he chose to monetize this product, however, angered a ton of indie writers. The anger is only growing as I write this on January 26.

Here’s the pitch from his Bookstats company website on what he’s selling:

Bookstat’s lightning-fast, responsive dashboard lets you search by publisher, genre, author, title, BISAC, ISBN, or ASIN. Discover the top-earning publishers, authors, and titles in each genre right now. See their total ebook, audiobook, and online print sales for last quarter, last week, or even yesterday. Drill down into thousands of subgenres. Analyze sales by price point. By publisher type. By online discount offered. Slice the data any way you want.

From the largest Big Five trade publishers down to the scrappiest garage micropresses, to sales from Amazon’s in-house publishing imprints and format-dominating Audible Studios to J.K. Rowling’s Pottermore — data that you’ll find nowhere else — even the sales of individual self-published authors: it’s all right there, live at your fingertips, ready for you to ask it the questions that drive your business.

And Bookstat’s dashboard works just as well on your mobile device:

That hot new title, up-and-coming author, or exploding genre that someone just mentioned?

Have a phone in your pocket?

You can check their online sales right now.


This is not what Data Guy promised indie writers when he started setting up this company. First, he spoke at the annual conferences for Romance Writers of America and Novelists, Inc, promising writers that if they shared their personal data with him, he would then create an algorithm that would allow them (for a fee) to see their own numbers as scraped by his algorithm.

A lot of writers—many big names—shared their data with him, so he could tweak his product. When he announced the new business, however, it was clear that his business model revealed all of this information…to companies that have revenues of ten million or more. I did a piece just a few weeks ago on traditional publishers. Ten million is the minimum threshold for what I was calling “small traditional publishers” —if you look at publically available financial data on those companies.

How Data Guy will be able to verify this information is beyond me. I now consider him untrustworthy when it comes to handling information. He gathered information from writers under false pretenses, and is now giving it to their competitors. (I write this on January 26, so this may have changed.)

The writers themselves cannot access their own personal information, to even see if it’s accurate. (I can guarantee that it does not check a writer’s complete sales or income. Writers earn from many, many more sources than Amazon, even online.)

When the free report came out, there was a list of the top-selling indie writers—by name—with the promise that some big corporation could get their sales numbers if the big corporation paid for it. Just the list of the top 50 indie writers had some big reveals that the writers themselves did not want public.

To make matters worse, none of those writers gave him permission to use their names. I suspect (although I do not know) that some of those writers gave him information to help him tweak his algorithm, thinking they would be getting personal data from him, not data on other companies. He betrayed them.

Writer after writer wrote to him, demanding their names be removed from that list. Finally, he took the list down, posting this in its place:

ETA: We had initially shared a ranked Top 50 Indie Ebook Sellers list here (with units and dollars blurred out, of course). [Kris’s note…but available to anyone who could pay for it] But then some of the authors on it started emailing us and asking for their names to be blurred out, too. As a courtesy to those authors, upon request we did so, but after the first few it became too much of a hassle. And besides, with a quarter of the list greyed out, it no longer effectively illustrated the point we were making, anyway. So we yanked the whole list, and will just simply state what we observed on it.

Data Guy pissed off the very people who helped him build the business he was trying to market.

All of that—I’m sorry to say—is neither here nor there. By the time those of you who read this on my website see this post, the Author Earnings report might be gone (crushed by Amazon or Disney, also on these lists without permission) or everything might have settled down or there might be a new teapot tempest brewing about other things.

If Amazon does let him continue, it will be because they believe that he’s not doing anything wrong, or that his numbers are stunningly incorrect and irrelevant to Amazon’s business.

But not to the writers whose information is being scraped.

I said above that I would be using this latest crisis to examine something else entirely. Because this tempest revealed something to me about new-to-business indie writers. (Not new indie writers. Writers who are new to being business owners.)

They do not understand why the long-term business professionals are upset by Data Guy’s actions. His breach of implied ethics probably shouldn’t surprise any of us, especially if he was a white-hat hacker. We enabled his information scraping (which might or might not have been legal, let alone ethical) from the beginning.

But it was—and is—the loss of control of our own personal business information, being represented, information about our businesses that we cannot access that has savvy business writers angry.

Many new-to-business writers came on various sites which were discussing this tempest, and comparing what Data Guy was doing to the Forbes List or to BookScan or to movie box office.

Some writers stunningly said that everyone else had the right to other writers’ information because those writers were “famous.” (I think these writers need to learn about the definition of famous under the law, but that’s another point.)

The writers who were upset were upset because they did not know and could not receive what kind of information about their businesses that Data Guy was providing to their competition. These writers—these businesses—had lost control of the information about their businesses. This is the kind of thing that businesses sue over—and win.

Passive Guy did an interesting analysis of Amazon’s interest in this, finally concluding that the data Data Guy had scraped belonged to Amazon.  The fascinating thing about Amazon is that they are known worldwide for keeping their business information private. Sometimes they sue, as they did in March of 2016 when one of their logistics experts took a job at Target.

Sometimes Amazon changes its algorithms the moment it learns that people have been gaming them or using them for profit (which is what Data Guy is doing). (See the entire history of Kindle Select for that.)

Amazon also pointedly refuses to divulge information that most business divulge as a matter of course. Article after article about retail or online businesses or major corporations complain about Amazon’s reticence in sharing its data with outside companies.

Big companies like Amazon work hard to keep their information private.

But the new-to-business writers cited other things. The Forbes List, movie box office, things like that which “reveal” information about individuals or the entertainment industry.

It’s not an apples-to-apples comparison. It’s not even an apples-to-oranges comparison. It just shows a lack of understanding of the various numbers floating around our entertainment news sites. (Those numbers are explained on this sites, if you just look at the actual articles and not the headlines.)

For example, the movie box office numbers you hear have nothing to do with individual movie companies or even with the real earnings of an actual movie.

Box office numbers are based on ticket sales at reporting theaters in the United States (I’m not sure about worldwide). Not all theaters report. Most of the reporting theaters are the large chains. Then the ticket sales are multiplied by an average of ticket prices, to get the weekend box office revenues.

After that, it’s a guess as to whether or not a film has actually made money from its release. There are the actual costs of the film, the projected costs of the film, the carried-over costs of the film, the hidden costs of the film, and the expected future earnings of the film. Those are counted against that weekend number so that “industry experts” can tell you if the film flopped or not.

But flopped for whom? The company that made the film? The distributors of the film? The theaters that showed the film? We don’t know what popcorn sales were for The Last Jedi the week of its release, particularly compared to popcorn sales for Three Billboards Outside Ebbing, Missouri, but I can tell you that Three Billboards probably made more popcorn profit for its theaters than The Last Jedi did, because it cost less for the theater to run Three Billboards than it cost to run The Last Jedi.

And why am I discussing popcorn sales? Because theaters make most of their money on films on concessions dollars, not on the actual ticket sales.

Box office information looks straightforward, until you begin to understand that once again, the public information isn’t the kind of information that matters to the businesses involved in the industry.

This is why Winston Groom thought he had an ironclad case when he sued Paramount Pictures for failing to pay him his share of the profits of Forrest Gump, the movie made from his book of the same name. After all, the movie had been one of the biggest films of the 1990s, with, at the time of the suit, box office revenue of $300 million.

Groom had signed a contract entitling him to a percentage net profits, not gross revenue (and not even a percentage of the movie’s final budget). Paramount claimed a loss on that picture of $60 million. Forensic accountants searched for the truth, lawyers made a lot of money, and eventually Groom got paid some more money than the $350,000 he initially took for licensing the film rights as a “token” against future earnings.

Box office numbers do not mean what entertainment junkies believe they mean.

Any more than the Forbes List of top earners in various entertainment categories means those people are actually the top earners. Forbes itself admits that.

Here’s what Forbes says in its list of the highest-paid authors in 2017 :

All earnings are for June 1, 2016 through May 31, 2017 before taxes and other fees. All book-sales data are sourced from NPD BookScan, which tracks 85% of the domestic print market. Estimates are compiled by examining print, e-book and audiobook sales, considering TV and movie earnings, and talking to authors, agents, publishers, lawyers and other experts.

In other words, Forbes uses BookScan, which retailers report to directly (like box office numbers), but that doesn’t give a real examination of revenue. Because what price did those print books sell at? (I saw some Grisham books in the week of release in Book Warehouse, a book discounter. That means Grisham received no money or pennies only for that book in that store. Were those books counted on Bookscan? I don’t know. I doubt it, but if they were, you can’t count that as a full price earning.)

So what Forbes is saying above is that it uses publically available information, and it “talks” to authors, publishers, lawyers, and other experts. Then it “estimates.” (Says so, right in the methodology copy.) No verification of these numbers exists.

In fact, the 2017 list I took that methodology quote from puts J.K. Rowling at the top, but most of her income is hidden. Pottermore is notoriously quiet about its revenues. And there’s no way to know what she’s actually earning from all the subsidiary rights and licenses (some of which she had licensed to Scholastic and still might not get the proper percentage on).

I have known authors who made the Forbes list, and know that their income was incorrectly reported. (Over-reported in one instance, significantly under-reported in another.) I know several authors who are excellent business people and who keep all of their publishing information quiet. They’ve never been on the list, although one of them should be on the list regularly. He’s not one, not two, but several New York Times bestsellers. He writes under a lot of pen names, licenses subsidiary rights on all of them, and no one outside a close circle of writer-friends knows that he is all of those people. (Notably, he does not have an agent. He and his lawyer handle all of the deals.)

These lists are interesting to the consumer. They’re sometimes great publicity for entertainers and writers, but the lists are rarely accurate. The businesses behind the list don’t want the information to be accurate. They see these lists as some kind of weird advertising and nothing more.

So…in this world of celebrity and the internet and data at our fingertips, should we even try to keep our business information confidential? The big companies do so. Smaller companies do as well.

Smart business-oriented writers do.

Will someone find out everything about your business? I suppose they can try. They probably won’t.

Losing control of aspects of your information, though, can harm your business in significant ways. Ironically, even Data Guy seems to know that. In an exchange in his comment section when he was asked what software he used to crunch all the data, he said this:

Some of the calibration & formulae were initially developed in R, as you surmise.

But at the scale it runs at now, the core engine is all custom code, leveraging both columnar SQL RDBMS & clustered NoSQL in the architecture.

Unfortunately I’m going to have to stop there, because some parts of the software architecture as implemented are non-obvious/innovative enough that it might be protectable IP.

Data Guy understands that parts of a business should remain protected or the business will be harmed. But he hasn’t bothered (at least at this writing) to take that concept out of the realm of software into the realm of business numbers for individual writers and companies.

Again, I am aware that sharing information was how indie publishing got off the ground. I think with the publication of the January Author Earnings Report, those days have officially come to an end.

That’s probably a good thing. Maybe the expectation that writers should share every bit of data about their businesses will go by the wayside now.

It helped a lot of writers enter the indie publishing realm, but those days are past. We are indeed a mature market, where writers no longer bootstrap each other and are slowly learning that not everyone is deserving of their trust.

Keep the information on your business proprietary, folks. And learn how to control it to benefit your business.

Remember, the person who cares about your business the most is you. Some others might want to help you, but they might also want to use what you do to their advantage.

You can’t always tell the difference up front. But the moment someone reveals their true colors, respect that. Because as Maya Angelou says, “When someone shows you who they are, believe them the first time.”

It’ll save you a lot of grief.


I have long had a rule about my writing: It makes me money or I don’t do it. That’s because I make a living at my writing. I’m not a hobbyist.

Not even when it comes to the blog. Yes, I’m always learning as I write these blogs, but they wouldn’t continue if I didn’t receive some money for writing them.

I set up two ways that you can support the blog financially if you’re so inclined. If you liked this post, and want to show your one-time appreciation, the place to do that is PayPal. If you go that route, please include your email address in the notes section, so I can say thank you.

I have a Patreon page, so if you feel like supporting the blog on an on-going basis, then please head there. It has taken a little more than a year for me to come close to the $500 per month that I said I would need to keep the blog going week to week…on Patreon. But the donations I receive from PayPal made up for the shortfall early on.

If someone were tracking my nonfiction earnings only from Patreon, they would wonder why I was continuing. That’s because the revenue from the non-fiction is no more straightforward than any of the other revenue in my writing business.

Okay, time to get off the soapbox on this topic, and to say thank you to everyone who comes to the blog. I greatly appreciate the discussions, the insights, the information, and yes, the donations. You folks are wonderful.

Thank you!

Click to go to PayPal.

“Business Musings: Confidential Business Information,” copyright © 2018 by Kristine Kathryn Rusch. Image at the top of the blog copyright © Can Stock Photo / AndreyPopov.


19 thoughts on “Business Musings: Confidential Business Information

  1. In light of what Data Guy is up to regarding his collection and profiting from spidering Amazon, something happened this weekend that really bothers me.

    Like many, many Indie authors, I got on the bandwagon with Book Report from the day I heard about it in 2015. I got the applet and b/c my income passed his benchmark, I’ve paid the $100.00/yearly fee. Back then the applet didn’t collect data, just translated my Amazon pages into sales numbers on my computer. Great stuff, right?

    Here’s the thing- last year Liam changed the privacy policy when he added some features. No… I didn’t read it, I barely noticed. My cost was the same, and I didn’t really use the new and improved features.

    Now this weekend, Liam (the guy behind Book Report) announce a doubling of the annual fee. Which made me look at BR. What I found, in light of this blog post took my breath away.

    Book Report’s current privacy policy is that he store all my Amazon data. All of it. He states that he won’t ever ever yadda yadda my data…
    It was the moment I realized that he has E V E R Y T H I N G on his hard drive about my business. I’m all in w/ KU, so he has at his disposal information on my biz that the IRS would need a court order (I think) to get at.


    Yeah, I’m going to quietly get out of BR as soon as I find an app that will do what his used to do: i.e. grab my sales data from Amazon and coalesce it into an easily readable daily report. The upcoming $200/year fee isn’t a huge deal… an enormous price jump yes, but I got the money…

    It’s just that I didn’t realize that he had ALL my data stored until now.
    Pretty disquieting, no?

  2. Thanks for surfacing this issue, Kris. I was the first to ask to have my name removed from that top 50 list, and I was glad that many others did the same. I’ve never given my sales info TO ANYONE or publicly shared more than “general info” about my sales. So it was violating, to say the least, to see my name and personal info made public by someone who refuses to put his own name behind what he’s doing. Bob Mayer is right–the indie community was very collegial at first when we were all figuring out the way forward and trying to decide if the boom was going to last (it did–for those of us who’ve continued to stay focused on our books). But lately, crap like this and other things that’ve been happening have me feeling less positive about our community. I would ask Data Guy to keep his nose out of my business. My sales and revenue info is not for sale at any price.

  3. Interesting. I remember the ‘golden age’ of indies when a lot of writers were posting their sales figures and income. Haven’t seen anyone do it in years. I remember one very successful romance author posting a few years back, after the golden age, saying that while she draws 10 times the number of people to a booksigning, her sales were down to 10% of what they had been just a year ago.

    I feel like some people who were constantly at the vanguard of pushing indie publishing have disappeared off the face of the Earth. I think a lot of people who were feeling pretty good then have had to go get other jobs. After almost 3 decades in publishing, one of my mantras based on experience is: the moment an author thinks they have it made, their career is over.

    I never filled out those requests for my data because I didn’t see the point. My data and sales and income is unique to me. So is every other authors. We all have different circumstances. We have multiple revenue streams (for example my asteroid mining company). A sure way for an author to lose their mind is comparing themselves. The one thing I did publicly is post when I hit a bestseller lists, particularly a top 10 on Amazon. But now, even that’s pretty worthless given the volatility of the list and the lack of a long tail. Frankly, no one cares any more.

    This is also a holdover from traditional publishing where sales in the Marketplace are “reported” by agents. I know one big name agent who flat out lies with his reports. I’ve seen the contracts he’s reporting and they are very different. Let’s not get started on the NY Times list and reporting stores.

    Let’s face it. A lot of that early boasting was part of the indie-trad war which is long over. The only people who won, were the ones who kept writing and running their business.

    One thing I would throw out is that I’ve watched authors get in uproars about things that they don’t control. While there are things that are unfair, or flat out wrong, if I can’t change it (sometimes because I’m not Jeff Bezos), I change my business model. I view every rejection as an opportunity to do something different; every failure as a learning curve.

    That’s the only thing I control. I do remember Bezos saying in an interview: Complaining is not a business model.

    What I like about your blog is focusing on the fact that being a successful author requires more than writing good books; it requires running a good business. Too often authors jump on the latest bandwagon and don’t take the long view.

  4. I missed the whole thing after the part about the big indie names being upset. I’m in a support group run by one of them, and she couldn’t figure out how he was attributing 30-ish more books to her than she actually had out. Then I went back into my editor-requested WIP.

    I’ve never fully trusted Data Guy’s conclusions anyway, because I couldn’t figure out how he accounted for the accumulation of sales on rank. I stay in top 200 free on my genre lists with as little as one download a day, partly because less popular lists and partly because accumulation. I was able to rocket myself to the top of my lists within a week of going free, and it made all the difference in continued visibility.

    I attended his presentation at RWA in Orlando, and walked away with some good genre info. I can’t share any specifics about what he talked about, because it was a PAN-only session, but it was interesting and somewhat helpful. But he’s the last place I’d turn to for specific sales numbers on someone, because like you I don’t trust his methodology.

    Not to mention a bunch of his romance data is likely completely inaccurate due to KU scammers.

  5. I am the one who asked Data Guy the question about software. Given the volume of data he is handling there are only two choices for churning out stats. AFAIK. SAS is the expensive — and arcane — choice. R is — last I knew — free software.

    Does Amazon have an action against Data Guy? I do not think so. Data Guy is trawling through Amazon’s book pages, pulling rank, publisher, price, and author from each page. Amazon makes that data public. Data Guy is not taking proprietary information from Amazon. He is taking information that Amazon gives everybody, albeit in greater quantities than Amazon foresaw. I do not think a suit by Amazon alleging that these data are proprietary has legs; that is, it would not survive a Motion for Summary Judgment.

    Note that Data Guy does not have unit sale numbers. He is using rank as a proxy for unit sales. Kris, your analogy to movie box office is a good one. The week following an opening, news organization rank the movie and publish the box office take. Is it accurate? No, it is a best guess. The companies that make those guesses work hard to refine their guessing algorithms when they later have hard data — if they can get it.

    Data Guy got willing help from a number of useful idiots to refine his proxy. But it is still a proxy. Will he be able to refine his proxy in the future? That will require help from more useful idiots, ’cause I think many bridges have been burned and he is now persona non grata with many in the indie community.

    BTW that talk about ‘core engine’ and SQL RDBMS and NoSQL and protectable IP . . . Data Guy is not talking about statistical analysis software. He is talking about software to collect and collate the data. In my experience, collection and collation were 80% of the job (5% to figure out what data to collect, 80% to collect and collate them (‘collate’ means to package the data in a form that is easy to read and manipulate; for example, an expenses ledger), 5% to analyze the data (but it takes years of study to know which analyses to perform), and 10% to report the data. This is a good ROT for calculating your expenses in a statistical job. Data Guy has plowed a lot of effort into the collection and collation software. He now wants to get paid for that effort.

    But here is the kicker.

    You get 5% of your payoff from knowing what data to collect, 10% from collection and collation, 5% from analyses, and 80% from reporting. All that effort that you put in for collection and collation is minor compared to reporting. The trick is translating statistics to meaningful information presented to decision makers who do not know statistics and do not want to learn statistics. When you can do that, you are not just golden; you walk on water.

    Back when I did statistics for the US Air Force, I collected, collated, and analyzed data for a few weeks and produced a chart. The chart was boring. Two bars of equal height. As soon as I saw the chart, I knew what I was looking at and what it meant. My job was to present that chart and the information it contained to officers to make decisions about future acquisitions. I broke every rule about briefings when I presented the chart. It was the only chart in my presentation. Boring. Then I told them what it meant. Their eyes got wide and they leaned forward. They got excited. Result: they accelerated system acquisition (money they were going to spend anyway) and saved $50 million a year per system in operation expenses.

    I only had one customer. (I could only have one customer — the data were classified.) My greatest assets were not my degree in mathematics and computer science but my credibility and presentation skills.

    Data Guy has many potential customers. Maybe I am wrong, but I do not think that the big houses will show an interest in his information. He is a new source of information, and he comes from the wrong side of the tracks. My dealings with publishers showed me that they knew their sales numbers — eventually. This is what they track and make decisions on. Maybe I am wrong, but my feeling is that Data Guy just cut his own throat with his customer base. YMMV.

    BTW from my reading of Amazon’s TOS, my understanding is that the people who shared their sales numbers with Data Guy violated their contract with Amazon.

    For all the indies who are upset, I say, “Chill.” Best thing to say is ‘No comment’. If you cannot say that, smile and say, “Data Guy is using proxies, not hard data. That means he’s guessing. And you think his guesses are worth money? Why, bless your heart.”

    Oh! BTW, the first time Data Guy did a report way back in the Pliocene Epoch, I got him to send me a text file of his data with comma separators. It was small enough that I was able to import it to Excel and play games with it. I used his data for my analyses. Nothing I did invalidated his results. IMO his stats then were solid. I have not verified any of his analyses since then (I got other things to do with my time, okay?) but I have no reason to question his statistical abilities. The one — the only — source of weakness in his conclusions that I see is the fact that he uses a proxy. But that sword cuts two ways.

    1. Thank you for the analysis, Antares. Excellent points all.

      However, you miss my point: DG’s statistical prowess and interpretive skills are considered excellent. So when he says that XXX Indie writer has sales of YYY and is the top of her game, then his analysis will have a lot more impact than, say, my stupid guesses.

      It is good business for XXX Indie writer to know what DG has reported to a publishing house, a Hollywood studio, and so on, so that XXX can counter with–um, no, that’s incorrect data and here’s why, or something. The problem is that if DG’s business is successful, a lot of indies will lose business without even knowing why. Some studio will buy a report on XXX Indie writer’s sales, decide they’re not high enough or that XXX Indie writer doesn’t appeal to the right demographic, and the chance of an offer will disappear.

      Like you, I doubt DG will have the client base he needs. Even he acknowledges that the big publishers have their own internal data guys now, so he’s going for the mid-level publishers and anyone else whose company valuation fits his criteria. I doubt this is the threat I just made it out to be in the previous paragraph.

      But writers need to know that there’s the possibility for another DG to arise, one with some actual street smarts, and that sharing information with someone like that will only bite you in the butt. If you discover that someone like DG exists and is selling an interpretation of your data, the best thing you can do is buy that report for yourself so that you know what information is out there that you’ll have to counter one day. My irritation with DG? He didn’t make this data available to anyone who can afford a flat fee (which could be ridiculously high). Instead, he set yet another nosy bar, gathering even more information he has no right to.

      Ah, well. I was right about one thing. The community has moved away from this tempest in a teapot to other concerns.

      1. Kris, The accuracy of Data Guy’s proxies (ranks) to unit sales depends on 1) the number of datapoints in his sample and 2) time. Maybe time can be safely ignored in a bounded study. IMO the most important chart in his latest presentation was the one that showed e-book sales to be constant each month over the span of the collection. Even so those data will — like milk — go sour after a while. If I used the results from 2006-2009 to predict the likelihood that Space X would successfully land their first stage booster, I would predict failure. The early data on Space X are no longer predictive.

        The number of datapoints is key. To get robust stats, you want 1,500 datapoints in your sample. With a sample size that large, you can confidently deliver reliable information. As that number drops, reliability suffers. Under 500, strange outliers will pop up. Under 150, you are flirting with catastrophe and reliability is a crap shoot.

        Data Guy’s latest report? Give it a year and no customer will care. But all the people he pissed off will remember. It’s like my father said: If you please one, flip a coin. If it comes up heads, he’ll tell one more in the coming month. But if you make one unhappy, he will tell ten before he goes to bed.

        1. You’re still missing my point, Antares. I’m not worried about the quality of the data. I understand how data works, and that it spoils. However, I also know how business works. If people believe the data is accurate (whether it is or not), they will act on that data. If a writer does not know what someone else is reporting about her sales, she can’t deal with the preconceptions that person she’s negotiating with brings to the table. I’ve experienced that problem in negotiation many, many times. It’s much better to know what the perception is than it is to know the actual truth, especially in negotiation. So not having access to DG’s data automatically puts the writer at a disadvantage on the perception front.

  6. As you said, what is unfair is suddenly making companies gaining access to data the authors previously shared openly, for a fee, without the authors being able to check that data if they don’t pay. It’s abusive and immoral.

    Like Joe Konrath, I used to share my numbers on my blog, but have stopped, because it involves work and time.

    But is secrecy in business in general a good thing? That’s quite a debate. Governments (well, maybe not all governments) are levaraging their power to lift banking secret in fiscal paradises. Because the average taxpayer is transparent with his income, while businesses are not.

    In publishing, secret was and is used to protect and encourage an elitist class of bestselling authors (usually white old males).

    What Data Guy has done is wrong, but I wouldn’t rejoice because of the coming back of secrecy in the publishing business.

  7. There are so many skeevy aspects to all of this. Nice to see people who realize the impact from a business perspective talking about it. Before I read all the blogs and commentaries, one (of many) of the big things that really made me disgusted was that the information could only be bought by companies with 10 millions worth in annual sales.



    Without the Indies, Data Guy wouldn’t have been able to do what he did. Couldn’t have verified his algorithms. Couldn’t put out semi-accurate numbers. And then he turned around and raised his middle finger at all of them. What a traitorous move. Does he even realize that this is what it means? That he turned traitor to an entire community of business people?

    Smooth move.

    (And yes, I know what I said was an emotional response. There are so many aspects of that one move that could be talked about. For instance: Amazon is not the major selling platform for me as I talked about in this post: But that was my first response. A skeevy duplicitous traitor to the people who helped him. And no, I was not one of them who contributed information to help him verify his programming.)

  8. It is alarming how someone as aware of the need for privacy in his own line of work can strip clean the privacy rights of another industry. To say these numbers were there for the taking belies the number of hours of coding and ‘real world’ fact checking he did to get an algorithm he believes is true. If his investment in this algorithm outstripped his ability to obtain ethical compensation then Data Guy should have stopped pursuing this long ago. A need for compensation doesn’t justify villainy.

    1. “A need for compensation doesn’t justify villainy.”


      I guess the thing that struck me the most was that authors who wanted to know their own data, or even to see how they stood against other authors, can’t buy that information unless they’re making $10 million or more. How can that be justified?

      I personally feel that since so many authors were “outed” without their permission — whether the data is accurate or not doesn’t matter — then Data Guy should also be revealed. He wants to profit from this, let it be known who he is so people can decide if they want to associate with him or not.

  9. I was following this story when it first broke. First on Passive Voice, then on Deans blog. I think I understand the whole thing about privacy and business confidentiality when it comes to ones business. Thanks for the details on this.

  10. Interesting business discussion, Kris. I actually post my rates online but my primary industry is pretty well commoditized (deliberately by the primary referral sources). So, as a point of differentiation, I make sure I have the highest prices in my marketplace.

    There is a ton of stuff that I keep to myself, though, especially work processes. When observant people ask questions, I simply state that I have remarkable efficient systems without going into details.

    One thing is absolute; I don’t violate confidences. I’m surprised Data Guy did. And, because we live in a creepy age, I’m clawing back as much privacy as I can.

  11. With the last AE Report and DG’s announcement, part of me was appalled on behalf of people I know being outed in a way they had not anticipated. The other part was laughing hysterically at trying to extrapolate the entire industry from skewed numbers, inaccurate names, and assumed business formations. And I say that as someone who considers herself indie, but I made nearly half my income for 2017 from trad publishing in books that do not bear my name.

    Also, the way DG dealt with the whole fiasco is a perfect example of how NOT to handle your own PR. Picard-Riker double facepalm

    I’m slinking back into my writing cave, and I think I’ll stay there until 2019.

  12. In the very least, businesses should have the choice for what information they want to share, how, and where. More than one author who used to publish their numbers has admitted to stopping due to getting harassed by jealous folks, where they noticed a definition connection between publishing numbers and getting spam and a flurry of review sabotage. It’s usually harder to sabotage things that are kept private and hidden.


    1. Just an add: Something that bothers me is how folks treat things as “Oh, just opt out if you don’t want in.”

      For things like this, where it’s a 3rd party using your data, you should have to opt in to be included, not opt out to be excluded. Assumed consent is unacceptable in most contexts, so why is it accepted on things like this?

Leave a Reply

Your email address will not be published. Required fields are marked *