Close
Notification:  
v2.2.1 Professional
Login
Loading
Wiki About this wiki Volume 1 Volume 2 Test Page Vol 1 - v053 - Errata Vol 1 - v053 - Aux Book Features Vol 1 - v053 - Alternative Format Vol 1 - v054 - Notes Vol 1 - v054 - Pages Vol 1 - v054 - New Paragraph Sources Vol 1 - v055 - The Delivery Method Vol 1 - v055 Notes Vol 1 - v056 Notes Vol 1 - v057 Notes Fundamental Images test -paste in table Test Buttons New Page New Page Where is the Password for Additional Features? Next Word Version FR Issues with TOC and Book Interleaving Dynamic Text Display Everywhere Very Large Books Book content mapping New Behavior Evolutionary News Microsoft Courier Literate Programming Currently Reading We're going to Mars - Mission to Mars 2 New Big Book Links Circular library Wiki Distribution The Mind's You Preservation in the Digital Age - REPRINT Introduction If Words were Flowers Foreword and | or Preface HyperTextopia and the Docuverse Chronology Time Quantum Self Reference print paragraphs of text in pseudo KANJI - Paul Haeberli - 1996 Hypertext that works Les Sous-Sols du Revolu Napoleon romance novel finally released Books and architecture The Archivist - Schuiten -- Peeters Authoring Bots More Book Stats Non-Ownership Collaborative Writing Literary Evolution and the Russian Formalists New Printing Surfaces Failed Time Capsule Methods Toilet Paper Novels Bed Cover Non-Fiction Texting Jargon Finding books in other books with x-rays Data in Motion is Safer Data Rosetta disk Calendar Based Update What we can learn from slow music Media that last for ever Plastic Logic E Books Future or Libraries by Thomas Frey This Book's Seven Wonders Oreilly Montly Subscription Book Borrowed for the Longest time v055 stats Count how many dragees results - January 1 Jen and William's Annual Hangover Brunch- Experiment Results My Name is Zachary, I am 21 and I am hot 10 Literary Exploits - Commented The Tyranny of Gadgets RSVP techniques Book Pricing Algorithn New Links Political Parametrics - 2d to 3d conversion of the American Political Landscape TOPANGA to DOWNTOWN LA - Good Books Graze, Hunt and Browse Expedition Typing without a keyboard Computing_Timeline Software Cracking for the Mass by Google, inc. Fixes for Multi-Level Moving-Image Semantics Chalkbot Hardware Accelerated Bible Code extreme poetry New Page New Page New Page New Page Interview with a chatbot - (c) New Scientist anthropomorphic middle 'man' Reinterpreting Mount Rushmore Books that became algorithms Reading old stones Norsam Technology 219 Years of bets at Cambridge Long Term Backup strategies Recovering Mesopotanian Tablets Carlos Ruiz - Book Cemetary Flexible OLED Foldable displays - what happened to Readius Copyright law issues that inline linking raises Deep Linking - Printing the internet with the Google clause New Page Math Tables keyword reading scheme - teaching reading Best Man Speech Flowchart comments New Page

 

More Book stats from Amazon

http://userslib.com/2007/10/30/amazon-adds-word-stats/

 

http://www.stevenberlinjohnson.com/2007/10/this-may-be-old.html

 

Literary Style By The Numbers

This may be old news to some of you, but I just noticed the other day that Amazon has added a whole panel of "text stats" for many of its books. I noticed it because my last book The Ghost Map just came out in paperback (go read it people --  it's a lot more fun than this post will turn out to be) and so I'm back into the swing of checking Amazon a few times a day. Text Stats is a pretty wonky page -- everything from some of the "readability" indices, to overall word count, to what Amazon calls "Fun stats" like "Words per dollar." (Quotes you never hear at Barnes and Noble: "This copy of Infinite Jest is such a bargain at only 39,574 words per dollar!")

But the two stats that I found totally fascinating were "Average Words Per Sentence" and "% Complex Words," the latter defined as words with three or more syllables -- words like "ameliorate", "protoplasm" or "motherf***er." I've always thought that sentence length is a hugely determining factor in a reader's perception of a given work's complexity, and I spent quite a bit of time in my twenties actively teaching myself to write shorter sentences. So this kind of material is fascinating to me, partially because it lets me see something statistically that I've thought a great deal about intuitively as a writer, and partially because I can compare my own stats to other writers' and see how I fare. (Perhaps there's a literary Rotisserie league lurking somewhere on those Text Stats pages.)

So I spent a few hours last week plugging in the numbers for my books, as well as a few other authors that I assembled in an entirely unscientific fashion: Malcolm Gladwell, Steven Pinker, Seth Godin, Christopher Hitchens -- and then, just to see how far I'd come, I threw in my intellectual (and, sadly, stylistic) heroes from my early twenties, the post-structuralist legends Michel Foucault and Frederic Jameson.  I compiled stats for 3-4 books for each author, except Gladwell who has written two, and then plotted them on a scatter chart, with  the y axis representing % complex words and the x axis representing words per sentence. The results were pretty fascinating:

Chart

Some observations:

1. There's a clear cluster of Hitchens/Johnson/Pinker in the center. (From eyeballing some other Amazon pages, I think Dawkins, Michael Pollan, E. O. Wilson would have been in that general area as well.) But what I thought was so striking was that even in that cluster, each author's books are closer to his other books than they are to the other two author's books. In other words, each of us has a certain sweet spot of complexity that we come back to book after book. My first and last books, Ghost Map and Interface Culture had the exact same words per sentence, down to the decimal point: 24.6. (My longest sentences turned out to be in Emergence, followed closely by Everything Bad at 25.8 and 25.7.) Pinker tends to be just slightly less complex syntactically (with the one outlier Blank Slate, which is more complex than anything I've written.) And Hitchens tends to write longer sentences by a couple of words.

2. Gladwell's sentences are fully 25% shorter than mine. I'm not sure if the average reader would notice the difference between the Johnson/Hitchens/Pinker cluster, but a 25% drop in sentence length has to alter the reading experience dramatically. Clearly, the only things separating me from selling ten million copies of my books are those extra 6.5 words per sentence.

3. Check out Foucault and Jameson. They are literally on another planet. The top spot goes to Jameson's "Postmodernism" book which I read like scripture my first year of grad school: 53 words per sentence! Interestingly, most of the variation shows up in sentence length not in word complexity -- you often hear people complain about the impenetrable jargon of critical theory, but it looks here like the sentence length is as least as much of a culprit.

4. I would love to see some stats on dynamic range here: not just average sentence length, but how much the sentence lengths vary over the course of each book. One of the things I learned when I started writing in a less academic style (largely when I was doing FEED) is the importance of throwing in a very short sentence for emphasis at regular intervals. (Come to think of it, I may have learned this from reading Gladwell's early pieces in the New Yorker.)

5. Is there a Literature grad school version of the Lazy Web? If so, I would love to see a study that cross-referenced sales and syntactical complexity across thousands of books and determined who had the highest sales-to-complexity ratio of all time.

6. After looking at the Jameson number, I went back to one of my papers from junior year at Brown to see how awful my prose was. I pulled up the scariest sentence in the first paragraph and did a quick word count: 75 words. 75! And no semi-colons either. I bet Fred Jameson's pretty psyched I never finished that PhD...


Books with a lot of Pages binded in one thing

http://vkb.110mb.com/cdrom/textbooks/Harrison%27s%20Principles%20of%20Internal%20Medicine,%2017th%202008.htm

4192 pages:

http://www.amazon.com/dp/0323019854/ref=nosim?tag=dnssesecurthe-20&link_code=as3&creativeASIN=0323019854&creative=373489&camp=211189