Using Text Analytics In Your Writing

CaptureA while back, I used a wordcloud on the text of my own book to tune and get insights into my writing style. Here, see what I mean. The idea is to try and take an objective look at your work in progress, a book you’ve got cooking and that you probably think is awesome (of course!) using text analytics rather than your brother or your dad.  Go ahead and blast me as a nerd, tell me machine learning will never take the place of emotional impact, and blah blah blah. I zoned out on you. What I’m suggesting is a supplemental tool to get another set of eyes on what you’re creating to go deep into it, and make it the best it can be.

You can wiki ‘text analytics’ with the link above; but the short version is to break down the raw text with software into just its individual words, ignoring common words like ‘and’. Typically, you’d also ignore capitalization and punctuation. Once it’s just a bunch of the remaining words, you can count their frequencies and display them in a way that highlights that. Wordclouds like the one in the header here just make them bigger if they repeat more than others. You could as easily chart them. I used the R programming language with ‘tm’ and ‘wordcloud’ packages for mine – here’s the code if you want it:


I won’t go into the deets on what that cloud tells me – I covered that sort of thing in the other article. Quickly though, I do see I’m consistent with similes and the mindset in my narrative style of trying to think like a movie camera. This book I’m writing now is a much more intimate book, less bombastic grandiose scale than my first one. Since it’s more visceral and up-close with the characters, I’m seeing even support character names showing up in the cloud. That surprised me. I actually went back to earlier chapters and shifted who was acting and speaking to even out the balance when I saw one of the main characters dominating.

I am experimenting with something called ‘sentiment analysis’, which also sounds useful. I couldn’t make it work, or I’d be writing about that right now. Sentiment analysis relies on lexicons (there are three of them out there that I know of) which contain a crap-ton of English words with assigned scores for positive/negative sentiment or emotional content. For example, the NRC lexicon scores ‘yes’ or ‘no’ for a word for positive, negative, anger, anticipation, disgust, fear, joy, sadness, surprise, and trust. That means you can get a rough measure on the emotional content of the storyline, even chart the course of the book and see when things started getting more hopeful, sadder, or whether you overdid the foreshadowing. If I can get that into something meaningful, I’ll share it with you here and tell you how I did it. The book I’m referencing is called Text Mining With R.

Anyway, that’s the thought. Maybe the real advice for me isn’t that I need another way of getting feedback on what I’m writing, but rather just to stop getting distracted by shiny things and sit down and write it?! Yes, I don’t need software to tell me that. I’m two and a half years in on this one, though I had dreams of finishing in a year. I feel good about it; and it’s going to be a sharper, fresher work because of all the time I’m spending on it for sure.

Still, I’d like to see it on my bookshelf!

Merry Christmas, guys!




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s