Wednesday, November 25, 2009

Editing by Machine

Our friend Julie in the comments emailed me to ask my opinion of a particular editing software package. Because my opinion is less than glowing, I won't name the software. It operates much like all the editing software I've seen, and so my comments could apply to several platforms.

First, I must disclose that I'm biased against the notion that content editing can be performed by mechanical analysis. How can any software analyze for strength and consistency of character? For relative emotional impact of premise and climax? For a reader's potential ability to bond with a character? Can a computer identify theme, motif, and symbol? If it works by analyzing text, how can it analyze subtext?

I would never rely upon software for content editing.

But just for kicks, and just to be sure I wasn't prejudiced against something that might work, I downloaded the software and ran one of my manuscripts through the machine. I used a manuscript under contract with my company, one I know inside out and upside down. I chose this because I wanted a clear understanding of the story and narrative so that the sample analysis would be meaningful. (And no, I won't tell you which manuscript it was. The only relevant point is that I am thoroughly acquainted with it.)

Here's what I learned:

1. How many words per sentence.

I did a spot check of its counting, and found one error. The error came in where a number was used in the text. This is not a big deal, I think, but worth mentioning.

Because it presented the counts in the same order that the sentences appear in the manuscript, this function might be useful for showing where the text might be rhythmically monotonous. Where there was a string of four sentences all with six words each, I checked the manuscript again. The sentences were fine -- two standard SVO constructions, one fragment, and one with an introductory prepositional phrase. Rhythm depends upon more than mere word counts, but still I can see where this tool might be useful.

2. Flagging single word repetitions.

Again, in theory this could be a useful tool. In practice, it has its limits. It flagged the character names as overused, and even reported the exact number of usages to eliminate in order to overcome this objection. Strangely, it did not object to pronouns, so I ran a search in the original manuscript and found that pronouns outnumbered character names by a factor of more than ten. I don't know what to make of that. Perhaps the machine doesn't like proper nouns? (Worth noting: I had to use word to count the pronouns. The fancy editing software doesn't let you select which terms are count-worthy.)

The software didn't let me choose its parameters, so it generated some useless data such as the number of contractions in the manuscript. It was unable to distinguish between legitimate usages of the past progressive tense, but lumped all the "was walking" and "were kissing" moments in with all of the simple past conjugations of to be. Likewise, it counted the number of words ending in -ing without distinguishing between progressive tense participles and present participial phrases. None of that is of much value.

3. Identifying overused phrases.

The software claimed to be able to identify overused phrases. This part of the report, however, did little more than flag certain adverbs like when and where and then and after. The author earned a hearty "Good!" in this part of the report, but I'm not sure why. I suppose it doesn't like adverb phrases.

The other part of this section of the report listed several nouns that it claimed were overused. Seemed odd to include these in the part that was supposed to identify repetitive phrases -- phrases do have more than one word in them, after all. In any case, the nouns it flagged as "repetitive phrases" were appropriately used. This part of the report seems to have little value.

4. Dialogue tags.

The machine had no difficulty scanning the document for the word said. It picked up on some synonyms such as muttered, asked, blurted, and shouted, but missed hissed and snarled. (Alicia, make of that what you will!) It did not identify beats. I routinely strip tags during line edits, sometimes converting them to beats and sometimes eliminating them. This function might be useful if you were doing a search and destroy on tags, but there's a built-in option in word that can do that already.

5. Misused words.

I expected this part of the section to identify misused words. Silly me! What it did was flag every word that has a homonym. To/too/two, pearl/purl, and so on. It left it to me to determine if the correct homonym had been chosen.

6. Spelling errors.

This baffled me. We had no spelling errors according to this report. The manuscript is set in an alternate world, complete with made-up nouns, and none of these triggered a spelling error. Word's built-in spellchecker went cuckoo over this same manuscript, but the special editing software blew right past everything. Makes me wonder if I somehow accidentally turned this function off.


And that was it. No pretense at literary analysis, just a simple word-counting program that counts what it thinks is important. You could potentially use this software to locate certain words you want to change, but I don't see why you would spend money for it. Word (and, I suspect, wordperfect and other word processors) already let you do this easily.

Here's how. This varies a little bit depending on which version of the software you're running, but here's the basic process.

Open up the dialogue box for the "replace" function.
In the "find" field, type something you want to flag. (Said, ing, ly, etc.)
In the "replace" field, type the exact same word.
While the cursor is still in the "replace" field, click the "More" button.
Click "Format" and select "font" from the menu.
Select a nice bright font color like red.
Click on "okay," and then "replace all."

Boom. You just flagged your word. Repeat as needed for every word you want to flag. And then, when you're going through your manuscript for that pesky manual content editing, you won't be able to avoid all your present participles or saids or -ly adverbs.

Theresa

7 comments:

Leona said...

Thank you! I've wondered if I should get another program as well. I've downloaded a couple of writing softwares, and I will try to learn them at a later date. However, I think if I were to use the Word program to its full advantage, I wouldn't need many other programs.

Maree Anderson said...

OMG, thank you, Theresa! I routinely use Word's search or search/replace options, but I never thought of simply changing the font color so you then have an instant visual flag right throughout your ms. Legend!!!

Linda Adams said...

These programs can be bad for a beginner writer looking for rules. There was one where it counted the number of words and told people to eliminate words like "was" as being passive. So this one writer went through his first chapter and took out every was, revising sentences so they didn't use it. It was also pretty obvious from the convoluted sentences that this was what he did.

Leona said...

I had someone from fan story comment on one of my stories. She told me I needed a creative writing class (not a bad thing) but it was for using the word was.

She said that according to her class, all usage of "was" was passive voice and seriously ripped me a new one over it. Ending her comment with "WAS WAS WAS" you need help! It really messed me up! I use Word's help by turning on the passive voice markers and reread the definition frequently to help me remember the definition.

JewelTones said...

Thanks, Theresa. :) You see so many software programs floating out there for writers promising to correct so many things and make writing this 1-2-3, connect the dots kind of plug and play thing that just bugs the heck out of me. And I don't know... Is it just me. or does it seem like some people just shortcuts to writing without having to *learn* the craft at all? It gets very frustrating when you're trying to talk about writing as a craft and the arguments... Oy!

Oh, and Leona, I feel your pain. I think passive voice is one of the big things to flag in writing and yet people try to slap rules like the never use "was" one as a cure all when sometimes? Sometimes you're just better off using the dang "was" in the sentence instead of trying to get all conveluted about it. LOL. (normal writing disclaimer about passive voice and 'was' inserted here).

My favorite is the head-hopping argument. It's not head hopping! It's multiple perspectives! or So-and-so writer does it so I can too! {sigh} Though I did find a series of articles on POV that talked about how POV and passive voice can overlap into each other to cause problems in writing to be really interesting.

Okay, way off topic there. LOL. Sorry about that.

Thanks, Theresa!

JT

Adrian said...

I think this is a very interesting topic. I'm a computer programmer as well as an aspiring writer. I've often dabbled in making programs to help me with copy-editing and surfacing some types of writing problems that are hard for me to spot in my own writing.

It's true that nothing today will come close to what a real editor can do. But note that Theresa tried out the unnamed software on a manuscript under contract. This narrative has already risen to a high level of quality. Thus it's not surprising that the software is limited in how much more it can help. It would be interesting to see if someone who isn't yet selling could improve his or her manuscript with such a tool (or at least learn what it would take to get to the next level).

There is no silver bullet software solution, but that doesn't mean software can't be helpful.

I'm old enough to remember when the idea of putting a dictionary--scratch that, a word list--on a PC was ludicrous. Today you probably have at least half a dozen individual spelling checkers--each with their own dictionary--on your laptop.

Grammar checkers used to be an utter joke. Some people still regard them as such, even though they're vastly improved. Unfortunately, most of them are designed and tuned to help with business correspondence rather than prose, so their reputation among writers isn't consistent with their capabilities.

Technology in textual analysis is improving at an accelerating pace. Today's best spelling checkers (like the ones on the web search box of your favorite search engine) use context to help recommend the right spelling with astounding accuracy--so much so, that we're staring to think of them as spelling correctors instead of spelling checkers.

This advancement is possible due to the explosion of text available electronically. Statistical models can be built from corpora of tens of millions of words by a hobbyist with a network connection.

Sure, understanding story structure is still a few levels away. But it is coming. Machine translation from one language to another has made huge advancements in the past couple years. Doing good translation requires a level of semantic understanding beyond grammar and syntax. So these text-analysis programs are beginning to achieve a rudimentary level of comprehension that already surpasses many relatively recent predictions.

Proofreading tools have been useful for a while now, and they continue to improve. Style tools will soon be useful to all but the best writers and editors in the near future. After that, programs that understand genre conventions and story structure will emerge. And with live access to everything written, they will automatically keep abreast of the trends in the conventions as they evolve.

Falen said...

changing the font color is genius! I don't know why this has never occured to me.

Question - can you recommend any editing/revision books? I have a friend who is looking for a few and this post seemed pretty timely.