Will Google Chrome’s ubiquitous spellcheck/suggest improve the state of Internet grammar?

English: Content: Well-known C19th pangram The...

English: Content: Well-known C19th pangram The quick brown fox jumps over the lazy dog Category:Screenshots of Linux software (Photo credit: Wikipedia)

Google Chrome is a popular browser.  There is some debate about how popular, because these things are hard to measure.  Being advertised on Google properties helps (watch out for the FTC) but so does being secure and fast.

One of the features of Chrome that doesn’t get talked about much is that it has a spell checker built into every field you can type in.

Who cares, right?

But think about it – you no longer need to rely on whatever spell checking tool a site owner happened to install, if any.  Now you know that you have a spell checker regardless what site you’re on, regardless of whether the site owner thought you needed a spell checker.  When would you not need it?

If spell checking is everywhere then you can even forget about having it and you’ll only be reminded when you misspell something.  It’s a pleasant little surprise – “good catch, spell checker – you’ve got my back!”  Once you get used to having it, using a browser without this feature is painful.

So what happens when you combine a popular browser with ubiquitous spell checking?  A lot more spell checked emails, spell checked blog posts, spell checked status updates…

In theory, this should improve the spelling of the whole Internet.  Millions of people spelling a bit better means millions of new pieces of content that millions of other people are reading.  As they read they’re being reminded of how things should be spelled.  It has to be better, right?

Maybe.  It depends on whether people care about the little red underline.

In November 2011, the Chrome team announced that they were going to change Chrome to use Google Suggest as its spell checker.  Previously it was using Hunspell, an open-source spell checking library that many other applications use.  The problem with Hunspell (and all similar dictionaries) is that it only has a limited list of the most popular words, so regardless who you are, there’s likely to be some word you’ll use that is spelled correctly but that Hunspell says is misspelled simply because it isn’t in its dictionary.

Now who has a great big list of (pretty much) every word ever written?  Google does…  And what service already uses this data to predict words that usually go together?  Google Suggest does…  Sounds like a match made in silicon heaven.

To use the data as a spell checker all Google Suggest has to do is check the frequency of use of the word as typed versus the frequency of the word that Google Suggest says is most often used in its place in the same context.  For example if you typed “cnat dance”, Google Suggest doesn’t actually know that “cnat” isn’t a valid word – it doesn’t have a list of valid words like Hunspell does – but it does know that it is very rarely (if ever) used in reputable documents (like UN transcripts).  Then in order to determine the correct spelling it checks to see what similar word is commonly used in the same context and suggests “can’t”.

Additionally to misspellings, Google Suggest offers other possible corrections, like grammar errors.  Since these aren’t as certain and maybe not as important as spelling errors, they’re underlined in grey instead of red.  For example if you typed “cant dance”, even though cant is a valid word (that Hunspell would let go flying past) Google Suggest would point out that “can’t dance” is used a LOT more frequently.  Therefore it’s probably not correct and you get a grey underline.

The weakness of this approach is that it uses the content of the Internet to determine what’s right.  So if eventually enough people spell “can’t” as “cnat” or “cant”, Google Suggest will start thinking that’s what’s right.

So do your part – spell something correctly on the Internet!

EDIT: Apparently I jumped the gun and all of these features hadn’t rolled out yet.

Enhanced by Zemanta