Optical Character Recognition (OCR) in Google Docs

7 July, 2010

Google OCR

Did you know that you can convert scanned text into Google Doc format? This is particularly handy if you have hard copies of old records or documents that need to be in text format. Rather than spending hours typing them out you can scan them into JPEG, GIF, PNG or PDF format and import them into Google Docs. Their OCR technology will do the rest.

The original image is included in the doc to make it easier for you to edit and correct mistakes.

Interestingly, this came about as part of Google’s 20% time. Software Engineer, Jaron Schaeffer was presented with a problem. A colleague’s wife had found a stack of ancient family chronicles in the attic and wanted to continue writing them.

Here’s the tech bit – http://googledataapis.blogspot.com/2009/09/import-scans-or-go-multilingual.html.

More info at http://googledocs.blogspot.com/2010/06/optical-character-recognition-ocr-in.html.

Posted by | Posted in Interesting Stuff, Useful Tips | 4 Comments

4 Responses to Optical Character Recognition (OCR) in Google Docs

  1. SallyF says:

    Oh my goodness I had no idea it did that! Be interesting to input things that aren’t text deliberately to see how it reads other things – like if it could create writing out of a picture or a pattern…

  2. Jenny Hudson says:

    Absolutely. It could yield some interesting results. And it would be great for genealogy.

  3. Pingback: Twitter Trackbacks for Vanilla Storm: Blog » Blog Archive » Optical Character Recognition (OCR) in Google Docs [vanillastorm.com] on Topsy.com

  4. Lilly says:

    Thanks a lot for this article. I personally find a lot of information about OCR technology on http://www.ocrworld.com. They also have a forum and you can post your questions there.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>