Automation

Friday, 12 September 2008 09:20 pm
da: (bit)
[personal profile] da
As dan was making dinner, I told him how I just spent a frustrated hour fighting with Apple Automator, trying to convert a ten-page PDF into ten jpegs, which I could then run through the text-recognition program that came with my scanner to ultimately have a plain-text document to email.

First, my command-line tool of choice, "convert", made crappy low-res output.

Then, Automator wouldn't do anything at all with it- I found an Action which was supposed to turn a PDF into images, but it seemed to make no output. How frustrating! Dan's response was, "You only need to get this to half a dozen people, right? Why don't you just make copies of the document and snail-mail them to people?"

*sigh*

Maybe he's right.

But in the meantime, I've learned:

Why Automator wasn't working: it left the output in /private/tmp instead of my working directory. (aaand this is documented... where?)

How to turn an Automator workflow into a right-click menu item to apply to any file/directory (save-as-plugin , choose "Finder" from the dropdown).

How to make OmniPage SE turn a set of jpgs into a nicely formatted .txt file of the text (drag the images from the Finder onto OmniPage; it will sort them in alpha order; click "OCR" and cursor through the words it has questions on.) I'm impressed with the accuracy; this time around, it was much faster than retyping. And since I got the original scan by dropping the 10-page document into work's copier's auto-feeder and emailing the .pdf to myself, this sort of feels like a win. (Next time I'll send a multipage .tiff, and skip the intermediate .jpg step.)

And finally, Newseum's Front Pages, today's front page from 600 of the world's newspapers (including our own), which you can download as a PDF. This was a tangential link from a google search for help with automator and pdfs; someone wrote a script to make you a set of the front pages of your favourite world papers on demand.

Date: Saturday, 13 September 2008 02:10 pm (UTC)
From: [identity profile] da-lj.livejournal.com
Nope- the original was 200dpi. I made 200dpi jpgs = fine for OCR. I think you start having JPG artifact problems at <100dpi or so.

December 2024

S M T W T F S
12 34567
891011121314
15161718192021
22232425262728
293031    

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Wednesday, 24 December 2025 05:44 pm
Powered by Dreamwidth Studios