Ad - leaderboard

Thursday, November 04, 2010

Extracting data from a text in PDF

Converting data from a PDF to a spreadsheet can be done, but the original is made for reading not machine processing and so, even once converted, it is almost useless. 
  1. convert it to text
  2. convert the text to a database or spreadsheet. PDF2XL works, Video 1 min.

I LIKE THIS ONE: 
HILIGHT AND EXTRACT METHOD: 14 day trial, 129 USD. Allows select and instant conversion of table areas of a PDF text. Nice job.
Download Cogniview PDF2XL


I DON'T LIKE THESE:
HILIGHT AND EXTRACT METHOD:
Download Able2Extract Professional Free Trial & Full Version (7 day, 3page at a time trial)
It works. Video 1 min.


COMMAND LINE METHOD: will find a text and copy the line where it occurs to the new.txt file.


@echo off > new.txt
for %%T in (*.txt) do find "MACHINING CYCLE TIME" < %%T >> new.txt


SEARCH AND REPLACE METHOD: with REGEX regular expressions to insert columns where required (parse):
Extract data from a text file (and gain an understanding of regex, csv & data importing in the process)


CODE METHOD 1:
http://www.left-brain.com/tabId/65/itemId/1697/pageId/2/Automating-Windows-Administration.aspx


TEMPLATE PARSER METHOD:
Text Template Parser - Text Convertor, Extract Data from Text Files, Web pages and Emails


CODE METHOD 2:
Extracting data from text file [Archive] - CodeGuru Forums