Data Import

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #343 Score: 0
    Joachim Engel
    Participant

    The Drag-and-Drop Data import tool is really very nice and helpful.

    Yet, it does not seem to work with tables from pdf files.

    Many of the data I am interested in the web are in pdf files. Is there any clever way to import these data??

    #344
    Jonathan Sandoe
    Keymaster

    Hi Joachim,

    I am able to drag and drop tables from PDF documents without difficulty using the CODAP drag and drop tool.

    The answer lies not with CODAP, but with the software you are using to display the PDF. You see, when you select text in any kind of document and say copy or begin to drag it, most modern software will make an HTML version of that and put it in its copy or drag buffer. In this process it will look for anything that looks like tabular data and convert it to an HTML table. If the software is not sufficiently modern or if, for some reason, it cannot recognize the data as tabular in structure, it will not do this. When you paste or drop something into the CODAP drag and drop tool, it simply looks for tables in the HTML data stream.

    So, here are my suggestions:

    1. try opening the PDF file in a different tool. On Macs there is the Adobe PDF reader and “Preview”. On PC’s there are similar tools. Most browsers can also display PDFs.
    2.  see if the tool you are using to display PDFs is an up-to-date version of the tool. If not, update it.
    3. try other PDFs with tables. Maybe it is some property of the PDF you are working with that is problematic.
    4. try different selection areas. Try areas that encompass the whole table. Try selecting the whole document.

    If these don’t work, I am at a loss to suggest what you may do.

    #346
    Bill Finzer
    Keymaster

    Hi Joachim,

    I was able to follow Jonathan’s suggestion using the DropHTMLTable plugin on the CODAP Data Interactives Plugins page (under utilities). I selected the table text in the PDF and pasted it into the plugin. For some reason I wasn’t able to directly drag into CODAP.

    I also discovered a free utility called Tabula that appears to address this need, allowing you to extract tables from PDFs and turn them into a csv. I tried it in a simple case and it worked fine. Let us know if you have any luck with it.

    Bill

Viewing 3 posts - 1 through 3 (of 3 total)
  • You must be logged in to reply to this topic.