Search for answers here or ask a question
2

PDF / Document files as a Data Source in Modeller

Hi Avi / Pyramid Dev team,

Often, information useful for a Pyramid client is stored in PDF or other document formats - for example where the Pyramid client itself has many suppliers who do not interact via any type of direct data interchange.

I deal with many customers who would be good examples of this use case - smaller retailers who purchase goods from numerous wholesale suppliers and receive invoices in a similar format to those attached here - PDFs, Word docs etc.

 

As shown in the image, perhaps a way of tagging certain label names and extracting the both the label name and the associated value - either by name, relative position or similar - would allow Pyramid to extract the relevant fields and associated values from word or PDF documents to be used alongside additional customer data for greater overall insight.

 

I understand this functionality is being included in some other BI toolsets as new functionality.

I look forward to your thoughts on this.

 

Thanks and regards,
Richard.

2replies Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular
    • Ian MacdonaldAdmin
    • Team Lead - Product Management
    • Ian_Macdonald
    • 2 mths ago
    • Reported - view

    Hi Richard,

    You can do this already through the Python Source Block in Model.

    Generally, we would be looking for tables of data in the PDF for loading into a DB for analysis. This can be done with a single line of Python code. This article give a good explanation and guide on how to do this, as well as more complex examples where one needs to deal with multiple tables.

    There may also be a need to retrieve individual elements of the document, like invoice number or Name and Address that comprise the "header" for the table(s) and / or elements from specific PDF 'fields' used in online PDF forms for example.

    You can also do this via Python, with a good explanation available on the same site as the previous one.

    In the meantime, we'll monitor the votes for this idea.

    Hope that helps.

    Ian

    Like
    • Ian Macdonald - thanks for the response on this Ian - I'll certainly give this a try, much appreciated.

      Bets wishes,
      Rich

      Like
Like2 Follow
  • Status Answered
  • 2 Likes
  • 2 wk agoLast active
  • 2Replies
  • 33Views
  • 3 Following