0

How to import multiple Excel files in Data Modeler?

Is there a way to loop through multiple Excel spreadsheets with the same format and load the data into the in memory database?

1 reply

null
    • Jason_Picker
    • 3 yrs ago
    • Reported - view

    There is a Multi Files object in the list of Sources but unfortunately, it does not work with Excel files; only CSV, TXT, JSON and XML files.  However, it is easy to add your own Python script to do the same with Excel files.  Here is some simple code to import Excel files:

    import xlrd
    import glob2
    import numpy as np
    import pandas as pd
    
    all_files = glob2.glob('C:\Data\*.xls')
    
    df = pd.concat((pd.read_excel(f) for f in all_files))

    In the above code, the xlrd and glob2 libraries provide the magic for making this work.  The glob function reads the list of files from a specified directory into a list or array.  You can then loop through that array and import the data into a pandas data frame using the read_excel function.

    So now the question is, how do I use this code in Pyramid?  You can follow these steps to make it work:

    1. Add a Python node to your Data Flow by grabbing it from the bottom of the Sources list and dragging it onto the screen.

    2. Drop the code into the Python Script property.  *Please consult the documentation on how to setup Python environments for supporting different libraries and packages (Augmented Server Scripting Environments).

    3. Click on the Auto Detect button, enter the data frame name (df) and press the Auto Detect button.

      The auto detect will detect all the column from the files and add them to the structure but you can also manually add the columns yourself by clicking on the plus sign (+) button.

    4. Once you have the structure configured, you can click on the Add Select Nodes button to add the data to a Select node. 
    5. You can then add any additional features you want to include in your data flow and then set the target to In Memory or whichever target you want to use.  

Content aside

  • Status Answered
  • 3 yrs agoLast active
  • 1Replies
  • 40Views
  • 1 Following