0

Trimming and Cleaning Whitespace with Python Node in Model

Hello,

 

I know how to trim and use regex to replace whitespaces in Model, but I thought I'd challenge myself to try to learn and familiarize myself with the script node. 

 

I can run this block of code in my dev environment to get my desired results:

import pandas as pd
import re

df = pd.read_csv("List table - whitespacemiddleofwords-test.csv")

def whitespace_remover(dataframe):
    # iterate columns
    for i in dataframe.columns:
        # check datatype of each columns
        if dataframe[i].dtype == 'object':
            # strip whitespace from middle and ending
            dataframe[i] = dataframe[i].str.replace(r'\s+', ' ').str.strip()
        else:
            pass

whitespace_remover(df)
df.to_csv('no whitespace.csv', index=False)

 

When I try to appropriate this into the python node, this block gives me an output that is NULL:

import pandas as pd
import re

outputDF = pd.DataFrame()

def whitespace_remover(dataframe):
    for i in dataframe.columns:
        if dataframe[i].dtype == 'object':
            dataframe[i] = dataframe[i].str.replace(r'\s+', ' ').str.strip()
        else:
            pass

df = whitespace_remover(inputDF['Param1'].to_frame())

outputDF['Column1'] = df

I've tried other variations without the function and the output is the same or it copies the original column exactly without removing the whitespaces/trimming.

 

I'm trying to create something like the below output:

 

Any help will be greatly appreciated!!

Reply

null

Content aside

  • 18 hrs agoLast active
  • 3Views
  • 1 Following