cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
New Member
New Member

Hot Folder Script – Remove Manifest Rows With Duplicate Values

Is there a way to write a script to run before a workflow that will erase rows from a CSV manifest that share specific duplicate values?  The manifest in question needs the duplicate rows for another workflow, but we’d like to use this same data to do something else without having to manually edit the original manifest daily.

The original CSV data would look as follows:

 

"1","05","GM","XXX","0032609","","0620","10KN95","2"

"1","05","GM","XXX","0032609","","0620","50KN95","1"

"1","05","GM","XXX","0032609","","0620","DISPOFM","3"

"2","05","GM","XXX","0032433","","0620","DISPOFM ","1"

 

What I want to do is get rid of any rows that share the same order number (field 1).   So rows 2 and 3 would be deleted and the resulting CSV would end up looking like this:

 

"1","05","GM","XXX","0032609","","0620","10KN95","2"

"2","05","GM","XXX","0032433","","0620","DISPOFM ","1"

 

I’ll be the first to admit that I’m not really familiar with the scripting Core uses, so possibly this question is beyond the scope of what hot folder scripts can do?  

Any input would be appreciated.

Thank you,

RM

Tags (1)
0 Kudos
2 Replies
Highlighted

Re: Hot Folder Script – Remove Manifest Rows With Duplicate Values

Hello,

Maybe you should read this old post first.

Then, as they wrote, as soon as you are able to develop a script (with whatever language you want) and manage to run it "manually" (outside FF Core), then your FF Core Hotfolder will be able to automate. The most important thing, as you will read, is that your script writes it resulting file into the specified output folder.

west-digital.fr

 

0 Kudos
Highlighted
FreeFlow Production Workflow Moderator
FreeFlow Production Workflow Moderator

Re: Hot Folder Script – Remove Manifest Rows With Duplicate Values

Here is a way to do this. I have used Python using pandas which you will need to install.

First I added a sample pdf to your sample data like this:

"1","05","GM","XXX","0032609","","0620","10KN95","2","C:\tmp\test1.pdf"
"1","05","GM","XXX","0032609","","0620","50KN95","1","C:\tmp\test1.pdf"
"1","05","GM","XXX","0032609","","0620","DISPOFM","3","C:\tmp\test1.pdf"
"2","05","GM","XXX","0032433","","0620","DISPOFM ","1","C:\tmp\test2.pdf"

This Python script will add a header to the data and remove any duplicates in column A keeping the first occurence:

import pandas as pd
import sys
import ntpath
import shutil
dtype_dic= {'A': str,'B': str,'C': str,'D': str,'E': str,'F': str,'G': str,'H': str,'I': str,'J': str} # New header as all strings
df = pd.read_csv(sys.argv[1], names=['A','B','C','D','E','F','G','H','I','J'], dtype=dtype_dic) # Read incoming csv $FFin$
new_df = df.drop_duplicates('A',keep='first') # Drop duplicates in column A
new_df.to_csv('C:/tmp/'+ ntpath.basename(sys.argv[1]), index=False) # Write new csv to C:\tmp\
shutil.move('C:/tmp/'+ ntpath.basename(sys.argv[1]), sys.argv[2][0:-1]) # Move temp csv to script output folder $FFout$

The result is this:

A,B,C,D,E,F,G,H,I,J
1,05,GM,XXX,0032609,,0620,10KN95,2,C:\tmp\test1.pdf
2,05,GM,XXX,0032433,,0620,DISPOFM ,1,C:\tmp\test2.pdf

You can run the Python script using a batch file like this:

"C:\Program Files\Python38\python.exe" "C:\Xerox\FreeFlow\Core\00000000-0000-0000-0000-000000000000\Data\Scripts\removeduplicatelines.py" %1 %2

which you can run on the hot folder

Capture.PNG

Both scripts should be placed in your FreeFlow Core scripts folder:

<drive letter>:\Xerox\FreeFlow\Core\00000000-0000-0000-0000-000000000000\Data\Scripts

This assumes there is also a C:\tmp\ folder.

Scripts attached. If your scripts folder is not located on C: you need to change the path the Python script in the batch file using the correct drive letter.

/Stefan

0 Kudos