Filter or clean CSV data while loading to PostgreSQL -


i loading csv files postgresql tables using bulk load method copy command. there fields have bad character in (like "|", """, ";" , on). keep getting different error while loading it. tried tab-delimited, comma-delimited, , other options, too, no luck.

is there way can clean csv data before loading postgresql using copy command or there copy command syntax can replace bad characters default?

these of syntax have tried:

copy tblsf '/filelocation/test.csv' csv header delimiter ',' null '?'; copy tblsf '/filelocation/test.csv' csv header delimiter '|' null '?';   copy tblsf '/filelocation/test.csv' csv header delimiter e'\t' null '?'; copy tblsf '/filelocation/test.csv' csv header delimiter '<>' null '?'; 

thanks in advance.

sometimes file not encoded using utf-8. try this:

iconv -f utf-8 -t utf-8 -c /filelocation/test.csv > /filelocation/test_clean.csv 

and try postgresql copy (below command assumes fields separated commas):

copy tblsf '/filelocation/test_clean.csv' csv header delimiter ','; 

if have mal-formed file, example:

company,owner john's pizza, llc,john smith burger co,jones, mike 

you need resave data in corrected format. example:

"company","owner" "john's pizza, llc","john smith" "burger co","jones, mike" 

once have clean file, can edit , resave using different delimiter (for example in excel, or using csv module in python). before saving new delimiter, want scrub delimiter out of file, example, in case of pipes |:

sed -i 's/|//g' test_clean.csv 

Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -