Word Frequency from a CSV Column in Python -


i have .csv file column of messages have collected, wish word frequency list of every word in column. here have far , not sure have made mistake, appreciated. edit: expected output write entire list of words , count (without duplicates) out .csv file.

import csv collections import counter collections import defaultdict  output_file = 'comments_word_freqency.csv' input_stream = open('comments.csv') reader = csv.reader(input_stream, delimiter=',') reader.next() #skip header csvrow = [row[3] row in reader] #get fourth column  open(output_file, 'rb') csvfile:     row in reader:         freq_dict = defaultdict(int) # "int" part                                     # means values of dictionary integers.         line in csvrow:             words = line.split(" ")             word in words:                 word = word.lower() # ignores case type                 freq_dict[word] += 1          writer = csv.writer(open(output_file, "wb+")) # lets write csv file.         key, value in freq_dict.items():                         # iterates through dictionary , writes each pair own line.             writer.writerow([key, value]) 

the code uploaded on place, think you're getting at. returns list of word , number of times appeared in original file.

words= [] open('comments_word_freqency.csv', 'rb') csvfile:     reader = csv.reader(csvfile)     reader.next()     row in reader:          csv_words = row[3].split(" ")          in csv_words:               words.append(i)  words_counted = [] in words:     x = words.count(i)     words_counted.append((i,x))  #write csv file open('output.csv', 'wb') f: writer = csv.writer(f) writer.writerows(edgl) 

then rid of duplicates in list call set() on it

set(words_counted) 

your output this:

'this', 2 'is', 1 'your', 3 'output', 5 

Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -