python - Custom format ID mapping -
i have 2 databases (txt files). 1 two-column, tab-delimited one, holds names , ids.
name1 \t id1 name1 \t id2 name2 \t id9 name2 \t id40 name3 \t id3
the other database has same ids first 1 in first column, while second column lists ids of same kind delimited commas (these children of ones in first one, second database hierarchical).
id1 \t id1,id2,id3 id2 \t id2, id9
what third database same format second, in second column i'd swap out children ids names of first database. example:
id1 \t name1,name2,name3 id2 \t name1,name2
is there way this? i'm quite beginner, when had map ids before used web services, custom format needed further analysis , i'm not sure start.
thanks in advance!
import csv # reading first db simple since there's fixed delimiter # use csv module split lines , create dictionary maps id name id_dictionary = {} open('db_1.txt', 'r') infile: reader = csv.reader(infile, delimiter='\t') line in reader: id_dictionary[line[1]] = line[0] # can again split on tab return 'name1,name2' etc single # string call split() on later. row_data = [] open('db_2.txt', 'r') infile: reader = csv.reader(infile, delimiter='\t') line in reader: # id remains unchanged, keep first value row = [line[0]] # split string individual elements in list id_codes = line[1].split(',') # list comprehension id in dictionary , return # name stored against translated = [id_dictionary.get(item) item in id_codes] # add translated list using represent row row.extend(translated) # append row our collection of rows row_data.append(row) open('db_3.txt', 'w') outfile: row in row_data: outfile.write(row[0]) outfile.write('\t') outfile.write(','.join(map(str,row[1:]))) # join values comma outfile.write('\n')
Comments
Post a Comment