python - Repeating strings in pandas DF -- want to return list of unique strings -

June 15, 2014

i have bunch of rows of data in pandas df contain inconsistently offsetting string characters. each game id (another column), 2 string characters unique game id, not switch off in predicatble pattern. regardless, i'm trying write helper function takes each unique game id , gets 2 team names associated it.

for example...

index game_id 0 400827888 1 400827888 2 400827888 3 400827888 4 400827888 ... 555622 400829117 555623 400829117 555624 400829117 555625 400829117

index team 0 atl 1 det 2 atl 3 det 4 atl ... 555622 por 555623 den 555624 por 555625 por

here woeful attempt, not working.

def get_teams(df):     in df['gameid']:         both_teams = [df['team'].astype(str)]         return(both_teams)

i'd return ['atl', 'det] game id 400827888 , ['por', 'den'] game id 400829117. instead, returning team name associated each index.

you can use seriesgroupby.unique:

print (df.groupby('game_id')['team'].unique()) game_id 400827888    [atl, det] 400829117    [por, den] name: team, dtype: object

for looping use iterrows:

for i, g in df.groupby('game_id')['team'].unique().reset_index().iterrows():     print (g.game_id)     print (g.team)

edit:

if need find game_id string (e.g. det) use boolean indexing:

s = df.groupby('game_id')['team'].unique()  print (s[s.apply(lambda x: 'det' in x)].index.tolist()) [400827888]

Search This Blog

Perl

python - Repeating strings in pandas DF -- want to return list of unique strings -

Comments

Post a Comment

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -