python - Collapsing list to unique IDs with a range of dates -


i have large list of ids repeat different ranges of dates. need create unique list of ids 1 range of dates includes earliest start date , latest end date uncollapsed list.

this example of have:

    id  start_date  end_date     1   9/25/2015   10/12/2015     1   9/16/2015   11/1/2015     1   8/25/2015   9/21/2015     2   9/2/2015    10/29/2015     3   9/18/2015   10/15/2015     3   9/19/2015   9/30/2015     4   8/27/2015   9/15/2015 

and need.

   id   start_date  end_date    1    8/25/2015   11/1/2015    2    9/2/2015    10/29/2015    3    9/18/2015   10/15/2015    4    8/27/2015   9/15/2015  

i'm trying in python, not having luck. thanks!

use groupby/aggregate:

in [12]: df.groupby('id').agg({'start_date':min, 'end_date':max}) out[12]:     start_date   end_date id                       1  2015-08-25 2015-11-01 2  2015-09-02 2015-10-29 3  2015-09-18 2015-10-15 4  2015-08-27 2015-09-15 

note important start_date , end_date parsed dates, min , max return minimum , maximum dates each id. if values merely string representations of dates, min , max give string min or max depends on string lexicographic order. if date-strings in yyyy/mm/dd format, lexicographic order correspond parsed-date order, date-strings in mm/dd/yyyy format not have property.

if start_date , end_date have string values, then

for col in ['start_date', 'end_date']:     df[col] = pd.to_datetime(df[col]) 

would convert strings dates.

if loading dataframe file using pd.read_table (or pd.read_csv), then

df = pd.read_table(filename, ..., parse_dates=[1, 2]) 

would parse strings in second , third columns of file dates. [1, 2] corresponds second , third columns since python uses 0-based indexing.


Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -