pandas - copy values from one dataframe to another dataframe(different length) by comparing row values in python -


i new python , working dataframes. have 2 dataframes, 1 data months , data days in months. want data monthly dataframe column in daily dataframe, repeated number of days in month. thanks, have tried provide illustration below.

monthly df     val  date        year   month 0   0.00 2016-01-31  2016   1 1   0.10 2016-02-29  2016   2 2   0.07 2016-03-31  2016   3 3   0.01 2016-04-30  2016   4 4   0.28 2016-05-31  2016   5  dailydf          date  year  month  val 0  2016-01-01  2016      1    0 1  2016-01-02  2016      1    0 2  2016-01-03  2016      1    0 3  2016-01-04  2016      1    0 4  2016-01-05  2016      1    0 5  2016-01-06  2016      1    0 6  2016-01-07  2016      1    0 7  2016-01-08  2016      1    0 8  2016-01-09  2016      1    0 9  2016-01-10  2016      1    0 10 2016-01-11  2016      1    0 11 2016-01-12  2016      1    0 12 2016-01-13  2016      1    0 13 2016-01-14  2016      1    0 14 2016-01-15  2016      1    0 15 2016-01-16  2016      1    0 16 2016-01-17  2016      1    0 17 2016-01-18  2016      1    0 18 2016-01-19  2016      1    0 19 2016-01-20  2016      1    0 20 2016-01-21  2016      1    0 21 2016-01-22  2016      1    0 22 2016-01-23  2016      1    0 23 2016-01-24  2016      1    0 24 2016-01-25  2016      1    0 25 2016-01-26  2016      1    0 26 2016-01-27  2016      1    0 27 2016-01-28  2016      1    0 28 2016-01-29  2016      1    0 29 2016-01-30  2016      1    0 ..        ...   ...    ...  ... 31 2016-02-01  2016      2    0 32 2016-02-02  2016      2    0 33 2016-02-03  2016      2    0 34 2016-02-04  2016      2    0 35 2016-02-05  2016      2    0 36 2016-02-06  2016      2    0 37 2016-02-07  2016      2    0 38 2016-02-08  2016      2    0 39 2016-02-09  2016      2    0 40 2016-02-10  2016      2    0 41 2016-02-11  2016      2    0 42 2016-02-12  2016      2    0 43 2016-02-13  2016      2    0 44 2016-02-14  2016      2    0 45 2016-02-15  2016      2    0 46 2016-02-16  2016      2    0 47 2016-02-17  2016      2    0 48 2016-02-18  2016      2    0 49 2016-02-19  2016      2    0 50 2016-02-20  2016      2    0 51 2016-02-21  2016      2    0 52 2016-02-22  2016      2    0 53 2016-02-23  2016      2    0 54 2016-02-24  2016      2    0 55 2016-02-25  2016      2    0 56 2016-02-26  2016      2    0 57 2016-02-27  2016      2    0 58 2016-02-28  2016      2    0 59 2016-02-29  2016      2    0 60 2016-03-01  2016      3    0 

so in 'val' column of daily dataframe want "val" monthly dataframe repeated number of days in month.

expected output          date  year  month  val 0  2016-01-01  2016      1    0 1  2016-01-02  2016      1    0 2  2016-01-03  2016      1    0 3  2016-01-04  2016      1    0 4  2016-01-05  2016      1    0 5  2016-01-06  2016      1    0 6  2016-01-07  2016      1    0 7  2016-01-08  2016      1    0 8  2016-01-09  2016      1    0 ..        ...   ...    ...  ...  10 2016-01-11  2016      1    0 11 2016-01-12  2016      1    0 12 2016-01-13  2016      1    0 13 2016-01-14  2016      1    0 14 2016-01-15  2016      1    0 15 2016-01-16  2016      1    0 16 2016-01-17  2016      1    0 17 2016-01-18  2016      1    0 18 2016-01-19  2016      1    0 19 2016-01-20  2016      1    0 20 2016-01-21  2016      1    0 21 2016-01-22  2016      1    0 22 2016-01-23  2016      1    0 23 2016-01-24  2016      1    0 24 2016-01-25  2016      1    0 25 2016-01-26  2016      1    0 26 2016-01-27  2016      1    0 27 2016-01-28  2016      1    0 28 2016-01-29  2016      1    0 29 2016-01-30  2016      1    0 ..        ...   ...    ...  ...  41 2016-02-11  2016      2    0.10 42 2016-02-12  2016      2    0.10 43 2016-02-13  2016      2    0.10 44 2016-02-14  2016      2    0.10 45 2016-02-15  2016      2    0.10 46 2016-02-16  2016      2    0.10 47 2016-02-17  2016      2    0.10 ..        ...   ...    ...  ... 49 2016-03-19  2016      3    0.07 50 2016-03-20  2016      3    0.07 51 2016-03-21  2016      3    0.07 52 2016-03-22  2016      3    0.07 53 2016-03-23  2016      3    0.07 54 2016-03-24  2016      3    0.07 

as @merlin has mentioned joining (using pd.merge() method) should pretty straightforward:

in [126]: pd.merge(daily.drop('val', 1), monthly.drop('date', 1), on=['year','month']) out[126]:           date  year  month  val   val 0   2016-01-01  2016      1    0  0.00 1   2016-01-02  2016      1    0  0.00 2   2016-01-03  2016      1    0  0.00 3   2016-01-04  2016      1    0  0.00 4   2016-01-05  2016      1    0  0.00 5   2016-01-06  2016      1    0  0.00 6   2016-01-07  2016      1    0  0.00 7   2016-01-08  2016      1    0  0.00 8   2016-01-09  2016      1    0  0.00 9   2016-01-10  2016      1    0  0.00 10  2016-01-11  2016      1    0  0.00 11  2016-01-12  2016      1    0  0.00 12  2016-01-13  2016      1    0  0.00 13  2016-01-14  2016      1    0  0.00 14  2016-01-15  2016      1    0  0.00 ..         ...   ...    ...  ...   ... 137 2016-05-17  2016      5    0  0.28 138 2016-05-18  2016      5    0  0.28 139 2016-05-19  2016      5    0  0.28 140 2016-05-20  2016      5    0  0.28 141 2016-05-21  2016      5    0  0.28 142 2016-05-22  2016      5    0  0.28 143 2016-05-23  2016      5    0  0.28 144 2016-05-24  2016      5    0  0.28 145 2016-05-25  2016      5    0  0.28 146 2016-05-26  2016      5    0  0.28 147 2016-05-27  2016      5    0  0.28 148 2016-05-28  2016      5    0  0.28 149 2016-05-29  2016      5    0  0.28 150 2016-05-30  2016      5    0  0.28 151 2016-05-31  2016      5    0  0.28  [152 rows x 5 columns] 

i want offer bit more challenging task - generate desired df monthlydf:

in [108]: df out[108]:     val       date  year  month 0  0.00 2016-01-31  2016      1 1  0.10 2016-02-28  2016      2 2  0.07 2016-03-31  2016      3 3  0.01 2016-04-30  2016      4 4  0.28 2016-05-31  2016      5  in [117]: df.set_index('date').resample('ms').mean().append(x.iloc[[-1]]).resample('d').pad().reset_index() out[117]:           date   val  year  month 0   2016-01-01  0.00  2016      1 1   2016-01-02  0.00  2016      1 2   2016-01-03  0.00  2016      1 3   2016-01-04  0.00  2016      1 4   2016-01-05  0.00  2016      1 5   2016-01-06  0.00  2016      1 6   2016-01-07  0.00  2016      1 7   2016-01-08  0.00  2016      1 8   2016-01-09  0.00  2016      1 9   2016-01-10  0.00  2016      1 10  2016-01-11  0.00  2016      1 11  2016-01-12  0.00  2016      1 12  2016-01-13  0.00  2016      1 13  2016-01-14  0.00  2016      1 14  2016-01-15  0.00  2016      1 ..         ...   ...   ...    ... 137 2016-05-17  0.28  2016      5 138 2016-05-18  0.28  2016      5 139 2016-05-19  0.28  2016      5 140 2016-05-20  0.28  2016      5 141 2016-05-21  0.28  2016      5 142 2016-05-22  0.28  2016      5 143 2016-05-23  0.28  2016      5 144 2016-05-24  0.28  2016      5 145 2016-05-25  0.28  2016      5 146 2016-05-26  0.28  2016      5 147 2016-05-27  0.28  2016      5 148 2016-05-28  0.28  2016      5 149 2016-05-29  0.28  2016      5 150 2016-05-30  0.28  2016      5 151 2016-05-31  0.28  2016      5  [152 rows x 4 columns] 

explanation:

resample monthlydf begin-of-month

in [112]: df.set_index('date').resample('ms').mean() out[112]:              val  year  month date 2016-01-01  0.00  2016      1 2016-02-01  0.10  2016      2 2016-03-01  0.07  2016      3 2016-04-01  0.01  2016      4 2016-05-01  0.28  2016      5 

add last row original monthlydf:

in [113]: df.set_index('date').resample('ms').mean().append(x.iloc[[-1]]) out[113]:              val  year  month date 2016-01-01  0.00  2016      1 2016-02-01  0.10  2016      2 2016-03-01  0.07  2016      3 2016-04-01  0.01  2016      4 2016-05-01  0.28  2016      5 2016-05-31  0.28  2016      5 

after can resample using daily rule: d


Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -