pandas - copy values from one dataframe to another dataframe(different length) by comparing row values in python -
i new python , working dataframes. have 2 dataframes, 1 data months , data days in months. want data monthly dataframe column in daily dataframe, repeated number of days in month. thanks, have tried provide illustration below.
monthly df val date year month 0 0.00 2016-01-31 2016 1 1 0.10 2016-02-29 2016 2 2 0.07 2016-03-31 2016 3 3 0.01 2016-04-30 2016 4 4 0.28 2016-05-31 2016 5 dailydf date year month val 0 2016-01-01 2016 1 0 1 2016-01-02 2016 1 0 2 2016-01-03 2016 1 0 3 2016-01-04 2016 1 0 4 2016-01-05 2016 1 0 5 2016-01-06 2016 1 0 6 2016-01-07 2016 1 0 7 2016-01-08 2016 1 0 8 2016-01-09 2016 1 0 9 2016-01-10 2016 1 0 10 2016-01-11 2016 1 0 11 2016-01-12 2016 1 0 12 2016-01-13 2016 1 0 13 2016-01-14 2016 1 0 14 2016-01-15 2016 1 0 15 2016-01-16 2016 1 0 16 2016-01-17 2016 1 0 17 2016-01-18 2016 1 0 18 2016-01-19 2016 1 0 19 2016-01-20 2016 1 0 20 2016-01-21 2016 1 0 21 2016-01-22 2016 1 0 22 2016-01-23 2016 1 0 23 2016-01-24 2016 1 0 24 2016-01-25 2016 1 0 25 2016-01-26 2016 1 0 26 2016-01-27 2016 1 0 27 2016-01-28 2016 1 0 28 2016-01-29 2016 1 0 29 2016-01-30 2016 1 0 .. ... ... ... ... 31 2016-02-01 2016 2 0 32 2016-02-02 2016 2 0 33 2016-02-03 2016 2 0 34 2016-02-04 2016 2 0 35 2016-02-05 2016 2 0 36 2016-02-06 2016 2 0 37 2016-02-07 2016 2 0 38 2016-02-08 2016 2 0 39 2016-02-09 2016 2 0 40 2016-02-10 2016 2 0 41 2016-02-11 2016 2 0 42 2016-02-12 2016 2 0 43 2016-02-13 2016 2 0 44 2016-02-14 2016 2 0 45 2016-02-15 2016 2 0 46 2016-02-16 2016 2 0 47 2016-02-17 2016 2 0 48 2016-02-18 2016 2 0 49 2016-02-19 2016 2 0 50 2016-02-20 2016 2 0 51 2016-02-21 2016 2 0 52 2016-02-22 2016 2 0 53 2016-02-23 2016 2 0 54 2016-02-24 2016 2 0 55 2016-02-25 2016 2 0 56 2016-02-26 2016 2 0 57 2016-02-27 2016 2 0 58 2016-02-28 2016 2 0 59 2016-02-29 2016 2 0 60 2016-03-01 2016 3 0
so in 'val' column of daily dataframe want "val" monthly dataframe repeated number of days in month.
expected output date year month val 0 2016-01-01 2016 1 0 1 2016-01-02 2016 1 0 2 2016-01-03 2016 1 0 3 2016-01-04 2016 1 0 4 2016-01-05 2016 1 0 5 2016-01-06 2016 1 0 6 2016-01-07 2016 1 0 7 2016-01-08 2016 1 0 8 2016-01-09 2016 1 0 .. ... ... ... ... 10 2016-01-11 2016 1 0 11 2016-01-12 2016 1 0 12 2016-01-13 2016 1 0 13 2016-01-14 2016 1 0 14 2016-01-15 2016 1 0 15 2016-01-16 2016 1 0 16 2016-01-17 2016 1 0 17 2016-01-18 2016 1 0 18 2016-01-19 2016 1 0 19 2016-01-20 2016 1 0 20 2016-01-21 2016 1 0 21 2016-01-22 2016 1 0 22 2016-01-23 2016 1 0 23 2016-01-24 2016 1 0 24 2016-01-25 2016 1 0 25 2016-01-26 2016 1 0 26 2016-01-27 2016 1 0 27 2016-01-28 2016 1 0 28 2016-01-29 2016 1 0 29 2016-01-30 2016 1 0 .. ... ... ... ... 41 2016-02-11 2016 2 0.10 42 2016-02-12 2016 2 0.10 43 2016-02-13 2016 2 0.10 44 2016-02-14 2016 2 0.10 45 2016-02-15 2016 2 0.10 46 2016-02-16 2016 2 0.10 47 2016-02-17 2016 2 0.10 .. ... ... ... ... 49 2016-03-19 2016 3 0.07 50 2016-03-20 2016 3 0.07 51 2016-03-21 2016 3 0.07 52 2016-03-22 2016 3 0.07 53 2016-03-23 2016 3 0.07 54 2016-03-24 2016 3 0.07
as @merlin has mentioned joining (using pd.merge() method) should pretty straightforward:
in [126]: pd.merge(daily.drop('val', 1), monthly.drop('date', 1), on=['year','month']) out[126]: date year month val val 0 2016-01-01 2016 1 0 0.00 1 2016-01-02 2016 1 0 0.00 2 2016-01-03 2016 1 0 0.00 3 2016-01-04 2016 1 0 0.00 4 2016-01-05 2016 1 0 0.00 5 2016-01-06 2016 1 0 0.00 6 2016-01-07 2016 1 0 0.00 7 2016-01-08 2016 1 0 0.00 8 2016-01-09 2016 1 0 0.00 9 2016-01-10 2016 1 0 0.00 10 2016-01-11 2016 1 0 0.00 11 2016-01-12 2016 1 0 0.00 12 2016-01-13 2016 1 0 0.00 13 2016-01-14 2016 1 0 0.00 14 2016-01-15 2016 1 0 0.00 .. ... ... ... ... ... 137 2016-05-17 2016 5 0 0.28 138 2016-05-18 2016 5 0 0.28 139 2016-05-19 2016 5 0 0.28 140 2016-05-20 2016 5 0 0.28 141 2016-05-21 2016 5 0 0.28 142 2016-05-22 2016 5 0 0.28 143 2016-05-23 2016 5 0 0.28 144 2016-05-24 2016 5 0 0.28 145 2016-05-25 2016 5 0 0.28 146 2016-05-26 2016 5 0 0.28 147 2016-05-27 2016 5 0 0.28 148 2016-05-28 2016 5 0 0.28 149 2016-05-29 2016 5 0 0.28 150 2016-05-30 2016 5 0 0.28 151 2016-05-31 2016 5 0 0.28 [152 rows x 5 columns]
i want offer bit more challenging task - generate desired df monthlydf:
in [108]: df out[108]: val date year month 0 0.00 2016-01-31 2016 1 1 0.10 2016-02-28 2016 2 2 0.07 2016-03-31 2016 3 3 0.01 2016-04-30 2016 4 4 0.28 2016-05-31 2016 5 in [117]: df.set_index('date').resample('ms').mean().append(x.iloc[[-1]]).resample('d').pad().reset_index() out[117]: date val year month 0 2016-01-01 0.00 2016 1 1 2016-01-02 0.00 2016 1 2 2016-01-03 0.00 2016 1 3 2016-01-04 0.00 2016 1 4 2016-01-05 0.00 2016 1 5 2016-01-06 0.00 2016 1 6 2016-01-07 0.00 2016 1 7 2016-01-08 0.00 2016 1 8 2016-01-09 0.00 2016 1 9 2016-01-10 0.00 2016 1 10 2016-01-11 0.00 2016 1 11 2016-01-12 0.00 2016 1 12 2016-01-13 0.00 2016 1 13 2016-01-14 0.00 2016 1 14 2016-01-15 0.00 2016 1 .. ... ... ... ... 137 2016-05-17 0.28 2016 5 138 2016-05-18 0.28 2016 5 139 2016-05-19 0.28 2016 5 140 2016-05-20 0.28 2016 5 141 2016-05-21 0.28 2016 5 142 2016-05-22 0.28 2016 5 143 2016-05-23 0.28 2016 5 144 2016-05-24 0.28 2016 5 145 2016-05-25 0.28 2016 5 146 2016-05-26 0.28 2016 5 147 2016-05-27 0.28 2016 5 148 2016-05-28 0.28 2016 5 149 2016-05-29 0.28 2016 5 150 2016-05-30 0.28 2016 5 151 2016-05-31 0.28 2016 5 [152 rows x 4 columns]
explanation:
resample monthlydf
begin-of-month
in [112]: df.set_index('date').resample('ms').mean() out[112]: val year month date 2016-01-01 0.00 2016 1 2016-02-01 0.10 2016 2 2016-03-01 0.07 2016 3 2016-04-01 0.01 2016 4 2016-05-01 0.28 2016 5
add last row original monthlydf
:
in [113]: df.set_index('date').resample('ms').mean().append(x.iloc[[-1]]) out[113]: val year month date 2016-01-01 0.00 2016 1 2016-02-01 0.10 2016 2 2016-03-01 0.07 2016 3 2016-04-01 0.01 2016 4 2016-05-01 0.28 2016 5 2016-05-31 0.28 2016 5
after can resample using daily
rule: d
Comments
Post a Comment