pandas - Multiple Linear Regression in Python (PatsyError: model is missing required outcome variables) -
i running following code regression in python , error (patsyerror: model missing required outcome variables). how fix it? thanks
y = spikers['grade'] x = spikers[['num_pageview', 'num_video_play_resume', 'eng_proficiency', 'english']] model = smf.ols(y,x).fit() model.summary()
i had similar problem trying run sm.logit on outcome variable 'y' binary (0s or 1s): let data in pandas data frame called 'data:
import statsmodels.formula.api sm x = ['age','sex','x1','x2','x3','x4'] logit = sm.logit(data['y'],data[x]) result = logit.fit() print result.summary() traceback (most recent call last): file "<ipython-input-xxx>", line 1, in <module> logit = sm.logit(data['y'],data[x]) file "c:\...\statsmodels\base\model.py", line 147, in from_formula missing=missing) file "c:\...\statsmodels\formula\formulatools.py", line 68, in handle_formula_data na_action=na_action) file "c:\...\patsy\highlevel.py", line 312, in dmatrices raise patsyerror("model missing required outcome variables") patsyerror: model missing required outcome variables
i getting error message displayed above. managed fix , pull out sensible results using notation instead:
f1 = 'y ~ age+sex+x1+x2+x3+x4' logit = sm.logit(formula = f1, data = data) result = logit.fit()
this kind of notational use of statsmodels.formula.api preferred, far can tell
Comments
Post a Comment