machine learning - Vowpal Wabbit not predicting binary values, maybe overtraining? -
i trying use vowpal wabbit binary classification, i.e. given feature values vw classify either 1 or 0. how have training data formatted.
1 'name | feature1:0 feature2:1 feature3:48 feature4:4881 ... -1 'name2 | feature1:1 feature2:0 feature3:5 feature4:2565 ... etc
i have 30,000 1 data points, , 3,000 0 data points. have 100 1 , 100 0 data points i'm using test on, after create model. these test data points classified default 1. here how format prediction set:
1 'name | feature1:0 feature2:1 feature3:48 feature4:4881 ...
from understanding of vw documentation, need use either logistic or hinge loss_function binary classifications. how i've been creating model:
vw -d ../training_set.txt --loss_function logistic/hinge -f model
and how try predictions:
vw -d ../test_set.txt --loss_function logistic/hinge -i model -t -p /dev/stdout
however, i'm running problems. if use hinge loss function, predictions -1. when use logistic loss function, arbitrary values between 5 , 11. there general trend data points should 0 lower values, 5-7, , data points should 1 6-11. doing wrong? i've looked around documentation , checked bunch of articles vw see if can identify problem is, can't figure out. ideally 0,1 value, or value between 0 , 1 corresponds how strong vw thinks result is. appreciated!
independently of tool and/or specific algorithm can use "learning curves" ,and train/cross validation/test splitting diagnose algorithm , determine whats problem . after diagnosing problem can apply adjustments algorithm, example if find have over-fitting can apply actions like:
- add regularization
- get more training data
- reduce complexity of model
- eliminate redundant features.
you can reference andrew ng. "advice machine learning" videos on youtube more details on subject.
Comments
Post a Comment