machine learning - LInearSVC vs. SVC(kernel='linear'): Conflicting arguments? -


from research, found 3 conflicting results:

  1. svc(kernel="linear") better
  2. linearsvc better
  3. doesn't matter

can explain when use linearsvc vs. svc(kernel="linear")?

it seems linearsvc marginally better svc , more finicky. if scikit decided spend time on implementing specific case linear classification, why wouldn't linearsvc outperform svc?

mathematically, optimizing svm convex optimization problem, unique minimizer. means there 1 solution mathematical optimization problem.

the differences in results come several aspects: svc , linearsvc supposed optimize same problem, in fact liblinear estimators penalize intercept, whereas libsvm ones don't (iirc). leads different mathematical optimization problem , different results. there may other subtle differences such scaling , default loss function (edit: make sure set loss='hinge' in linearsvc). next, in multiclass classification, liblinear one-vs-rest default whereas libsvm one-vs-one.

sgdclassifier(loss='hinge') different other 2 in sense uses stochastic gradient descent , not exact gradient descent , may not converge same solution. obtained solution may generalize better.

between svc , linearsvc, 1 important decision criterion linearsvc tends faster converge larger number of samples is. due fact linear kernel special case, optimized in liblinear, not in libsvm.


Comments