Supervised Learning
Model
Parameters
Objective Function
目标函数
$$Obj(\Theta)=L(\Theta)+\Omega(\Theta)$$
$$Training Loss + Regularization$$
Training Loss | Regularization | |
---|---|---|
measure | how well model fit on training data | complexity of model |
optimize…to encourages | predictive models[1] | simple models[2] |
Training Loss
训练误差Training Loss: measures how well model fit on training data
$$L(\Theta)=\sum_{i=1}^nl(y_i,\hat{y}_i)$$
- Square Loss
$$l(y_i,\hat{y}_i)=(y_i-\hat{y}_i)^2$$ - Log-Loss
$$l(y_i,\hat{y}_i)=y_i\ln{(1+e{-\hat{y}_i})}+(1-y_i)\ln{(1+e{\hat{y}_i})}$$
Regularization
正则化Regularization: measures complexity of model
$$\Omega(\Theta)$$
- $l1$-norm
$$\Omega(\omega)=\lambda||w||_1$$ - $l2$-norm
$$\Omega(\omega)=\lambda||w||^2$$
参考资料
Fitting well in training data at least get you close to training data which is hopefully close to the underlying distribution. ——from Tianqi Chen - Introduction to Boosted Trees ↩︎
Simpler models tends to have smaller variance in future predictions, making prediction stable. ——from Tianqi Chen - Introduction to Boosted Trees ↩︎