有监督学习

Supervised Learning

Model

目标函数
$$Obj(\Theta)=L(\Theta)+\Omega(\Theta)$$
$$Training Loss + Regularization$$

	Training Loss	Regularization
measure	how well model fit on training data	complexity of model
optimize…to encourages	predictive models^[1]	simple models^[2]

训练误差Training Loss: measures how well model fit on training data
$$L(\Theta)=\sum_{i=1}^nl(y_i,\hat{y}_i)$$

Square Loss
$$l(y_i,\hat{y}_i)=(y_i-\hat{y}_i)^2$$
Log-Loss
$$l(y_i,\hat{y}_i)=y_i\ln{(1+e^{{-\hat{y}_i})}+(1-y_i)\ln{(1+e}{\hat{y}_i})}$$

正则化Regularization: measures complexity of model
$$\Omega(\Theta)$$

Fitting well in training data at least get you close to training data which is hopefully close to the underlying distribution. ——from Tianqi Chen - Introduction to Boosted Trees ↩︎
Simpler models tends to have smaller variance in future predictions, making prediction stable. ——from Tianqi Chen - Introduction to Boosted Trees ↩︎