SVM 简单分析

原始式

$\min_{\omega, b}\frac{1}{2}||w||^2 \\ s.t. (y_i(\omega^Tx_i+b) - 1) \ge 0\\ \alpha_i \ge 0 \\ 拉格朗日最值充分条件 => 对偶问题 \\ L() = \frac{1}{2}||w||^2 + \sum_{i = 1}^{m}\alpha_i(y_i(\omega^Tx_i+b) - 1) \\ \omega = \sum_{i=1}^{m}\alpha_iy_ix_i \\ 0 = \sum_{i = 1}^{m}\alpha_iy_i \\ 整理得 \\ \max_{\alpha} \sum_{i = 1}^{m}\alpha_i - \frac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha_i\alpha_jy_iy_jx_i^Tx_j \\ s.t. \alpha_i \ge 0 \\ 0 = \sum_{i = 1}^{m}\alpha_iy_i \\$

核函数. 核矩阵, 再生核希尔伯特空间(RKHS), 线性组合

$x_i^Tx_j => \phi(x_i)^T\phi(x_j) => \kappa(x_i, x_j)$

软间隔, 0/1损失函数

$z = (y_i(\omega^Tx_i+b) - 1) \ge 0 不一定成立\\ \ell_{0/1}(z) \ge 0 一定成立\\ \min_{\omega, b}\frac{1}{2}||w||^2 + C\sum_{i = 1}^{m}\ell_{0/1}(z) 直观解释为允许样本违规, 违规后果参与优化\\ s.t. (y_i(\omega^Tx_i+b) - 1) \ge 0 or\lt 0\\ \alpha_i \ge 0 \\$

松弛变量

$z = (y_i(\omega^Tx_i+b) - 1) \ge 0 不一定成立\\ \ell_{exp}(z), \ell_{logistic}(z), \ell_{hinge}(z) \gt -z \\ \xi_i <=> \ell(z) \\ \min_{\omega, b}\frac{1}{2}||w||^2 + C\sum_{i = 1}^{m}\xi_i\\ s.t. -z \lt \xi_i <=> (y_i(\omega^Tx_i+b)) \ge 1 - \xi_i \\ \xi_i \ge 0 \\ 最值充分条件 => 对偶问题\\ L(\omega, b, \xi) = \frac{1}{2}||w||^2 + C\sum_{i = 1}^{m}\xi_i +\\ \sum_{i = 1}^{m}\alpha_i( 1 - \xi_i - y_i(\omega^Tx_i+b))-\sum_{i=1}^{m}\mu_i\xi_i\\ 对偶问题与原始式一样, 约束条件变为 0 \le \alpha_i \le C$

KKT条件分析

\left\{ \begin{aligned} lagrange乘子 => \alpha_i \ge 0, \mu_i \ge 0 \\ less equal0 unequal => 1 - \xi_i - y_i(\omega^Tx_i+b) \le 0, -\xi_i \le 0 \\ slack equal => \alpha_i( 1 - \xi_i - y_i(\omega^Tx_i+b)) = 0, -\mu_i\xi_i = 0\\ \end{aligned} \right. \\ \begin{cases} \alpha_i = 0 对\omega的优化, b的优化无影响\\ \alpha_i \gt 0 => 1 - \xi_i - y_i(\omega^Tx_i+b) = 0 样本就是软间隔, 支持向量 \\ \alpha_i = C => \mu_i = 0 \begin{cases} \xi_i \le 1 标签, 预测同号, 分类正确, 位于软间隔之间 \\ \xi_i \gt 1 预测错误 \end{cases}\\ \alpha_i \lt C => \mu_i \gt 0 => \xi_i = 0 样本是支持向量, 且对大间隔2/||w||\\ \end{cases}