RecursiveFrog 发表于 2022-1-25 10:38

白话机器学习-最优化方法-牛顿法

简介

牛顿法,英文名称BFGS,是求解非线性优化问题的最有效的方法之一。
特点


[*]收敛速度快;
方式


[*]牛顿法是迭代算法,每一步需要求解目标函数的海塞矩阵的逆矩阵,计算比较复杂(后续会讲解拟牛顿法,拟牛顿法通过正定矩阵近似海塞矩阵的逆矩阵或海塞矩阵,简化了这个过程。
分析

考虑无约束最优化问题

https://www.zhihu.com/equation?tex=%5Cmin_%7Bx+%5Cin+R%7D+f%28x%29++%5C%5C
其中https://www.zhihu.com/equation?tex=x%5E%2A为目标函数的极小点。 假设f(x)具有二阶连续偏导数,若第k次迭代值为,则可将f(x)在附近进行二阶泰勒展开:

https://www.zhihu.com/equation?tex=f%28x%29+%3D+f%28x%5E%7Bk%7D%29+%2B+g_%7Bk%7D%5E%7BT%7D%28x+-+x%5E%7Bk%7D%29+%2B+1%2F2%28x-x%5E%7Bk%7D%29%5ETH%28x%5E%7Bk%7D%29%28x+-+x%5E%7Bk%7D%29++%5C%5C

[*]https://www.zhihu.com/equation?tex=g_k+%3D+g%28x%5E%7Bk%7D%29%3D+%5Cnabla%28f%28x%5E%7Bk%7D%29%29+是f(x)的梯度向量在的值。
[*]是f(x)的海塞矩阵https://www.zhihu.com/equation?tex=+%5B%5Cfrac+%7B%5Cpartial+f%5E2%7D+%7B%5Cpartial+x_i+%5Cpartial+y_j%7D%5D_%7Bnxn%7D在的值。
这里详解下泰勒展开式的里面的海塞矩阵,暂时讲解下二元函数的泰勒展开式



接着我们继续进行,函数f(x)有极值的必要条件是在极值点处的一阶导数为0,即梯度向量为0。特别是当是正定矩阵的时候,函数f(x)的极值为极小值,所以:

https://www.zhihu.com/equation?tex=%5Cnabla%28f%28x%29%29+%3D+0++%5C%5C
对f(x)求导,则

https://www.zhihu.com/equation?tex=%5Cnabla%28f%28x%29+%3D+f%28x%5E%7Bk%7D%29+%2B+g_%7Bk%7D%5E%7BT%7D%28x+-+x%5E%7Bk%7D%29+%2B+1%2F2%28x-x%5E%7Bk%7D%29%5ETH%28x%5E%7Bk%7D%28x+-+x%5E%7Bk%7D%29%29%29++%5C%5Chttps://www.zhihu.com/equation?tex=%3D++g_k+%2B+H%28x%5E%7Bk%7D%29%28x+-+x%5E%7Bk%7D%29+%5C%5C


https://www.zhihu.com/equation?tex=g_k+%2B+H%28x%5E%7Bk%7D%29%28x%5E%7Bk%2B1%7D+-+x%5E%7Bk%7D%29+%3D+0+%5C%5Chttps://www.zhihu.com/equation?tex=x%5E%7Bk%2B1%7D+-+x%5E%7Bk%7D%3D++-H%28x%5Ek%29%5E%7B-1%7Dg_k++%5C%5C
或者

https://www.zhihu.com/equation?tex=x%5E%7Bk%2B1%7D+%3D++x%5E%7Bk%7D+%2B+p_k+%5C%5C
其中


到此公式推导完毕
算法

输入:目标函数f(x),梯度https://www.zhihu.com/equation?tex=+g%28x%29+%3D+%5Cnabla+f%28x%29,海塞矩阵H(x),精度要求ε; 输出:f(x)的极小点x^*;

[*]取初始值点https://www.zhihu.com/equation?tex=x%5E%7B%280%29%7D,k=0;
[*]计算https://www.zhihu.com/equation?tex=g_k+%3D+g%28x%5E%7B%28k%29%7D%29
[*]若https://www.zhihu.com/equation?tex=%7C%7Cg_k%7C%7C+%3C+%CE%B5,则停止计算,得到解https://www.zhihu.com/equation?tex=x%5E%2A+%3D+x%5E%7B%28k%29%7D
[*]计算https://www.zhihu.com/equation?tex=H_k+%3D+H%28x%5E%7B%28k%29%7D%29,并且求解https://www.zhihu.com/equation?tex=p_k



[*]进行迭代,https://www.zhihu.com/equation?tex=x%5E%7Bk%2B1%7D+%3D++x%5E%7Bk%7D+%2B+p_k,请求k++,转到第2步;
页: [1]
查看完整版本: 白话机器学习-最优化方法-牛顿法