“学习优化”(Learning to Optimize, L2O)指利用深度学习来自动构建或加速优化算法,属于元优化范畴。与上一节中RL直接决策算法步骤略有不同,L2O更强调以神经网络参数化优化迭代过程本身,并通过训练使其在一类问题上收敛更快或效果更佳。一个经典例子是Andrychowicz等人提出的用LSTM来模拟梯度下降优化器,对神经网络训练进行加速(这属于连续优化的L2O)。在线性规划或一般的凸优化中,L2O可以体现在学习迭代求解器的部分计算。例如,Interior Point Method (IPM)每步需要解一系列线性方程组,Gao等人提出利用LSTM神经网络来近似求解这些方程,从而减少每步的计算开销。他们将该LSTM嵌入内点法流程,构成一种学习辅助的内点算法(IPM-LSTM)。在求解二次规划等测试中,这种方法相比标准内点法**减少了60%的迭代步数,整体求解时间缩短约70%**。换言之,通过学习线性方程求解的隐含结构,神经网络替代了通用线性代数方法,极大加速了优化过程。
Vlastelica, Marin, et al. "Differentiation of Blackbox Combinatorial Solvers." AAAI, 2020, pp. 3243–3250. (Enforcing Hard Linear Constraints in Deep Models)
Amos, Brandon, and J. Zico Kolter. "OptNet: Differentiable Optimization as a Layer in Neural Networks." ICML, 2017, pp. 136–145.
Meng, Zihang, et al. "Physarum Powered Differentiable Linear Programming Layers and Applications." AAAI, vol. 35, no. 10, 2021, pp. 8939–8949.
Zhao, Guantao, et al. "RL Simplex: Bringing Computational Efficiency in Linear Programming via Reinforcement Learning." ICLR (under review), 2024.
Kotary, James, et al. "End-to-End Constrained Optimization Learning: A Survey." IJCAI, Survey Track, 2021, pp. 4475–4482.
Mandi, Jayanta, and Tias Guns. "Interior Point Solving for LP-based Prediction+Optimisation." NeurIPS, 2020.
Lu, Haihao, and David Applegate. "Scaling up Linear Programming with PDLP." Google AI Blog, 20 Sept. 2024.
Gao, Xi, et al. "IPM-LSTM: A Learning-Based Interior Point Method for Solving Nonlinear Programs." NeurIPS, 2024 (to appear).
Amos, Brandon, and J. Zico Kolter. OptNet: Differentiable Optimization As A Layer in Neural Networks. arXiv preprint arXiv:1703.00443, 2017.
Kafakthong, Natdanai, and Krung Sinapiromsaran. "Primal-Optimal-Binding LPNet: Deep Learning Architecture to Predict Optimal Binding Constraints of a Linear Programming Problem." IJACSA, vol. 14, no. 5, 2023, pp. 1055–1062.