余弦退火调度:平滑衰减学习率的经典策略
FreeGuideOnline
最新
2026-06-21
python import torch import torch.nn as nn import torch.optim as optim from torch.optim.lr_scheduler import CosineAnnealingLR import matplotlib.pyplot as plt
model = nn.Linear(10, 2) optimizer = optim.SGD(model.parameters(), lr=0.1) scheduler = CosineAnnealingLR(optimizer, T_max=50, eta_min=0.001)
lrs = [] for epoch in range(100): optimizer.step() # 模拟训练步 scheduler.step() lrs.append(optimizer.param_groups[0]['lr'])
绘制学习率曲线
plt.plot(lrs) plt.xlabel('Epoch') plt.ylabel('Learning Rate') plt.title('Cosine Annealing Learning Rate Schedule') plt.show()
对于热重启版本:
```python
scheduler = CosineAnnealingWarmRestarts(optimizer, T_0=20, T_mult=2, eta_min=0)
T_0:第一个周期的长度T_mult:每个后续周期长度乘以此因子(例如 T_mult=2 则周期长度为 20, 40, 80...)eta_min:最低学习率
使用 TensorFlow/Keras
import tensorflow as tf
total_epochs = 100
lr_schedule = tf.keras.optimizers.schedules.CosineDecay(
initial_learning_rate=0.1,
decay_steps=total_epochs,
alpha=0.001 # eta_min / eta_max
)
optimizer = tf.keras.optimizers.SGD(learning_rate=lr_schedule)