聚类时间序列:发现相似行为的模式挖掘

FreeGuideOnline 最新 2026-06-24

python import numpy as np import matplotlib.pyplot as plt from tslearn.preprocessing import TimeSeriesScalerMeanVariance from tslearn.clustering import TimeSeriesKMeans from tslearn.datasets import CachedDatasets

加载示例数据,如“Trace”数据集

X_train, y_train, X_test, y_test = CachedDatasets().load_dataset("Trace")

归一化

scaler = TimeSeriesScalerMeanVariance() X_scaled = scaler.fit_transform(X_train)

使用K-means+DTW,指定簇数3

km_dtw = TimeSeriesKMeans(n_clusters=3, metric="dtw", verbose=True, random_state=0) labels = km_dtw.fit_predict(X_scaled)

绘制每个簇的平均形状

plt.figure(figsize=(10,6)) for yi in range(3): plt.subplot(1, 3, yi+1) for xx in X_scaled[labels==yi][:10]: # 绘制前10条 plt.plot(xx.ravel(), "k-", alpha=0.2) plt.plot(km_dtw.cluster_centers_[yi].ravel(), "r-", linewidth=2) plt.title(f"Cluster {yi+1}") plt.tight_layout() plt.show()