交叉验证 (Cross-Validation, CV)

为了获得对模型性能更准确的评估，应使用K折交叉验证（K-fold CV）。它将训练集随机分割成K个子集，然后对模型进行K次训练和评估，最后对所有评估分数求平均值

偏差/方差权衡 (Bias/Variance Trade-off)

模型的泛化误差可以分解为偏差（Bias）、方差（Variance）和不可避免的误差

高偏差（欠拟合/Underfitting）

模型过于简单，无法捕捉到数据中的模式

高方差（过拟合/Overfitting）

模型过于复杂，对训练数据中的微小变化过于敏感，无法很好地泛化

关键性能指标

任务类型	核心指标	说明
回归 (Regression)	均方根误差 (RMSE)	最常见的指标，衡量系统预测中通常会产生的误差大小。对较大的误差赋予较高的权重,。
	平均绝对误差 (MAE)	对异常值不那么敏感，可作为RMSE的替代品。
分类 (Classification)	混淆矩阵 (Confusion Matrix)	统计A类实例被分成为B类别的次数，比单纯的准确率（accuracy）更有洞察力,。
	精确率 (Precision) 与召回率 (Recall)	精确率：衡量正向预测的准确率（TP/(TP+FP)）。召回率：衡量分类器正确检测到的正类实例的比率（TP/(TP+FN)）。二者是相互制约的（精度/召回率权衡）,,。
	F1分数 (F1 Score)	精度和召回率的谐波平均值。只有当精度和召回率都很高时，F1分数才能很高。
	ROC曲线与AUC (Area Under the Curve)	ROC曲线绘制的是真正类率（召回率）与假正类率（FPR）的关系。AUC衡量分类器的平均性能，AUC越接近1，性能越好,,。

from sklearn.metrics import mean_squared_error, mean_absolute_error
import numpy as np

y_true = np.array([100, 200, 300, 400])
y_pred = np.array([110, 190, 290, 450])

rmse = mean_squared_error(y_true, y_pred)
mae = mean_absolute_error(y_true, y_pred)

print("RMSE:", rmse)
print("MAE:", mae)

from sklearn.datasets import load_iris
import pandas as pd
import numpy as np

iris = load_iris()
X = pd.DataFrame(iris.data)
y = pd.Series(iris.target)

# 构造不平衡
idx_0 = y[y == 0].index
idx_1 = y[y == 1].sample(5, random_state=42).index
idx_2 = y[y == 2].sample(5, random_state=42).index

idx = idx_0.union(idx_1).union(idx_2)
X_imb = X.loc[idx]
y_imb = y.loc[idx]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X_imb,
    y_imb,
    test_size=0.3,
    stratify=y_imb,
    random_state=42
)

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

print("混淆矩阵：")
print(confusion_matrix(y_test, y_pred))

print("\n分类报告：")
print(classification_report(y_test, y_pred))

from sklearn.preprocessing import label_binarize
from sklearn.metrics import roc_auc_score

y_test_bin = label_binarize(y_test, classes=[0,1,2])
y_prob = model.predict_proba(X_test)

auc = roc_auc_score(y_test_bin, y_prob, multi_class="ovr")
print("ROC-AUC:", auc)

Share this article:

python机器学习模型评估与性能度量 (Model Evaluation and Performance Metrics)

交叉验证 (Cross-Validation, CV)

偏差/方差权衡 (Bias/Variance Trade-off)

高偏差（欠拟合/Underfitting）

高方差（过拟合/Overfitting）

关键性能指标

金融市场公开市场议价市场有形市场无形市场

金融市场货币市场资本市场

python机器学习 模型评估与性能度量 (Model Evaluation and Performance Metrics)

交叉验证 (Cross-Validation, CV)

偏差/方差权衡 (Bias/Variance Trade-off)

高偏差（欠拟合/Underfitting）

高方差（过拟合/Overfitting）

关键性能指标

金融市场 公开市场 议价市场 有形市场 无形市场

金融市场 货币市场 资本市场

python机器学习模型评估与性能度量 (Model Evaluation and Performance Metrics)

金融市场公开市场议价市场有形市场无形市场

金融市场货币市场资本市场