案例系列:泰坦尼克号_预测幸存者_TensorFlow决策森林

本文主要是介绍案例系列:泰坦尼克号_预测幸存者_TensorFlow决策森林,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

文章目录

  • 1. 导入依赖库
  • 2. 加载数据集
  • 3. 准备数据集
  • 4. 将Pandas数据集转换为TensorFlow数据集
  • 5. 使用默认参数训练模型
  • 6. 使用改进的默认参数训练模型
  • 7. 进行预测
  • 8. 使用超参数调优训练模型
  • 9. 创建一个集成模型

TensorFlow决策森林在表格数据上表现较好。本笔记将带您完成使用TensorFlow决策森林训练基线梯度提升树模型并在泰坦尼克号竞赛中提交的步骤。

本笔记展示了:

  1. 如何进行一些基本的预处理。例如,将对乘客姓名进行标记化处理,将车票名称分割成几个部分。
  2. 如何使用默认参数训练梯度提升树(GBT)。
  3. 如何使用改进的默认参数训练GBT。
  4. 如何调整GBTs的参数。
  5. 如何训练和集成多个GBTs。

1. 导入依赖库

# 导入所需的库
import numpy as np
import pandas as pd
import osimport tensorflow as tf
import tensorflow_decision_forests as tfdf# 打印 TensorFlow Decision Forests 的版本号
print(f"Found TF-DF {tfdf.__version__}")
Found TF-DF 1.2.0

2. 加载数据集

# 导入pandas库,用于数据处理和分析
import pandas as pd# 读取训练数据集和测试数据集
train_df = pd.read_csv("/kaggle/input/titanic/train.csv")
serving_df = pd.read_csv("/kaggle/input/titanic/test.csv")# 显示训练数据集的前10行数据
train_df.head(10)
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarked
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS
1211Cumings, Mrs. John Bradley (Florence Briggs Th...female38.010PC 1759971.2833C85C
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S
4503Allen, Mr. William Henrymale35.0003734508.0500NaNS
5603Moran, Mr. JamesmaleNaN003308778.4583NaNQ
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S
7803Palsson, Master. Gosta Leonardmale2.03134990921.0750NaNS
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC

3. 准备数据集

我们将对数据集进行以下转换。

  1. 对名称进行分词。例如,“Braund, Mr. Owen Harris” 将变成 [“Braund”, “Mr.”, “Owen”, “Harris”]。
  2. 提取车票中的任何前缀。例如,车票 “STON/O2. 3101282” 将变成 “STON/O2.” 和 3101282。
def preprocess(df):# 复制输入的DataFrame,以免修改原始数据df = df.copy()# 定义一个函数,用于规范化姓名def normalize_name(x):# 将姓名中的特殊字符去除,并用空格分隔单词return " ".join([v.strip(",()[].\"'") for v in x.split(" ")])# 定义一个函数,用于提取车票号码的最后一部分def ticket_number(x):# 将车票号码按空格分隔,并返回最后一个部分return x.split(" ")[-1]# 定义一个函数,用于提取车票项目def ticket_item(x):# 将车票号码按空格分隔items = x.split(" ")# 如果车票号码只有一个部分,则返回"NONE"if len(items) == 1:return "NONE"# 否则,将除最后一个部分外的其他部分用下划线连接起来return "_".join(items[0:-1])# 对姓名列应用规范化函数df["Name"] = df["Name"].apply(normalize_name)# 对车票列应用提取车票号码函数df["Ticket_number"] = df["Ticket"].apply(ticket_number)# 对车票列应用提取车票项目函数df["Ticket_item"] = df["Ticket"].apply(ticket_item)                     return df# 对训练数据集进行预处理
preprocessed_train_df = preprocess(train_df)
# 对服务数据集进行预处理
preprocessed_serving_df = preprocess(serving_df)# 打印预处理后的训练数据集的前5行
preprocessed_train_df.head(5)
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedTicket_numberTicket_item
0103Braund Mr Owen Harrismale22.010A/5 211717.2500NaNS21171A/5
1211Cumings Mrs John Bradley Florence Briggs Thayerfemale38.010PC 1759971.2833C85C17599PC
2313Heikkinen Miss Lainafemale26.000STON/O2. 31012827.9250NaNS3101282STON/O2.
3411Futrelle Mrs Jacques Heath Lily May Peelfemale35.01011380353.1000C123S113803NONE
4503Allen Mr William Henrymale35.0003734508.0500NaNS373450NONE

让我们列出模型的输入特征列表。值得注意的是,我们不想在“PassengerId”和“Ticket”特征上训练我们的模型。

# 获取预处理后的训练数据集的所有列名,并将其存储在input_features列表中
input_features = list(preprocessed_train_df.columns)# 从input_features列表中移除"Ticket"列
input_features.remove("Ticket")# 从input_features列表中移除"PassengerId"列
input_features.remove("PassengerId")# 从input_features列表中移除"Survived"列
input_features.remove("Survived")# 打印输出input_features列表,显示剩余的特征列
print(f"Input features: {input_features}")
Input features: ['Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Cabin', 'Embarked', 'Ticket_number', 'Ticket_item']

4. 将Pandas数据集转换为TensorFlow数据集


def tokenize_names(features, labels=None):"""将姓名分割为标记。TF-DF可以原生地处理文本标记。"""# 使用tf.strings.split函数将姓名分割为标记,并将结果存储在features["Name"]中features["Name"] =  tf.strings.split(features["Name"])return features, labels# 将预处理后的训练数据集转换为TF数据集,并指定标签列为"Survived",然后应用tokenize_names函数进行标记化处理
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(preprocessed_train_df,label="Survived").map(tokenize_names)# 将预处理后的服务数据集转换为TF数据集,并应用tokenize_names函数进行标记化处理
serving_ds = tfdf.keras.pd_dataframe_to_tf_dataset(preprocessed_serving_df).map(tokenize_names)

5. 使用默认参数训练模型

首先,我们使用默认参数训练了一个GradientBoostedTreesModel模型。

# 创建一个梯度提升树模型
model = tfdf.keras.GradientBoostedTreesModel(verbose=0,  # 设置日志输出级别为0,几乎没有日志输出features=[tfdf.keras.FeatureUsage(name=n) for n in input_features],  # 设置模型使用的特征列表exclude_non_specified_features=True,  # 只使用在特征列表中指定的特征random_seed=1234,  # 设置随机种子
)# 使用训练数据集训练模型
model.fit(train_ds)# 获取模型的自我评估结果
self_evaluation = model.make_inspector().evaluation()# 输出模型的准确率和损失值
print(f"Accuracy: {self_evaluation.accuracy} Loss:{self_evaluation.loss}")
[INFO 2023-05-18T10:31:05.469776904+00:00 kernel.cc:1214] Loading model from path /tmp/tmpxl2c60xw/model/ with prefix f38ff16f536e4497
[INFO 2023-05-18T10:31:05.47954519+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:31:05.479865457+00:00 kernel.cc:1046] Use fast generic engineWARNING: AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x78705a4f94d0> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Accuracy: 0.8260869383811951 Loss:0.8608942627906799

6. 使用改进的默认参数训练模型

现在,您将在创建GBT模型时使用一些特定的参数

# 创建模型model = tfdf.keras.GradientBoostedTreesModel(verbose=0,  # 输出日志较少features=[tfdf.keras.FeatureUsage(name=n) for n in input_features],  # 使用指定的特征exclude_non_specified_features=True,  # 只使用features中指定的特征min_examples=1,  # 每个节点最少样本数categorical_algorithm="RANDOM",  # 类别特征处理算法shrinkage=0.05,  # 学习率split_axis="SPARSE_OBLIQUE",  # 分裂轴sparse_oblique_normalization="MIN_MAX",  # 稀疏斜轴归一化方法sparse_oblique_num_projections_exponent=2.0,  # 稀疏斜轴投影数指数num_trees=2000,  # 树的数量random_seed=1234,  # 随机种子
)# 训练模型
model.fit(train_ds)# 模型评估
self_evaluation = model.make_inspector().evaluation()
print(f"Accuracy: {self_evaluation.accuracy} Loss:{self_evaluation.loss}")
[INFO 2023-05-18T10:31:10.217810247+00:00 kernel.cc:1214] Loading model from path /tmp/tmp73d7qv4h/model/ with prefix ce08288098554ec5
[INFO 2023-05-18T10:31:10.227982178+00:00 decision_forest.cc:661] Model loaded with 33 root(s), 1823 node(s), and 10 input feature(s).
[INFO 2023-05-18T10:31:10.228265252+00:00 kernel.cc:1046] Use fast generic engineAccuracy: 0.760869562625885 Loss:1.0154211521148682

让我们来看一下模型,你还可以注意到模型找出的变量重要性的信息。

# 打印模型的概述信息
model.summary()
Model: "gradient_boosted_trees_model_1"
_________________________________________________________________Layer (type)                Output Shape              Param #   
=================================================================
=================================================================
Total params: 1
Trainable params: 0
Non-trainable params: 1
_________________________________________________________________
Type: "GRADIENT_BOOSTED_TREES"
Task: CLASSIFICATION
Label: "__LABEL"Input Features (11):AgeCabinEmbarkedFareNameParchPclassSexSibSpTicket_itemTicket_numberNo weightsVariable Importance: INV_MEAN_MIN_DEPTH:1.           "Sex"  0.576632 ################2.           "Age"  0.364297 #######3.          "Fare"  0.278839 ####4.          "Name"  0.208548 #5. "Ticket_number"  0.180792 6.        "Pclass"  0.176962 7.         "Parch"  0.176659 8.   "Ticket_item"  0.175540 9.      "Embarked"  0.172339 10.         "SibSp"  0.170442 Variable Importance: NUM_AS_ROOT:1.  "Sex" 28.000000 ################2. "Name"  5.000000 Variable Importance: NUM_NODES:1.           "Age" 406.000000 ################2.          "Fare" 290.000000 ###########3.          "Name" 44.000000 #4.   "Ticket_item" 42.000000 #5.           "Sex" 31.000000 #6.         "Parch" 28.000000 7. "Ticket_number" 22.000000 8.        "Pclass" 15.000000 9.      "Embarked" 12.000000 10.         "SibSp"  5.000000 Variable Importance: SUM_SCORE:1.           "Sex" 460.497828 ################2.           "Age" 355.963333 ############3.          "Fare" 292.870316 ##########4.          "Name" 108.548952 ###5.        "Pclass" 28.132254 6.   "Ticket_item" 23.818676 7. "Ticket_number" 23.772288 8.         "Parch" 19.303155 9.      "Embarked"  8.155722 10.         "SibSp"  0.015225 Loss: BINOMIAL_LOG_LIKELIHOOD
Validation loss value: 1.01542
Number of trees per iteration: 1
Node format: NOT_SET
Number of trees: 33
Total number of nodes: 1823Number of nodes by tree:
Count: 33 Average: 55.2424 StdDev: 5.13473
Min: 39 Max: 63 Ignored: 0
----------------------------------------------
[ 39, 40) 1   3.03%   3.03% #
[ 40, 41) 0   0.00%   3.03%
[ 41, 42) 0   0.00%   3.03%
[ 42, 44) 0   0.00%   3.03%
[ 44, 45) 0   0.00%   3.03%
[ 45, 46) 0   0.00%   3.03%
[ 46, 47) 0   0.00%   3.03%
[ 47, 49) 2   6.06%   9.09% ###
[ 49, 50) 2   6.06%  15.15% ###
[ 50, 51) 0   0.00%  15.15%
[ 51, 52) 2   6.06%  21.21% ###
[ 52, 54) 5  15.15%  36.36% #######
[ 54, 55) 0   0.00%  36.36%
[ 55, 56) 5  15.15%  51.52% #######
[ 56, 57) 0   0.00%  51.52%
[ 57, 59) 4  12.12%  63.64% ######
[ 59, 60) 7  21.21%  84.85% ##########
[ 60, 61) 0   0.00%  84.85%
[ 61, 62) 3   9.09%  93.94% ####
[ 62, 63] 2   6.06% 100.00% ###Depth by leafs:
Count: 928 Average: 4.8847 StdDev: 0.380934
Min: 2 Max: 5 Ignored: 0
----------------------------------------------
[ 2, 3)   1   0.11%   0.11%
[ 3, 4)  17   1.83%   1.94%
[ 4, 5)  70   7.54%   9.48% #
[ 5, 5] 840  90.52% 100.00% ##########Number of training obs by leaf:
Count: 928 Average: 28.4127 StdDev: 70.8313
Min: 1 Max: 438 Ignored: 0
----------------------------------------------
[   1,  22) 731  78.77%  78.77% ##########
[  22,  44)  74   7.97%  86.75% #
[  44,  66)  37   3.99%  90.73% #
[  66,  88)   3   0.32%  91.06%
[  88, 110)   9   0.97%  92.03%
[ 110, 132)   8   0.86%  92.89%
[ 132, 154)  18   1.94%  94.83%
[ 154, 176)   8   0.86%  95.69%
[ 176, 198)   6   0.65%  96.34%
[ 198, 220)   2   0.22%  96.55%
[ 220, 241)   2   0.22%  96.77%
[ 241, 263)   1   0.11%  96.88%
[ 263, 285)   2   0.22%  97.09%
[ 285, 307)   5   0.54%  97.63%
[ 307, 329)   1   0.11%  97.74%
[ 329, 351)   2   0.22%  97.95%
[ 351, 373)   6   0.65%  98.60%
[ 373, 395)   6   0.65%  99.25%
[ 395, 417)   2   0.22%  99.46%
[ 417, 438]   5   0.54% 100.00%Attribute in nodes:406 : Age [NUMERICAL]290 : Fare [NUMERICAL]44 : Name [CATEGORICAL_SET]42 : Ticket_item [CATEGORICAL]31 : Sex [CATEGORICAL]28 : Parch [NUMERICAL]22 : Ticket_number [CATEGORICAL]15 : Pclass [NUMERICAL]12 : Embarked [CATEGORICAL]5 : SibSp [NUMERICAL]Attribute in nodes with depth <= 0:28 : Sex [CATEGORICAL]5 : Name [CATEGORICAL_SET]Attribute in nodes with depth <= 1:39 : Age [NUMERICAL]28 : Sex [CATEGORICAL]21 : Fare [NUMERICAL]5 : Name [CATEGORICAL_SET]3 : Pclass [NUMERICAL]2 : Ticket_number [CATEGORICAL]1 : Parch [NUMERICAL]Attribute in nodes with depth <= 2:102 : Age [NUMERICAL]65 : Fare [NUMERICAL]28 : Sex [CATEGORICAL]15 : Name [CATEGORICAL_SET]7 : Ticket_number [CATEGORICAL]5 : Pclass [NUMERICAL]4 : Parch [NUMERICAL]2 : Ticket_item [CATEGORICAL]2 : Embarked [CATEGORICAL]Attribute in nodes with depth <= 3:206 : Age [NUMERICAL]156 : Fare [NUMERICAL]33 : Name [CATEGORICAL_SET]29 : Sex [CATEGORICAL]19 : Ticket_number [CATEGORICAL]11 : Ticket_item [CATEGORICAL]11 : Parch [NUMERICAL]7 : Pclass [NUMERICAL]3 : Embarked [CATEGORICAL]Attribute in nodes with depth <= 5:406 : Age [NUMERICAL]290 : Fare [NUMERICAL]44 : Name [CATEGORICAL_SET]42 : Ticket_item [CATEGORICAL]31 : Sex [CATEGORICAL]28 : Parch [NUMERICAL]22 : Ticket_number [CATEGORICAL]15 : Pclass [NUMERICAL]12 : Embarked [CATEGORICAL]5 : SibSp [NUMERICAL]Condition type in nodes:744 : ObliqueCondition122 : ContainsBitmapCondition29 : ContainsCondition
Condition type in nodes with depth <= 0:31 : ContainsBitmapCondition2 : ContainsCondition
Condition type in nodes with depth <= 1:64 : ObliqueCondition33 : ContainsBitmapCondition2 : ContainsCondition
Condition type in nodes with depth <= 2:176 : ObliqueCondition51 : ContainsBitmapCondition3 : ContainsCondition
Condition type in nodes with depth <= 3:380 : ObliqueCondition77 : ContainsBitmapCondition18 : ContainsCondition
Condition type in nodes with depth <= 5:744 : ObliqueCondition122 : ContainsBitmapCondition29 : ContainsConditionTraining logs:
Number of iteration to final model: 33Iter:1 train-loss:1.266350 valid-loss:1.360049  train-accuracy:0.624531 valid-accuracy:0.543478Iter:2 train-loss:1.213702 valid-loss:1.321897  train-accuracy:0.624531 valid-accuracy:0.543478Iter:3 train-loss:1.165783 valid-loss:1.286817  train-accuracy:0.624531 valid-accuracy:0.543478Iter:4 train-loss:1.122469 valid-loss:1.256133  train-accuracy:0.624531 valid-accuracy:0.543478Iter:5 train-loss:1.081461 valid-loss:1.229342  train-accuracy:0.808511 valid-accuracy:0.771739Iter:6 train-loss:1.045305 valid-loss:1.204601  train-accuracy:0.826033 valid-accuracy:0.728261Iter:16 train-loss:0.794952 valid-loss:1.058568  train-accuracy:0.914894 valid-accuracy:0.771739Iter:26 train-loss:0.646146 valid-loss:1.021539  train-accuracy:0.926158 valid-accuracy:0.793478Iter:36 train-loss:0.558627 valid-loss:1.023663  train-accuracy:0.929912 valid-accuracy:0.771739Iter:46 train-loss:0.493899 valid-loss:1.025164  train-accuracy:0.931164 valid-accuracy:0.760870Iter:56 train-loss:0.451528 valid-loss:1.032880  train-accuracy:0.938673 valid-accuracy:0.771739

7. 进行预测

# 定义函数prediction_to_kaggle_format,将模型预测结果转换为Kaggle格式
# 参数model:模型对象
# 参数threshold:阈值,默认为0.5
def prediction_to_kaggle_format(model, threshold=0.5):# 使用模型对serving_ds进行预测,得到生存概率proba_survive = model.predict(serving_ds, verbose=0)[:,0]# 创建一个DataFrame,包含PassengerId和Survived两列# PassengerId列取自serving_df的"PassengerId"列# Survived列根据生存概率是否大于等于阈值进行转换为0或1return pd.DataFrame({"PassengerId": serving_df["PassengerId"],"Survived": (proba_survive >= threshold).astype(int)})# 定义函数make_submission,将Kaggle预测结果生成提交文件
# 参数kaggle_predictions:Kaggle预测结果的DataFrame
def make_submission(kaggle_predictions):# 设置提交文件的路径为"/kaggle/working/submission.csv"path="/kaggle/working/submission.csv"# 将kaggle_predictions保存为CSV文件,不包含索引列kaggle_predictions.to_csv(path, index=False)# 打印提交文件导出的路径print(f"Submission exported to {path}")# 调用prediction_to_kaggle_format函数,将模型预测结果转换为Kaggle格式
# 将结果赋值给kaggle_predictions变量
kaggle_predictions = prediction_to_kaggle_format(model)# 调用make_submission函数,将Kaggle预测结果生成提交文件
# 参数为kaggle_predictions
make_submission(kaggle_predictions)# 使用Linux命令!head查看提交文件的前几行
!head /kaggle/working/submission.csv
Submission exported to /kaggle/working/submission.csv
PassengerId,Survived
892,0
893,0
894,0
895,0
896,0
897,0
898,0
899,0
900,1

8. 使用超参数调优训练模型

通过指定模型的调优构造函数参数来启用超参数调优。调优对象包含调优器的所有配置(搜索空间、优化器、试验和目标)。

# 创建一个随机搜索调谐器对象,设置试验次数为1000次
tuner = tfdf.tuner.RandomSearch(num_trials=1000)# 设置参数"min_examples"的搜索空间为[2, 5, 7, 10]
tuner.choice("min_examples", [2, 5, 7, 10])# 设置参数"categorical_algorithm"的搜索空间为["CART", "RANDOM"]
tuner.choice("categorical_algorithm", ["CART", "RANDOM"])# 创建一个局部搜索空间对象,设置参数"growing_strategy"的搜索空间为["LOCAL"]
local_search_space = tuner.choice("growing_strategy", ["LOCAL"])# 在局部搜索空间对象中设置参数"max_depth"的搜索空间为[3, 4, 5, 6, 8]
local_search_space.choice("max_depth", [3, 4, 5, 6, 8])# 创建一个全局搜索空间对象,设置参数"growing_strategy"的搜索空间为["BEST_FIRST_GLOBAL"],并将其与之前的局部搜索空间对象合并
global_search_space = tuner.choice("growing_strategy", ["BEST_FIRST_GLOBAL"], merge=True)# 在全局搜索空间对象中设置参数"max_num_nodes"的搜索空间为[16, 32, 64, 128, 256]
global_search_space.choice("max_num_nodes", [16, 32, 64, 128, 256])# 设置参数"use_hessian_gain"的搜索空间为[True, False]
# tuner.choice("use_hessian_gain", [True, False])# 设置参数"shrinkage"的搜索空间为[0.02, 0.05, 0.10, 0.15]
tuner.choice("shrinkage", [0.02, 0.05, 0.10, 0.15])# 设置参数"num_candidate_attributes_ratio"的搜索空间为[0.2, 0.5, 0.9, 1.0]
tuner.choice("num_candidate_attributes_ratio", [0.2, 0.5, 0.9, 1.0])# 创建一个斜切搜索空间对象,设置参数"split_axis"的搜索空间为["SPARSE_OBLIQUE"],并将其与之前的全局搜索空间对象合并
oblique_space = tuner.choice("split_axis", ["SPARSE_OBLIQUE"], merge=True)# 在斜切搜索空间对象中设置参数"sparse_oblique_normalization"的搜索空间为["NONE", "STANDARD_DEVIATION", "MIN_MAX"]
oblique_space.choice("sparse_oblique_normalization", ["NONE", "STANDARD_DEVIATION", "MIN_MAX"])# 在斜切搜索空间对象中设置参数"sparse_oblique_weights"的搜索空间为["BINARY", "CONTINUOUS"]
oblique_space.choice("sparse_oblique_weights", ["BINARY", "CONTINUOUS"])# 在斜切搜索空间对象中设置参数"sparse_oblique_num_projections_exponent"的搜索空间为[1.0, 1.5]
oblique_space.choice("sparse_oblique_num_projections_exponent", [1.0, 1.5])# 使用调谐器来创建一个梯度提升树模型
tuned_model = tfdf.keras.GradientBoostedTreesModel(tuner=tuner)# 使用训练数据集来训练调谐后的模型,设置verbose参数为0表示不显示训练过程中的日志信息
tuned_model.fit(train_ds, verbose=0)# 获取调谐后模型的评估结果
tuned_self_evaluation = tuned_model.make_inspector().evaluation()# 打印调谐后模型的准确率和损失值
print(f"Accuracy: {tuned_self_evaluation.accuracy} Loss:{tuned_self_evaluation.loss}")
Use /tmp/tmpf3gqf8yh as temporary training directory[INFO 2023-05-18T10:33:20.758894639+00:00 kernel.cc:1214] Loading model from path /tmp/tmpf3gqf8yh/model/ with prefix 1800e47d98cd4401
[INFO 2023-05-18T10:33:20.773899277+00:00 decision_forest.cc:661] Model loaded with 19 root(s), 589 node(s), and 12 input feature(s).
[INFO 2023-05-18T10:33:20.773949099+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesGeneric" built
[INFO 2023-05-18T10:33:20.773977709+00:00 kernel.cc:1046] Use fast generic engineAccuracy: 0.9178082346916199 Loss:0.6503586769104004

在上面的单元格的最后一行中,您可以看到准确率比以前使用默认参数和手动设置的参数要高。

这就是超参数调整的主要思想。

要获取更多信息,您可以参考此教程:自动化超参数调整

9. 创建一个集成模型

在这里,您将使用不同的种子创建100个模型,并将它们的结果组合起来。

这种方法消除了与创建ML模型相关的一些随机因素。

在GBT的创建中使用了honest参数。它将使用不同的训练示例来推断结构和叶值。这种正则化技术将示例交换为偏差估计。

# 代码注释predictions = None  # 初始化预测结果为空
num_predictions = 0  # 初始化预测次数为0for i in range(100):  # 循环100次print(f"i:{i}")  # 打印当前循环的次数i# 可能的模型:GradientBoostedTreesModel 或 RandomForestModelmodel = tfdf.keras.GradientBoostedTreesModel(verbose=0,  # 输出很少的日志features=[tfdf.keras.FeatureUsage(name=n) for n in input_features],  # 使用指定的特征exclude_non_specified_features=True,  # 只使用features中指定的特征random_seed=i,  # 设置随机种子honest=True,  # 使用honest模式)model.fit(train_ds)  # 使用训练数据集进行模型训练sub_predictions = model.predict(serving_ds, verbose=0)[:,0]  # 对测试数据集进行预测,并获取预测结果的第一列if predictions is None:  # 如果预测结果为空predictions = sub_predictions  # 将当前预测结果赋值给predictionselse:predictions += sub_predictions  # 将当前预测结果与之前的预测结果相加num_predictions += 1  # 预测次数加1predictions /= num_predictions  # 将预测结果除以预测次数,得到平均预测结果kaggle_predictions = pd.DataFrame({"PassengerId": serving_df["PassengerId"],  # 使用serving_df中的"PassengerId"列作为"PassengerId"列"Survived": (predictions >= 0.5).astype(int)  # 将预测结果大于等于0.5的转换为整数类型,并作为"Survived"列})make_submission(kaggle_predictions)  # 调用make_submission函数,传入kaggle_predictions作为参数,生成提交结果
i:0[INFO 2023-05-18T10:33:21.948337712+00:00 kernel.cc:1214] Loading model from path /tmp/tmplm3k4_lm/model/ with prefix c4f440bf7ff942e4
[INFO 2023-05-18T10:33:21.953190127+00:00 kernel.cc:1046] Use fast generic enginei:1[INFO 2023-05-18T10:33:24.230007891+00:00 kernel.cc:1214] Loading model from path /tmp/tmpl3j28v1o/model/ with prefix ea268a84a741444b
[INFO 2023-05-18T10:33:24.251794826+00:00 kernel.cc:1046] Use fast generic enginei:2[INFO 2023-05-18T10:33:25.498207811+00:00 kernel.cc:1214] Loading model from path /tmp/tmpmj97qbr5/model/ with prefix f2f7410f63bd409a
[INFO 2023-05-18T10:33:25.503194641+00:00 kernel.cc:1046] Use fast generic enginei:3[INFO 2023-05-18T10:33:27.910626163+00:00 kernel.cc:1214] Loading model from path /tmp/tmpwsp1w2ml/model/ with prefix f928c3cbda334e6d
[INFO 2023-05-18T10:33:27.938088033+00:00 kernel.cc:1046] Use fast generic enginei:4[INFO 2023-05-18T10:33:30.339966478+00:00 kernel.cc:1214] Loading model from path /tmp/tmp4dqqgbtz/model/ with prefix a9e2b4aa2bd14f15
[INFO 2023-05-18T10:33:30.346317062+00:00 kernel.cc:1046] Use fast generic enginei:5[INFO 2023-05-18T10:33:31.453628429+00:00 kernel.cc:1214] Loading model from path /tmp/tmpgvxkiu9m/model/ with prefix f5a20793ca43486e
[INFO 2023-05-18T10:33:31.457181214+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:33:31.457242742+00:00 kernel.cc:1046] Use fast generic enginei:6[INFO 2023-05-18T10:33:32.699337745+00:00 kernel.cc:1214] Loading model from path /tmp/tmposloraoe/model/ with prefix 7641e344b3e84731
[INFO 2023-05-18T10:33:32.707394885+00:00 kernel.cc:1046] Use fast generic enginei:7[INFO 2023-05-18T10:33:34.855967893+00:00 kernel.cc:1214] Loading model from path /tmp/tmp37s3iidq/model/ with prefix f9acd15508a4477c
[INFO 2023-05-18T10:33:34.876978248+00:00 kernel.cc:1046] Use fast generic enginei:8[INFO 2023-05-18T10:33:36.133979214+00:00 kernel.cc:1214] Loading model from path /tmp/tmp2w1jbf7w/model/ with prefix a73d32791aad4620
[INFO 2023-05-18T10:33:36.144570159+00:00 kernel.cc:1046] Use fast generic enginei:9[INFO 2023-05-18T10:33:38.078212415+00:00 kernel.cc:1214] Loading model from path /tmp/tmpf8h2tme_/model/ with prefix c32733675faa4571
[INFO 2023-05-18T10:33:38.095937299+00:00 kernel.cc:1046] Use fast generic enginei:10[INFO 2023-05-18T10:33:39.294404897+00:00 kernel.cc:1214] Loading model from path /tmp/tmp_34hnzg2/model/ with prefix d86f7947a9924e08
[INFO 2023-05-18T10:33:39.300675439+00:00 kernel.cc:1046] Use fast generic enginei:11[INFO 2023-05-18T10:33:40.710356612+00:00 kernel.cc:1214] Loading model from path /tmp/tmpqqhxvzqa/model/ with prefix f4fa80b88812483e
[INFO 2023-05-18T10:33:40.725593448+00:00 kernel.cc:1046] Use fast generic enginei:12[INFO 2023-05-18T10:33:41.872693359+00:00 kernel.cc:1214] Loading model from path /tmp/tmpgio8_emb/model/ with prefix 584bc3336ff148d4
[INFO 2023-05-18T10:33:41.878926188+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:33:41.878973373+00:00 kernel.cc:1046] Use fast generic enginei:13[INFO 2023-05-18T10:33:43.133436956+00:00 kernel.cc:1214] Loading model from path /tmp/tmp_fe2ypgw/model/ with prefix 665f04dc50494529
[INFO 2023-05-18T10:33:43.144992798+00:00 kernel.cc:1046] Use fast generic enginei:14[INFO 2023-05-18T10:33:44.307986506+00:00 kernel.cc:1214] Loading model from path /tmp/tmpr81v89fc/model/ with prefix 18d7d2a243594cee
[INFO 2023-05-18T10:33:44.314551544+00:00 kernel.cc:1046] Use fast generic enginei:15[INFO 2023-05-18T10:33:46.142297492+00:00 kernel.cc:1214] Loading model from path /tmp/tmpbgs_2ci0/model/ with prefix 4e729daf7fa14285
[INFO 2023-05-18T10:33:46.150843277+00:00 kernel.cc:1046] Use fast generic enginei:16[INFO 2023-05-18T10:33:48.039337316+00:00 kernel.cc:1214] Loading model from path /tmp/tmpr5v82plm/model/ with prefix 7f12fa3d909d4f27
[INFO 2023-05-18T10:33:48.053265884+00:00 kernel.cc:1046] Use fast generic enginei:17[INFO 2023-05-18T10:33:49.877689502+00:00 kernel.cc:1214] Loading model from path /tmp/tmpu84ev3x9/model/ with prefix 17e265ef795c476a
[INFO 2023-05-18T10:33:49.891505639+00:00 kernel.cc:1046] Use fast generic enginei:18[INFO 2023-05-18T10:33:51.279061786+00:00 kernel.cc:1214] Loading model from path /tmp/tmp_kn7vjpk/model/ with prefix de89cda1f7cb457a
[INFO 2023-05-18T10:33:51.296866304+00:00 kernel.cc:1046] Use fast generic enginei:19[INFO 2023-05-18T10:33:52.884210845+00:00 kernel.cc:1214] Loading model from path /tmp/tmpiqbe9z0k/model/ with prefix 3ffde27267724071
[INFO 2023-05-18T10:33:52.903292797+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:33:52.903359977+00:00 kernel.cc:1046] Use fast generic enginei:20[INFO 2023-05-18T10:33:54.339331903+00:00 kernel.cc:1214] Loading model from path /tmp/tmpp23celh4/model/ with prefix 10648c743627411c
[INFO 2023-05-18T10:33:54.355176879+00:00 kernel.cc:1046] Use fast generic enginei:21[INFO 2023-05-18T10:33:55.579964463+00:00 kernel.cc:1214] Loading model from path /tmp/tmph_zw36wd/model/ with prefix a2bb80559c7e4821
[INFO 2023-05-18T10:33:55.586214432+00:00 kernel.cc:1046] Use fast generic enginei:22[INFO 2023-05-18T10:33:56.754886233+00:00 kernel.cc:1214] Loading model from path /tmp/tmplw1k53vh/model/ with prefix f8c87a097abd4766
[INFO 2023-05-18T10:33:56.762601065+00:00 kernel.cc:1046] Use fast generic enginei:23[INFO 2023-05-18T10:33:58.077570163+00:00 kernel.cc:1214] Loading model from path /tmp/tmpqth9jo1v/model/ with prefix d44f89acfd884036
[INFO 2023-05-18T10:33:58.086871098+00:00 kernel.cc:1046] Use fast generic enginei:24[INFO 2023-05-18T10:33:59.75683034+00:00 kernel.cc:1214] Loading model from path /tmp/tmp_320ckz8/model/ with prefix ca6c614f297c4190
[INFO 2023-05-18T10:33:59.762867776+00:00 kernel.cc:1046] Use fast generic enginei:25[INFO 2023-05-18T10:34:01.111614827+00:00 kernel.cc:1214] Loading model from path /tmp/tmplr1dgz7t/model/ with prefix 5f58ccbc2f714cef
[INFO 2023-05-18T10:34:01.124043889+00:00 kernel.cc:1046] Use fast generic enginei:26[INFO 2023-05-18T10:34:02.403875094+00:00 kernel.cc:1214] Loading model from path /tmp/tmptmc420hg/model/ with prefix f15c70a4abd142ed
[INFO 2023-05-18T10:34:02.414477226+00:00 kernel.cc:1046] Use fast generic enginei:27[INFO 2023-05-18T10:34:03.632885117+00:00 kernel.cc:1214] Loading model from path /tmp/tmp9bnj_rhe/model/ with prefix 09cf9e80b54e4f01
[INFO 2023-05-18T10:34:03.639922594+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:34:03.639985667+00:00 kernel.cc:1046] Use fast generic enginei:28[INFO 2023-05-18T10:34:04.80093394+00:00 kernel.cc:1214] Loading model from path /tmp/tmpdy4ty8e7/model/ with prefix 12ecce69a3094482
[INFO 2023-05-18T10:34:04.806562217+00:00 kernel.cc:1046] Use fast generic enginei:29[INFO 2023-05-18T10:34:06.176649164+00:00 kernel.cc:1214] Loading model from path /tmp/tmp4s_urrdz/model/ with prefix 7c52615e6dbe49b6
[INFO 2023-05-18T10:34:06.190106917+00:00 kernel.cc:1046] Use fast generic enginei:30[INFO 2023-05-18T10:34:08.042952706+00:00 kernel.cc:1214] Loading model from path /tmp/tmpa5ffc53i/model/ with prefix 778954274b29412a
[INFO 2023-05-18T10:34:08.071412376+00:00 kernel.cc:1046] Use fast generic enginei:31[INFO 2023-05-18T10:34:10.130544806+00:00 kernel.cc:1214] Loading model from path /tmp/tmpe531jwwn/model/ with prefix f480e9ddd2034b6f
[INFO 2023-05-18T10:34:10.143340258+00:00 kernel.cc:1046] Use fast generic enginei:32[INFO 2023-05-18T10:34:11.8704522+00:00 kernel.cc:1214] Loading model from path /tmp/tmpvh3w4qn3/model/ with prefix 530fadef1eda4a78
[INFO 2023-05-18T10:34:11.877131677+00:00 kernel.cc:1046] Use fast generic enginei:33[INFO 2023-05-18T10:34:13.261046572+00:00 kernel.cc:1214] Loading model from path /tmp/tmp6cmvc2ni/model/ with prefix 9983903467604992
[INFO 2023-05-18T10:34:13.275374265+00:00 kernel.cc:1046] Use fast generic enginei:34[INFO 2023-05-18T10:34:14.932436357+00:00 kernel.cc:1214] Loading model from path /tmp/tmpuyr6xbug/model/ with prefix e8d30d97cfdc438f
[INFO 2023-05-18T10:34:14.941772566+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:34:14.941821998+00:00 kernel.cc:1046] Use fast generic enginei:35[INFO 2023-05-18T10:34:16.239869578+00:00 kernel.cc:1214] Loading model from path /tmp/tmpbupnml9z/model/ with prefix d49e70e1dc4643a5
[INFO 2023-05-18T10:34:16.248321871+00:00 kernel.cc:1046] Use fast generic enginei:36[INFO 2023-05-18T10:34:17.790593631+00:00 kernel.cc:1214] Loading model from path /tmp/tmpm9_7mg97/model/ with prefix 7c5d4bb088834cdf
[INFO 2023-05-18T10:34:17.806502874+00:00 kernel.cc:1046] Use fast generic enginei:37[INFO 2023-05-18T10:34:19.119001097+00:00 kernel.cc:1214] Loading model from path /tmp/tmpprk9ne1p/model/ with prefix 92577de9c74c4e30
[INFO 2023-05-18T10:34:19.128851908+00:00 kernel.cc:1046] Use fast generic enginei:38[INFO 2023-05-18T10:34:20.718640589+00:00 kernel.cc:1214] Loading model from path /tmp/tmpxnx03asy/model/ with prefix c06e0e0c2b3143d6
[INFO 2023-05-18T10:34:20.733772661+00:00 kernel.cc:1046] Use fast generic enginei:39[INFO 2023-05-18T10:34:22.276000518+00:00 kernel.cc:1214] Loading model from path /tmp/tmpl1bcgsyt/model/ with prefix 3f8161548998456a
[INFO 2023-05-18T10:34:22.290102378+00:00 kernel.cc:1046] Use fast generic enginei:40[INFO 2023-05-18T10:34:23.677801876+00:00 kernel.cc:1214] Loading model from path /tmp/tmp8etd50zo/model/ with prefix c19b3262a9cf4a82
[INFO 2023-05-18T10:34:23.682914204+00:00 kernel.cc:1046] Use fast generic enginei:41[INFO 2023-05-18T10:34:25.260553537+00:00 kernel.cc:1214] Loading model from path /tmp/tmp6ctspflq/model/ with prefix 895d3dc68ff041a3
[INFO 2023-05-18T10:34:25.277071839+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:34:25.277123526+00:00 kernel.cc:1046] Use fast generic enginei:42[INFO 2023-05-18T10:34:26.667122397+00:00 kernel.cc:1214] Loading model from path /tmp/tmpegzgyttm/model/ with prefix e8c589e4d6f54675
[INFO 2023-05-18T10:34:26.678052894+00:00 kernel.cc:1046] Use fast generic enginei:43[INFO 2023-05-18T10:34:28.384453398+00:00 kernel.cc:1214] Loading model from path /tmp/tmpp2efe_ge/model/ with prefix a2a2af2a909f43bf
[INFO 2023-05-18T10:34:28.404482053+00:00 kernel.cc:1046] Use fast generic enginei:44[INFO 2023-05-18T10:34:29.824741245+00:00 kernel.cc:1214] Loading model from path /tmp/tmpjiiwvuj6/model/ with prefix 14c443fb8e0e4b16
[INFO 2023-05-18T10:34:29.835718149+00:00 kernel.cc:1046] Use fast generic enginei:45[INFO 2023-05-18T10:34:31.403557622+00:00 kernel.cc:1214] Loading model from path /tmp/tmpw7t4qv67/model/ with prefix 058a3c9f358a4441
[INFO 2023-05-18T10:34:31.407348428+00:00 kernel.cc:1046] Use fast generic enginei:46[INFO 2023-05-18T10:34:33.016721727+00:00 kernel.cc:1214] Loading model from path /tmp/tmpf33zt1_8/model/ with prefix 13f35c50f63e4523
[INFO 2023-05-18T10:34:33.032482566+00:00 kernel.cc:1046] Use fast generic enginei:47[INFO 2023-05-18T10:34:34.642400708+00:00 kernel.cc:1214] Loading model from path /tmp/tmp__v6r89g/model/ with prefix e9d642544b0e4c04
[INFO 2023-05-18T10:34:34.657600654+00:00 kernel.cc:1046] Use fast generic enginei:48[INFO 2023-05-18T10:34:35.866337496+00:00 kernel.cc:1214] Loading model from path /tmp/tmp7buinln0/model/ with prefix e761528f031f4a7e
[INFO 2023-05-18T10:34:35.871274531+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:34:35.871344651+00:00 kernel.cc:1046] Use fast generic enginei:49[INFO 2023-05-18T10:34:37.106491025+00:00 kernel.cc:1214] Loading model from path /tmp/tmp866z5gwf/model/ with prefix ed6acb5d8332445f
[INFO 2023-05-18T10:34:37.114994662+00:00 kernel.cc:1046] Use fast generic enginei:50[INFO 2023-05-18T10:34:38.544557797+00:00 kernel.cc:1214] Loading model from path /tmp/tmpnl2o_rwi/model/ with prefix 93b66a53f7d84de9
[INFO 2023-05-18T10:34:38.558418799+00:00 kernel.cc:1046] Use fast generic enginei:51[INFO 2023-05-18T10:34:40.585342582+00:00 kernel.cc:1214] Loading model from path /tmp/tmp1duuv71f/model/ with prefix ed7ae4de78b5440b
[INFO 2023-05-18T10:34:40.603043096+00:00 kernel.cc:1046] Use fast generic enginei:52[INFO 2023-05-18T10:34:41.986421488+00:00 kernel.cc:1214] Loading model from path /tmp/tmpvw6ii_z9/model/ with prefix f8db08bb01c647d3
[INFO 2023-05-18T10:34:41.995479515+00:00 kernel.cc:1046] Use fast generic enginei:53[INFO 2023-05-18T10:34:43.846630571+00:00 kernel.cc:1214] Loading model from path /tmp/tmpny_ukl54/model/ with prefix 1785ce9217aa4994
[INFO 2023-05-18T10:34:43.855497064+00:00 kernel.cc:1046] Use fast generic enginei:54[INFO 2023-05-18T10:34:45.151833126+00:00 kernel.cc:1214] Loading model from path /tmp/tmpiya7usve/model/ with prefix bbbdfef726764bd3
[INFO 2023-05-18T10:34:45.156589442+00:00 kernel.cc:1046] Use fast generic enginei:55[INFO 2023-05-18T10:34:46.77358209+00:00 kernel.cc:1214] Loading model from path /tmp/tmpbxg0t47u/model/ with prefix a2add2a15a8b4937
[INFO 2023-05-18T10:34:46.789983186+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:34:46.790036924+00:00 kernel.cc:1046] Use fast generic enginei:56[INFO 2023-05-18T10:34:49.185730496+00:00 kernel.cc:1214] Loading model from path /tmp/tmpvgsmijpy/model/ with prefix f11c35b624dd4801
[INFO 2023-05-18T10:34:49.200454508+00:00 kernel.cc:1046] Use fast generic enginei:57[INFO 2023-05-18T10:34:50.475233069+00:00 kernel.cc:1214] Loading model from path /tmp/tmpj2pqhlzf/model/ with prefix 94a270b459db41f7
[INFO 2023-05-18T10:34:50.480098021+00:00 kernel.cc:1046] Use fast generic enginei:58[INFO 2023-05-18T10:34:51.81473943+00:00 kernel.cc:1214] Loading model from path /tmp/tmpc318b_47/model/ with prefix 35a4cc3721344731
[INFO 2023-05-18T10:34:51.822092629+00:00 kernel.cc:1046] Use fast generic enginei:59[INFO 2023-05-18T10:34:53.231037101+00:00 kernel.cc:1214] Loading model from path /tmp/tmplj6owa0f/model/ with prefix a8ea31fc3a404c13
[INFO 2023-05-18T10:34:53.241916395+00:00 kernel.cc:1046] Use fast generic enginei:60[INFO 2023-05-18T10:34:54.837732274+00:00 kernel.cc:1214] Loading model from path /tmp/tmpc_k5t1ol/model/ with prefix 6637c533b177416a
[INFO 2023-05-18T10:34:54.84928887+00:00 kernel.cc:1046] Use fast generic enginei:61[INFO 2023-05-18T10:34:56.502178789+00:00 kernel.cc:1214] Loading model from path /tmp/tmp6p3yzkc6/model/ with prefix 7f3f357e4ecd467b
[INFO 2023-05-18T10:34:56.507717374+00:00 kernel.cc:1046] Use fast generic enginei:62[INFO 2023-05-18T10:34:58.562533808+00:00 kernel.cc:1214] Loading model from path /tmp/tmp4qwsdjx5/model/ with prefix 3cf2abcc265d4eaa
[INFO 2023-05-18T10:34:58.591053862+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:34:58.591128478+00:00 kernel.cc:1046] Use fast generic enginei:63[INFO 2023-05-18T10:35:01.045013705+00:00 kernel.cc:1214] Loading model from path /tmp/tmpsc1d3fdo/model/ with prefix 331111241efa4eaf
[INFO 2023-05-18T10:35:01.055002209+00:00 kernel.cc:1046] Use fast generic enginei:64[INFO 2023-05-18T10:35:02.491436173+00:00 kernel.cc:1214] Loading model from path /tmp/tmpz8_0rh4j/model/ with prefix 782b4b9544664c34
[INFO 2023-05-18T10:35:02.500648398+00:00 kernel.cc:1046] Use fast generic enginei:65[INFO 2023-05-18T10:35:03.746983159+00:00 kernel.cc:1214] Loading model from path /tmp/tmpuad51ad_/model/ with prefix 62ad43498a8945ee
[INFO 2023-05-18T10:35:03.752355273+00:00 kernel.cc:1046] Use fast generic enginei:66[INFO 2023-05-18T10:35:05.002186741+00:00 kernel.cc:1214] Loading model from path /tmp/tmpmr6hhyib/model/ with prefix e13e4dafafe240a9
[INFO 2023-05-18T10:35:05.009253052+00:00 kernel.cc:1046] Use fast generic enginei:67[INFO 2023-05-18T10:35:06.639802377+00:00 kernel.cc:1214] Loading model from path /tmp/tmp48hjyy5y/model/ with prefix b57f917606ba4450
[INFO 2023-05-18T10:35:06.659970275+00:00 kernel.cc:1046] Use fast generic enginei:68[INFO 2023-05-18T10:35:08.082001856+00:00 kernel.cc:1214] Loading model from path /tmp/tmpgpkk86q4/model/ with prefix 0585d52463174c5c
[INFO 2023-05-18T10:35:08.095272974+00:00 kernel.cc:1046] Use fast generic enginei:69[INFO 2023-05-18T10:35:09.331136571+00:00 kernel.cc:1214] Loading model from path /tmp/tmph_frny4j/model/ with prefix fa5a095fe4904682
[INFO 2023-05-18T10:35:09.338199435+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:35:09.338254677+00:00 kernel.cc:1046] Use fast generic enginei:70[INFO 2023-05-18T10:35:10.640887616+00:00 kernel.cc:1214] Loading model from path /tmp/tmp4wjgf2r2/model/ with prefix 80c4c80589414e13
[INFO 2023-05-18T10:35:10.650636036+00:00 kernel.cc:1046] Use fast generic enginei:71[INFO 2023-05-18T10:35:12.430099993+00:00 kernel.cc:1214] Loading model from path /tmp/tmptjlatbqq/model/ with prefix 26007776bcc648d7
[INFO 2023-05-18T10:35:12.438293634+00:00 kernel.cc:1046] Use fast generic enginei:72[INFO 2023-05-18T10:35:14.019537623+00:00 kernel.cc:1214] Loading model from path /tmp/tmpu4egs0bv/model/ with prefix 4a67d5be0d72468d
[INFO 2023-05-18T10:35:14.036505305+00:00 kernel.cc:1046] Use fast generic enginei:73[INFO 2023-05-18T10:35:15.512613873+00:00 kernel.cc:1214] Loading model from path /tmp/tmpfqjqzbub/model/ with prefix aca012bed5e74739
[INFO 2023-05-18T10:35:15.520282978+00:00 kernel.cc:1046] Use fast generic enginei:74[INFO 2023-05-18T10:35:16.861640206+00:00 kernel.cc:1214] Loading model from path /tmp/tmpj9r8iw0a/model/ with prefix 74b2a1783b9e46a6
[INFO 2023-05-18T10:35:16.874599194+00:00 kernel.cc:1046] Use fast generic enginei:75[INFO 2023-05-18T10:35:18.122098866+00:00 kernel.cc:1214] Loading model from path /tmp/tmpruig1t4u/model/ with prefix d7ab0b72252a4c10
[INFO 2023-05-18T10:35:18.130775546+00:00 kernel.cc:1046] Use fast generic enginei:76[INFO 2023-05-18T10:35:19.822243439+00:00 kernel.cc:1214] Loading model from path /tmp/tmpoqsf9fbn/model/ with prefix eb5803ca471a4f5a
[INFO 2023-05-18T10:35:19.826743805+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:35:19.82681821+00:00 kernel.cc:1046] Use fast generic enginei:77[INFO 2023-05-18T10:35:20.988911754+00:00 kernel.cc:1214] Loading model from path /tmp/tmpqi5x3v5z/model/ with prefix 92ba1de56c4b4bea
[INFO 2023-05-18T10:35:20.994464531+00:00 kernel.cc:1046] Use fast generic enginei:78[INFO 2023-05-18T10:35:22.200580278+00:00 kernel.cc:1214] Loading model from path /tmp/tmpcun3o_n6/model/ with prefix 6b72ab70b8af48d5
[INFO 2023-05-18T10:35:22.207684575+00:00 kernel.cc:1046] Use fast generic enginei:79[INFO 2023-05-18T10:35:23.408222+00:00 kernel.cc:1214] Loading model from path /tmp/tmpl172n50i/model/ with prefix aacd198ef20c48ab
[INFO 2023-05-18T10:35:23.416126968+00:00 kernel.cc:1046] Use fast generic enginei:80[INFO 2023-05-18T10:35:24.750680123+00:00 kernel.cc:1214] Loading model from path /tmp/tmp65w8y5ov/model/ with prefix 11dca72a6a674b19
[INFO 2023-05-18T10:35:24.761420982+00:00 kernel.cc:1046] Use fast generic enginei:81[INFO 2023-05-18T10:35:26.60253295+00:00 kernel.cc:1214] Loading model from path /tmp/tmpk2brm2qt/model/ with prefix ef8df47ffbb94864
[INFO 2023-05-18T10:35:26.614219574+00:00 kernel.cc:1046] Use fast generic enginei:82[INFO 2023-05-18T10:35:28.331782151+00:00 kernel.cc:1214] Loading model from path /tmp/tmposmdbjqj/model/ with prefix b5fa31b36b9346c6
[INFO 2023-05-18T10:35:28.342305552+00:00 kernel.cc:1046] Use fast generic enginei:83[INFO 2023-05-18T10:35:30.126093361+00:00 kernel.cc:1214] Loading model from path /tmp/tmpjp8omn7j/model/ with prefix 1911d6a9dc5245b5
[INFO 2023-05-18T10:35:30.135542215+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:35:30.135591968+00:00 kernel.cc:1046] Use fast generic enginei:84[INFO 2023-05-18T10:35:32.510333746+00:00 kernel.cc:1214] Loading model from path /tmp/tmp1tja5r_e/model/ with prefix aa3f4c78bb394574
[INFO 2023-05-18T10:35:32.531350922+00:00 kernel.cc:1046] Use fast generic enginei:85[INFO 2023-05-18T10:35:33.844042681+00:00 kernel.cc:1214] Loading model from path /tmp/tmp0r236t3e/model/ with prefix 19ff9f95ddc2438e
[INFO 2023-05-18T10:35:33.851596791+00:00 kernel.cc:1046] Use fast generic enginei:86[INFO 2023-05-18T10:35:35.407022201+00:00 kernel.cc:1214] Loading model from path /tmp/tmpeuc6bj_3/model/ with prefix 6dd65111acdd4630
[INFO 2023-05-18T10:35:35.424928648+00:00 kernel.cc:1046] Use fast generic enginei:87[INFO 2023-05-18T10:35:37.040148321+00:00 kernel.cc:1214] Loading model from path /tmp/tmp358yev21/model/ with prefix da78df557f754986
[INFO 2023-05-18T10:35:37.060583254+00:00 kernel.cc:1046] Use fast generic enginei:88[INFO 2023-05-18T10:35:38.427411651+00:00 kernel.cc:1214] Loading model from path /tmp/tmpzakqb0xp/model/ with prefix ad35a17e9b86465b
[INFO 2023-05-18T10:35:38.440438018+00:00 kernel.cc:1046] Use fast generic enginei:89[INFO 2023-05-18T10:35:39.638133755+00:00 kernel.cc:1214] Loading model from path /tmp/tmpildtvtld/model/ with prefix 0ac7f95660244655
[INFO 2023-05-18T10:35:39.644023175+00:00 kernel.cc:1046] Use fast generic enginei:90[INFO 2023-05-18T10:35:41.008617146+00:00 kernel.cc:1214] Loading model from path /tmp/tmpkrivrsps/model/ with prefix 40280f0ab1094407
[INFO 2023-05-18T10:35:41.019743959+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:35:41.019791788+00:00 kernel.cc:1046] Use fast generic enginei:91[INFO 2023-05-18T10:35:42.252408477+00:00 kernel.cc:1214] Loading model from path /tmp/tmptqwp8y1g/model/ with prefix b8cf07a7fb3c4ca6
[INFO 2023-05-18T10:35:42.259792887+00:00 kernel.cc:1046] Use fast generic enginei:92[INFO 2023-05-18T10:35:43.792889728+00:00 kernel.cc:1214] Loading model from path /tmp/tmpj_4udme_/model/ with prefix d28fe4cbc6944242
[INFO 2023-05-18T10:35:43.811495786+00:00 kernel.cc:1046] Use fast generic enginei:93[INFO 2023-05-18T10:35:45.144455819+00:00 kernel.cc:1214] Loading model from path /tmp/tmpgvv3j4l_/model/ with prefix cd5cbe19609841d7
[INFO 2023-05-18T10:35:45.155137979+00:00 kernel.cc:1046] Use fast generic enginei:94[INFO 2023-05-18T10:35:46.376542268+00:00 kernel.cc:1214] Loading model from path /tmp/tmpa6bn46wq/model/ with prefix 7453510caac74087
[INFO 2023-05-18T10:35:46.383601011+00:00 kernel.cc:1046] Use fast generic enginei:95[INFO 2023-05-18T10:35:47.676212833+00:00 kernel.cc:1214] Loading model from path /tmp/tmp7zbxz1bs/model/ with prefix c927c5dbf31844b1
[INFO 2023-05-18T10:35:47.685251373+00:00 kernel.cc:1046] Use fast generic enginei:96[INFO 2023-05-18T10:35:48.987414626+00:00 kernel.cc:1214] Loading model from path /tmp/tmplvx1w0aj/model/ with prefix 73702a762b25465f
[INFO 2023-05-18T10:35:48.998273203+00:00 kernel.cc:1046] Use fast generic enginei:97[INFO 2023-05-18T10:35:50.151145813+00:00 kernel.cc:1214] Loading model from path /tmp/tmp_j62smlb/model/ with prefix 2d637fb0572e4544
[INFO 2023-05-18T10:35:50.156863027+00:00 kernel.cc:1046] Use fast generic enginei:98[INFO 2023-05-18T10:35:51.415486908+00:00 kernel.cc:1214] Loading model from path /tmp/tmp7aug1mjr/model/ with prefix a825629f8cc849b0
[INFO 2023-05-18T10:35:51.423800281+00:00 abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2023-05-18T10:35:51.423847083+00:00 kernel.cc:1046] Use fast generic enginei:99[INFO 2023-05-18T10:35:52.90711379+00:00 kernel.cc:1214] Loading model from path /tmp/tmpf19kt4x7/model/ with prefix b150b8c0efe248fa
[INFO 2023-05-18T10:35:52.922135177+00:00 kernel.cc:1046] Use fast generic engineSubmission exported to /kaggle/working/submission.csv

这篇关于案例系列:泰坦尼克号_预测幸存者_TensorFlow决策森林的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/530836

相关文章

Golang操作DuckDB实战案例分享

《Golang操作DuckDB实战案例分享》DuckDB是一个嵌入式SQL数据库引擎,它与众所周知的SQLite非常相似,但它是为olap风格的工作负载设计的,DuckDB支持各种数据类型和SQL特性... 目录DuckDB的主要优点环境准备初始化表和数据查询单行或多行错误处理和事务完整代码最后总结Duck

Python中的随机森林算法与实战

《Python中的随机森林算法与实战》本文详细介绍了随机森林算法,包括其原理、实现步骤、分类和回归案例,并讨论了其优点和缺点,通过面向对象编程实现了一个简单的随机森林模型,并应用于鸢尾花分类和波士顿房... 目录1、随机森林算法概述2、随机森林的原理3、实现步骤4、分类案例:使用随机森林预测鸢尾花品种4.1

MySQL不使用子查询的原因及优化案例

《MySQL不使用子查询的原因及优化案例》对于mysql,不推荐使用子查询,效率太差,执行子查询时,MYSQL需要创建临时表,查询完毕后再删除这些临时表,所以,子查询的速度会受到一定的影响,本文给大家... 目录不推荐使用子查询和JOIN的原因解决方案优化案例案例1:查询所有有库存的商品信息案例2:使用EX

Spring Security 从入门到进阶系列教程

Spring Security 入门系列 《保护 Web 应用的安全》 《Spring-Security-入门(一):登录与退出》 《Spring-Security-入门(二):基于数据库验证》 《Spring-Security-入门(三):密码加密》 《Spring-Security-入门(四):自定义-Filter》 《Spring-Security-入门(五):在 Sprin

Hadoop企业开发案例调优场景

需求 (1)需求:从1G数据中,统计每个单词出现次数。服务器3台,每台配置4G内存,4核CPU,4线程。 (2)需求分析: 1G / 128m = 8个MapTask;1个ReduceTask;1个mrAppMaster 平均每个节点运行10个 / 3台 ≈ 3个任务(4    3    3) HDFS参数调优 (1)修改:hadoop-env.sh export HDFS_NAMENOD

性能分析之MySQL索引实战案例

文章目录 一、前言二、准备三、MySQL索引优化四、MySQL 索引知识回顾五、总结 一、前言 在上一讲性能工具之 JProfiler 简单登录案例分析实战中已经发现SQL没有建立索引问题,本文将一起从代码层去分析为什么没有建立索引? 开源ERP项目地址:https://gitee.com/jishenghua/JSH_ERP 二、准备 打开IDEA找到登录请求资源路径位置

深入探索协同过滤:从原理到推荐模块案例

文章目录 前言一、协同过滤1. 基于用户的协同过滤(UserCF)2. 基于物品的协同过滤(ItemCF)3. 相似度计算方法 二、相似度计算方法1. 欧氏距离2. 皮尔逊相关系数3. 杰卡德相似系数4. 余弦相似度 三、推荐模块案例1.基于文章的协同过滤推荐功能2.基于用户的协同过滤推荐功能 前言     在信息过载的时代,推荐系统成为连接用户与内容的桥梁。本文聚焦于

科研绘图系列:R语言扩展物种堆积图(Extended Stacked Barplot)

介绍 R语言的扩展物种堆积图是一种数据可视化工具,它不仅展示了物种的堆积结果,还整合了不同样本分组之间的差异性分析结果。这种图形表示方法能够直观地比较不同物种在各个分组中的显著性差异,为研究者提供了一种有效的数据解读方式。 加载R包 knitr::opts_chunk$set(warning = F, message = F)library(tidyverse)library(phyl

【区块链 + 人才服务】可信教育区块链治理系统 | FISCO BCOS应用案例

伴随着区块链技术的不断完善,其在教育信息化中的应用也在持续发展。利用区块链数据共识、不可篡改的特性, 将与教育相关的数据要素在区块链上进行存证确权,在确保数据可信的前提下,促进教育的公平、透明、开放,为教育教学质量提升赋能,实现教育数据的安全共享、高等教育体系的智慧治理。 可信教育区块链治理系统的顶层治理架构由教育部、高校、企业、学生等多方角色共同参与建设、维护,支撑教育资源共享、教学质量评估、

客户案例:安全海外中继助力知名家电企业化解海外通邮困境

1、客户背景 广东格兰仕集团有限公司(以下简称“格兰仕”),成立于1978年,是中国家电行业的领军企业之一。作为全球最大的微波炉生产基地,格兰仕拥有多项国际领先的家电制造技术,连续多年位列中国家电出口前列。格兰仕不仅注重业务的全球拓展,更重视业务流程的高效与顺畅,以确保在国际舞台上的竞争力。 2、需求痛点 随着格兰仕全球化战略的深入实施,其海外业务快速增长,电子邮件成为了关键的沟通工具。