决策树、SVM、随机森林在评定信用等级上的应用

2024-01-10 21:50

本文主要是介绍决策树、SVM、随机森林在评定信用等级上的应用,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

以下为我们这次的数据集信息,分别是各类特征和信用评定Label,属于二分类问题。

本文章想通过比较决策树、SVM和随机森林在该数据集上的表现

在这里插入图片描述

导入数据,查看缺失值

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data =pd.read_excel('./GermanCredit.xls', sheet_name='Data')  #读取xls文件的Data sheet
data.head()
num_features = ['DURATION','AMOUNT','INSTALL_RATE','AGE','NUM_CREDITS','NUM_DEPENDENTS']
cat_features = data.columns.drop(num_features + ['OBS#'])
data.isnull().sum()
# 都没有缺失值
OBS#                0
CHK_ACCT            0
DURATION            0
HISTORY             0
NEW_CAR             0
USED_CAR            0
FURNITURE           0
RADIO/TV            0
EDUCATION           0
RETRAINING          0
AMOUNT              0
SAV_ACCT            0
EMPLOYMENT          0
INSTALL_RATE        0
MALE_DIV            0
MALE_SINGLE         0
MALE_MAR_or_WID     0
CO-APPLICANT        0
GUARANTOR           0
PRESENT_RESIDENT    0
REAL_ESTATE         0
PROP_UNKN_NONE      0
AGE                 0
OTHER_INSTALL       0
RENT                0
OWN_RES             0
NUM_CREDITS         0
JOB                 0
NUM_DEPENDENTS      0
TELEPHONE           0
FOREIGN             0
RESPONSE            0
dtype: int64

将连续特征离散化

发现DURATION是贷款期限,分布在4-72个月之间,而且分布是一个看似左偏的正态分布,做一个hist图看得更清楚!

plt.hist(data['DURATION'])

在这里插入图片描述

(array([171., 262., 337.,  57.,  86.,  17.,  54.,   2.,  13.,   1.]),array([ 4. , 10.8, 17.6, 24.4, 31.2, 38. , 44.8, 51.6, 58.4, 65.2, 72. ]),<a list of 10 Patch objects>)

取五分位数,将DURATION特征转化成cat_features做离散化处理

x<20 dua_rank = 1
20<x<40 dua_rank = 2
40<x<60 dua_rank = 3
60<x<72 dua_rank = 4
并且创造一个新特征 dua_rank 添加在new_data中,也可以用sklearn.KBinsDiscretizer进行分箱处理
dua_rank = []
duration = data['DURATION']
for i in duration:if i <=20:dua_rank.append(1)elif i<= 40:dua_rank.append(2)elif i < 60:dua_rank.append(3)else:dua_rank.append(4)

可以看出,大部分的duration分布在rank1、2的区间内

plt.hist(dua_rank,bins = 4)
(array([554., 365.,  67.,  14.]),array([1.  , 1.75, 2.5 , 3.25, 4.  ]),<a list of 4 Patch objects>)

在这里插入图片描述

new_data = data.copy()
new_data['dua_rank'] = dua_rank
new_data.head()
OBS#CHK_ACCTDURATIONHISTORYNEW_CARUSED_CARFURNITURERADIO/TVEDUCATIONRETRAINING...OTHER_INSTALLRENTOWN_RESNUM_CREDITSJOBNUM_DEPENDENTSTELEPHONEFOREIGNRESPONSEdua_rank
01064000100...0012211011
121482000100...0011210003
233124000010...0011120011
340422001000...0001220013
450243100000...0002220002

5 rows × 33 columns

plt.hist(data['AMOUNT'])
(array([445., 293.,  97.,  80.,  38.,  19.,  14.,   8.,   5.,   1.]),array([  250. ,  2067.4,  3884.8,  5702.2,  7519.6,  9337. , 11154.4,12971.8, 14789.2, 16606.6, 18424. ]),<a list of 10 Patch objects>)

在这里插入图片描述

我们也将AMOUNT特征分为1-10级,用十分位点作为评分标准

同样可以用sklearn.KBinsDiscretizer进行分箱离散化

percent = np.percentile(data['AMOUNT'], [i * 10 for i in range(1,10)])
amount_rank = []
for i in data['AMOUNT']:if i < percent[0]:amount_rank.append(1)elif i < percent[1]:amount_rank.append(2)elif i <percent[2]:amount_rank.append(3)elif i < percent[3]:amount_rank.append(4)elif i < percent[4]:amount_rank.append(5)elif i < percent[5]:amount_rank.append(6)elif i < percent[6]:amount_rank.append(7)elif i < percent[7]:amount_rank.append(8)elif i < percent[8]:amount_rank.append(9)else:amount_rank.append(10)
new_data['amount_rank'] = amount_rank
data['INSTALL_RATE'].value_counts()
4    476
2    231
3    157
1    136
Name: INSTALL_RATE, dtype: int64
INSTSLL_RATE 分期付款率占可支配收入的百分比可以直接看作一个离散变量,不作处理
AGE 变量做离散化处理,原理同上的特征处理
data['AGE'].describe()
count    1000.000000
mean       35.546000
std        11.375469
min        19.000000
25%        27.000000
50%        33.000000
75%        42.000000
max        75.000000
Name: AGE, dtype: float64
percent = np.percentile(data['AGE'], [25, 50, 75])
age_rank = []
for i in data['AGE']:if i <= percent[0]:age_rank.append(1)elif i <= percent[1]:age_rank.append(2)elif i <= percent[2]:age_rank.append(3)else:age_rank.append(4)
new_data['age_rank'] = age_rank      
new_data.head()
OBS#CHK_ACCTDURATIONHISTORYNEW_CARUSED_CARFURNITURERADIO/TVEDUCATIONRETRAINING...OWN_RESNUM_CREDITSJOBNUM_DEPENDENTSTELEPHONEFOREIGNRESPONSEdua_rankamount_rankage_rank
01064000100...1221101124
121482000100...1121000391
233124000010...1112001154
340422001000...01220013104
450243100000...0222000294

5 rows × 35 columns

data['NUM_CREDITS'].describe()
count    1000.000000
mean        1.407000
std         0.577654
min         1.000000
25%         1.000000
50%         1.000000
75%         2.000000
max         4.000000
Name: NUM_CREDITS, dtype: float64
num_credits 表示持有的信用卡的数目,范围在1-3 ,也可以不用处理
plt.hist(data['NUM_CREDITS'],bins = 4)
(array([633., 333.,  28.,   6.]),array([1.  , 1.75, 2.5 , 3.25, 4.  ]),<a list of 4 Patch objects>)

在这里插入图片描述

data['NUM_DEPENDENTS'].describe()
count    1000.000000
mean        1.155000
std         0.362086
min         1.000000
25%         1.000000
50%         1.000000
75%         1.000000
max         2.000000
Name: NUM_DEPENDENTS, dtype: float64
也只有两个类, 我也不需要处理
plt.hist(data['NUM_DEPENDENTS'],bins = 2)
(array([845., 155.]), array([1. , 1.5, 2. ]), <a list of 2 Patch objects>)

在这里插入图片描述

将刚刚做处理的num_features 删除掉,用rank特征代替
new_data.drop(['AGE','AMOUNT','DURATION'], axis = 1,inplace=True)
将不具有大小关系的特征进行one-hot encoding 以消除其大小的含义

所以我们将 HISTORY , JOB 特征 进行独热编码

history = data['HISTORY']
new_history = pd.get_dummies(history,prefix='histor')
job = data['JOB']
new_job = pd.get_dummies(job, prefix= 'job')
new_data = pd.concat([new_data, new_history, new_job], axis = 1)
new_data.drop(['HISTORY', 'JOB'], axis = 1,inplace=True)
new_data
OBS#CHK_ACCTNEW_CARUSED_CARFURNITURERADIO/TVEDUCATIONRETRAININGSAV_ACCTEMPLOYMENT...age_rankhistor_0histor_1histor_2histor_3histor_4job_0job_1job_2job_3
01000010044...4000010010
12100010002...1001000010
23300001003...4000010100
34000100003...4001000010
45010000002...4000100010
..................................................................
995996300100003...2001000100
996997001000002...3001000001
997998300010004...3001000010
998999000010002...1001000010
9991000101000010...1000010010

1000 rows × 39 columns

new_data.info
<bound method DataFrame.info of      OBS#  CHK_ACCT  NEW_CAR  USED_CAR  FURNITURE  RADIO/TV  EDUCATION  \
0       1         0        0         0          0         1          0   
1       2         1        0         0          0         1          0   
2       3         3        0         0          0         0          1   
3       4         0        0         0          1         0          0   
4       5         0        1         0          0         0          0   
..    ...       ...      ...       ...        ...       ...        ...   
995   996         3        0         0          1         0          0   
996   997         0        0         1          0         0          0   
997   998         3        0         0          0         1          0   
998   999         0        0         0          0         1          0   
999  1000         1        0         1          0         0          0   RETRAINING  SAV_ACCT  EMPLOYMENT  ...  age_rank  histor_0  histor_1  \
0             0         4           4  ...         4         0         0   
1             0         0           2  ...         1         0         0   
2             0         0           3  ...         4         0         0   
3             0         0           3  ...         4         0         0   
4             0         0           2  ...         4         0         0   
..          ...       ...         ...  ...       ...       ...       ...   
995           0         0           3  ...         2         0         0   
996           0         0           2  ...         3         0         0   
997           0         0           4  ...         3         0         0   
998           0         0           2  ...         1         0         0   
999           0         1           0  ...         1         0         0   histor_2  histor_3  histor_4  job_0  job_1  job_2  job_3  
0           0         0         1      0      0      1      0  
1           1         0         0      0      0      1      0  
2           0         0         1      0      1      0      0  
3           1         0         0      0      0      1      0  
4           0         1         0      0      0      1      0  
..        ...       ...       ...    ...    ...    ...    ...  
995         1         0         0      0      1      0      0  
996         1         0         0      0      0      0      1  
997         1         0         0      0      0      1      0  
998         1         0         0      0      0      1      0  
999         0         0         1      0      0      1      0  [1000 rows x 39 columns]>
from sklearn import tree
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
import graphviz
from sklearn.svm import SVC
x = new_data.drop(['RESPONSE'], axis = 1)
y = new_data.loc[:,['RESPONSE']]
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size = 0.2, random_state = 777)

决策树分类器

采用GridSearchCV进行可选参数的遍历,选出一个最优模型,以后的决策树,SVM,随机森林都采用这个方法进行遍历调参,选出最优参数和最优模型

model = DecisionTreeClassifier(criterion='gini',max_depth=4,min_samples_split=4,max_features=6)
params = {'criterion':['gini','entropy'],'max_depth': range(1,30),'min_samples_split': range(2,10),'min_samples_leaf' : range(1,6),
}
cv = GridSearchCV(model,param_grid= params,n_jobs= -1,verbose=1,scoring='accuracy', cv = 5)
cv.fit(data.iloc[:,:-1], data.iloc[:,-1])
Fitting 5 folds for each of 2320 candidates, totalling 11600 fits[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done 1348 tasks      | elapsed:    2.4s
[Parallel(n_jobs=-1)]: Done 8248 tasks      | elapsed:   15.7s
[Parallel(n_jobs=-1)]: Done 11600 out of 11600 | elapsed:   22.5s finishedGridSearchCV(cv=5, error_score='raise-deprecating',estimator=DecisionTreeClassifier(class_weight=None,criterion='gini', max_depth=4,max_features=6,max_leaf_nodes=None,min_impurity_decrease=0.0,min_impurity_split=None,min_samples_leaf=1,min_samples_split=4,min_weight_fraction_leaf=0.0,presort=False, random_state=None,splitter='best'),iid='warn', n_jobs=-1,param_grid={'criterion': ['gini', 'entropy'],'max_depth': range(1, 30),'min_samples_leaf': range(1, 6),'min_samples_split': range(2, 10)},pre_dispatch='2*n_jobs', refit=True, return_train_score=False,scoring='accuracy', verbose=1)
model1 = cv.best_estimator_
cv.best_params_,cv.best_score_
({'criterion': 'gini','max_depth': 5,'min_samples_leaf': 2,'min_samples_split': 5},0.734)

影响决策树决策的特征重要性的可视化

发现最重要的特征为 CHK_ACCT、DURATION、AGE、HISTORY
plt.figure(figsize= (9,6))
plt.bar(data.iloc[:,:-1].columns, model1.feature_importances_)
plt.xticks(rotation = 90)
plt.show()

在这里插入图片描述

cv.fit(x, y)
model2 = cv.best_estimator_
cv.best_params_,cv.best_score_
Fitting 5 folds for each of 2320 candidates, totalling 11600 fits[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done 1328 tasks      | elapsed:    3.0s
[Parallel(n_jobs=-1)]: Done 7928 tasks      | elapsed:   18.5s
[Parallel(n_jobs=-1)]: Done 11600 out of 11600 | elapsed:   27.5s finished({'criterion': 'gini','max_depth': 5,'min_samples_leaf': 2,'min_samples_split': 2},0.728)
model2
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=5,max_features=6, max_leaf_nodes=None,min_impurity_decrease=0.0, min_impurity_split=None,min_samples_leaf=2, min_samples_split=2,min_weight_fraction_leaf=0.0, presort=False,random_state=None, splitter='best')

将预处理过的特征 进行决策树分析,发现 CHK_ACCT仍是最重要的影响特征,其他特征并没有表现出来,模型的表现也不如未处理的数据

plt.figure(figsize= (9,6))
plt.bar(x.columns, model2.feature_importances_)
plt.xticks(rotation = 90)
plt.show()

在这里插入图片描述

决策树的可视化

graph_data = tree.export_graphviz(model2,out_file = None,feature_names=x.columns,filled= True, rounded= True)
graph = graphviz.Source(graph_data,)
graph

在这里插入图片描述

SVM分类器

可以看出SVM在全是离散型变量的数据集的预测上表现的并不是很好,不如决策树,accuracy 在为预处理的数据集和 经过离散处理的数据集上都只有0.7和0.65的表现
model = SVC()
params = {'C':range(1,10)}
cv = GridSearchCV(model,param_grid=params, verbose = 1,cv = 5,scoring='accuracy',n_jobs=-1)
cv.fit(x, y)
model1 = cv.best_estimator_
cv.best_score_
Fitting 5 folds for each of 9 candidates, totalling 45 fits[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  38 out of  45 | elapsed:    1.3s remaining:    0.2s
[Parallel(n_jobs=-1)]: Done  45 out of  45 | elapsed:    1.5s finished
F:\Anaconda3\lib\site-packages\sklearn\utils\validation.py:724: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().y = column_or_1d(y, warn=True)
F:\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning."avoid this warning.", FutureWarning)0.653
cv.fit(data.iloc[:,:-1], data.iloc[:,-1])
model2 = cv.best_estimator_
cv.best_score_
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.Fitting 5 folds for each of 9 candidates, totalling 45 fits[Parallel(n_jobs=-1)]: Done  38 out of  45 | elapsed:    1.6s remaining:    0.2s
[Parallel(n_jobs=-1)]: Done  45 out of  45 | elapsed:    1.8s finished
F:\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning."avoid this warning.", FutureWarning)0.7

随机森林分类器

可以看出集成类分类器会有更好的表现,在预处理的数据集上表现为0.759,未处理的数据集上表现为 0.773
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=500, random_state=2)
params = {'n_estimators':range(1,1000)
}
cv = GridSearchCV(model, param_grid=params ,verbose = 1,n_jobs=-1, scoring='accuracy')
cv.fit(x,y)
rfc1= cv.best_estimator_
F:\Anaconda3\lib\site-packages\sklearn\model_selection\_split.py:1978: FutureWarning: The default value of cv will change from 3 to 5 in version 0.22. Specify it explicitly to silence this warning.warnings.warn(CV_WARNING, FutureWarning)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.Fitting 3 folds for each of 999 candidates, totalling 2997 fits[Parallel(n_jobs=-1)]: Done  75 tasks      | elapsed:    3.7s
[Parallel(n_jobs=-1)]: Done 526 tasks      | elapsed:   30.7s
[Parallel(n_jobs=-1)]: Done 776 tasks      | elapsed:  1.1min
[Parallel(n_jobs=-1)]: Done 1126 tasks      | elapsed:  2.3min
[Parallel(n_jobs=-1)]: Done 1576 tasks      | elapsed:  4.5min
[Parallel(n_jobs=-1)]: Done 2126 tasks      | elapsed:  8.2min
[Parallel(n_jobs=-1)]: Done 2776 tasks      | elapsed: 14.1min
[Parallel(n_jobs=-1)]: Done 2997 out of 2997 | elapsed: 16.4min finished
F:\Anaconda3\lib\site-packages\sklearn\model_selection\_search.py:715: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().self.best_estimator_.fit(X, y, **fit_params)
rfc = cv.best_estimator_
rfc
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',max_depth=None, max_features='auto', max_leaf_nodes=None,min_impurity_decrease=0.0, min_impurity_split=None,min_samples_leaf=1, min_samples_split=2,min_weight_fraction_leaf=0.0, n_estimators=673,n_jobs=None, oob_score=False, random_state=2, verbose=0,warm_start=False)
cv.best_score_
0.759
cv.fit(data.iloc[:,:-1],y.iloc[:,-1])
rfc2= cv.best_estimator_
F:\Anaconda3\lib\site-packages\sklearn\model_selection\_split.py:1978: FutureWarning: The default value of cv will change from 3 to 5 in version 0.22. Specify it explicitly to silence this warning.warnings.warn(CV_WARNING, FutureWarning)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.Fitting 3 folds for each of 999 candidates, totalling 2997 fits[Parallel(n_jobs=-1)]: Done  75 tasks      | elapsed:    4.0s
[Parallel(n_jobs=-1)]: Done 423 tasks      | elapsed:   21.4s
[Parallel(n_jobs=-1)]: Done 673 tasks      | elapsed:   49.5s
[Parallel(n_jobs=-1)]: Done 1023 tasks      | elapsed:  1.9min
[Parallel(n_jobs=-1)]: Done 1473 tasks      | elapsed:  4.0min
[Parallel(n_jobs=-1)]: Done 2023 tasks      | elapsed:  7.5min
[Parallel(n_jobs=-1)]: Done 2673 tasks      | elapsed: 13.2min
[Parallel(n_jobs=-1)]: Done 2997 out of 2997 | elapsed: 16.5min finished
cv.best_score_ #
0.773

总结

1、在决策树,随机森林,SVM上,经过预处理的数据反而准确率不及源数据,可能造成的原因是,处理后将原数据的某些特点抹掉了,使模型欠拟合
2、SVM在大部分特征是0、1特征的数据集上表现不如树模型
3、随机森林这类集成模型表现好于单模型,但需要计算资源较多,耗时长。
4、影响贷款最重要的特征是CHK_ACCT,即支票帐户状态。
5、随机森林和决策树这种树类模型在实际应用中更具可解释性。

总体变现并不是很好,之后想再试一试 xgboost 和 LightGBM这类boosting集成树模型

这篇关于决策树、SVM、随机森林在评定信用等级上的应用的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/592184

相关文章

Python中随机休眠技术原理与应用详解

《Python中随机休眠技术原理与应用详解》在编程中,让程序暂停执行特定时间是常见需求,当需要引入不确定性时,随机休眠就成为关键技巧,下面我们就来看看Python中随机休眠技术的具体实现与应用吧... 目录引言一、实现原理与基础方法1.1 核心函数解析1.2 基础实现模板1.3 整数版实现二、典型应用场景2

Python Dash框架在数据可视化仪表板中的应用与实践记录

《PythonDash框架在数据可视化仪表板中的应用与实践记录》Python的PlotlyDash库提供了一种简便且强大的方式来构建和展示互动式数据仪表板,本篇文章将深入探讨如何使用Dash设计一... 目录python Dash框架在数据可视化仪表板中的应用与实践1. 什么是Plotly Dash?1.1

Android Kotlin 高阶函数详解及其在协程中的应用小结

《AndroidKotlin高阶函数详解及其在协程中的应用小结》高阶函数是Kotlin中的一个重要特性,它能够将函数作为一等公民(First-ClassCitizen),使得代码更加简洁、灵活和可... 目录1. 引言2. 什么是高阶函数?3. 高阶函数的基础用法3.1 传递函数作为参数3.2 Lambda

Java中&和&&以及|和||的区别、应用场景和代码示例

《Java中&和&&以及|和||的区别、应用场景和代码示例》:本文主要介绍Java中的逻辑运算符&、&&、|和||的区别,包括它们在布尔和整数类型上的应用,文中通过代码介绍的非常详细,需要的朋友可... 目录前言1. & 和 &&代码示例2. | 和 ||代码示例3. 为什么要使用 & 和 | 而不是总是使

Python循环缓冲区的应用详解

《Python循环缓冲区的应用详解》循环缓冲区是一个线性缓冲区,逻辑上被视为一个循环的结构,本文主要为大家介绍了Python中循环缓冲区的相关应用,有兴趣的小伙伴可以了解一下... 目录什么是循环缓冲区循环缓冲区的结构python中的循环缓冲区实现运行循环缓冲区循环缓冲区的优势应用案例Python中的实现库

SpringBoot整合MybatisPlus的基本应用指南

《SpringBoot整合MybatisPlus的基本应用指南》MyBatis-Plus,简称MP,是一个MyBatis的增强工具,在MyBatis的基础上只做增强不做改变,下面小编就来和大家介绍一下... 目录一、MyBATisPlus简介二、SpringBoot整合MybatisPlus1、创建数据库和

python中time模块的常用方法及应用详解

《python中time模块的常用方法及应用详解》在Python开发中,时间处理是绕不开的刚需场景,从性能计时到定时任务,从日志记录到数据同步,时间模块始终是开发者最得力的工具之一,本文将通过真实案例... 目录一、时间基石:time.time()典型场景:程序性能分析进阶技巧:结合上下文管理器实现自动计时

Java逻辑运算符之&&、|| 与&、 |的区别及应用

《Java逻辑运算符之&&、||与&、|的区别及应用》:本文主要介绍Java逻辑运算符之&&、||与&、|的区别及应用的相关资料,分别是&&、||与&、|,并探讨了它们在不同应用场景中... 目录前言一、基本概念与运算符介绍二、短路与与非短路与:&& 与 & 的区别1. &&:短路与(AND)2. &:非短

Spring AI集成DeepSeek三步搞定Java智能应用的详细过程

《SpringAI集成DeepSeek三步搞定Java智能应用的详细过程》本文介绍了如何使用SpringAI集成DeepSeek,一个国内顶尖的多模态大模型,SpringAI提供了一套统一的接口,简... 目录DeepSeek 介绍Spring AI 是什么?Spring AI 的主要功能包括1、环境准备2

Spring AI与DeepSeek实战一之快速打造智能对话应用

《SpringAI与DeepSeek实战一之快速打造智能对话应用》本文详细介绍了如何通过SpringAI框架集成DeepSeek大模型,实现普通对话和流式对话功能,步骤包括申请API-KEY、项目搭... 目录一、概述二、申请DeepSeek的API-KEY三、项目搭建3.1. 开发环境要求3.2. mav