网站开发项目任务,网页qq登录电脑版,怎么做可以直播的网站,沈阳微信网站制作1.项目背景#xff1a;银行体系对于信用可违约进行预测#xff0c;原始数据集如下#xff1a;2.分析步骤#xff1a;(1)数据清洗(Data Cleaning)(2) 探索性可视化(Exploratory Visualization)(3) 特征工程(Feature Engineering)(4)基本建模评估(Basic Modeling E…1.项目背景银行体系对于信用可违约进行预测原始数据集如下2.分析步骤(1)数据清洗(Data Cleaning)(2) 探索性可视化(Exploratory Visualization)(3) 特征工程(Feature Engineering)(4)基本建模评估(Basic Modeling Evaluation)3.源码数据集下载易一网络科技 - 付费文章www.intumu.com加载数据import pandas as pddfpd.read_excel(LRGWFB.xls)df.head()年龄 教育 工龄 地址 收入 负债率 信用卡负债 其他负债 违约 0 41 3 17 12 176 9.3 11.359392 5.008608 1 1 27 1 10 6 31 17.3 1.362202 4.000798 0 2 40 1 15 14 55 5.5 0.856075 2.168925 0 3 41 1 15 14 120 2.9 2.658720 0.821280 0 4 24 2 2 0 28 17.3 1.787436 3.056564 1是否有空值df.isnull().any()年龄 False教育 False工龄 False地址 False收入 False负债率 False信用卡负债 False其他负债 False违约 Falsedtype: bool目标集分类df[违约].unique()array([1, 0], dtypeint64)训练集、目标集分割X, y df.iloc[:,1:-1],df.iloc[:,-1]特征相关性classes X.columns.tolist()classes[教育, 工龄, 地址, 收入, 负债率, 信用卡负债, 其他负债]from yellowbrick.features import Rank2Dvisualizer Rank2D(algorithmpearson,size(800, 600),title7特征向量的皮尔森相关系数)visualizer.fit(X, y)visualizer.transform(X)visualizer.poof()E:\Anaconda3\lib\site-packages\yellowbrick\features\rankd.py:262: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.X X.as_matrix()特征重要性from sklearn.ensemble import RandomForestClassifierfrom yellowbrick.features.importances import FeatureImportancesmodel RandomForestClassifier(n_estimators10)viz FeatureImportances(model,size(800, 600),title随机森林算法分类训练特征重要性,xlabel重要性评分)viz.fit(X, y)viz.poof()分类报告训练集、测试集分割from sklearn.model_selection import train_test_split as ttsX_train, X_test, y_train, y_test tts(X, y, test_size 0.2, random_state10)分类结果报告from sklearn.ensemble import RandomForestClassifierfrom yellowbrick.classifier import ClassificationReportmodel RandomForestClassifier(n_estimators10)visualizer ClassificationReport(model, supportTrue,size(800, 600),title机森林算法分类报告)visualizer.fit(X_train.values, y_train)print(得分,visualizer.score(X_test.values, y_test))visualizer.poof()得分 0.7714285714285715持久化保存from sklearn.ensemble import RandomForestClassifiermodel RandomForestClassifier(n_estimators10)model.fit(X_train.values, y_train)RandomForestClassifier(bootstrapTrue, class_weightNone, criteriongini,max_depthNone, max_featuresauto, max_leaf_nodesNone,min_impurity_decrease0.0, min_impurity_splitNone,min_samples_leaf1, min_samples_split2,min_weight_fraction_leaf0.0, n_estimators10, n_jobsNone,oob_scoreFalse, random_stateNone, verbose0,warm_startFalse)from sklearn.externals import joblibjoblib.dump(model,model.pickle) #保存[model.pickle]载入训练模型model joblib.load(model.pickle) #载入model.predict(X_test) # 输出每组数据的预测结果的标签值array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1,0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0,1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,1, 0, 1, 1, 0, 0, 0, 0], dtypeint64)model.predict_proba(X_test) # 输出的是二维矩阵 第i行j列表示测试数据第i行测试数据在每个label上的概率array([[1. , 0. ],[0.9, 0.1],[0.8, 0.2],[1. , 0. ],[0.9, 0.1],[1. , 0. ],[0.5, 0.5],[0.8, 0.2],[0.9, 0.1],[1. , 0. ],[0.4, 0.6],[1. , 0. ],[0.6, 0.4],[0.3, 0.7],[1. , 0. ],[0.6, 0.4],[0.9, 0.1],[0.7, 0.3],[1. , 0. ],[0.9, 0.1],[0.4, 0.6],[0.4, 0.6],[0.5, 0.5],[1. , 0. ],[0.8, 0.2],[1. , 0. ],[0.9, 0.1],[0.5, 0.5],[0.1, 0.9],[0.9, 0.1],[0.8, 0.2],[0.6, 0.4],[0.8, 0.2],[0.9, 0.1],[0.7, 0.3],[1. , 0. ],[0.2, 0.8],[0.9, 0.1],[1. , 0. ],[1. , 0. ],[1. , 0. ],[0.9, 0.1],[0.4, 0.6],[0.7, 0.3],[0.4, 0.6],[0.9, 0.1],[0.5, 0.5],[0.1, 0.9],[1. , 0. ],[1. , 0. ],[0.8, 0.2],[0.7, 0.3],[1. , 0. ],[0.5, 0.5],[0.8, 0.2],[0.7, 0.3],[0.9, 0.1],[0.8, 0.2],[0.3, 0.7],[0.9, 0.1],[1. , 0. ],[0.9, 0.1],[0.9, 0.1],[0.9, 0.1],[0.8, 0.2],[0.9, 0.1],[1. , 0. ],[0.9, 0.1],[0.4, 0.6],[0.5, 0.5],[0.9, 0.1],[0.8, 0.2],[0.6, 0.4],[0.8, 0.2],[1. , 0. ],[1. , 0. ],[0.8, 0.2],[1. , 0. ],[0.9, 0.1],[0.6, 0.4],[1. , 0. ],[1. , 0. ],[0.7, 0.3],[1. , 0. ],[0.8, 0.2],[1. , 0. ],[0.3, 0.7],[0.9, 0.1],[0.7, 0.3],[0.5, 0.5],[0.4, 0.6],[1. , 0. ],[0.9, 0.1],[0.8, 0.2],[0.8, 0.2],[0.9, 0.1],[0.8, 0.2],[0.2, 0.8],[0.7, 0.3],[0.7, 0.3],[0.4, 0.6],[0.6, 0.4],[0.7, 0.3],[0.8, 0.2],[1. , 0. ],[0.5, 0.5],[0.8, 0.2],[1. , 0. ],[0.9, 0.1],[0.5, 0.5],[0.8, 0.2],[0.6, 0.4],[0.8, 0.2],[0.9, 0.1],[0.9, 0.1],[0.6, 0.4],[0.8, 0.2],[0.9, 0.1],[0.1, 0.9],[1. , 0. ],[1. , 0. ],[1. , 0. ],[0.9, 0.1],[0.6, 0.4],[1. , 0. ],[0.8, 0.2],[0.8, 0.2],[0.7, 0.3],[0.9, 0.1],[0.9, 0.1],[0.5, 0.5],[1. , 0. ],[0.2, 0.8],[0.9, 0.1],[0.4, 0.6],[0.2, 0.8],[0.8, 0.2],[1. , 0. ],[0.8, 0.2],[0.8, 0.2]])新手可查阅历史目录yeayeePython数据分析及可视化实例目录zhuanlan.zhihu.com最后别只收藏不关注哈