社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  Python

如何用 Python 构建机器学习模型?

架构师大咖 • 2 年前 • 280 次点击  
👇👇关注后回复 “进群” ,拉你进程序员交流群👇👇


作者丨Anello
译者丨Sambodhi
策划丨凌敏
来源丨AI前线(ID:ai-front)
本文,我们将通过 Python 语言包,来构建一些机器学习模型。
构建机器学习模型的模板

该 Notebook 包含了用于创建主要机器学习算法所需的代码模板。在 scikit-learn 中,我们已经准备好了几个算法。只需调整参数,给它们输入数据,进行训练,生成模型,最后进行预测。

1. 线性回归

对于线性回归,我们需要从 sklearn 库中导入 linear_model。我们准备好训练和测试数据,然后将预测模型实例化为一个名为线性回归 LinearRegression 算法的对象,它是 linear_model 包的一个类,从而创建预测模型。之后我们利用拟合函数对算法进行训练,并利用得分来评估模型。最后,我们将系数打印出来,用模型进行新的预测。

# Import modules
from sklearn import linear_model

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted_variable

x_test = test_dataset_precictor_variables

# Create linear regression object
linear = linear_model.LinearRegression()

# Train the model with training data and check the score
linear.fit(x_train, y_train)
linear.score(x_train, y_train)

# Collect coefficients
print('Coefficient: \n', linear.coef_)
print('Intercept: \n', linear.intercept_)

# Make predictions
predicted_values = linear.predict(x_test)
2. 逻辑回归

在本例中,从线性回归到逻辑回归唯一改变的是我们要使用的算法。我们将 LinearRegression 改为 LogisticRegression。

# Import modules
from sklearn.linear_model import LogisticRegression

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted_variable

x_test = test_dataset_precictor_variables

# Create logistic regression object
model = LogisticRegression()

# Train the model with training data and checking the score
model.fit(x_train, y_train)
model.score(x_train, y_train)

# Collect coefficients
print('Coefficient: \n', model.coef_)
print('Intercept: \n', model.intercept_)

# Make predictions
predicted_vaues = model.predict(x_teste)
3. 决策树

我们再次将算法更改为 DecisionTreeRegressor:

# Import modules
from sklearn import tree

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted_variable

x_test = test_dataset_precictor_variables

# Create Decision Tree Regressor Object
model = tree.DecisionTreeRegressor()

# Create Decision Tree Classifier Object
model = tree.DecisionTreeClassifier()

# Train the model with training data and checking the score
model.fit(x_train, y_train)
model.score(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)
4. 朴素贝叶斯

我们再次将算法更改为 DecisionTreeRegressor:

# Import modules
from sklearn.naive_bayes import GaussianNB

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Create GaussianNB object
model = GaussianNB()

# Train the model with training data
model.fit(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)
5. 支持向量机

在本例中,我们使用 SVM 库的 SVC 类。如果是 SVR,它就是一个回归函数:

# Import modules
from sklearn import svm

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Create SVM Classifier object
model = svm.svc()

# Train the model with training data and checking the score
model.fit(x_train, y_train)
model.score(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)
6.K- 最近邻

在 KneighborsClassifier 算法中,我们有一个超参数叫做 n_neighbors,就是我们对这个算法进行调整。

# Import modules
from sklearn.neighbors import KNeighborsClassifier

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Create KNeighbors Classifier Objects
KNeighborsClassifier(n_neighbors = 6) # default value = 5

# Train the model with training data
model.fit(x_train, y_train)

# Make predictions
predicted_values = model.predict(x_test)
7.K- 均值
# Import modules
from sklearn.cluster import KMeans

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Create KMeans objects
k_means = KMeans(n_clusters = 3, random_state = 0)

# Train the model with training data
model.fit(x_train)

# Make predictions
predicted_values = model.predict(x_test)
8. 随机森林
# Import modules
from sklearn.ensemble import RandomForestClassifier

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Create Random Forest Classifier objects
model = RandomForestClassifier()

# Train the model with training data
model.fit(x_train, x_test)

# Make predictions
predicted_values = model.predict(x_test)
9. 降维
# Import modules
from sklearn import decomposition

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Creating PCA decomposition object
pca = decomposition.PCA(n_components = k)

# Creating Factor analysis decomposition object
fa = decomposition.FactorAnalysis()

# Reduc the size of the training set using PCA
reduced_train = pca.fit_transform(train)

# Reduce the size of the training set using PCA
reduced_test = pca.transform(test)
10. 梯度提升和 AdaBoost
# Import modules
from sklearn.ensemble import GradientBoostingClassifier

# Create training and test subsets
x_train = train_dataset_predictor_variables
y_train = train_dataset_predicted variable

x_test = test_dataset_precictor_variables

# Creating Gradient Boosting Classifier object
model = GradientBoostingClassifier(n_estimators = 100, learning_rate = 1.0, max_depth = 1, random_state = 0)

# Training the model with training data
model.fit(x_train, x_test)

# Make predictions
predicted_values = model.predict(x_test)

我们的工作将是把这些算法中的每一个块转化为一个项目。首先,定义一个业务问题,对数据进行预处理,训练算法,调整超参数,获得可验证的结果,在这个过程中不断迭代,直到我们达到满意的精度,做出理想的预测。

原文链接:

https://levelup.gitconnected.com/10-templates-for-building-machine-learning-models-with-notebook-282c4eb0987f

-End-

最近有一些小伙伴,让我帮忙找一些 面试题 资料,于是我翻遍了收藏的 5T 资料后,汇总整理出来,可以说是程序员面试必备!所有资料都整理到网盘了,欢迎下载!

点击👆卡片,关注后回复【面试题】即可获取

在看点这里好文分享给更多人↓↓

Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/147746
 
280 次点击