社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
Py学习  »  Python

python上的二元线性回归

Hugo Assis Brandao • 5 年前 • 1775 次点击  

我正在开发一个代码来分析两个变量之间的关系。我正在使用 数据文件 将变量保存在两列中,如下所示:

column A = 132.54672, 201.3845717, 323.2654551  
column B = 51.54671995,  96.38457166, 131.2654551

我试着用 状态模型 但上面说我没有足够的样品。

有人能帮我吗?为了计算其他变量,我需要定义系数和截距。

y = coefficient * x + intercept
Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/43689
 
1775 次点击  
文章 [ 4 ]  |  最新文章 5 年前
T_T
Reply   •   1 楼
T_T    6 年前

使用 scipy.stats :

import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt


column_A= [132.54672, 201.3845717, 323.2654551]
column_B= [51.54671995, 96.38457166, 131.2654551]
df = pd.DataFrame({'A': column_A, 'B': column_B})

reg = stats.linregress(df.A, df.B)

plt.plot(df.A, df.B, 'bo', label='Data')
plt.plot(df.A, reg.intercept + reg.slope * df.A, 'k-', label='Linear Regression')
plt.xlabel('A')
plt.ylabel('B')
plt.legend()
plt.show()

enter image description here

你也可以从中找到有用的方法。 dir(reg) ,其中包括

.intercept .pvalue .rvalue .slope .stderr

here

James Phillips
Reply   •   2 楼
James Phillips    6 年前

除了前面的优秀答案外,这里还有一个图形装配器,它有一个3d散点图、3d曲面图和一个等高线图。

import numpy, scipy, scipy.optimize
import matplotlib
from mpl_toolkits.mplot3d import  Axes3D
from matplotlib import cm # to colormap 3D surfaces from blue to red
import matplotlib.pyplot as plt

graphWidth = 800 # units are pixels
graphHeight = 600 # units are pixels

# 3D contour plot lines
numberOfContourLines = 16


def SurfacePlot(func, data, fittedParameters):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)

    matplotlib.pyplot.grid(True)
    axes = Axes3D(f)

    x_data = data[0]
    y_data = data[1]
    z_data = data[2]

    xModel = numpy.linspace(min(x_data), max(x_data), 20)
    yModel = numpy.linspace(min(y_data), max(y_data), 20)
    X, Y = numpy.meshgrid(xModel, yModel)

    Z = func(numpy.array([X, Y]), *fittedParameters)

    axes.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=1, antialiased=True)

    axes.scatter(x_data, y_data, z_data) # show data along with plotted surface

    axes.set_title('Surface Plot (click-drag with mouse)') # add a title for surface plot
    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label
    axes.set_zlabel('Z Data') # Z axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems


def ContourPlot(func, data, fittedParameters):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    x_data = data[0]
    y_data = data[1]
    z_data = data[2]

    xModel = numpy.linspace(min(x_data), max(x_data), 20)
    yModel = numpy.linspace(min(y_data), max(y_data), 20)
    X, Y = numpy.meshgrid(xModel, yModel)

    Z = func(numpy.array([X, Y]), *fittedParameters)

    axes.plot(x_data, y_data, 'o')

    axes.set_title('Contour Plot') # add a title for contour plot
    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label

    CS = matplotlib.pyplot.contour(X, Y, Z, numberOfContourLines, colors='k')
    matplotlib.pyplot.clabel(CS, inline=1, fontsize=10) # labels for contours

    plt.show()
    plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems


def ScatterPlot(data):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)

    matplotlib.pyplot.grid(True)
    axes = Axes3D(f)
    x_data = data[0]
    y_data = data[1]
    z_data = data[2]

    axes.scatter(x_data, y_data, z_data)

    axes.set_title('Scatter Plot (click-drag with mouse)')
    axes.set_xlabel('X Data')
    axes.set_ylabel('Y Data')
    axes.set_zlabel('Z Data')

    plt.show()
    plt.close('all') # clean up after using pyplot or else thaere can be memory and process problems


def func(data, a, alpha, beta):
    t = data[0]
    p_p = data[1]
    return a * (t**alpha) * (p_p**beta)


if __name__ == "__main__":
    xData = numpy.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])
    yData = numpy.array([11.0, 12.1, 13.0, 14.1, 15.0, 16.1, 17.0, 18.1, 90.0])
    zData = numpy.array([1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.0, 9.9])

    data = [xData, yData, zData]

    initialParameters = [1.0, 1.0, 1.0] # these are the same as scipy default values in this example

    # here a non-linear surface fit is made with scipy's curve_fit()
    fittedParameters, pcov = scipy.optimize.curve_fit(func, [xData, yData], zData, p0 = initialParameters)

    ScatterPlot(data)
    SurfacePlot(func, data, fittedParameters)
    ContourPlot(func, data, fittedParameters)

    print('fitted prameters', fittedParameters)

    modelPredictions = func(data, *fittedParameters) 

    absError = modelPredictions - zData

    SE = numpy.square(absError) # squared errors
    MSE = numpy.mean(SE) # mean squared errors
    RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
    Rsquared = 1.0 - (numpy.var(absError) / numpy.var(zData))
    print('RMSE:', RMSE)
    print('R-squared:', Rsquared)
DavidG
Reply   •   3 楼
DavidG    6 年前

你可以这样做 curve_fit :

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

x = np.array([132.54672, 201.3845717, 323.2654551])
y = np.array([51.54671995, 96.38457166, 131.2654551])

linear = lambda x, a, b: a * x + b

popt, pcov = curve_fit(linear, x, y, p0=[1, 1])
plt.plot(x, y, "rx")
plt.plot(x, linear(x, *popt), "b-")
plt.title("f(x)=a*x+b, a={:.2f}, b={:.2f}".format(*popt))
plt.show()

情节:

enter image description here

Sheldore David Zwicker
Reply   •   4 楼
Sheldore David Zwicker    6 年前

好的,这是一个使用dataframe的解决方案。我跳过导入命令,只显示相关部分。如果你不知道他们是什么,给我一个评论。

我在用纽比的 polyfit 一阶线性回归。你可以打印适合的字体( fit )得到坡度和截距。 fit[0] 是截获和 fit[1] 是斜率(或系数,如你所说)

column_A= [132.54672, 201.3845717, 323.2654551]
column_B= [51.54671995, 96.38457166, 131.2654551]
df = pd.DataFrame({'A': column_A, 'B': column_B})

fit = np.poly1d(np.polyfit(df['A'], df['B'], 1))

A_mesh = np.linspace(min(df['A']), max(df['A']), 100)

plt.plot(df['A'], df['B'], 'bx', label='Data', ms=10)
plt.plot(A_mesh, fit(A_mesh), '-b', label='Linear fit')

print (fit)
# 0.4028 x + 4.833

enter image description here