作者: Chuck Huber编译: 丁晨 (厦门大学)邮箱: 3049378404@qq.com
致谢: 本文摘自以下文章,特此感谢!Source: Chuck Huber, 2020, Stata/Python integration part 9: Using the Stata Function Interface to copy data from Python to Stata, -Link-
Stata/Python 交互系列推文 源自 Stata 公司的统计项目总监 Chuck Huber 博士发表于 Stata 官网的系列博文,一共 9 篇。较为系统地介绍了 Stata 与 Python 的交互方式,包括:如何配置你的软件、如何实现 Stata 与 Python 数据集互通、如何调用 Python 工具包、如何进行机器学习分析等。
Part 1 : Setting up Stata to use Python -Link- Part 2 : Three ways to use Python in Stata -Link- Part 3 : How to install Python packages -Link- Part 4 : How to use Python packages-Link-Part 5 : Three-dimensional surface plots of marginal predictions-Link-Part 6 : Working with APIs and JSON data -Link- Part 7 : Machine learning with support vector machines, -Link- Part 8 : Using the Stata Function Interface to copy data from Stata to Python, -Link- Part 9 : Using the Stata Function Interface to copy data from Python to Stata, -Link- 中文编译稿列表如下:
1. 导读
2. 实例说明
2.1 下载并处理数据
2.2 拷贝数据到Stata
2.3 作图
3. 参考资料
4. 相关推文
1. 导读 本文介绍如何使用SFI模块将python数据拷贝到Stata,原文使用python的yfinance模块从Yahoo!Finance网站下载道琼斯工业指数(DJIA)。鉴于国内连接yfinance不太稳定。本文改用pandas_datareader获取道指(DJIA),并使用Stata画出下图:
2. 实例说明 2.1 下载并处理数据 首先我们使用pandas_datareader.data
. python ----------------------------------------------- python (type end to exit) ----------------------- >>> import pandas_datareader.data as web >>> dowjones=web.DataReader('^DJI','stooq','2010-01-01','2019-12-31') >>> dowjones Open High Low Close Volume Date 2019-12-31 28414.64 28547.35 28376.49 28538.44 193336533 2019-12-30 28654.76 28664.69 28428.98 28462.14 181507192 2019-12-27 28675.34 28701.66 28608.98 28645.26 182181663 2019-12-26 28539.46 28624.10 28535.15 28621.39 156025977 2019-12-24 28572.57 28576.80 28503.21 28515.45 86151979 ... ... ... ... ... ... 2010-01-08 10606.40 10619.40 10554.30 10618.20 172637555 2010-01-07 10571.10 10612.40 10505.20 10606.90 217441286 2010-01-06 10564.70 10595.00 10546.50 10573.70 186108764 2010-01-05 10584.60 10584.60 10522.50 10572.00 188599202 2010-01-04 10430.70 10605.00 10430.70 10584.00 179768845 [2516 rows x 5 columns] >>> end -------------------------------------------------------------------------------------------------
. python ----------------------------------------------- python (type end to exit) ----------------------- >>> dowjones.index DatetimeIndex(['2019-12-31', '2019-12-30', '2019-12-27', '2019-12-26', '2019-12-24', '2019-12-23', '2019-12-20', '2019-12-19', '2019-12-18', '2019-12-17', ... '2010-01-15', '2010-01-14', '2010-01-13', '2010-01-12', '2010-01-11', '2010-01-08', '2010-01-07', '2010-01-06', '2010-01-05', '2010-01-04'], dtype='datetime64[ns]', name='Date', length=2516, freq=None) >>> dowjones['dowdate']=dowjones.index.astype(str) >>> dowjones['dowdate'] Date 2019-12-31 2019-12-31 2019-12-30 2019-12-30 2019-12-27 2019-12-27 2019-12-26 2019-12-26 2019-12-24 2019-12-24 ... 2010-01-08 2010-01-08 2010-01-07 2010-01-07 2010-01-06 2010-01-06 2010-01-05 2010-01-05 2010-01-04 2010-01-04 Name: dowdate, Length: 2516, dtype: object >>> end -------------------------------------------------------------------------------------------------
2.2 拷贝数据到Stata 使用python的SFI模块,将数据拷贝到Stata中:
. python ----------------------------------------------- python (type end to exit) ----------------------- >>> from sfi import Data >>> Data.setObsTotal(len(dowjones)) >>> Data.addVarStr("dowdate",10) >>> Data.addVarDouble("dowclose") >>> Data.addVarInt("dowvolume") >>> >>> Data.store("dowdate",None,dowjones['dowdate'],None) >>> Data.store("dowclose",None,dowjones['Close'],None) >>> Data.store("dowvolume",None,dowjones['Volume'],None) >>> end -------------------------------------------------------------------------------------------------
. list in 1/5, abbreviate(9) +-----------------------------------+ | dowdate dowclose dowvolume | |-----------------------------------| 1. | 2019-12-31 28538.44 1.9e+08 | 2. | 2019-12-30 28462.14 1.8e+08 | 3. | 2019-12-27 28645.26 1.8e+08 | 4. | 2019-12-26 28621.39 1.6e+08 | 5. | 2019-12-24 28515.45 8.6e+07 | +-----------------------------------+
dowdate:String ->Date,生成新变量date . generate date = date(dowdate,"YMD") . format %tdCCYY-NN-DD date . replace dowvolume = dowvolume/1000000 (2,516 real changes made) . format %10.2fc dowvolume . label variable dowvolume "DJIA Volume (Millions of Shares)" . list in 1/5, abbreviate(9) +------------------------------------------------+ | dowdate dowclose dowvolume date | |------------------------------------------------| 1. | 2019-12-31 28538.44 193.34 2019-12-31 | 2. | 2019-12-30 28462.14 181.51 2019-12-30 | 3. | 2019-12-27 28645.26 182.18 2019-12-27 | 4. | 2019-12-26 28621.39 156.03 2019-12-26 | 5. | 2019-12-24 28515.45 86.15 2019-12-24 | +------------------------------------------------+
2.3 作图 以date为横坐标,dowclose为纵坐标y(左侧),做折线图;dowvolume为纵坐标y2(右侧),做直方图;Stata基础命令详见连享会公开课-stata33讲
twoway (line dowclose date, lcolor(green) lwidth(medium)) /// (bar dowvolume date, fcolor(blue) lcolor(blue) yaxis(2)), /// title("Dow Jones Industrial Average (2010 - 2019)") /// xtitle("") ytitle("") ytitle("", axis(2)) /// xlabel(, labsize(small) angle(horizontal)) /// ylabel(5000(5000)30000, /// labsize(small) labcolor(green) /// angle(horizontal) format(%9.0fc)) /// ylabel(0(500)3000, /// labsize(small) labcolor(blue) /// angle(horizontal) axis(2)) /// legend(order(1 "Closing Price" 2 "Volume (millions)") /// cols(1) position(10) ring(0))
3. 参考资料 Stata/Python integration part 9: Using the Stata Function Interface to copy data from Python to Stata 【Python搞量化】pandas_datareader 经济和金融数据读取API介绍
