python: API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' URL = API + date URL end
. python: ------------------ python (type end to exit) ----- >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> URL = API + date >>> URL 'https://api.fda.gov/drug/event.json?search=receivedate:[20180101+TO+20180105]' >>> end --------------------------------------------------
进一步,我们将地区限制在美国。在下例中,字符串 country 将我们的查询地区限制在了美国。之后,我们可以合并字符串 API、date 和 country,将它存储为字符串 URL。这里需要注意,我们必须要在 date 和 country 之间加上 “+AND+”。
python: API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' URL = API + date + "+AND+" + country URL end
此时,虽然 API 指令变得更加复杂了,但是我们的代码可读性依旧很高。
. python: ------------------ python (type end to exit) ----- >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> country = 'occurcountry:"US"' >>> URL = API + date + "+AND+" + country >>> URL 'https://api.fda.gov/drug/event.json?search=receivedate:[20180101+TO+20180105]+ > AND+occurcountry:"US"' >>> end --------------------------------------------------
同理,我们通过添加字符串 drug 可以进一步限制药品不良事件涉及的药品为 Fentanyl。
python: API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' URL = API + date + "+AND+" + country + "+AND+" + drug URL end
最后,我们在我们的数据中加入每一天药品不良事件发生的次数。要注意的是,连接字符串 data 的是 & 而不是 +AND+。
python: API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data URL end
. python: ------------------ python (type end to exit) ----- >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> country = 'occurcountry:"US"' >>> drug = 'patient.drug.openfda.brand_name:"Fentanyl"' >>> data = 'count=receivedate' >>> URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data >>> URL 'https://api.fda.gov/drug/event.json?search=receivedate:[20180101+TO+20180105]+
> AND+occurcountry:"US"+AND+patient.drug.openfda.brand_name:"Fentanyl"&count=re > ceivedate' >>> end --------------------------------------------------
4. 通过 API 指令抓取数据
现在,我们准备将 API 指令发送至 openFDA 数据服务器。首先调入 requests 包,之后通过 requests.get() 将 API 指令的 URL 发送出去,最后得到的 JSON 数据,并放在 data 中。
python: import requests API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data data = requests.get(URL).json() data end
我们可以输入 data 以查看 data 中的内容,但此时屏幕列示的 data 可读性较差,如下所示:
. python: ------------------ python (type end to exit) ----- >>> import requests >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> country = 'occurcountry:"US"' >>> drug = 'patient.drug.openfda.brand_name:"Fentanyl"' >>> data = 'count=receivedate' >>> URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data >>> data = requests.get(URL).json() >>> data {'meta': {'disclaimer': 'Do not rely on openFDA to make decisions regarding med > ical care. While we make every effort to ensure that data is accurate, you sh > ould assume all results are unvalidated. We may limit or otherwise restrict y > our access to the API in line with our Terms of Service.', 'terms': 'https:// > open.fda.gov/terms/', 'license': 'https://open.fda.gov/license/', 'last_updat > ed': '2020-10-24'}, 'results': [{'time': '20180101', 'count': 1}, {'time': '2 > 0180102', 'count': 16}, {'time': '20180103', 'count': 20}, {'time': '20180104 > ', 'count': 25}, {'time': '20180105', 'count': 24}]} >>> end --------------------------------------------------
python: import requests import json API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data data = requests.get(URL).json() print(json.dumps(data, indent=4, sort_keys=True)) end
. python: ------------------ python (type end to exit) ----- >>> import requests >>> import json >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> country = 'occurcountry:"US"' >>> drug = 'patient.drug.openfda.brand_name:"Fentanyl"' >>> data = 'count=receivedate' >>> URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data >>> data = requests.get(URL).json() >>> print(json.dumps(data, indent=4, sort_keys=True)) { "meta": { "disclaimer": "Do not rely on openFDA to make decisions regarding medic > al care. While we make every effort to ensure that data is accurate, you shou > ld assume all results are unvalidated. We may limit or otherwise restrict you > r access to the API in line with our Terms of Service.", "last_updated": "2020-10-24", "license": "https://open.fda.gov/license/", "terms": "https://open.fda.gov/terms/" }, "results": [ { "count": 1, "time": "20180101" }, { "count": 16, "time": "20180102" }, { "count": 20, "time": "20180103" }, { "count": 25, "time": "20180104" }, { "count": 24, "time": "20180105" } ] } >>> end --------------------------------------------------
python: import requests import json API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data data = requests.get(URL).json() fdadata = data.get('results', []) print(json.dumps(fdadata, indent=4, sort_keys=True)) end
. python: ------------------ python (type end to exit) ----- >>> import requests >>>
import json >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> country = 'occurcountry:"US"' >>> drug = 'patient.drug.openfda.brand_name:"Fentanyl"' >>> data = 'count=receivedate' >>> URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data >>> data = requests.get(URL).json() >>> fdadata = data.get('results', []) >>> print(json.dumps(fdadata, indent=4, sort_keys=True)) [ { "count": 1, "time": "20180101" }, { "count": 16, "time": "20180102" }, { "count": 20, "time": "20180103" }, { "count": 25, "time": "20180104" }, { "count": 24, "time": "20180105" } ] >>> end --------------------------------------------------
python: import requests import json import pandas as pd API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data data = requests.get(URL).json() fdadata = data.get('results', []) fda_df = pd.read_json(json.dumps(fdadata)) fda_df end
. python: ------------------ python (type end to exit) ----- >>> import requests >>> import json >>> import pandas as pd >>> API = 'https://api.fda.gov/drug/event.json?search=' >>> date = 'receivedate:[20180101+TO+20180105]' >>> country = 'occurcountry:"US"' >>> drug = 'patient.drug.openfda.brand_name:"Fentanyl"' >>> data = 'count=receivedate' >>> URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data >>> data = requests.get(URL).json() >>> fdadata = data.get('results', []) >>> fda_df = pd.read_json(json.dumps(fdadata)) >>> fda_df time count 0201801011 12018010216 22018010320 32018010425 42018010524 >>> end --------------------------------------------------
最后,我们利用 to_stata() 将 fda_df 保存为「fentanyl.dta」,选项 version=118 表示数据将被存储为 Stata 16 版本的数据文件。
python: import requests import json import pandas as pd API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20180101+TO+20180105]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data data = requests.get(URL).json() fdadata = data.get('results', []) fda_df = pd.read_json(json.dumps(fdadata)) fda_df.to_stata('fentanyl.dta', version=118) end
# Construct the URL for the API call API = 'https://api.fda.gov/drug/event.json?search=' date = 'receivedate:[20100101+TO+20200101]' country = 'occurcountry:"US"' drug = 'patient.drug.openfda.brand_name:"Fentanyl"' data = 'count=receivedate' URL = API + date + "+AND+" + country + "+AND+" + drug + "&" + data
# Submit the API data request data = requests.get(URL).json()
# Extract the 'results' part of the JSON data fdadata = data.get('results', [])
# Convert the JSON data to a pandas data frame fda_df = pd.read_json(json.dumps(fdadata))
# Use pandas to write the data frame to a Stata 16 dataset fda_df.to_stata('fentanyl.dta', version=118)
end
use fentanyl.dta, clear drop index generate date = mofd(date(string(time, "%8.0f"),"YMD")) format date %tm collapse (sum) count, by(date) tsset date, monthly
twoway (line count date, lcolor(blue) lwidth(medthick)), /// ytitle("Adverse Events Reported to the FDA") /// ylabel(0(2000)8000, angle(horizontal) grid) /// xtitle("") /// title("Fentanyl Adverse Events Reported to the FDA") /// caption(Data Source: openFDA, size(small)) /// scheme(s1color)
6. 结语
至此,我们已成功地将 API 指令发送给了「openFDA」,得到了 JSON 数据,并且将其转化为了 Stata 数据集格式。