https://nakedsecurity.sophos.com/
. 这是我使用的示例代码
from newspaper import Article
from urllib.parse import urlparse, parse_qs
url = "https://nakedsecurity.sophos.com/2020/01/28/5-ways-to-be-a-bit-safer-this-data-privacy-day/"
toi_article = Article(url, language="en")
# #To download the article
toi_article.download()
# #To parse the article
toi_article.parse()
# # #To perform natural language processing ie..nlp
toi_article.nlp()
print("Article's Text:")
print(toi_article.text)
我也在这个演示应用程序中尝试了这篇文章的URL
http://newspaper-demo.herokuapp.com/
它说“将html转换为字符串时出错”