我认为这是某种机器人检测。然而
requests_html
可以呈现JS,它不是真正的浏览器,不能完全绕过机器人保护。
你可以使用一些库来控制真正的浏览器,比如
playwright
/
selenium
/
puppeteer
下面是一个例子
剧作家
:
from playwright.sync_api import sync_playwright
URL = 'https://www.otcmarkets.com'
with sync_playwright() as p:
# Webkit is fastest to start and hardest to detect
browser = p.webkit.launch(headless=True)
page = browser.new_page()
page.goto(URL)
html = page.content()
print(html)