社区所有版块导航
Python
python开源   Django   Python   DjangoApp   pycharm  
DATA
docker   Elasticsearch  
aigc
aigc   chatgpt  
WEB开发
linux   MongoDB   Redis   DATABASE   NGINX   其他Web框架   web工具   zookeeper   tornado   NoSql   Bootstrap   js   peewee   Git   bottle   IE   MQ   Jquery  
机器学习
机器学习算法  
Python88.com
反馈   公告   社区推广  
产品
短视频  
印度
印度  
私信  •  关注

MendelG

MendelG 最近创建的主题
MendelG 最近回复了
4 年前
回复了 MendelG 创建的主题 » Python-沃尔玛的类别名称Web抓取

要打印所有产品(16),您可以尝试使用CSS选择器进行搜索: .collapsible-content > ul a, .sometimes-shown a

在你的例子中:

from selenium import webdriver

driver = webdriver.Chrome()
url = (
    "https://www.walmart.com/browse/snacks-cookies-chips/cookies/976759_976787_1001391"
)
driver.get(url)

search = driver.find_element_by_xpath("//*[@id='Departments']/div/div/ul").text
driver.find_element_by_xpath("//*[@id='Departments']/div/div/button/span").click()

all_departments = [
    link.get_attribute("href")
    for link in driver.find_elements_by_css_selector(
        ".collapsible-content > ul a, .sometimes-shown a"
    )
]

print(len(all_departments))
print(all_departments)

输出:

16
['https://www.walmart.com/browse/food/chocolate-cookies/976759_976787_1001391_4007138', 'https://www.walmart.com/browse/food/cookies/976759_976787_1001391_8331066', 'https://www.walmart.com/browse/food/butter-cookies/976759_976787_1001391_7803640', 'https://www.walmart.com/browse/food/shortbread-cookies/976759_976787_1001391_8026949', 'https://www.walmart.com/browse/food/coconut-cookies/976759_976787_1001391_6970757', 'https://www.walmart.com/browse/food/healthy-cookies/976759_976787_1001391_7466302', 'https://www.walmart.com/browse/food/keebler-cookies/976759_976787_1001391_3596825', 'https://www.walmart.com/browse/food/biscotti/976759_976787_1001391_2224095', 'https://www.walmart.com/browse/food/gluten-free-cookies/976759_976787_1001391_4362193', 'https://www.walmart.com/browse/food/molasses-cookies/976759_976787_1001391_3338971', 'https://www.walmart.com/browse/food/peanut-butter-cookies/976759_976787_1001391_6460174', 'https://www.walmart.com/browse/food/pepperidge-farm-cookies/976759_976787_1001391_2410932', 'https://www.walmart.com/browse/food/snickerdoodle-cookies/976759_976787_1001391_8926167', 'https://www.walmart.com/browse/food/sugar-free-cookies/976759_976787_1001391_5314659', 'https://www.walmart.com/browse/food/tate-s-cookies/976759_976787_1001391_9480535', 'https://www.walmart.com/browse/food/vegan-cookies/976759_976787_1001391_8007359']
4 年前
回复了 MendelG 创建的主题 » 如何在Python中使用if检查bs4元素的类型?

您还必须导入 Tag 连同 BeautifulSoup :

from bs4 import BeautifulSoup, Tag

检查你的元素是否正确 type() 标签 :

if type(bs4_element_var) == Tag:
    print("bs4_element_var's type is bs4.element.Tag")

if isinstance(bs4_element_var, Tag):
    print("bs4_element_var's type is bs4.element.Tag")