如何在python中使用Selenium在Myntra上单击“显示更多”

Dhruv Darda • 3 年前 • 1175 次点击

我想在Myntra网站上提取规格和“完整外观”,只有点击“显示更多”才能看到。我为此编写了以下代码:

url = 'https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy'

df = pd.DataFrame(columns=['name','title','price','description','Size & fit','Material & care', 'Complete the look'])
metadata = dict.fromkeys(['name','title','price','description','Size & fit','Material & care', 'Complete the look'])
from selenium.common.exceptions import NoSuchElementException

driver = webdriver.Chrome('chromedriver')
specs = dict()
for i in range(1): #len(links)
    driver.get(url)
    try:
        metadata['title'] = driver.find_element_by_class_name('pdp-title').get_attribute("innerHTML")
        metadata['name'] = driver.find_element_by_class_name('pdp-name').get_attribute("innerHTML")
        metadata['price'] = driver.find_element_by_class_name('pdp-price').find_element_by_xpath('./strong').get_attribute("innerHTML")
        metadata['description'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[8]/div/div[1]/p').text
        #metadata['Specifications'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[1]/div[1]').text
        if driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]'):
            print('yes')
            element = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]')
            element.click()
        for i in range(1,20):
            try:
                specs[driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[{}]/div[1]'.format(i)).text] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[{}]/div[2]'.format(i)).text
            except:
                break
        metadata['Complete the look'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[8]/div/div[4]/div[2]/div/p/p').text
    except NoSuchElementException:  
        pass
    df = df.append(metadata, ignore_index=True)

我在输出中得到了一个“是”,我想这表明单击了“显示更多”选项,但在我的数据框的“完成外观”列中得到了一个“无”。如何获取隐藏在“show more”中的详细信息,它有以下标签:

    <div class="index-sizeFitDesc">
<h4 class="index-sizeFitDescTitle index-product-description-title" style="padding-bottom: 12px;">Specifications</h4>
<div class="index-tableContainer">
<div class="index-row">
<div class="index-rowKey">Sleeve Length</div>
<div class="index-rowValue">Long Sleeves</div>
</div><div class="index-row">
<div class="index-rowKey">Shape</div>
<div class="index-rowValue">Straight</div>
</div><div class="index-row">
<div class="index-rowKey">Neck</div>
<div class="index-rowValue">Mandarin Collar</div>
</div><div class="index-row">
<div class="index-rowKey">Print or Pattern Type</div>
<div class="index-rowValue">Geometric</div>
</div><div class="index-row">
<div class="index-rowKey">Design Styling</div>
<div class="index-rowValue">Regular</div></div>
<div class="index-row">
<div class="index-rowKey">Slit Detail</div>
<div class="index-rowValue">Side Slits</div>
</div><div class="index-row">
<div class="index-rowKey">Length</div>
<div class="index-rowValue">Above Knee</div>
</div><div class="index-row">
<div class="index-rowKey">Hemline</div>
<div class="index-rowValue">Curved</div></div></div>
<div class="index-showMoreText">See More</div></div>

Python社区是高质量的Python/Django开发社区
本文地址：http://www.python88.com/topic/129827

1175 次点击

文章 [ 2 ] | 最新文章 3 年前

• 1 楼

pmadhu 3 年前

这个 Specifications 在里面 Product Details 是多个部分的组合。

最好在这些部分中逐一提取细节。

最好试着找到 relative xpaths 对于元素。

url = 'https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy'

# df = pd.DataFrame(columns=['name','title','price','description','Size & fit','Material & care', 'Complete the look'])
metadata = dict.fromkeys(['name','title','price','description','Size & fit','Material & care','Specifications', 'Complete the look'])
from selenium.common.exceptions import NoSuchElementException

driver = webdriver.Chrome('chromedriver')
specs = dict()
specfication = []
for i in range(1): #len(links)
    driver.get(url)
    try:
        metadata['title'] = driver.find_element_by_class_name('pdp-title').get_attribute("innerHTML")
        metadata['name'] = driver.find_element_by_class_name('pdp-name').get_attribute("innerHTML")
        metadata['price'] = driver.find_element_by_class_name('pdp-price').find_element_by_xpath('./strong').get_attribute("innerHTML")

        # Details were extracted even without scrolling, but it would be better to scroll down.
        driver.execute_script("arguments[0].scrollIntoView(true);",driver.find_element_by_xpath("//div[@class='pdp-productDescriptorsContainer']"))
        metadata['description'] = driver.find_element_by_xpath("//p[@class='pdp-product-description-content']").text
        #metadata['Specifications'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[1]/div[1]/div[1]').text
        if driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]'):
            print('yes')
            element = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[7]/div/div[4]/div[2]')
            element.click()
        metadata['Size & fit'] = driver.find_element_by_xpath("//h4[contains(text(),'Size')]/following-sibling::p").text
        metadata['Material & care']=driver.find_element_by_xpath("//h4[contains(text(),'Material')]/following-sibling::p").text

        # from Sleeve Length to Hemline
        specn1 = driver.find_elements_by_xpath("//div[@class='index-sizeFitDesc']/div[1]/div")
        for spec in specn1:
            key = spec.find_element_by_xpath("./div[@class='index-rowKey']").text
            value = spec.find_element_by_xpath("./div[@class='index-rowValue']").text
            specfication.append([key,value])

        #from Colour Family to Occasion
        specn2 = driver.find_elements_by_xpath("//div[@class='index-sizeFitDesc']/div[2]/div[1]/div")
        for spec in specn2:
            key = spec.find_element_by_xpath("./div[@class='index-rowKey']").text
            value = spec.find_element_by_xpath("./div[@class='index-rowValue']").text
            specfication.append([key, value])

        metadata['Specifications'] = specfication
        metadata['Complete the look'] = driver.find_element_by_xpath("//h4[contains(text(),'Complete')]/following-sibling::p").text
        # metadata['Complete the look'] = driver.find_element_by_xpath('//*[@id="mountRoot"]/div/div/div/main/div[2]/div[2]/div[8]/div/div[4]/div[2]/div/p/p').text
    except Exception as e:
        print(e)
        pass
for key,value in metadata.items():
    print(f"{key} : {value}")
    # df = df.append(metadata, ignore_index=True)

yes
name : Men Yellow Printed Straight Kurta
title : Jompers
price : Rs. 892
description : Yellow printed straight kurta, has a mandarin collar, long sleeves, straight hem, and side slits
Size & fit : The model (height 6') is wearing a size M
Material & care : Material: Cotton
Hand Wash
Specifications : [['Sleeve Length', 'Long Sleeves'], ['Shape', 'Straight'], ['Neck', 'Mandarin Collar'], ['Print or Pattern Type', 'Solid'], ['Design Styling', 'Regular'], ['Slit Detail', 'Side Slits'], ['Length', 'Knee Length'], ['Hemline', 'Straight'], ['Colour Family', 'Bright'], ['Weave Pattern', 'Regular'], ['Weave Type', 'Machine Weave'], ['Occasion', 'Daily']]
Complete the look : Sport this classic kurta from Jompers this season. Achieve a comfortably chic look for your next dinner party or family outing when you team this yellow piece with slim trousers and minimal flair.

• 2 楼

cruisepandey 3 年前

我没有通读你写的所有代码,但为了点击show more,我尝试了下面的代码,可能你可以用现有代码注入下面的代码。

我们必须 scroll to that particular element 让 Selenium 知道元素的确切位置。
我用过JS .click() 点击 展示更多

示例代码:

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
#driver.implicitly_wait(50)
wait = WebDriverWait(driver, 20)
driver.get("https://www.myntra.com/kurtas/jompers/jompers-men-yellow-printed-straight-kurta/11226756/buy")
ele = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div.index-showMoreText")))
driver.execute_script("arguments[0].scrollIntoView(true);", ele)
ActionChains(driver).move_to_element(ele).perform()
driver.execute_script("arguments[0].click();", ele)
Complete_The_Look = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.index-product-description-content"))).text
print(Complete_The_Look)

进口:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains

输出:

Sport this classic kurta from Jompers this season. Achieve a comfortably chic look for your next dinner party or family outing when you team this yellow piece with slim trousers and minimal flair.

登录后回复