下面的代码是用来从网页上抓取数据的。实际上,这个代码的输出
另一个列表的输出将被考虑。
list2=[]
###-I am collecting all span tags ,storing as text in variable called alpha.
for i in range(len(contents)):
for j in contents[i].findAll('span'):
alpha=j.text
# print(alpha)
alphachar=re.sub('[^a-zA-Z]+', '', alpha) #I am eliminating empty lists.
alphabets=alphachar.split() #converting to list
for item in alphabets:
if item!=[]:
list2.append(item) #I am appending to lists
for (a, b) in zip(li,list2):
print(a,b)
上述代码的输出为:
AMD AdvancedMicroDevicesInc
BAC BankofAmericaCorp
GE GeneralElectricCo
F FordMotorCo
M MacysInc
PFE PfizerInc
FCX FreeportMcMoRanInc
BMY BristolMyersSquibbCo
T ATTInc
JWN NordstromInc
JWN NordstromInc
M MacysInc
LB LBrandsInc
GPS GapInc
SJM JMSmuckerCo
CPRI CapriHoldingsLtd
RL RalphLaurenCorp
BIIB BiogenInc
FCX FreeportMcMoRanInc
ADS AllianceDataSystemsCorp
现在我有了另一个名为name的列表:
name = allbody.findAll('h3')
其输出为:
Most actives,Gainers
现在,我希望输出为:
- Most actives
AMD AdvancedMicroDevicesInc
BAC BankofAmericaCorp
GE GeneralElectricCo
F FordMotorCo
M MacysInc
PFE PfizerInc
FCX FreeportMcMoRanInc
BMY BristolMyersSquibbCo
T ATTInc
JWN NordstromInc
- Gainers
JWN NordstromInc
M MacysInc
LB LBrandsInc
GPS GapInc
SJM JMSmuckerCo
CPRI CapriHoldingsLtd
RL RalphLaurenCorp
BIIB BiogenInc
FCX FreeportMcMoRanInc
ADS AllianceDataSystemsCorp
我试着用嵌套for循环来命名和压缩函数,但没有成功。有人能帮忙吗?