这是一种简单的方法,扫描所有书签以找到匹配的对象,然后扫描每个页面,直到它匹配相同的对象。可能不是最优雅的方法,但应该完成工作。
from PyPDF2 import PdfFileReader
reader = PdfFileReader('D:\\Downloads\Sample.pdf')
# Scan outlines for bookmark containing KYC
outlines = reader.outlines
print(outlines)
for bookmark in outlines:
print(bookmark['/Title'])
print(bookmark['/Page'])
if bookmark['/Title'] == 'KYC':
mypage = bookmark['/Page']
# Scan page looking for the matching object
print(reader.getNumPages())
for x in range(0, reader.getNumPages()):
apage = reader.getPage(x)
print(apage)
if apage == mypage:
print('Eureka on page', x + 1)