本文主要是介绍爬虫小练习:网页源代码隐藏数据(非ajax和js加载)之空气质量网,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
注:一层一层剥开它的心,切记一次性访问目标网页
from selenium import webdriver
import requests
import timeoption = webdriver.ChromeOptions()
option.add_argument("disable-infobars")
option.add_argument("--headless")driver = webdriver.Chrome(chrome_options = option)
driver.get("https://www.aqistudy.cn/historydata/")
driver.maximize_window()
time.sleep(2)driver.find_element_by_xpath('//div[@class="bottom"]//a[@href="monthdata.php?city=深圳"]').click()
time.sleep(3)
content = driver.page_source
# print(content)# response = requests.get("https://www.aqistudy.cn/historydata/monthdata.php?city=%E4%B8%8A%E6%B5%B7")
# content = response.content.decode("utf-8")with open("test.txt","w",encoding='utf-8') as f:f.write(content)# print(content)
这篇关于爬虫小练习:网页源代码隐藏数据(非ajax和js加载)之空气质量网的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!