本文主要是介绍python爬虫实战(7)--获取it某家热榜,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
1. 需要的类库
import requests
from bs4 import BeautifulSoup
import pandas as pd
2. 请求榜单
def fetch_ranking_data():url = "https://m.ithome.com/rankm/"response = requests.get(url)if response.status_code == 200:return response.contentelse:print(f"Error fetching data. Status code: {response.status_code}")return None
3. 解析响应
def parse_html(html_content):soup = BeautifulSoup(html_content, 'html.parser')rank_items = soup.find_all('div', class_='placeholder one-img-plc')data = []for rank_item in rank_items:rank_num = rank_item.select_one('.rank-num').texttitle = rank_item.select_one('.plc-title').texturl = rank_item.select_one('a')['href']data.append({'Rank': rank_num,'Title': title,'URL': url})return data
4.输出文件
def create_excel(data):df = pd.DataFrame(data)df.to_excel('ranking_data.xlsx', index=False)print("Excel file created successfully.")
5. 成果展示
这篇关于python爬虫实战(7)--获取it某家热榜的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!