python爬虫:案例二:携程网酒店价格信息

2024-08-27 05:48

本文主要是介绍python爬虫:案例二:携程网酒店价格信息,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

这个案例可能不太智能,有个朋友和我说他们公司让他爬携程的酒店价格信息,我当时去看了一下,发现携程的信息爬起来挺麻烦,城市是必输项,酒店名是选输项,跳转的url中城市后面带一个数字,对于这个每个城市表示的数字的规则我不知道,这样我只能定向爬一个城市,或者就是模拟浏览器之类,觉得挺麻烦,到了酒店页面又有挺多东西看着头疼,我对他说这个挺麻烦的,分析花的时间会很久,后来他说他们公司是人工输入酒店价格详情的url到数据库,然后直接从一个页面获取价格数据

#coding=utf-8
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
import urllib
from selenium import webdriverurls=['http://hotels.ctrip.com/hotel/848702.html#ctm_ref=hod_sr_lst_dl_n_2_1']
#假设一堆url
class Xc():def pc(seif):for url in urls:driver = webdriver.PhantomJS() driver.get(url)fangx_1=driver.find_element_by_class_name('room_unfold').text.split('\n')[0]jiage_1=driver.find_element_by_class_name('base_price').textdriver.quitreturn fangx_1+'|'+jiage_1#房型和对应的价格s=Xc()
print s.pc()


结果:
单人房(无窗)|¥237

上面的代码只是简单的例子,而且所有房型价额需要一个一个解析,太麻烦了,后来我发现源码最下面居然有一段json,里面的内容就是房型,价格这些,于是我改了一下代码

#coding=utf-8
import sys
reload(sys)
sys.setdefaultencoding( "utf-8" )
import urllib
from selenium import webdriverurls=['http://hotels.ctrip.com/hotel/848702.html#ctm_ref=hod_sr_lst_dl_n_2_1']class Xc():def pc(seif):for url in urls:driver = webdriver.PhantomJS() driver.get(url)#fangx_1=driver.find_element_by_class_name('room_unfold').text.split('\n')[0]#jiage_1=driver.find_element_by_class_name('base_price').textjson=driver.find_element_by_xpath('//*[@id="htl_detail_htl_hotel"]').get_attribute('value')driver.quit#return fangx_1+'|'+jiage_1return jsons=Xc()
print s.pc()


结果:
pageid=102003;ht=848702;checkin=2016-05-09;checkout=2016-05-10;rmlist=[{"rm":"30665921","shadowid":"0","rpfq":"0.0","rpfh":"219","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"30265080","shadowid":"0","rpfq":"0.0","rpfh":"263","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125027","shadowid":"0","rpfq":"0.0","rpfh":"294","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"8684722","shadowid":"0","rpfq":"0.0","rpfh":"294","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265081","shadowid":"0","rpfq":"0.0","rpfh":"219","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"8684723","shadowid":"0","rpfq":"0.0","rpfh":"294","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265075","shadowid":"0","rpfq":"0.0","rpfh":"237","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"单人床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125024","shadowid":"0","rpfq":"0.0","rpfh":"265","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"单人床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2890470","shadowid":"0","rpfq":"0.0","rpfh":"265","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"单人床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265074","shadowid":"0","rpfq":"0.0","rpfh":"254","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125041","shadowid":"0","rpfq":"0.0","rpfh":"284","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2890480","shadowid":"0","rpfq":"0.0","rpfh":"284","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265072","shadowid":"0","rpfq":"0.0","rpfh":"280","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125016","shadowid":"0","rpfq":"0.0","rpfh":"313","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2525661","shadowid":"0","rpfq":"0.0","rpfh":"313","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265079","shadowid":"0","rpfq":"0.0","rpfh":"305","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"2525665","shadowid":"0","rpfq":"0.0","rpfh":"341","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"大床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"F","isgift":"F","isgroup":"F"},{"rm":"30265077","shadowid":"0","rpfq":"0.0","rpfh":"305","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"},{"rm":"24125021","shadowid":"0","rpfq":"0.0","rpfh":"341","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"免费取消","guaranteetype":"F","bk":"T","isgift":"F","isgroup":"F"},{"rm":"8684720","shadowid":"0","rpfq":"0.0","rpfh":"341","pt":"FG","mt":"0.0","pn":"0.0","promotiontype":"0","iscomfirm":"F","bedtype":"双床","breakfast":"0","policy":"不可取消","guaranteetype":"T","bk":"T","isgift":"F","isgroup":"F"}]

rpfh是价格
bedtype是房型



这篇关于python爬虫:案例二:携程网酒店价格信息的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1110807

相关文章

python: 多模块(.py)中全局变量的导入

文章目录 global关键字可变类型和不可变类型数据的内存地址单模块(单个py文件)的全局变量示例总结 多模块(多个py文件)的全局变量from x import x导入全局变量示例 import x导入全局变量示例 总结 global关键字 global 的作用范围是模块(.py)级别: 当你在一个模块(文件)中使用 global 声明变量时,这个变量只在该模块的全局命名空

Hadoop企业开发案例调优场景

需求 (1)需求:从1G数据中,统计每个单词出现次数。服务器3台,每台配置4G内存,4核CPU,4线程。 (2)需求分析: 1G / 128m = 8个MapTask;1个ReduceTask;1个mrAppMaster 平均每个节点运行10个 / 3台 ≈ 3个任务(4    3    3) HDFS参数调优 (1)修改:hadoop-env.sh export HDFS_NAMENOD

性能分析之MySQL索引实战案例

文章目录 一、前言二、准备三、MySQL索引优化四、MySQL 索引知识回顾五、总结 一、前言 在上一讲性能工具之 JProfiler 简单登录案例分析实战中已经发现SQL没有建立索引问题,本文将一起从代码层去分析为什么没有建立索引? 开源ERP项目地址:https://gitee.com/jishenghua/JSH_ERP 二、准备 打开IDEA找到登录请求资源路径位置

深入探索协同过滤:从原理到推荐模块案例

文章目录 前言一、协同过滤1. 基于用户的协同过滤(UserCF)2. 基于物品的协同过滤(ItemCF)3. 相似度计算方法 二、相似度计算方法1. 欧氏距离2. 皮尔逊相关系数3. 杰卡德相似系数4. 余弦相似度 三、推荐模块案例1.基于文章的协同过滤推荐功能2.基于用户的协同过滤推荐功能 前言     在信息过载的时代,推荐系统成为连接用户与内容的桥梁。本文聚焦于

【Python编程】Linux创建虚拟环境并配置与notebook相连接

1.创建 使用 venv 创建虚拟环境。例如,在当前目录下创建一个名为 myenv 的虚拟环境: python3 -m venv myenv 2.激活 激活虚拟环境使其成为当前终端会话的活动环境。运行: source myenv/bin/activate 3.与notebook连接 在虚拟环境中,使用 pip 安装 Jupyter 和 ipykernel: pip instal

【区块链 + 人才服务】可信教育区块链治理系统 | FISCO BCOS应用案例

伴随着区块链技术的不断完善,其在教育信息化中的应用也在持续发展。利用区块链数据共识、不可篡改的特性, 将与教育相关的数据要素在区块链上进行存证确权,在确保数据可信的前提下,促进教育的公平、透明、开放,为教育教学质量提升赋能,实现教育数据的安全共享、高等教育体系的智慧治理。 可信教育区块链治理系统的顶层治理架构由教育部、高校、企业、学生等多方角色共同参与建设、维护,支撑教育资源共享、教学质量评估、

【机器学习】高斯过程的基本概念和应用领域以及在python中的实例

引言 高斯过程(Gaussian Process,简称GP)是一种概率模型,用于描述一组随机变量的联合概率分布,其中任何一个有限维度的子集都具有高斯分布 文章目录 引言一、高斯过程1.1 基本定义1.1.1 随机过程1.1.2 高斯分布 1.2 高斯过程的特性1.2.1 联合高斯性1.2.2 均值函数1.2.3 协方差函数(或核函数) 1.3 核函数1.4 高斯过程回归(Gauss

业务中14个需要进行A/B测试的时刻[信息图]

在本指南中,我们将全面了解有关 A/B测试 的所有内容。 我们将介绍不同类型的A/B测试,如何有效地规划和启动测试,如何评估测试是否成功,您应该关注哪些指标,多年来我们发现的常见错误等等。 什么是A/B测试? A/B测试(有时称为“分割测试”)是一种实验类型,其中您创建两种或多种内容变体——如登录页面、电子邮件或广告——并将它们显示给不同的受众群体,以查看哪一种效果最好。 本质上,A/B测

客户案例:安全海外中继助力知名家电企业化解海外通邮困境

1、客户背景 广东格兰仕集团有限公司(以下简称“格兰仕”),成立于1978年,是中国家电行业的领军企业之一。作为全球最大的微波炉生产基地,格兰仕拥有多项国际领先的家电制造技术,连续多年位列中国家电出口前列。格兰仕不仅注重业务的全球拓展,更重视业务流程的高效与顺畅,以确保在国际舞台上的竞争力。 2、需求痛点 随着格兰仕全球化战略的深入实施,其海外业务快速增长,电子邮件成为了关键的沟通工具。

【学习笔记】 陈强-机器学习-Python-Ch15 人工神经网络(1)sklearn

系列文章目录 监督学习:参数方法 【学习笔记】 陈强-机器学习-Python-Ch4 线性回归 【学习笔记】 陈强-机器学习-Python-Ch5 逻辑回归 【课后题练习】 陈强-机器学习-Python-Ch5 逻辑回归(SAheart.csv) 【学习笔记】 陈强-机器学习-Python-Ch6 多项逻辑回归 【学习笔记 及 课后题练习】 陈强-机器学习-Python-Ch7 判别分析 【学