本文主要是介绍wind客户端非官方API接口数据爬虫教程,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
很多时候需要获取的数据,API提供不了。
以风控数据为例:
下面开始数据抓取教程:
1、使用打开fiddle,并配置wind客户端代理抓包
打开风控界面,再查看fiddle发现
wind.risk.platform/risknews/get_news接口就是风控展示信息的内容
复制接口参数到python代码
通过测试发现wind.sessionid是认证session。
2.获取session
打开CE,加载wind进程
搜索前面,fiddle中得wind.session的值
发现最下面有
可以通过内存地址来获取session
完整代码如下:
import pymem
Game = pymem.Pymem("wmain.exe") # wind进程
def Get_moduladdr(dll): # 读DLL模块基址modules = list(Game.list_modules()) # 列出exe的全部DLL模块for module in modules:if module.name == dll:Moduladdr = module.lpBaseOfDllreturn ModuladdrChar_Modlue = Get_moduladdr("CSector.DLL") # 读DLL模块基址
session = Game.read_bytes(Char_Modlue+0x139088, 32).decode("utf8")
import requests, json
url = "https://114.80.154.45/wind.risk.platform/risknews/get_news"headers = {"Host": "114.80.154.45","Connection": "keep-alive","Content-Length": "289","Accept": "*/*","Content-Type": "application/json;charset=UTF-8","Origin": "https://114.80.154.45","User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36","wind-language":"zh-CN","wind.sessionid": session,"Referer": "https://114.80.154.45/wind.risk.platform/index.html?lan=cn","Accept-Encoding": "gzip, deflate, br","Accept-Language": "zh-CN,en-US;q=0.9",
}
body = {"pageSize":30,"tagCode":[],"areaCode":[],"industryCode":[],"emotionId":["7012000001"],"companyNature":[],"companyCode":[],"keywords":[],"timeFrom":"2023-05-25T00:00:00Z","timeTo":"2023-05-25T23:59:59Z","sector":["a001010c00000000"],"importanceId":[],"filterType":"1","windcodeEnable":True,"pageNo":1}
r = requests.post(url,headers=headers,data= json.dumps(body))
print(r.text)
这篇关于wind客户端非官方API接口数据爬虫教程的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!