首页 > Web开发 > 详细

Ajax实战微博

时间:2020-03-16 10:04:11      阅读:131      评论:0      收藏:0      [点我收藏+]

转载自:静觅 » [Python3网络爬虫开发实战] 6.3-Ajax结果提取

技术分享图片

 

 

 上面的代码中比较好的几个地方记录:

 1 base_url = https://m.weibo.cn/api/container/getIndex?
 2 
 3 headers = {
 4     Host: m.weibo.cn,
 5     Referer: https://m.weibo.cn/u/2830678474,
 6     User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36,
 7     X-Requested-With: XMLHttpRequest,
 8 }
 9 
10 
11 def get_page(page):
12     params = {
13         type: uid,
14         value: 2830678474,
15         containerid: 1076032830678474,
16         page: page
17     }
18     
19     # 在这一步中将url分成路径和参数两个部分,使用urlencode对参数进行加载
20     url = base_url + urlencode(params)
21     try:
22         response = requests.get(url, headers=headers)
23         # 这个部分对返回码进行判断,去掉非正常情况的处理
24         if response.status_code == 200:
25             # 返回结果是json格式的直接调用json方法,不用json.loads(response.content)
26             return response.json()
27     except requests.ConnectionError as e:
28         print(Error, e.args)

个人代码:

 1 import requests
 2 import json
 3 
 4 headers = {
 5     "Referer":"https://m.weibo.cn/u/2830678474?sudaref=cuiqingcai.com&display=0&retcode=6102",
 6     "User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
 7     "X-Requested-With":"XMLHttpRequest",
 8     "X-XSRF-TOKEN":"609539"
 9 }
10 
11 url = "https://m.weibo.cn/api/container/getIndex?sudaref=cuiqingcai.com&display=0&retcode=6102&type=uid&value=2830678474&containerid=1076032830678474"
12 while True:
13     response = requests.get(url,headers=headers)
14     try:
15         since_id = json.loads(response.content)["data"]["cardlistInfo"]["since_id"]
16     except:
17         break
18     url = "https://m.weibo.cn/api/container/getIndex?sudaref=cuiqingcai.com&display=0&retcode=6102&type=uid&value=2830678474&containerid=1076032830678474&since_id=" + str(since_id)
19     content = json.loads(response.content)["data"]["cards"]
20     for i in range(10):
21         try:
22             print(content[i]["mblog"]["text"])
23         except:
24             continue

部分结果展示:

1 每当我颓废的时候,看看这个视频,我就浑身充满了斗志!为了我和我老婆的小米之家!我可以!我能行!加油! <a data-url="http://t.cn/A6hrPmIS" href="https://m.weibo.cn/p/index?containerid=2304444475185156522026&url_type=39&object_type=video&pos=1&luicode=10000011&lfid=1076032830678474" data-hide=""><span class=url-icon><img style=width: 1rem;height: 1rem src=https://h5.sinaimg.cn/upload/2015/09/25/3/timeline_card_small_video_default.png></span><span class="surl-text">崔庆才丨静觅的微博视频</span></a> 
2 <span class="url-icon"><img alt=[doge] src="//h5.sinaimg.cn/m/emoticon/icon/others/d_doge-861403219c.png" style="width:1em; height:1em;" /></span> 
3 转发微博
4 今天我和我老婆都是健康饮食的好仔仔。<span class="url-icon"><img alt=[馋嘴] src="//h5.sinaimg.cn/m/emoticon/icon/default/d_chanzui-01ee2388fd.png" style="width:1em; height:1em;" /></span> 

Ajax实战微博

原文:https://www.cnblogs.com/waws1314/p/12501707.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!