首页 > 编程语言 > 详细

Python实现模拟登陆

时间:2014-03-22 16:02:53      阅读:463      评论:0      收藏:0      [点我收藏+]

大家经常会用Python进行数据挖掘的说,但是有些网站是需要登陆才能看到内容的,那怎么用Python实现模拟登陆呢?其实网路上关于这方面的描述很多,不过前些日子遇到了一个需要cookie才能登陆的网站,而且这个网站还有些问题,于是费了好大的劲才搞定,现在贴出来给大家分享下。

首先是用Python3标准库里的urllib包实现的一个版本,不需要考虑许多细节:

bubuko.com,布布扣
 1 #! /usr/bin/env python
 2 # -*- coding:utf-8 -*-
 3 
 4 import urllib.request
 5 import urllib.parse
 6 import http.cookiejar
 7 
 8 StudentInfoURL = http://210.x.x.1:90/student/index.jsp
 9 loginURL = http://210.x.x.1:90/login.jsp
10 loginCheckURL = http://210.x.x.1:90/j_security_check
11 post_data = urllib.parse.urlencode({j_username: xxxxxxx, j_password: xxxxxxx})
12 headers = {
13     Content-Type: application/x-www-form-urlencoded,
14     UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36
15 }
16 
17 cj = http.cookiejar.CookieJar()
18 opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
19 #此处一定要链接一次,否则得不到cookie
20 opener.open(loginCheckURL)    
21 urllib.request.install_opener(opener)
22 
23 
24 ######################此处加入异常处理,再登一次即可######################
25 request = urllib.request.Request(loginCheckURL, post_data, headers)
26 try:
27     response = urllib.request.urlopen(request)
28 except:
29     response = urllib.request.urlopen(request)
30 print(response.read().decode(GBK))
31 
32 
33 ######################可以开始正常访问啦######################
34 request = urllib.request.Request(StudentInfoURL, headers=headers)
35 fp =  urllib.request.urlopen(request)
36 print(fp.read().decode(GBK))
bubuko.com,布布扣

下面是另一个版本,用的是比较底层的http包里的client模块实现的,个人很喜欢这个版本:

bubuko.com,布布扣
 1 #!/usr/bin/env python
 2 #  -*- coding:utf-8 -*-
 3 
 4 import http.client
 5 
 6 ###########################################################
 7 HOST = 210.x.x.1:90
 8 UserName =  "xxxxxxx"
 9 PassWord =  "xxxxxxx"
10 data =  "j_username=%s&j_password=%s"        %(UserName,PassWord)
11 Headers = {
12     "Content-Type":"application/x-www-form-urlencoded",
13     "User-Agent":"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729)",
14     }
15 ###########################################################
16 
17 
18 #连接服务器
19 conn = http.client.HTTPConnection(HOST,timeout=30)
20 conn.connect()
21 
22 #GET到登录页,以获取cookies
23 conn.request("GET","/j_security_check",None,Headers)
24 res = conn.getresponse()
25 m_cookie = res.getheader("Set-Cookie").split(;)[0]
26 res.read()
27  
28 #POST到登录页,进行登录
29 Headers["Cookie"] = m_cookie
30 conn.request("POST","/j_security_check",data,Headers)
31 res = conn.getresponse()
32 res.read()
33 if res.status == 400:
34     #再次链接到登录页
35     conn.request("POST","/j_security_check",data,Headers)
36     res = conn.getresponse()
37     res.read()
38 conn.close()
39 
40 
41 
42 
43 
44 ######################可以开始正常访问啦######################
45 conn2 = http.client.HTTPConnection(HOST)
46 conn2.request("GET","/student/index.jsp",None,Headers)
47 fp = conn2.getresponse()
48 print(fp.status)
49 print(fp.read().decode("GBK"))
50 ###########################################################
bubuko.com,布布扣

欢迎大家批评

Python实现模拟登陆,布布扣,bubuko.com

Python实现模拟登陆

原文:http://www.cnblogs.com/oOXuOo/p/3617385.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!