爬虫之一：爬补天厂商数据

时间：2016-01-29 20:32:17 阅读：161 评论：0 收藏：0 [点我收藏+]

#coding:utf-8
import re,urllib

def gethtml(url):
  page = urllib.urlopen(url)
  html=page.read()
  return html

def getlink(html):

  link = re.findall(r‘<td  align="left" style="padding-left:20px;">(.*?)</td>‘,html)
  #linklist = re.findall(link,html)
  return link

def save(links):
  f=open(‘360.txt‘,‘a‘)
  for i in links:
    f.write(i+"\n")
    #f.close()
    #print ‘ok‘  

for page in range(11, 200):
  url = "https://butian.360.cn/company/lists/page/" +str(page)
  html = gethtml(url)
  print str(page)+"ye"
  links = getlink(html)
  print links
  save(links)

爬虫之一：爬补天厂商数据

原文：http://www.cnblogs.com/dongchi/p/5169287.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)