【py分析】

时间：2014-01-15 23:57:10 阅读：461 评论：0 收藏：0 [点我收藏+]

pyQuery

pyQuery 是 jQuery 在 python 中的实现，能够以 jQuery 的语法来操作解析 HTML 文档，十分方便。使用前需要安装，easy_install pyquery 即可，或者 Ubuntu 下

sudo apt-get install python-pyquery

以下例子：

from pyquery import PyQuery as pyq
doc=pyq(url=r‘http://list.taobao.com/browse/cat-0.htm‘)
cts=doc(‘.market-cat‘)
 
for i in cts:
	print ‘====‘,pyq(i).find(‘h4‘).text() ,‘====‘
	for j in pyq(i).find(‘.sub‘):
		print pyq(j).text() ,
	print ‘\n‘

--------------- my code --------------------

for 
i in 
cts:
    print 
‘-‘*10,pyq(i).find(‘h4‘).text()
    for 
j in 
pyq(i).find(‘.subtitle‘):
        print 
pyq(j).text()
    print 
‘\n‘
    for 
j in 
pyq(i).find(‘.sublist‘):
        print 
‘\t‘,pyq(j).text()
    print 
‘\n‘

------------------------------------------------

You can use the PyQuery class to load an xml document from a string, a lxml document, from a file or from an url:

>>> from 
pyquery import 
PyQuery as pq
>>> from 
lxml import 
etree
>>> import 
urllib
>>> d = 
pq("<html></html>")
>>> d = 
pq(etree.fromstring("<html></html>"))
>>> d = 
pq(url=your_url)
>>> d = 
pq(url=your_url,
...        opener=lambda 
url, **kw: urlopen(url).read())
>>> d = 
pq(filename=path_to_html_file)

转换 (Traversing)

支持大部分jQuwey转换方法。这里是一些实例。

用字符选择器来进行过滤:

>>> d(‘p‘).filter(‘.hello‘)
[<p#hello.hello>]

也可以对单一元素使用 eq 方法:

>>> d(‘p‘).eq(0)
[<p#hello.hello>]

用户也可以寻找内嵌元素:

>>> d(‘p‘).find(‘a‘)
[<a>, <a>]
>>> d(‘p‘).eq(1).find(‘a‘)
[<a>]

>>> d(‘p‘).find(‘a‘).end()
[<p#hello.hello>, <p#test>]
>>> d(‘p‘).eq(0).end()
[<p#hello.hello>, <p#test>]
>>> d(‘p‘).filter(lambda i: i == 1).end()
[<p#hello.hello>, <p#test>]

【py分析】

原文：http://www.cnblogs.com/lizunicon/p/3515983.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)