用Beautifulsoup4库处理html网站

时间：2020-05-13 20:37:52 阅读：49 评论：0 收藏：0 [点我收藏+]

代码：

from bs4 import BeautifulSoup 
r=‘‘‘<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>菜鸟教程（runoob.com)</title> 
</head>
<body>
         <hl>我的第一个标题</hl>
         <p id="first">我的第一个段落。</p> 
</body>
                  <table border="1">
          <tr>
                  <td>row 1, cell 1</td> 
                  <td>row 1, cell 2</td> 
         </tr>
         <tr>
                  <td>row 2, cell 1</td>
                  <td>row 2, cell 2</td>
         <tr>
</table>
</html>‘‘‘
soup= BeautifulSoup(r)
print("打印head标签内容:")
print(soup.head)
print("我的学号:3029")
print("获取body标签内容:")
print(soup.body)
print("获取id为first的标签:")
print(soup.find_all(id="first"))
print("获取并打印html页面中的中文字符:")
print(soup.title.string)
print(soup.hl.string)
print(soup.p.string)

结果：

技术分享图片

用Beautifulsoup4库处理html网站

原文：https://www.cnblogs.com/jiana/p/12884619.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)