一、导入requests-html 模块
pip install requests-html
导入失败,根据报错原因可知需要升级pip
查看我的pip版本号 pip install pip
当前版本为18.1
升级pip
python -m pip install --upgrade pip
查看当前版本,确认升级成功
可看见现在的版本已经升级为19.0.3了
再次导入requests-html模块
导入成功
二、案例
第一次调用render()方法时,代码将会自动下载Chromium,并保存在你的家目录下(如:~/.pyppeteer/)。它只会下载这一次。
C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\python.exe F:/python_study/练习/01.py
C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
100%|██████████| 127496521/127496521 [17:41<00:00, 120055.09it/s]
[W:pyppeteer.chromium_downloader]
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: C:\Users\Administrator\AppData\Local\pyppeteer\pyppeteer\local-chromium\575458
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://www.jianshu.com/u/125ddf84ee1d')
r.html.render() # 首次使用,自动下载chromium
# 1、获取当前页面的HTML
#print(r.html.html)
# 2、获取本页面所有的链接并返回一个列表, 保留了url在页面中原本的形式(已经自动去掉了html标签)
print(r.html.links)
# 3、获取本页面所有的链接并返回一个列表, 自动将url转换为绝对路径形式(已经自动去掉了html标签)
print(r.html.absolute_links)
#4、通过css选择器选取一个Element对象
d = r.html.find(".info", first=True)
print(d.text)
结果:
{'/p/c6e83e1947ca#comments', '/users/125ddf84ee1d/subscriptions', '/nb/32804973', '/apps?utm_medium=desktop&utm_source=navbar-apps', '/p/27738d1796bd', '/p/28ab188851ea', '/p/eec971b18f7b', '/users/125ddf84ee1d/timeline', '/p/be081c9f318b#comments', '/sign_in', '/p/eec971b18f7b#comments', '/', '/p/c3a26b7c4a74#comments', '/u/125ddf84ee1d', '/users/125ddf84ee1d/following', '/u/125ddf84ee1d?order_by=shared_at', '/p/be081c9f318b', '/nb/32804827', '/nb/32803910', '/nb/4147375', '/p/4e049c40e89d#comments', '/p/c3a26b7c4a74', '/nb/31539290', '/sign_up', '/users/125ddf84ee1d/liked_notes', '/nb/32804844', '/p/27738d1796bd#comments', '/p/c1054b70c76f', '/p/a6d5b6a4c396', '/p/c6e83e1947ca', '/nb/32804082', '/p/a6d5b6a4c396#comments', '/writer#/', '/p/4e049c40e89d', '/p/c1054b70c76f#comments', '/users/125ddf84ee1d/followers', '/p/28ab188851ea#comments', '/notifications#/chats/new?mail_to=2014194', '/u/125ddf84ee1d?order_by=commented_at', '/nb/34697254', '/u/125ddf84ee1d?order_by=top'}
{'https://www.jianshu.com/p/eec971b18f7b', 'https://www.jianshu.com/p/c1054b70c76f', 'https://www.jianshu.com/nb/32804082', 'https://www.jianshu.com/users/125ddf84ee1d/followers', 'https://www.jianshu.com/p/4e049c40e89d#comments', 'https://www.jianshu.com/p/a6d5b6a4c396', 'https://www.jianshu.com/p/c1054b70c76f#comments', 'https://www.jianshu.com/p/28ab188851ea#comments', 'https://www.jianshu.com/u/125ddf84ee1d', 'https://www.jianshu.com/sign_in', 'https://www.jianshu.com/nb/32804844', 'https://www.jianshu.com/p/c6e83e1947ca#comments', 'https://www.jianshu.com/p/eec971b18f7b#comments', 'https://www.jianshu.com/nb/32803910', 'https://www.jianshu.com/p/a6d5b6a4c396#comments', 'https://www.jianshu.com/p/28ab188851ea', 'https://www.jianshu.com/p/c3a26b7c4a74#comments', 'https://www.jianshu.com/u/125ddf84ee1d?order_by=commented_at', 'https://www.jianshu.com/sign_up', 'https://www.jianshu.com/p/4e049c40e89d', 'https://www.jianshu.com/users/125ddf84ee1d/liked_notes', 'https://www.jianshu.com/p/c3a26b7c4a74', 'https://www.jianshu.com/users/125ddf84ee1d/following', 'https://www.jianshu.com/p/27738d1796bd', 'https://www.jianshu.com/users/125ddf84ee1d/timeline', 'https://www.jianshu.com/p/27738d1796bd#comments', 'https://www.jianshu.com/u/125ddf84ee1d?order_by=top', 'https://www.jianshu.com/p/be081c9f318b#comments', 'https://www.jianshu.com/writer#/', 'https://www.jianshu.com/nb/32804973', 'https://www.jianshu.com/p/c6e83e1947ca', 'https://www.jianshu.com/nb/4147375', 'https://www.jianshu.com/apps?utm_medium=desktop&utm_source=navbar-apps', 'https://www.jianshu.com/u/125ddf84ee1d?order_by=shared_at', 'https://www.jianshu.com/nb/31539290', 'https://www.jianshu.com/users/125ddf84ee1d/subscriptions', 'https://www.jianshu.com/', 'https://www.jianshu.com/nb/32804827', 'https://www.jianshu.com/p/be081c9f318b', 'https://www.jianshu.com/nb/34697254', 'https://www.jianshu.com/notifications#/chats/new?mail_to=2014194'}
0
关注
2
粉丝
36
文章
19011
字数
2
收获喜欢
19
简书钻