缘由
《北京遇上西雅图2不二情书》上映其实很久了,然而,最近才有时间从网上拖下来看(原谅,我们这破旧的小地方没有电影院这个设施)。发现里面的句子还是不错的,所有想弄下来研读一下。刚好,Python很适合最这个(ps:其实我也就只懂这个)
环境
windows,Python2.x,requests,BeautifulSoup
代码
#!/usr/bin/python
# -*- coding: utf-8 -*-
# 获取经典句子
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0',}
def get_html(url):
r = requests.get(url,headers = headers)
html = r.content
return html
def get_juzi(html):
soup =BeautifulSoup(html, "lxml")
juzilist = soup.find_all('a',class_="xlistju")
for x in juzilist:
print x.get_text().encode('utf-8')
print
def get_title(html):
soup =BeautifulSoup(html, "lxml")
print soup.title.get_text().encode('utf-8').replace('_句子迷','')
if __name__ == '__main__':
# url = 'http://www.juzimi.com/article/316132?page=0' url 的模式
for item in range(8): #这里是手动模式 ^_^
url = 'http://www.juzimi.com/article/316132?page=%s' % item
html = get_html(url)
if item == 0:
get_title(html)
get_juzi(html)
结束语
喜欢的话,欢迎关注,收藏,谢谢!