到目前为止,使用selenium操作headless 模式下的chrome下载文件会出现问题,点击了下载却没有任何文件被下下来。官方现在也还没有正式解决这个bug(若已解决请指正),所以得靠自己摸索下载方法了。
在一番搜索后,终于找到了解决办法,可以在chrome headless模式下指定下载目录并且下载。出处找不到了,实现过程如下:
定义一个DriverBuilder
class DriverBuilder():
def enable_download_in_headless_chrome(self, driver, download_dir):
"""
there is currently a "feature" in chrome where
headless does not allow file download: https://bugs.chromium.org/p/chromium/issues/detail?id=696481
This method is a hacky work-around until the official chromedriver support for this.
Requires chrome version 62.0.3196.0 or above.
"""
# add missing support for chrome "send_command" to selenium webdriver
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
command_result = driver.execute("send_command", params)
self.logger.info("response from browser:")
for key in command_result:
self.logger.info("result:" + key + ":" + str(command_result[key]))
配置好下载项
self.options = webdriver.ChromeOptions()
self.store_path = 'your_download_file'
if not os.path.exists(self.store_path):
os.makedirs(self.store_path)
self.prefs = {'download.default_directory': self.store_path,'profile.default_content_settings.popups': 0}
self.options.add_experimental_option('prefs', self.prefs)
self.options.add_argument('--headless')
self.driver = webdriver.Chrome(chrome_options=self.options)
DriverBuilder().enable_download_in_headless_chrome(self.driver, self.store_path)
之后就可以下载文件到指定目录了。最后提一句,chrome版本必须要60以后的,不然不支持headless模式。