python-浏览器多窗口下实现不同的代理IP

一、背景/应用介绍

因一个测试场景的要求，需要做个ip代理功能；并且要可以做到开启多个窗口下还能指定IP（根据账号指定IP不变）；
在冥思苦想下，就开始动工啦，下图是这样的一个构思:

需要实现的功能-思维导图

二、直接先看最后的成果：

1、起了三个窗口，每个浏览器都是独立的代理IP，且各浏览器的存储的一些信息也是独立的（如cookies）。确保了不会让数据有污染情况，导致浏览器storage共用情况；

多窗口不同IP的结果

多窗口不同IP的结果2

三、介绍整个构思设计过程

3.1、流程前置步骤:先取conf.txt配置文件信息(后续流程需要多处用到):

chrome_v= 85.0.4183.83 #填写本地谷歌浏览器版本（比如：81.0.4044.138）并需要确认官方是否有对应的驱动版本.如无，可填近似版本；可进入链接https://npm.taobao.org/mirrors/chromedriver查看官方驱动.
chrome_url=https://cdn.npm.taobao.org/dist/chromedriver #填写谷歌浏览器驱动下载地址或填https://chromedriver.storage.googleapis.com/index.html
log_save=yes #是否要保存日志:可填写yes或者no
file_name=userid_data.xlsx  ##IP的excle文档命名名称，注意填全称包含xlsx
testip1_url=http://httpbin.org/get?show_env=1  #request接口查询，第一次核验代理IP正常性，如果不一致会被强制退出；一般不可更改
testip2_url=https://www.ip138.com/ #通过浏览器打开IP查询页面，第二次核验代理IP正常性(供页面查看)；一般不可更改
bocai_url=https://cn.xxxx.com/home/register  #目标网站链接
user_id=uid  #用户账户元素定位
password=jpwd  #用户密码元素定位
captcha_text2=captcha_text2 #验证码元素定位
time_sleep1=5   #testip1_ur2链接，打开后等待时间(秒)
time_sleep2=5   #bocai_url链接，打开后等待时间(秒)

def config_txt(file_name="config/conf.txt"):#读取txt配置文件参数
    data_head=data_tail=list()
    for line in open(file_name,encoding='utf-8-sig', errors='ignore'): 
        head, sep,tail = line.partition('=')
        data_head.append(head)
        if tail.find('#')!=-1:
            tail, tail_b,tail_c = tail.partition('#')
        data_tail.append(tail.strip('\n '))                
    txt_data=dict(zip(data_head,data_tail))
    return txt_data

3.2、流程第一个步骤：通过requests判断引用代理ip是否有效、以及IP核验正确性
1）、步骤1-获取对应账号下的指定IP相关的信息：因为是届时是需要多个代理IP，故用文档统一管理（如下图），根据账号固定绑定一个；
备注：对于代理IP哪里来，其实网上可以搜到一些可以直接免费用的IP，但不太稳定；我这里是通过淘宝购买了，也贼便宜；

IP代理文档内容

def excel(file_name,users_id):#读取excel账号及路由代理信息
    try:
        data_excle=pd.read_excel(io='config/'+file_name,sheet_name=0)
        df = pd.DataFrame(data_excle)
        user_id=int(users_id) if type(users_id)==type("") and users_id.isdecimal()==True else users_id
        df_data=df[(df.user_id==user_id)].to_dict('list')
        user_sum=df[(df.user_id==user_id)].shape[0]
        if user_sum != 1:
            print("excel文档查找账户出现异常，该"+str(user_id)+"共查到"+str(user_sum)+"个")
            input('点击回车键可退出......')
            exit()
        keys = [key for key in df_data.keys()]
        values=[str(value).strip("'[]") for value in df_data.values() ]
        txt_data = dict(zip(keys, values))
        print(txt_data)
        return txt_data["user_id"],txt_data['password'],txt_data['ip'],txt_data['port'],txt_data['city'],\
               txt_data['ip_user'],txt_data['ip_password'],txt_data['browser'],txt_data['test']
    except :
        print("错误提示：未在excel文档内找到 "+str(users_id)+" 用户，请检查是否已配置")

备注：此处加了个执行记录（相当于log作用），把每次执行的都保存下来；引用的是第三方Logger库，需要可以度娘搜搜就有；

def testLogger(now_time,log_save):#log和生成的路径逻辑
    time_a = time.strftime("%Y-%m-%d", time.localtime())
    time_log = 'log/'+time_a+"/"
    isExists = os.path.exists(time_log)
    if not isExists:
        os.makedirs(time_log)
    sys.stdout= Logger.Logger(time_log+now_time+'.txt', sys.stdout) if log_save=='yes' else None
    return time_log

2）、步骤2-引用request库调用httpbin.org核验IP有效性、正确性

def start_test(user_id):
    global browser
    print("本次登录账号：",str(user_id))
    now_time,conf_txt=str(int(time.time()*1000)),config_txt()
    time_log = testLogger(now_time,conf_txt['log_save'])
    user_id,password,ip,port,city,ip_user,ip_password,browser,test=excel(conf_txt["file_name"],user_id)
    #拼接样例：'http://649192***:o2m0n***@101.200.187.22:16819'
    reque_ip=ip_user+":"+ip_password+"@"+ip+":"+port 
    affirm=ip_affirm(reque_ip,conf_txt["testip1_url"])
    #校验通过httpbin响应的IP是否与文档内的一致
    if ip == affirm.split(",")[0]:
        print("ip校验成功"+ip)
    else:
        print("IP校验失败\n配置ip为："+ip,"查询结果IP为："+affirm)
        input('点击回车键可退出......')
        exit()

def ip_affirm(proxies,testip1_url):#requests查询实际的IP接口地址（代理）
    #定义配置到的IP代理信息
    proxies={ "http": "http://"+proxies }
    try:
        response_get=requests.get(url=testip1_url, proxies=proxies)
        if response_get.status_code == 200:
            print(response_get.text)
            data=response_get.json()
            return data["origin"]
    except :
        print ("#####Error: 查询IP连接出现异常，请检查IP代理账户密码及链接是否正确有效")   
        input('点击回车键可退出......')
        exit()

3.3、打包浏览器代理插件：让每个窗口独立指定IP，另外在ui层面检查一边（引用selenium）

    #谷歌浏览器的文件配置
    capa=DesiredCapabilities.CHROME
    capa["pageLoadStrategy"] = "none"
    #第三方的proxyauth库支持，填入代理信息即可
    proxy_conf= proxyauth.create_proxy(host=ip,port=port,username=ip_user,password=ip_password)
    co = webdriver.ChromeOptions()
    co.add_argument("--start-maximized")
    co.add_extension(proxy_conf)
    browser = webdriver.Chrome(options=co,desired_capabilities=capa)
    wait = WebDriverWait(browser,20)
    #打开第三方网址，ui检查IP地址
    browser.get(conf_txt["testip2_url"])
    time.sleep(int(conf_txt["time_sleep1"]))
    # 进入测试页面，通过selenium自动填入基本信息
    browser.get(conf_txt["bocai_url"]);
    wait.until(EC.presence_of_element_located((By.XPATH, "//*[@id='logincaptcha2']")))
    time.sleep(int(conf_txt["time_sleep2"]))
    browser.execute_script("window.stop();")
    browser.find_element_by_id(conf_txt["user_id"]).send_keys(user_id)
    browser.find_element_by_id(conf_txt["password"]).send_keys(password)

#这一段是借助度娘的大神们的
# 打包Google代理插件
def create_proxy(host, port, username, password,scheme='http', plugin_path=None):
    if plugin_path is None:
        # 插件地址
        plugin_path = 'config/vimm_chrome_proxyauth_plugin.zip'
    manifest_json = """
        {
            "version": "1.0.0",
            "manifest_version": 2,
            "name": "Chrome Proxy",
            "permissions": [
                "proxy",
                "tabs",
                "unlimitedStorage",
                "storage",
                "<all_urls>",
                "webRequest",
                "webRequestBlocking"
            ],
            "background": {
                "scripts": ["background.js"]
            },
            "minimum_chrome_version":"22.0.0"
        }
        """
    background_js = string.Template(
        """
        var config = {
                mode: "fixed_servers",
                rules: {
                  singleProxy: {
                    scheme: "${scheme}",
                    host: "${host}",
                    port: parseInt(${port})
                  },
                  bypassList: ["foobar.com"]
                }
              };
        chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
        function callbackFn(details) {
            return {
                authCredentials: {
                    username: "${username}",
                    password: "${password}"
                }
            };
        }
        chrome.webRequest.onAuthRequired.addListener(
                    callbackFn,
                    {urls: ["<all_urls>"]},
                    ['blocking']
        );
        """
    ).substitute(host=host,port=port,username=username,password=password,scheme=scheme,)
    with zipfile.ZipFile(plugin_path, 'w') as zp:
        zp.writestr("manifest.json", manifest_json)
        zp.writestr("background.js", background_js)
    return plugin_path

3.4、为方便win执行，触发命令转换成bat文件（另外也可以用Bat To Exe Converter转成exe就更方便一点）：

#另外此处是获取了文件名,然后给python传个账户实参
@echo off 
set name=%~n0
start python model/test_bocai.py %name%
exit

*生成exe文件后，并命名成对应账户名，然后双击即可使用啦~~~~

专成win可识别的exe应用程序

3.5、自动检测及修复：python依赖包的以及谷歌驱动下载
1）、步骤1：对于python依赖包，相对用了最简单去处理了。通过import判断是否安装对应库；如没有就pip所需库的下载（我这里是引用了阿里云镜像源，相对比较稳定比较快）：

#  -*-coding:utf-8 -*-
#  python各库自动检测及安装
import os
from test_item import Classdriver
aliyun=' -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com'
print('正常进行自动检测、自动安装中，请勿退出...')
print("本系统python版本及安装目录：")
os.system('python -V')
os.system('where python')
try:
    import pandas  as pd
except:
    print('未找到<pands>库，正自动安装中：')
    os.system('pip install pandas'+aliyun)
print("pandas库版本：")
os.system('pip list | findstr pandas')
try:
    import selenium
except:
    print('未找到<selenium>库，正自动安装中：')
    os.system('pip install selenium'+aliyun)
print("selenium库版本：")
os.system('pip list | findstr selenium')
try:
    import requests
except:
    print('未找到<requests>库，正自动安装中：')
    os.system('pip install requests'+aliyun)
print("requests库版本：")
os.system('pip list | findstr requests')
try:
    from aip import AipOcr
except:
    print('未找到<baidu-aip>库，正自动安装中：')
    os.system('pip install baidu-aip '+aliyun)
print("baidu-api版本：")
os.system('pip list | findstr baidu-aip')
print("将自动打开谷歌浏览器，进行检测驱动版本")
Classdriver().driver()
input('自动检测修复已完毕，请点击回车键可退出......')

2）、步骤2--谷歌驱动这块的逻辑（对应功能点3、4），实现自动从官方下载对应的驱动，并配置到python路径中
也可通过链接，手动下载：https://npm.taobao.org/mirrors/chromedriver

import os
import zipfile
import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

class Classdriver(object):
    def __init__(self):
        self.python_path_a=list(os.popen('where python'))
        self.python_path=str(self.python_path_a[0].strip("\n"))

    def pyhon_model(self,chrome_v,driver="chromedriver_win32.zip",url='https://cdn.npm.taobao.org/dist/chromedriver'):
        # 拼接下载链接,链接2：https://chromedriver.storage.googleapis.com/index.html'
        if url.find("taobao")!= -1:
            download_url = url+'/'+chrome_v+'/'+driver  
        else:
            download_url = url + '?path='+chrome_v+'/'+driver
        file_a = requests.get(download_url)
        if file_a.status_code==200:
            file=file_a
        else:
            print('###驱动下载出现异常：请检查<chrome_v>版本号是否有对应的官方驱动版本，如未匹配可填写最接近的驱动版本号...')
            print('本次填写的<chrome_v>版本号：',chrome_v)
            print('请进入官方驱动版本链接检查是否有此版本号：https://npm.taobao.org/mirrors/chromedriver')
            input('点击回车键可退出......')
            exit()
        unzip=self.python_path[:-10]+driver
        with open(unzip, 'wb') as zip_file:    # 保存文件到脚本所在目录
            zip_file.write(file.content)
        print("unzip",unzip)
        zip_file = zipfile.ZipFile(unzip)
        zip_list = zip_file.namelist() # 得到压缩包里所有文件
        for i in range(len(self.python_path_a)):
            for f in zip_list:
                zip_file.extract(f, self.python_path_a[i].strip("\n")[:-10]) # 循环解压文件到指定目录
        zip_file.close()
    def driver(self):
        chrome_options = Options()
        chrome_options.add_argument('--no-sandbox')
        chrome_options.add_argument('--disable-dev-shm-usage')
        chrome_options.add_argument('--headless')
        chrome_options.add_experimental_option('excludeSwitches', ['enable-logging'])
        #读取config配置内的配置信息
        config=self.config_txt()
        if os.path.isfile(self.python_path[:-10]+'chromedriver.exe'):
            browser = webdriver.Chrome(options=chrome_options)
            rr=browser.capabilities['chrome']['chromedriverVersion']
            head, sep,tail = rr.partition(' ')
            browser.quit()
            #核验配置内的谷歌版本与当前驱动版本一致性，不一致将自动下载官方驱动
            if head==config['chrome_v']:
                print('谷歌驱动已安装、版本正确=====版本号：'+head)
            else:
                self.pyhon_model(chrome_v=config['chrome_v'],url=config['chrome_url'])
        else:
            self.pyhon_model(chrome_v=config['chrome_v'],url=config['chrome_url'])
    def config_txt(self,file_name="conf.txt"):#读取txt配置文件参数
        data_head=data_tail=list()
        for line in open(file_name,encoding='utf-8-sig', errors='ignore'): 
            head, sep,tail = line.partition('=')
            data_head.append(head)
            if tail.find('#')!=-1:
                tail, tail_b,tail_c = tail.partition('#')
            data_tail.append(tail.strip('\n '))                
        txt_data=dict(zip(data_head,data_tail))
        return txt_data

四、收尾语(主要写累了，就草草收尾了🥺)：

其中漏了一个图片验证码ocr的识别；这个属于其他也业务场景了，就未另外补充了（或可见我另一篇文章）；
然后，整体就需求功能就算完了，按自己的初学者的思路去做了一个小小项目；具体应该还是有很多可以优化的，继续努力把，一步一个脚印；

最后编辑于：2021.04.10 11:04:32

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 204,921评论 6赞 478
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 87,635评论 2赞 381
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 151,393评论 0赞 338
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,836评论 1赞 277
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,833评论 5赞 368
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,685评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,043评论 3赞 399
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,694评论 0赞 258
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 42,671评论 1赞 300
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,670评论 2赞 321
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,779评论 1赞 332
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,424评论 4赞 321
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,027评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,984评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,214评论 1赞 260
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,108评论 2赞 351
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,517评论 2赞 343

python-浏览器多窗口下实现不同的代理IP

一、背景/应用介绍

二、直接先看最后的成果：

三、介绍整个构思设计过程

推荐阅读更多精彩内容