参考的文章:https://www.jianshu.com/p/3c790e98ea8d python的四种请求方式,在该文中使用到了xml,form,json
本文记录 fiddler设置,抓取网页,抓取手机端代码,python 模拟对于请求
1.fiddler 抓取网页设置 (在filter 里面设置抓取网站的ip)具体操作可以百度,注意点是fiddler不能直接抓取火狐,具体原因百度。
2.抓取手机端
需要保证 手机和电脑处于同一个网段(笔记本 加路由器 手机笔记本连WiFi)手机的wifi设置里面手动代理,添加电脑在该网段下的IP,和fiddler设置的端口。
这个时候因为证书的原因https的包是显示不正常的,http://IP:端口/FiddlerRooot.cer(fiddler打开的状态下) 下载该证书并安装到手机上面。
3.开始干活:
第一种类型:pyhton post xml
xml='''<batch>
<request type="json"><![CDATA[{"action":"load-data","dataProvider":"inventoryDlgDynamicQuery#doLoadData","supportsEntity":true,"parameter":{"__viewConfigName":"com.ccssoft.inventory.web.qrcode.view.DeviceRoomMgr","metaClassName":"ROOM","isCollection":"true","areaPermission":"true","ROOM_LOCATE_REG_REGION@NAME":"安福县"},"resultDataType":"v:com.ccssoft.inventory.web.qrcode.view.DeviceRoomMgr$[v:com.ccssoft.inventory.web.qrcode.view.DeviceRoomMgr$dataTypeEntity]","pageSize":100,"pageNo":1,"context":{},"loadedDataTypes":["dataTypeEntity","dataTypeCondition"]}]]></request>
</batch>'''
header={
'Content-Type' : 'text/xml'
}
'''
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'zh-CN,zh;q=0.8',
'Cache-Control':'max-age=0',
'Cookie':'STUID=79',
'Connection':'keep-alive',
'Origin':'http://q',
'Referer':'http://q',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
'''
r2=requests.post("xxxxxxxxxxx",data=xml,headers=header)
print(r2)
重点是
header={
'Content-Type' : 'text/xml'
}
第二种类型:pyhton post json
# jsons='''{"action":"resolve-data","dataResolver":"codeLabelMgr#modifyPrintFlag","dataItems":[{"alias":"datasetEntity","data":{"$isWrapper":true,"$dataType":"v:com.ccssoft.inventory.web.qrcode.view.DeviceRoomMgr$[v:com.ccssoft.inventory.web.qrcode.view.DeviceRoomMgr$dataTypeEntity]","data":['''+json.dumps(dict_json["data"][5]) +''']},"refreshMode":"value","autoResetEntityState":true}],"parameter":{"typeflag":"3","reason":"123456","updateid":"'''+dict_json["data"][5]["ID"]+''',","printflag":"100382"},"context":{}}'''
#
# r2=requests.post("xxxxxxx",cookies = cookies ,data=jsons,headers=header)
# print(r2)
重点是
header={
'Content-Type' : 'text/javascript',
}
第三种类型 : python form
data={
'METHOD' : '19',
'token' : 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJUSU1FIjoxNTQzMjgwMDc3LCJTSEFSRElOR19JRCI6Ijc5NiIsIlVTRVJOQU1FIjoiY2FyMTM2MDA4In0.-hJTE-Bn2ZWX_fEqFAAm6TM2f_8KwsRb1WgA1Zn6KUU',
'ACTION' : '704',
'CODE' : 'JX_JA_GJ_100794202',
'SHARDING_ID' : '796',
'DATA' : '{"USER_LATITUDE":"27.289719","USERNAME":"xxxx","USER_ID":"361100000000000001224000","USER_LONGITUDE":"114.188924","ALTITUDE":"0.0","DIGITAL_CODE":"JX_JA_GJ_100794202","NAME":"xxx","PARENT_SEPC_ID":"1020000000","SHARDING_ID":"796"}'
}
r2=requests.post("xxxxxxx",data = data ,headers=header)
print(r2)
重点:Content-Type的格式有四种:分别是application/x-www-form-urlencoded(这也是默认格式)、application/json、text/xml以及multipart/form-data格式。
最后少的是模拟上传,和模拟post下载,等具体用到再补充。