Python已经relase3.6版本了,尝试使用PY3来构建服务,由于比较熟悉Tornado,故测试一下tornado在Python3下的常见用法。
业务代码通常需要访问三方服务和数据库,因此针对异步的http和数据库io进行测试。
事件循环
Python3.5+
的标准库asyncio
提供了事件循环用来实现协程,并引入了async/await
关键字语法以定义协程。Tornado通过yield生成器实现协程,它自身实现了一个事件循环。由于一些三方库都是基于asyncio进行,为了更好的使用python3新特效带来的异步IO,实际测试了Tornado在不同的事件循环中的性能,以及搭配三方库(motor,asyncpg,aiomysql)的方式。
tornado app基本结构
一个基本的tornado app代码如下:
import tornado.httpserver as httpserver
import tornado.ioloop as ioloop
import tornado.options as options
import tornado.web as web
options.parse_command_line()
class IndexHandler(web.RequestHandler):
def get(self):
self.finish("It works")
class App(web.Application):
def __init__(self):
settings = {
'debug': True
}
super(App, self).__init__(
handlers=[
(r'/', IndexHandler)
],
**settings)
if __name__ == '__main__':
app = App()
server = httpserver.HTTPServer(app, xheaders=True)
server.listen(5010)
ioloop.IOLoop.instance().start()
使用tornado默认的事件循环驱动app,IOLoop会创建一个事件循环,用于响应epoll事件,并调用响应的handler处理请求。
异步http client
Tornado提供了一个异步的HTTPClient,用于handler中访问三方的api,即使当前的三方api访问被阻塞了,也不会阻塞tornado响应其他的handler。
class GenHttpHandler(web.RequestHandler):
@gen.coroutine
def get(self):
url = 'http://127.0.0.1:5000/'
client = httpclient.AsyncHTTPClient()
resp = yield client.fetch(url)
print(resp.body)
self.finish(resp.body)
gen是tornado提供的协程模块。python3中还可以使用 async/await的语法
class AsyncHttpHandler(web.RequestHandler):
async def get(self):
url = 'http://127.0.0.1:5000/'
client = httpclient.AsyncHTTPClient()
resp = await client.fetch(url)
print(resp.body)
self.finish(resp.body)
asyncio 事件循环
Aysnc定义协程方式基本符合tornado的协程,但是毕竟不是全兼容了。例如asyncio.sleep 将不会work。
class SleepHandler(web.RequestHandler):
async def get(self):
print("hello tornado")
await asyncio.sleep(5)
self.write('It works!')
想要上面的asyncio.sleep 能够正常,需要替换I使用asyncio的事件循环替换ioloop。
if __name__ == '__main__':
tornado_asyncio.AsyncIOMainLoop().install()
app = App()
server = httpserver.HTTPServer(app, xheaders=True)
server.listen(5020)
asyncio.get_event_loop().run_forever()
使用 tornado_asyncio.AsyncIOMainLoop() 可以替换默认的ioloop。
uvloop 事件循环
除了标准库asyncio的事件循环,社区使用Cython实现了另外一个事件循环uvloop。用来取代标准库。号称是性能最好的python异步IO库。使用uvloop的方式如下:
if __name__ == '__main__':
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
tornado_asyncio.AsyncIOMainLoop().install()
app = App()
server = httpserver.HTTPServer(app, xheaders=True)
server.listen(5030)
asyncio.get_event_loop().run_forever()
由于 uvloop依赖 cython,因此需要按照 cython,两者都可以使用pip直接按照。
三种事件循环的性能
三种事件循环中,ioloop对asyncio.sleep 兼容性不好。主要考察后面两者事件循环的性能。测试接口类型为三种:
1.单纯的返回一个子串
2.异步httpclient性能
3.数据库读写性能
单纯返回子串
IOLoop
使用 100并发连接,10000请求量压测
ab -k -c100 -n10000 http://127.0.0.1:5010/
Server Software: TornadoServer/4.5.1
Server Hostname: 127.0.0.1
Server Port: 5010
Document Path: /
Document Length: 8 bytes
Concurrency Level: 100
Time taken for tests: 5.615 seconds
Complete requests: 10000
Failed requests: 0
Keep-Alive requests: 10000
Total transferred: 2260000 bytes
HTML transferred: 80000 bytes
Requests per second: 1780.84 [#/sec] (mean)
Time per request: 56.153 [ms] (mean)
Time per request: 0.562 [ms] (mean, across all concurrent requests)
Transfer rate: 393.04 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.2 0 3
Processing: 2 56 5.9 56 154
Waiting: 2 56 5.9 56 154
Total: 5 56 5.8 56 158
Qps 为 1780.84
使用 wrk 压测的结果,并发500线程连接,持续测试一分钟:
➜ ~ wrk -t12 -c500 -d60 http://127.0.0.1:5010/
Running 1m test @ http://127.0.0.1:5010/
12 threads and 500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 284.66ms 57.85ms 422.16ms 85.62%
Req/Sec 139.33 94.69 696.00 64.84%
99270 requests in 1.00m, 19.12MB read
Socket errors: connect 0, read 582, write 0, timeout 0
Requests/sec: 1651.92
Transfer/sec: 325.87KB
Asyncio
Concurrency Level: 100
Time taken for tests: 5.616 seconds
Complete requests: 10000
Failed requests: 0
Keep-Alive requests: 10000
Total transferred: 2260000 bytes
HTML transferred: 80000 bytes
Requests per second: 1780.69 [#/sec] (mean)
Time per request: 56.158 [ms] (mean)
Time per request: 0.562 [ms] (mean, across all concurrent requests)
Transfer rate: 393.00 [Kbytes/sec] received
qps 1780.69
Wrk 压测结果
➜ ~ wrk -t12 -c500 -d60 http://127.0.0.1:5020/
Running 1m test @ http://127.0.0.1:5020/
12 threads and 500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 265.34ms 32.16ms 453.76ms 83.32%
Req/Sec 157.85 104.58 696.00 63.36%
108364 requests in 1.00m, 20.88MB read
Socket errors: connect 0, read 458, write 2, timeout 0
Requests/sec: 1803.34
Transfer/sec: 355.74KB
uvloop
uvloop的测试结果
Concurrency Level: 100
Time taken for tests: 5.612 seconds
Complete requests: 10000
Failed requests: 0
Keep-Alive requests: 10000
Total transferred: 2260000 bytes
HTML transferred: 80000 bytes
Requests per second: 1781.98 [#/sec] (mean)
Time per request: 56.117 [ms] (mean)
Time per request: 0.561 [ms] (mean, across all concurrent requests)
Transfer rate: 393.29 [Kbytes/sec] received
Wrk 压测结果
➜ ~ wrk -t12 -c500 -d60 http://127.0.0.1:5030/
Running 1m test @ http://127.0.0.1:5030/
12 threads and 500 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 272.23ms 47.65ms 457.63ms 87.26%
Req/Sec 148.17 103.62 570.00 63.33%
104625 requests in 1.00m, 20.16MB read
Socket errors: connect 0, read 567, write 0, timeout 0
Requests/sec: 1740.76
Transfer/sec: 343.39KB
异步httpclient性能
异步的httpclient性能指在handler中访问别的api,如三方请求。测试的性能大致如下:
- | loop | asyncio | uvloop |
---|---|---|---|
ab | 571.12 | 462.64 | 534.99 |
wrk | 448.11 | 444.63 | 411.19 |
结论
通过一些压测,在三种的横向对比中,其性能大致在一个数量级上,并没有拉开很大的距离,在性能上使用哪一个差不多。考虑到三方库兼容标准的异步IO,并且uvloop驱动的另外一些框架 sanic和 japronto都比较不错,并且还可以使用cython加速,因此下面针对数据库驱动,使用事件循环为 uvloop。
数据库测试
Python中最常用的是 mysqldb,可是mysqldb不支持python3。python3中mysql驱动以pymysql为基础的aiomysql。而postgresql和mongodb都提供了基于asyncio事件循环的驱动。
asyncpg
对于 postgresql,比较好的驱动是 asyncpg,维护的活跃度和性能都比 aiopg更好。使用asyncpg的方式如下:
class DatabaseHandler(web.RequestHandler):
async def get(self):
conn = await asyncpg.connect('postgresql://postgres@localhost/test')
# rows = await conn.fetchrow('select pg_sleep(5)')
rows = await conn.fetchrow('select * from public.user')
print(rows[0])
await conn.close()
self.finish("ok")
class PoolHandler(web.RequestHandler):
async def get(self):
pool = self.application.pool
async with pool.acquire() as connection:
# Open a transaction.
async with connection.transaction():
# Run the query passing the request argument.
rows = await connection.fetch("SELECT * FROM public.user ")
# rows = await connection.fetch("SELECT pg_sleep(1) ")
print(rows)
self.finish("ok")
class App(web.Application):
def __init__(self, pool):
settings = {
'debug': True
}
self._pool = pool
super(App, self).__init__(
handlers=[
(r'/', IndexHandler),
(r'/db', DatabaseHandler),
(r'/pool', PoolHandler),
],
**settings)
@property
def pool(self):
return self._pool
async def init_db_pool():
return await asyncpg.create_pool(database='test',
user='postgres')
def init_app(pool):
app = App(pool)
return app
if __name__ == '__main__':
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
tornado_asyncio.AsyncIOMainLoop().install()
loop = asyncio.get_event_loop()
pool = loop.run_until_complete(init_db_pool())
app = init_app(pool=pool)
server = httpserver.HTTPServer(app, xheaders=True)
server.listen(5040)
loop.run_forever()
一种方式使用了短链接,即每一个请求,handler会创建一个数据库连接,完成查询再关闭,另外一种方式则是使用数据库连接池。当超过连接池的访问,handler会阻塞,但是不会阻塞整个服务。
aiomysql
class PoolHandler(web.RequestHandler):
async def get(self):
pool = self.application.pool
async with pool.acquire() as conn:
async with conn.cursor() as cur:
await cur.execute("SELECT * FROM users_account LIMIT 1")
ret = await cur.fetchone()
print(ret)
self.finish("ok")
class App(web.Application):
def __init__(self, pool):
settings = {
'debug': True
}
self._pool = pool
super(App, self).__init__(
handlers=[
(r'/pool', PoolHandler),
],
**settings)
@property
def pool(self):
return self._pool
async def init_db_pool(loop):
return await aiomysql.create_pool(host='127.0.0.1', port=3306,
user='root', password='root',
db='hydra', loop=loop)
def init_app(pool):
app = App(pool)
return app
if __name__ == '__main__':
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
tornado_asyncio.AsyncIOMainLoop().install()
loop = asyncio.get_event_loop()
pool = loop.run_until_complete(init_db_pool(loop=loop))
app = init_app(pool=pool)
server = httpserver.HTTPServer(app, xheaders=True)
server.listen(5070)
loop.run_forever()
motor
Mongodb的驱动为motor,它也实现了对asyncio的支持,其使用方式如下:
class MongodbHandler(web.RequestHandler):
async def get(self):
ret = await self.application.motor_client.hello.find_one()
# ret = await self.application.motor_client.hello.insert({'hello': 'world'})
print(ret)
self.finish("It works !")
class App(web.Application):
def __init__(self):
settings = {
'debug': True
}
super(App, self).__init__(
handlers=[
(r'/', IndexHandler),
(r'/mongodb', MongodbHandler),
],
**settings)
@property
def motor_client(self):
client = motor_asyncio.AsyncIOMotorClient('mongodb://localhost:27017')
return client['test']
if __name__ == '__main__':
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
tornado_asyncio.AsyncIOMainLoop().install()
app = App()
server = httpserver.HTTPServer(app, xheaders=True)
server.listen(5060)
asyncio.get_event_loop().run_forever()
读取数据的性能
ab -c100 -n10000
Wrk -t12 -c100 -d60s
asyncpg-db | asyncpg-pool | aiomysql | motor | |
---|---|---|---|---|
ab | 305.49 | 898.84 | 669.75 | 236.82 |
wrk | 281.60 | 819.23 | 655.58 | 252.51 |
压测中,使用 wrk 500的连接,压测 db的时候,会出现连接异常(Too Many Connection)。mongodb也会出现Can't assign requested address
的异常。
因为数据库读写都是non-block,因此db和mongodb模式都会因请求的增长而增长,当瞬时达到最大连接数将会raise异常。而pool的方式会等待连接释放,再发起数据库查询。而且性能最好。aiomysql的连接池方式与pq类似。
在同步带 mysql 驱动中,经常维护一个mysql长连接。而异步的驱动则不能这样,因为一个连接阻塞了,另外的协程还是无法读取这个连接。最好的方式还是使用连接池管理连接。
结论
Tornado的作者也指出过,他的测试过程中,使用asyncio和tornado自带的epoll事件循环性能差不多。并且tornado5.0会考虑完全吸纳asyncio。在此之前,使用tornado无论是使用自带的事件循环还是asyncio活着uvloop,在性能方面上都差不不大。需要兼容数据库或http库的时候,使用uvloop的驱动方式,兼容性最好~