nature-python/wsgi和tornado.md

## wsgi和tornado.md

      
    Raw
  

              wsgi和tornado.md
            
          
    #WSGI:(web server gateway interface)WEB服务器网关接口
WSGI是为python语言定义的web服务器和web应用程序或框架之间的一种简单而实用的借口。wsgi是一个web组件的接口规范，它将web组件分为三类：server，middleware，application
##wsgi server
wsgi server可以理解为一个符合wsgi规范的web server，接收request请求，封装一系列环境变量，按照wsgi规范调用注册的wsgi app，最后将response返回给客户端。文字很难解释清楚wsgi server到底是什么东西，以及做些什么事情，最直观的方式还是看wsgi server的实现代码。以python自带的wsgiref为例，wsgiref是按照wsgi规范实现的一个简单wsgi server。其工作流程如下：

服务器创建socket，监听端口，等待客户端连接。
当有请求来时，服务器解析客户端信息放到环境变量environ中，并调用绑定的handler来处理请求。
handler解析这个http请求，将请求信息例如method，path等放到environ中。
wsgi handler再将一些服务器端信息也放到environ中，最后服务器信息，客户端信息，本次请求信息全部都保存到了环境变量environ中。
wsgi handler 调用注册的wsgi app，并将environ和回调函数传给wsgi app
wsgi app 将reponse header/status/body 回传给wsgi handler
最终handler还是通过socket将response信息塞回给客户端。

##wsgi application
wsgi application就是一个普通的callable对象，当有请求到来时，wsgi server会调用这个wsgi app。这个对象接收两个参数，通常为environ,start_response。environ就像前面介绍的，可以理解为环境变量，跟一次请求相关的所有信息都保存在了这个环境变量中，包括服务器信息，客户端信息，请求信息。start_response是一个callback函数，wsgi application通过调用start_response，将response headers/status 返回给wsgi server。此外这个wsgi app会return 一个iterator对象 ，这个iterator就是response body。这么空讲感觉很虚，对着下面这个简单的例子看就明白很多了。下面这个例子是一个最简单的wsgi app，引自http://www.python.org/dev/peps/pep-3333/
def simple_app(environ, start_response):
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [u"This is hello wsgi app".encode('utf8')]
我们再用wsgiref 作为wsgi server ，然后调用这个wsgi app，就能直观看到一次request,response的效果，简单修改代码如下：
from wsgiref.simple_server import make_server

def simple_app(environ, start_response):
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [u"This is hello wsgi app".encode('utf8')]

httpd = make_server('', 8000, simple_app)
print "Serving on port 8000..."
httpd.serve_forever()
访问http://127.0.0.1:8000 就能看到效果了。
此外，上面讲到了wsgi app只要是一个callable对象就可以了，因此不一定要是函数，一个实现了__call__方法的实例也可以，示例代码如下:
from wsgiref.simple_server import make_server

class AppClass:

    def __call__(self,environ, start_response):
        status = '200 OK'
        response_headers = [('Content-type', 'text/plain')]
        start_response(status, response_headers)
        return ["hello world!"]

app = AppClass()
httpd = make_server('', 8000, app)
print "Serving on port 8000..."
httpd.serve_forever()
##wsgi middleware
上面的application看起来没什么意思，感觉没有太大用，但加上一层层的middleware包装之后就不一样了。一堆文字解释可能还没有一个demo更容易说明白，我写了一个简单Dispatcher Middleware，用来实现URL 路由：
from wsgiref.simple_server import make_server

URL_PATTERNS= (
    ('hi/','say_hi'),
    ('hello/','say_hello'),
    )

class Dispatcher(object):

    def _match(self,path):
        path = path.split('/')[1]
        for url,app in URL_PATTERNS:
            if path in url:
                return app

    def __call__(self,environ, start_response):
        path = environ.get('PATH_INFO','/')
        app = self._match(path)
        if app :
            app = globals()[app]
            return app(environ, start_response)
        else:
            start_response("404 NOT FOUND",[('Content-type', 'text/plain')])
            return ["Page dose not exists!"]

def say_hi(environ, start_response):
    start_response("200 OK",[('Content-type', 'text/html')])
    return ["kenshin say hi to you!"]

def say_hello(environ, start_response):
    start_response("200 OK",[('Content-type', 'text/html')])
    return ["kenshin say hello to you!"]

app = Dispatcher()

httpd = make_server('', 8000, app)
print "Serving on port 8000..."
httpd.serve_forever()

上面的例子可以看出来，middleware 包装之后，一个简单wsgi app就有了URL dispatch功能。然后我还可以在这个app外面再加上其它的middleware来包装它，例如加一个权限认证的middleware：
class Auth(object):
    def __init__(self,app):
        self.app = app

    def __call__(self,environ, start_response):
        #TODO
        return self.app(environ, start_response)

app = Dispatcher()
auth_app = Auth(app)

httpd = make_server('', 8000, auth_app)
print "Serving on port 8000..."
httpd.serve_forever()

经过这些middleware的包装，已经有点框架的感觉了。
##Tornado与wsgi
WSGI支持tornado,tornado.wsgi通过以下两种方式来支持wsgi:

WSGIApplication，可用于在其他支持wsgi的HTTP server上运行tornado app,例如google app engine

class WSGIApplication(web.Application):
    """A WSGI equivalent of `tornado.web.Application`.

    `WSGIApplication` is very similar to `tornado.web.Application`,
    except no asynchronous methods are supported (since WSGI does not
    support non-blocking requests properly). If you call
    ``self.flush()`` or other asynchronous methods in your request
    handlers running in a `WSGIApplication`, we throw an exception.

    Example usage::

        import tornado.web
        import tornado.wsgi
        import wsgiref.simple_server

        class MainHandler(tornado.web.RequestHandler):
            def get(self):
                self.write("Hello, world")

        if __name__ == "__main__":
            application = tornado.wsgi.WSGIApplication([
                (r"/", MainHandler),
            ])
            server = wsgiref.simple_server.make_server('', 8888, application)
            server.serve_forever()

    See the `appengine demo
    <https://github.com/facebook/tornado/tree/master/demos/appengine>`_
    for an example of using this module to run a Tornado app on Google
    App Engine.

    WSGI applications use the same `.RequestHandler` class, but not
    ``@asynchronous`` methods or ``flush()``.  This means that it is
    not possible to use `.AsyncHTTPClient`, or the `tornado.auth` or
    `tornado.websocket` modules.
    """
    def __init__(self, handlers=None, default_host="", **settings):
        web.Application.__init__(self, handlers, default_host, transforms=[],
                                 wsgi=True, **settings)

    def __call__(self, environ, start_response):
        handler = web.Application.__call__(self, HTTPRequest(environ))
        assert handler._finished
        reason = handler._reason
        status = str(handler._status_code) + " " + reason
        headers = list(handler._headers.get_all())
        if hasattr(handler, "_new_cookie"):
            for cookie in handler._new_cookie.values():
                headers.append(("Set-Cookie", cookie.OutputString(None)))
        start_response(status,
                       [(native_str(k), native_str(v)) for (k, v) in headers])
        return handler._write_buffer

WSGIContainer，可在tornado http server上运行其他wsgi application

class WSGIContainer(object):
    r"""Makes a WSGI-compatible function runnable on Tornado's HTTP server.

    Wrap a WSGI function in a `WSGIContainer` and pass it to `.HTTPServer` to
    run it. For example::

        def simple_app(environ, start_response):
            status = "200 OK"
            response_headers = [("Content-type", "text/plain")]
            start_response(status, response_headers)
            return ["Hello world!\n"]

        container = tornado.wsgi.WSGIContainer(simple_app)
        http_server = tornado.httpserver.HTTPServer(container)
        http_server.listen(8888)
        tornado.ioloop.IOLoop.instance().start()

    This class is intended to let other frameworks (Django, web.py, etc)
    run on the Tornado HTTP server and I/O loop.

    The `tornado.web.FallbackHandler` class is often useful for mixing
    Tornado and WSGI apps in the same server.  See
    https://github.com/bdarnell/django-tornado-demo for a complete example.
    """
    def __init__(self, wsgi_application):
        self.wsgi_application = wsgi_application

    def __call__(self, request):
        data = {}
        response = []

        def start_response(status, response_headers, exc_info=None):
            data["status"] = status
            data["headers"] = response_headers
            return response.append
        app_response = self.wsgi_application(
            WSGIContainer.environ(request), start_response)
        try:
            response.extend(app_response)
            body = b"".join(response)
        finally:
            if hasattr(app_response, "close"):
                app_response.close()
        if not data:
            raise Exception("WSGI app did not call start_response")

        status_code = int(data["status"].split()[0])
        headers = data["headers"]
        header_set = set(k.lower() for (k, v) in headers)
        body = escape.utf8(body)
        if status_code != 304:
            if "content-length" not in header_set:
                headers.append(("Content-Length", str(len(body))))
            if "content-type" not in header_set:
                headers.append(("Content-Type", "text/html; charset=UTF-8"))
        if "server" not in header_set:
            headers.append(("Server", "TornadoServer/%s" % tornado.version))

        parts = [escape.utf8("HTTP/1.1 " + data["status"] + "\r\n")]
        for key, value in headers:
            parts.append(escape.utf8(key) + b": " + escape.utf8(value) + b"\r\n")
        parts.append(b"\r\n")
        parts.append(body)
        request.write(b"".join(parts))
        request.finish()
        self._log(status_code, request)

    @staticmethod
[docs]    def environ(request):
        """Converts a `tornado.httpserver.HTTPRequest` to a WSGI environment.
        """
        hostport = request.host.split(":")
        if len(hostport) == 2:
            host = hostport[0]
            port = int(hostport[1])
        else:
            host = request.host
            port = 443 if request.protocol == "https" else 80
        environ = {
            "REQUEST_METHOD": request.method,
            "SCRIPT_NAME": "",
            "PATH_INFO": to_wsgi_str(escape.url_unescape(
                request.path, encoding=None, plus=False)),
            "QUERY_STRING": request.query,
            "REMOTE_ADDR": request.remote_ip,
            "SERVER_NAME": host,
            "SERVER_PORT": str(port),
            "SERVER_PROTOCOL": request.version,
            "wsgi.version": (1, 0),
            "wsgi.url_scheme": request.protocol,
            "wsgi.input": BytesIO(escape.utf8(request.body)),
            "wsgi.errors": sys.stderr,
            "wsgi.multithread": False,
            "wsgi.multiprocess": True,
            "wsgi.run_once": False,
        }
        if "Content-Type" in request.headers:
            environ["CONTENT_TYPE"] = request.headers.pop("Content-Type")
        if "Content-Length" in request.headers:
            environ["CONTENT_LENGTH"] = request.headers.pop("Content-Length")
        for key, value in request.headers.items():
            environ["HTTP_" + key.replace("-", "_").upper()] = value
        return environ

    def _log(self, status_code, request):
        if status_code < 400:
            log_method = access_log.info
        elif status_code < 500:
            log_method = access_log.warning
        else:
            log_method = access_log.error
        request_time = 1000.0 * request.request_time()
        summary = request.method + " " + request.uri + " (" + \
            request.remote_ip + ")"
        log_method("%d %s %.2fms", status_code, summary, request_time)