WSGI分析

2024-01-29 16:08
文章标签 分析 wsgi

本文主要是介绍WSGI分析,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

mod_wsgi 流程简单分析: 一个嵌入python的例子

WSGI: 一个协议,描述通用服务器与python app之间的接口规范

wsgi app:遵守wsgi规范的python app

mod_wsgi: apache服务器的一个扩展模块, wsgi协议在apache服务器上的一个实现,有了它, 你就可以在apache上运行wsgi app

总的来说,WSGIScriptAlias 模式,python解释器被嵌入到apache进程当中,请求处理代码是在apache的 worker子进程中执行。WSGIDaemonProcess python解释器运行在单独的进程之中,和apache进程是隔离的。

mod_wsgi怎么完成python初始化?和apache关系怎样?一个简单的http请求进来之后, 处理流程大概是什么?下面将针对 WSGIScriptAlias 模式进行简要分析。

apache配置:

WSGIScriptAlias  /hello /var/www/hello.wsgi

告诉apache hello.wsgi是一个mod_wsgi app,所有 /hello/ 下面的请求都转发给它。

wsgi代码:

jaime@westeros:~/source/mod-wsgi-3.3$ ls
build-2.6  build-3.2     debian    Makefile.in  mod_wsgi.lo
posix-ap2X.mk.in   win32-ap22py31.mk
build-2.7  configure     LICENCE   mod_wsgi.c   mod_wsgi.slo      README
build-3.1  configure.ac  Makefile  mod_wsgi.la  posix-ap1X.mk.in
win32-ap22py26.mk

mod_wsgi.c有很多代码是关于apache 1.3的,和2.0代码有很多重名的函数,容易误导, 不便于阅读,可使用 unifdef 工具,将1.3相关的代码全部用空行替代,保留行号 的同时又清爽了许多:

jaime@westeros:~/source/mod-wsgi-3.3$ sudo apt-get install unifdef
jaime@westeros:~/source/mod-wsgi-3.3$ unifdef -DAP_SERVER_MAJORVERSION_NUMBER=2 -b mod_wsgi.c  > mod_wsgi-clean.c

apache模块的入口 mod_wsgi.c +15085

/* Dispatch list for API hooks */module AP_MODULE_DECLARE_DATA wsgi_module = {STANDARD20_MODULE_STUFF,wsgi_create_dir_config,    /* create per-dir    config structures */wsgi_merge_dir_config,     /* merge  per-dir    config structures */wsgi_create_server_config, /* create per-server config structures */wsgi_merge_server_config,  /* merge  per-server config structures */wsgi_commands,             /* table of config file commands       */wsgi_register_hooks        /* register hooks                      */
};

配置选项对应的函数 mod_wsgi.c +14982:

static const command_rec wsgi_commands[] =
{AP_INIT_RAW_ARGS("WSGIScriptAlias", wsgi_add_script_alias,NULL, RSRC_CONF, "Map location to target WSGI script file."),...#if defined(MOD_WSGI_WITH_DAEMONS)AP_INIT_RAW_ARGS("WSGIDaemonProcess", wsgi_add_daemon_process,NULL, RSRC_CONF, "Specify details of daemon processes to start."),...AP_INIT_TAKE1("WSGILazyInitialization", wsgi_set_lazy_initialization,NULL, RSRC_CONF, "Enable/Disable lazy Python initialization."),
#endif...};

wsgi_add_script_alias大致做了一些初始化的工作,告诉apache dispatcher留意了, 看到类似XXX的url,要调用我们来处理。

有意思的是这个 wsgi_register_hooks mod_wsgi.c +14931+:

static void wsgi_register_hooks(apr_pool_t *p)
{...static const char * const p6[] = { "mod_python.c", NULL };ap_hook_post_config(wsgi_hook_init, p6, NULL, APR_HOOK_MIDDLE);ap_hook_child_init(wsgi_hook_child_init, p6, NULL, APR_HOOK_MIDDLE);ap_hook_translate_name(wsgi_hook_intercept, p1, n1, APR_HOOK_MIDDLE);ap_hook_handler(wsgi_hook_handler, NULL, NULL, APR_HOOK_MIDDLE);...
}

从名字上看,wsgi_hook_init, wsgi_hook_child_init是做初始化工作的。 我们先看wsgi_hook_handler做了什么 mod_wsgi.c +8690:

static int wsgi_hook_handler(request_rec *r)
{.../** Only process requests for this module. First check for* where target is the actual WSGI script. Then need to* check for the case where handler name mapped to a handler* script definition.*/// blablabla 一堆参数检查代码.../* Build the sub process environment. */// wsgi协议相关环境变量在这里设置,每次请求都不一样// 故此处是每次请求的必经之地wsgi_build_environment(r);...// WSGIDaemonProcess 模式处理代码/** Execute the target WSGI application script or proxy* request to one of the daemon processes as appropriate.*/#if defined(MOD_WSGI_WITH_DAEMONS)status = wsgi_execute_remote(r);if (status != DECLINED)return status;
#endif...return wsgi_execute_script(r);
}

wsgi_hook_handler 是每次请求的入口,最后调用wsgi_execute_script mod_wsgi.c +6404:

static int wsgi_execute_script(request_rec *r)
{.../* Grab request configuration. */config = (WSGIRequestConfig *)ap_get_module_config(r->request_config,&wsgi_module);/** Acquire the desired python interpreter. Once this is done* it is safe to start manipulating python objects.*/// 获得解释器,一个wsgi app可以运行在单独的python解释器里// 在一个进程里,可以有多个解释器同时运行// application_group 在 wsgi_application_group 函数中设置// 与req请求的servername,port,scriptname有关,每次请求对应于哪个解释器由它来决定interp = wsgi_acquire_interpreter(config->application_group);if (!interp) {ap_log_rerror(APLOG_MARK, WSGI_LOG_CRIT(0), r,"mod_wsgi (pid=%d): Cannot acquire interpreter '%s'.",getpid(), config->application_group);return HTTP_INTERNAL_SERVER_ERROR;}/* Calculate the Python module name to be used for script. */if (config->handler_script && *config->handler_script)script = config->handler_script;elsescript = r->filename;// 找到这个app的python模块名字name = wsgi_module_name(r->pool, script);...modules = PyImport_GetModuleDict();module = PyDict_GetItemString(modules, name);Py_XINCREF(module);if (module)exists = 1;/** If script reloading is enabled and the module for it has* previously been loaded, see if it has been modified since* the last time it was accessed. For a handler script will* also see if it contains a custom function for determining* if a reload should be performed.*/// Reload相关代码,检测app代码是否被修改if (module && config->script_reloading) {if (wsgi_reload_required(r->pool, r, script, module, r->filename)) {...#if defined(MOD_WSGI_WITH_DAEMONS)if (*config->process_group) {/** Need to restart the daemon process. We bail* out on the request process here, sending back* a special response header indicating that* process is being restarted and that remote* end should abandon connection and attempt to* reconnect again. We also need to signal this* process so it will actually shutdown. The* process supervisor code will ensure that it* is restarted.*/Py_BEGIN_ALLOW_THREADSap_log_rerror(APLOG_MARK, WSGI_LOG_INFO(0), r,"mod_wsgi (pid=%d): Force restart of ""process '%s'.", getpid(),config->process_group);Py_END_ALLOW_THREADS...wsgi_release_interpreter(interp);r->status = HTTP_INTERNAL_SERVER_ERROR;r->status_line = "0 Rejected";wsgi_daemon_shutdown++;// WSGIDaemonProcess 模式,杀掉当前daemon进程,重新加载kill(getpid(), SIGINT);return OK;}else {...PyDict_DelItemString(modules, name);}
#else/** Need to reload just the script module. Remove* the module from the modules dictionary before* reloading it again. If code is executing* within the module at the time, the callers* reference count on the module should ensure* it isn't actually destroyed until it is* finished.*/// WSGIScriptAlias 模式,删除旧的模块PyDict_DelItemString(modules, name);
#endif}}...// 如果是第一次请求,则需要加载该模块/* Load module if not already loaded. */if (!module) {module = wsgi_load_source(r->pool, r, name, exists, script,config->process_group,config->application_group);}...// 激动人心的时刻到了,执行app代码!status = HTTP_INTERNAL_SERVER_ERROR;/* Determine if script exists and execute it. */if (module) {PyObject *module_dict = NULL;PyObject *object = NULL;module_dict = PyModule_GetDict(module);object = PyDict_GetItemString(module_dict, config->callable_object);if (object) {AdapterObject *adapter = NULL;adapter = newAdapterObject(r);if (adapter) {PyObject *method = NULL;PyObject *args = NULL;Py_INCREF(object);status = Adapter_run(adapter, object); // 这里,这里Py_DECREF(object);...}else {Py_BEGIN_ALLOW_THREADSap_log_rerror(APLOG_MARK, WSGI_LOG_ERR(0), r,"mod_wsgi (pid=%d): Target WSGI script '%s' does ""not contain WSGI application '%s'.",getpid(), script, config->callable_object);Py_END_ALLOW_THREADSstatus = HTTP_NOT_FOUND;}}// 错误处理/* Log any details of exceptions if execution failed. */if (PyErr_Occurred())wsgi_log_python_error(r, NULL, r->filename);/* Cleanup and release interpreter, */Py_XDECREF(module);wsgi_release_interpreter(interp);return status;
}

Adapter_run +3823:

static int Adapter_run(AdapterObject *self, PyObject *object)
{...vars = Adapter_environ(self);// 获取 start_response 函数start = PyObject_GetAttrString((PyObject *)self, "start_response");// 准备参数,还记得 def application(environ, start_response) 吗?args = Py_BuildValue("(OO)", vars, start);// 执行app代码self->sequence = PyEval_CallObject(object, args);if (self->sequence != NULL) {if (!Adapter_process_file_wrapper(self)) {int aborted = 0;iterator = PyObject_GetIter(self->sequence);if (iterator != NULL) {PyObject *item = NULL;// 遍历返回的iterator,输出每一行while ((item = PyIter_Next(iterator))) {...if (length && !Adapter_output(self, msg, length, 0)) {if (!PyErr_Occurred())aborted = 1;Py_DECREF(item);break;}}}...}// 如果返回的seq有close方法则调用if (PyObject_HasAttrString(self->sequence, "close")) {PyObject *args = NULL;PyObject *data = NULL;close = PyObject_GetAttrString(self->sequence, "close");args = Py_BuildValue("()");data = PyEval_CallObject(close, args);Py_DECREF(args);Py_XDECREF(data);Py_DECREF(close);}...}...}

AdapterObject 是自定义的python类型,用来运行wsgi程序,含有start_response方法:

typedef struct {PyObject_HEADint result;request_rec \*r;
#if defined(MOD_WSGI_WITH_BUCKETS)apr_bucket_brigade \*bb;
#endifWSGIRequestConfig \*config;InputObject \*input;PyObject \*log;int status;const char \*status_line;PyObject \*headers;PyObject \*sequence;int content_length_set;apr_off_t content_length;apr_off_t output_length;
} AdapterObject;static PyTypeObject Adapter_Type;
...
static PyMethodDef Adapter_methods[] = {{ "start_response", (PyCFunction)Adapter_start_response, METH_VARARGS, 0 },{ "write",          (PyCFunction)Adapter_write, METH_VARARGS, 0 },{ "file_wrapper",   (PyCFunction)Adapter_file_wrapper, METH_VARARGS, 0 },{ NULL, NULL}
};

Adapter_xxx 系列函数,是wsgi协议的具体实现。我承认,前面说的在wsgi_build_environment中设置wsgi相关变量的说法有不对 的地方,大多数变量是在 Adapter_environ 中设置的:)

Adapter_start_response C实现的start_response

如何获得解释器?:

static InterpreterObject *wsgi_acquire_interpreter(const char *name)
{PyThreadState *tstate = NULL;PyInterpreterState *interp = NULL;InterpreterObject *handle = NULL;.../** Check if already have interpreter instance and* if not need to create one.*/handle = (InterpreterObject *)PyDict_GetItemString(wsgi_interpreters,name);if (!handle) {// 如果没有查找到解释器,新解释器在这里被创建handle = newInterpreterObject(name);...// 存储到 wsgi_interpretersPyDict_SetItemString(wsgi_interpreters, name, (PyObject *)handle);}elsePy_INCREF(handle);interp = handle->interp;/** Create new thread state object. We should only be* getting called where no current active thread* state, so no need to remember the old one. When* working with the main Python interpreter always* use the simplified API for GIL locking so any* extension modules which use that will still work.*/// thread 相关代码...return handle;
}

加载app代码在wsgi_load_source函数:

static PyObject *wsgi_load_source(apr_pool_t *pool, request_rec *r,const char *name, int exists,const char* filename,const char *process_group,const char *application_group)
{...fp = fopen(filename, "r");n = PyParser_SimpleParseFile(fp, filename, Py_file_input);...co = (PyObject *)PyNode_Compile(n, filename);PyNode_Free(n);// 根据文件名字name,编译过的代码co,加载该模块if (co)m = PyImport_ExecCodeModuleEx((char *)name, co, (char *)filename);Py_XDECREF(co);if (m) {...// 设置模块修改时间PyModule_AddObject(m, "__mtime__", object);}else {Py_BEGIN_ALLOW_THREADSif (r) {ap_log_rerror(APLOG_MARK, WSGI_LOG_ERR(0), r,"mod_wsgi (pid=%d): Target WSGI script '%s' cannot ""be loaded as Python module.", getpid(), filename);}...wsgi_log_python_error(r, NULL, filename);}return m;
}

以上即是WSGIScriptAlias模式下,一个请求收到之后,apache调用wsgi_hook_handler, mod_wsgi的大致处理流程。还有一个问题,python环境到底是在什么时候初始化的呢? 让我们回头看。

wsgi_hook_init mod_wsgi.c +13031:

static int wsgi_hook_init(apr_pool_t *pconf, apr_pool_t *ptemp,apr_pool_t *plog, server_rec *s)
{.../* Retain reference to base server. */wsgi_server = s;/* Retain record of parent process ID. */wsgi_parent_pid = getpid();/* Determine whether multiprocess and/or multithread. */ap_mpm_query(AP_MPMQ_IS_THREADED, &wsgi_multithread);wsgi_multithread = (wsgi_multithread != AP_MPMQ_NOT_SUPPORTED);ap_mpm_query(AP_MPMQ_IS_FORKED, &wsgi_multiprocess);if (wsgi_multiprocess != AP_MPMQ_NOT_SUPPORTED) {ap_mpm_query(AP_MPMQ_MAX_DAEMONS, &wsgi_multiprocess);wsgi_multiprocess = (wsgi_multiprocess != 1);}/* Retain reference to main server config. */wsgi_server_config = ap_get_module_config(s->module_config, &wsgi_module);/** Check that the version of Python found at* runtime is what was used at compilation.*/wsgi_python_version();/** Initialise Python if required to be done in* the parent process. Note that it will not be* initialised if mod_python loaded and it has* already been done.*/if (wsgi_python_required == -1)wsgi_python_required = 1;// 在哪里初始化python,取决于 wsgi_python_after_fork 即 WSGILazyInitialization 选项// 是在apache进程fork之前,还是之后?if (!wsgi_python_after_fork)wsgi_python_init(pconf);/* Startup separate named daemon processes. */// WSGIDaemonProcess 模式下启动daemon进程,要探索daemon模式的奥秘,这里即是入口
#if defined(MOD_WSGI_WITH_DAEMONS)status = wsgi_start_daemons(pconf);
#endifreturn status;
}

fork 之后的初始化函数:

static void wsgi_hook_child_init(apr_pool_t *p, server_rec *s)
{...// wsgi_python_required 取决于 WSGIRestrictEmbedded 选项if (wsgi_python_required) {/** Initialise Python if required to be done in* the child process. Note that it will not be* initialised if mod_python loaded and it has* already been done.*/if (wsgi_python_after_fork)wsgi_python_init(p);/** Now perform additional initialisation steps* always done in child process.*/wsgi_python_child_init(p);}
}

这两个只是和apache相关的,由apache调用的hook初始化,真正的python初始化在 wsgi_python_init, wsgi_python_child_init 两步初始化:

static void wsgi_python_init(apr_pool_t *p)
{static int initialized = 1;/* Perform initialisation if required. */if (!Py_IsInitialized() || !initialized) {.../* Initialise Python. */ap_log_error(APLOG_MARK, WSGI_LOG_INFO(0), wsgi_server,"mod_wsgi (pid=%d): Initializing Python.", getpid());initialized = 1;Py_Initialize(); // 神秘而又强大的 Py_Initialize/* Initialise threading. */PyEval_InitThreads();
#if PY_MAJOR_VERSION == 3 && PY_MINOR_VERSION >= 2/** We now want to release the GIL. Before we do that* though we remember what the current thread state is.* We will use that later to restore the main thread* state when we want to cleanup interpreters on* shutdown.*/wsgi_main_tstate = PyThreadState_Get();PyEval_ReleaseThread(wsgi_main_tstate);
#elsePyThreadState_Swap(NULL);PyEval_ReleaseLock();
#endifwsgi_python_initialized = 1;/** Register cleanups to be performed on parent restart* or shutdown. This will destroy Python itself.*/apr_pool_cleanup_register(p, NULL, wsgi_python_parent_cleanup,apr_pool_cleanup_null);}
}static void wsgi_python_child_init(apr_pool_t *p)
{// 第二步初始化所做的工作, 此时已经fork了/** Trigger any special Python stuff required after a fork.* Only do this though if we were responsible for the* initialisation of the Python interpreter in the first* place to avoid it being done multiple times. Also only* do it if Python was initialised in parent process.*//* Finalise any Python objects required by child process. *//* Initialise Python interpreter instance table and lock. */// 存放所有解释器的字典wsgi_interpreters = PyDict_New();/** Initialise the key for data related to a thread. At* the moment we only record an integer thread ID to be* used in lookup table to thread states associated with* an interprter.*//** Cache a reference to the first Python interpreter* instance. This interpreter is special as some third party* Python modules will only work when used from within this* interpreter. This is generally when they use the Python* simplified GIL API or otherwise don't use threading API* properly. An empty string for name is used to identify* the first Python interpreter instance.*//* Loop through import scripts for this process and load them. */// 处理wsgi_import_listif (wsgi_import_list) {...}
}

ha, 终于快完了,现在,让我们打印一些有趣的输出,来看一看这些函数在什么时间, 哪个进程被调用。注意,下面的patch针对没有使用过 unifdef 的代码:

diff --git a/mod_wsgi.c b/mod_wsgi.c
index f0764b8..1781f7b 100644
--- a/mod_wsgi.c
+++ b/mod_wsgi.c
@@ -29,6 +29,8 @@**/+#define INFO(fmt, args...) ap_log_error(APLOG_MARK, WSGI_LOG_ERR(0), wsgi_server, "[pid %d] %s:%s:%d "fmt, getpid(),__FILE__, __PRETTY_FUNCTION__, __LINE__,args)
+#define CORE_PRIVATE 1#include "httpd.h"
@@ -5722,10 +5724,14 @@ static void wsgi_python_init(apr_pool_t *p)static int initialized = 1;#endif+    INFO("%s", "enter");
+/* Perform initialisation if required. */if (!Py_IsInitialized() || !initialized) {+        INFO("%s", "init python");
+/* Enable Python 3.0 migration warnings. */#if PY_MAJOR_VERSION == 2 && PY_MINOR_VERSION >= 6
@@ -5859,6 +5865,8 @@ static PyObject *wsgi_interpreters = NULL;static InterpreterObject *wsgi_acquire_interpreter(const char *name){
+    INFO("search interpreter %s", name);
+PyThreadState *tstate = NULL;PyInterpreterState *interp = NULL;InterpreterObject *handle = NULL;
@@ -5893,6 +5901,9 @@ static InterpreterObject *wsgi_acquire_interpreter(const char *name)name);if (!handle) {
+
+        INFO("create interpreter %s", name);
+handle = newInterpreterObject(name);if (!handle) {
@@ -5916,6 +5927,8 @@ static InterpreterObject *wsgi_acquire_interpreter(const char *name)elsePy_INCREF(handle);+    INFO("found interpreter %s", name);
+interp = handle->interp;/*
@@ -6339,6 +6352,8 @@ static int wsgi_execute_script(request_rec *r)* it is safe to start manipulating python objects.*/+    INFO("%s", "enter");
+interp = wsgi_acquire_interpreter(config->application_group);if (!interp) {
@@ -6543,6 +6558,7 @@ static int wsgi_execute_script(request_rec *r)PyObject *method = NULL;PyObject *args = NULL;+                INFO("%s", "app running");Py_INCREF(object);status = Adapter_run(adapter, object);Py_DECREF(object);
@@ -6693,6 +6709,8 @@ static void wsgi_python_child_init(apr_pool_t *p)int thread_id = 0;int *thread_handle = NULL;+    INFO("%s", "init python further");
+/* Working with Python, so must acquire GIL. */state = PyGILState_Ensure();
@@ -6778,6 +6796,9 @@ static void wsgi_python_child_init(apr_pool_t *p)/* Loop through import scripts for this process and load them. */if (wsgi_import_list) {
+
+        INFO("%s", "dealing with wsgi_import_list");
+apr_array_header_t *scripts = NULL;WSGIScriptFile *entries;
@@ -8115,6 +8136,7 @@ static void wsgi_log_script_error(request_rec *r, const char *e, const char *n)static void wsgi_build_environment(request_rec *r){
+    INFO("%s", "enter");WSGIRequestConfig *config = NULL;const char *value = NULL;
@@ -8862,6 +8884,7 @@ static int wsgi_hook_handler(request_rec *r)if (!r->handler)return DECLINED;+    INFO("handler %s, file %s", r->handler, r->filename);/** Construct request configuration and cache it in the* request object against this module so can access it later
@@ -9082,6 +9105,7 @@ static int wsgi_hook_handler(request_rec *r)#if AP_SERVER_MAJORVERSION_NUMBER < 2+/** Apache 1.3 module initialisation functions.*/
@@ -12909,6 +12933,9 @@ static int wsgi_hook_daemon_handler(conn_rec *c)static int wsgi_hook_init(apr_pool_t *pconf, apr_pool_t *ptemp,apr_pool_t *plog, server_rec *s){
+
+    INFO("%s", "enter");
+void *data = NULL;const char *userdata_key = "wsgi_init";char package[128];
@@ -13028,6 +13055,8 @@ static void wsgi_hook_child_init(apr_pool_t *p, server_rec *s)}#endif+    INFO("%s", "enter");
+if (wsgi_python_required) {/** Initialise Python if required to be done in
@@ -13500,6 +13529,7 @@ static authn_status wsgi_check_password(request_rec *r, const char *user,* the last time it was accessed.*/+    /* FIXME: Reloading */if (module && config->script_reloading) {if (wsgi_reload_required(r->pool, r, script, module, NULL)) {/*
@@ -14804,6 +14834,9 @@ static int wsgi_hook_logio(apr_pool_t *pconf, apr_pool_t *ptemp,static void wsgi_register_hooks(apr_pool_t *p){
+
+    INFO("%s", "enter");
+static const char * const p1[] = { "mod_alias.c", NULL };static const char * const n1[]= { "mod_userdir.c","mod_vhost_alias.c", NULL };

日志输出,对应于上面给出的apache配置文件:

[Fri Sep 30 14:22:20 2011] [error] [pid 21372] mod_wsgi.c:wsgi_hook_init:12937 enter
[Fri Sep 30 14:22:20 2011] [error] [pid 21372] mod_wsgi.c:wsgi_register_hooks:14838 enter
[Fri Sep 30 14:22:20 2011] [error] [pid 21373] mod_wsgi.c:wsgi_hook_init:12937 enter
[Fri Sep 30 14:22:20 2011] [notice] Apache/2.2.17 (Ubuntu) mod_wsgi/3.3 Python/2.7.1+ configured -- resuming normal operations
[Fri Sep 30 14:22:20 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_child_init:13058 enter
[Fri Sep 30 14:22:20 2011] [error] [pid 21377] mod_wsgi.c:wsgi_python_init:5727 enter
[Fri Sep 30 14:22:20 2011] [error] [pid 21377] mod_wsgi.c:wsgi_python_init:5733 init python
[Fri Sep 30 14:22:20 2011] [error] [pid 21378] mod_wsgi.c:wsgi_hook_child_init:13058 enter
[Fri Sep 30 14:22:20 2011] [error] [pid 21378] mod_wsgi.c:wsgi_python_init:5727 enter
[Fri Sep 30 14:22:20 2011] [error] [pid 21378] mod_wsgi.c:wsgi_python_init:5733 init python
[Fri Sep 30 14:22:20 2011] [error] [pid 21377] mod_wsgi.c:wsgi_python_child_init:6712 init python further
[Fri Sep 30 14:22:20 2011] [error] [pid 21378] mod_wsgi.c:wsgi_python_child_init:6712 init python furtherjaime@westeros:/var/www$ ps aux | grep apache2
jaime    20827  0.0  0.0   3928   508 pts/2    S+   14:17   0:00 tail -f /var/log/apache2/error.log
root     21373  0.0  0.1  10224  3036 ?        Ss   14:22   0:00 /usr/sbin/apache2 -k start
www-data 21377  0.0  0.3 234368  6752 ?        Sl   14:22   0:00 /usr/sbin/apache2 -k start
www-data 21378  0.0  0.3 234392  6500 ?        Sl   14:22   0:00 /usr/sbin/apache2 -k start
jaime    23119  0.0  0.0   4156   856 pts/3    S+   16:37   0:00 grep --color=auto apache2

启动apache之后,在主进程21372中,执行wsgi_hook_init, wsgi_register_hooks, 其中wsgi_hook_init 在另一个进程中21373中也被执行了。 创建了两个子进程21377, 21378。每个进程都按顺序执行wsgi_hook_child_init, wsgi_python_init, wsgi_python_child_init。 此时,apache已经启动完成,python也已经初始化,但是解释器还没有创建。

第一次请求,由进程21377负责处理,创建了解释器,也加载了hello.wsgi:

[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_handler:8887 handler wsgi-script, file /var/www/hello.wsgi
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_build_environment:8139 enter
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_execute_script:6355 enter
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5868 search interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5905 create interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5930 found interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:29 2011] [info] [client 127.0.0.1] mod_wsgi (pid=21377, process='', application='127.0.1.1|/hello'): Loading WSGI script '/var/www/hello.wsgi'.
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_execute_script:6561 app running
[Fri Sep 30 14:22:29 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_handler:8887 handler image/x-icon, file /var/www/favicon.ico
[Fri Sep 30 14:22:29 2011] [error] [client 127.0.0.1] File does not exist: /var/www/favicon.ico

第二次请求,什么也不需要做,解释器使用原来的,代码也已经加载过了,cool:

[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_handler:8887 handler wsgi-script, file /var/www/hello.wsgi
[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_build_environment:8139 enter
[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_execute_script:6355 enter
[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5868 search interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5930 found interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_execute_script:6561 app running
[Fri Sep 30 14:22:36 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_handler:8887 handler image/x-icon, file /var/www/favicon.ico
[Fri Sep 30 14:22:36 2011] [error] [client 127.0.0.1] File does not exist: /var/www/favicon.ico

第三次请求,修改了hello.wsgi,所以需要重新加载代码, reloading:

[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_handler:8887 handler wsgi-script, file /var/www/hello.wsgi
[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_build_environment:8139 enter
[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_execute_script:6355 enter
[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5868 search interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_acquire_interpreter:5930 found interpreter 127.0.1.1|/hello
[Fri Sep 30 14:22:47 2011] [info] [client 127.0.0.1] mod_wsgi (pid=21377, process='', application='127.0.1.1|/hello'): Reloading WSGI script '/var/www/hello.wsgi'.
[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_execute_script:6561 app running
[Fri Sep 30 14:22:47 2011] [error] [pid 21377] mod_wsgi.c:wsgi_hook_handler:8887 handler image/x-icon, file /var/www/favicon.ico
[Fri Sep 30 14:22:47 2011] [error] [client 127.0.0.1] File does not exist: /var/www/favicon.ico

虽然前三次请求都由21372执行,但我们确实观测到了21378:

[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_hook_handler:8887 handler wsgi-script, file /var/www/hello.wsgi
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_build_environment:8139 enter
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_execute_script:6355 enter
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_acquire_interpreter:5868 search interpreter 127.0.1.1|/hello
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_acquire_interpreter:5905 create interpreter 127.0.1.1|/hello
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_acquire_interpreter:5930 found interpreter 127.0.1.1|/hello
[Fri Sep 30 14:41:37 2011] [info] [client 127.0.0.1] mod_wsgi (pid=21378, process='', application='127.0.1.1|/hello'): Loading WSGI script '/var/www/hello.wsgi'.
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_execute_script:6561 app running
[Fri Sep 30 14:41:37 2011] [error] [pid 21378] mod_wsgi.c:wsgi_hook_handler:8887 handler image/x-icon, file /var/www/favicon.ico
[Fri Sep 30 14:41:37 2011] [error] [client 127.0.0.1] File does not exist: /var/www/favicon.ico

Notes:

  • Python c api代码和apache c代码混在一起,其实只不过是对不同lib的变量进行操作罢了, 实际上都是c代码。当把libpython,libapache链接到本进程时,它们有各自的变量在全局空间里, 保存着自己的状态,其他的代码就是对这些变量的操作。 这部分解释了为什么mod_python, mod_wsgi会冲突,因为他们都链接了同一个库libpython, 如果协调 不善,则极易出问题。 http://code.google.com/p/modwsgi/wiki/InstallationIssues#Incompatible_ModPython_Versions

daemon模式备忘

wsgi_daemon_index 存放process_group到socket的一个映射, 由进程组的名字, 可以找到该组 进程正在监听的socket, 这个socket是与daemon通信的关键, 在fork之前创建, fork之后所有的子进程 都可访问, daemon需要关掉所有不是本进程组的socket fd。

wsgi_daemon_lists 所有已启动的daemon进程列表。

在apache启动的时候, 由wsgi_hook_init 调用start_daemons,创建所有的daemons, 此后daemon的数量就是固定的了。

pid7838 wsgi_hook_init调用返回之后, apache 又fork起了一个子进程 pid 7843, 非root权限, 调用wsgi_hook_child_init,此进程 负责处理分发所有的请求, 对每个请求调用wsgi_hook_handler, 在wsgi_execute_remote中和真正的daemon进程通过 socket进行交互, 该apache子进程可以被成为modwsgi的dispatcher。pid 7842是一个daemon进程。

不管是embedded模式, 还是daemon模式, 最后都会走到wsgi_execute_script函数。

请求headers, 标准的CGI变量, 是通过r->subprocess_env传递到daemon进程中的, 参见wsgi_build_environment, wsgi_send_request。 对象r,从dispatcher到daemon, 跨越了不同的进程, 已经不是原来的r了, 这点需要注意。

daemon进程如果发现需要reload代码, 则会发送一个0 Rejected 消息给dispatcher, 然后杀掉自己。apache捕获到daemon子进程死掉的信号, 重新启动一个daemon process, 仍然监听同一个socket。

daemon如果发现一切正常, 不需要reload(新的daemon总是如此), 会发送0 Continue的消息给dispatcher, 告诉它可以go on了。

dispatcher如果收到0 Rejected信号, 会重新尝试连接,直到收到0 Continue或超出重试次数为止。实际上, 0 Continue可以被看作是一种同步机制。

[Sun Oct 30 13:00:17 2011] [error] [pid 7837] mod_wsgi.c:wsgi_hook_init:13658 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7837] mod_wsgi.c:wsgi_register_hooks:15564 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_hook_init:13658 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_python_init:5817 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_python_init:5823 init python
[Sun Oct 30 13:00:17 2011] [info] mod_wsgi (pid=7838): Python home /usr/local/sae/python.
[Sun Oct 30 13:00:17 2011] [info] mod_wsgi (pid=7838): Initializing Python.
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_start_daemons:11955 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_start_process:11540 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_start_process:11944 ok, we're father
[Sun Oct 30 13:00:17 2011] [error] [pid 7838] mod_wsgi.c:wsgi_hook_init:13754 forking a new process to listen all connections, will call wsgi_hook_child_init
[Sun Oct 30 13:00:17 2011] [warn] pid file /var/run/apache2.pid overwritten -- Unclean shutdown of previous Apache run?
[Sun Oct 30 13:00:17 2011] [notice] Apache/2.2.17 (Ubuntu) mod_wsgi/3.3 Python/2.6.7 configured -- resuming normal operations
[Sun Oct 30 13:00:17 2011] [info] Server built: Sep  1 2011 09:25:26
[Sun Oct 30 13:00:17 2011] [error] [pid 7843] mod_wsgi.c:wsgi_hook_child_init:13784 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7843] mod_wsgi.c:wsgi_python_child_init:6883 init python further
[Sun Oct 30 13:00:17 2011] [info] mod_wsgi (pid=7843): Attach interpreter ''.
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_start_process:11558 ok in child, we're a new daemon process
[Sun Oct 30 13:00:17 2011] [info] mod_wsgi (pid=7842): Starting process 'wic' with uid=1000, gid=1000 and threads=1.
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_python_child_init:6883 init python further
[Sun Oct 30 13:00:17 2011] [info] mod_wsgi (pid=7842): Attach interpreter ''.
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_daemon_main:11276 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_daemon_main:11428 creating thread 0
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_daemon_thread:11119 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_daemon_worker:10887 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_monitor_thread:11181 enter
[Sun Oct 30 13:00:17 2011] [error] [pid 7842] mod_wsgi.c:wsgi_monitor_thread:11203 check worker status

这篇关于WSGI分析的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/657446

相关文章

Go标准库常见错误分析和解决办法

《Go标准库常见错误分析和解决办法》Go语言的标准库为开发者提供了丰富且高效的工具,涵盖了从网络编程到文件操作等各个方面,然而,标准库虽好,使用不当却可能适得其反,正所谓工欲善其事,必先利其器,本文将... 目录1. 使用了错误的time.Duration2. time.After导致的内存泄漏3. jsO

Spring事务中@Transactional注解不生效的原因分析与解决

《Spring事务中@Transactional注解不生效的原因分析与解决》在Spring框架中,@Transactional注解是管理数据库事务的核心方式,本文将深入分析事务自调用的底层原理,解释为... 目录1. 引言2. 事务自调用问题重现2.1 示例代码2.2 问题现象3. 为什么事务自调用会失效3

找不到Anaconda prompt终端的原因分析及解决方案

《找不到Anacondaprompt终端的原因分析及解决方案》因为anaconda还没有初始化,在安装anaconda的过程中,有一行是否要添加anaconda到菜单目录中,由于没有勾选,导致没有菜... 目录问题原因问http://www.chinasem.cn题解决安装了 Anaconda 却找不到 An

Spring定时任务只执行一次的原因分析与解决方案

《Spring定时任务只执行一次的原因分析与解决方案》在使用Spring的@Scheduled定时任务时,你是否遇到过任务只执行一次,后续不再触发的情况?这种情况可能由多种原因导致,如未启用调度、线程... 目录1. 问题背景2. Spring定时任务的基本用法3. 为什么定时任务只执行一次?3.1 未启用

C++ 各种map特点对比分析

《C++各种map特点对比分析》文章比较了C++中不同类型的map(如std::map,std::unordered_map,std::multimap,std::unordered_multima... 目录特点比较C++ 示例代码 ​​​​​​代码解释特点比较1. std::map底层实现:基于红黑

Spring、Spring Boot、Spring Cloud 的区别与联系分析

《Spring、SpringBoot、SpringCloud的区别与联系分析》Spring、SpringBoot和SpringCloud是Java开发中常用的框架,分别针对企业级应用开发、快速开... 目录1. Spring 框架2. Spring Boot3. Spring Cloud总结1. Sprin

Spring 中 BeanFactoryPostProcessor 的作用和示例源码分析

《Spring中BeanFactoryPostProcessor的作用和示例源码分析》Spring的BeanFactoryPostProcessor是容器初始化的扩展接口,允许在Bean实例化前... 目录一、概览1. 核心定位2. 核心功能详解3. 关键特性二、Spring 内置的 BeanFactory

MyBatis-Plus中Service接口的lambdaUpdate用法及实例分析

《MyBatis-Plus中Service接口的lambdaUpdate用法及实例分析》本文将详细讲解MyBatis-Plus中的lambdaUpdate用法,并提供丰富的案例来帮助读者更好地理解和应... 目录深入探索MyBATis-Plus中Service接口的lambdaUpdate用法及示例案例背景

MyBatis-Plus中静态工具Db的多种用法及实例分析

《MyBatis-Plus中静态工具Db的多种用法及实例分析》本文将详细讲解MyBatis-Plus中静态工具Db的各种用法,并结合具体案例进行演示和说明,具有很好的参考价值,希望对大家有所帮助,如有... 目录MyBATis-Plus中静态工具Db的多种用法及实例案例背景使用静态工具Db进行数据库操作插入

Go使用pprof进行CPU,内存和阻塞情况分析

《Go使用pprof进行CPU,内存和阻塞情况分析》Go语言提供了强大的pprof工具,用于分析CPU、内存、Goroutine阻塞等性能问题,帮助开发者优化程序,提高运行效率,下面我们就来深入了解下... 目录1. pprof 介绍2. 快速上手:启用 pprof3. CPU Profiling:分析 C