linux下处理实时日志生成另一...-Tomcat去除项目名称和端口号-nginx配置数据结构及合并过程

当前位置: 互联网>综合

本页文章导读:

▪linux下处理实时日志生成另一个实时日志一.背景介绍 1.知识点写这篇blog，主要有下面几个知识点想介绍： curl获取http相应内容； shell中执行php文件； php中执行shell命令（通过exec函数）； php实现tail -f命令；包含空格的参数如何.........

▪Tomcat去除项目名称和端口号 Tomcat去除项目名称和端口号 1. 去除端口号将端口设为80： <Connector port="80" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" URIEncoding="GBK"/> 2. .........

▪nginx配置数据结构及合并过程配置数据结构的创建过程初始化cycle->conf_ctx ngx_init_cycle()函数中根据nginx模块的数量ngx_max_module创建一个指针数组，如下。每个指针成员，对应nginx模块的配置数据结构。 cycle->conf_ctx = ng.........

[1]linux下处理实时日志生成另一个实时日志

来源: 互联网发布时间: 2013-10-24

一.背景介绍 1.知识点

写这篇blog，主要有下面几个知识点想介绍：

curl获取http相应内容；

shell中执行php文件；

php中执行shell命令（通过exec函数）；

php实现tail -f命令；

包含空格的参数如何作为参数传递（用双引号括起来）。

2.业务流程

这篇blog的背景是读取"/data3/im-log/nginx.im.imp.current/nginx.im.imp.current_current"这个实时日志，生成招聘会所需的实时日志。

业务流程如下：
（1）从http://bj.baidu.com/jobfairs/jobfairs_im_port.php?action=getIms获取企业和IM客户端id的关系。
响应的格式如下：
{"status":1,"ret":{"company_id":{“im_accout”:[im_id],"company_name":[]}}}
获取到的数据如下：
{"status":1,"ret":{"2028107":{"im_account":["31669394","50000098"],"name":["baidu"]},"2028098":{"im_account":["50029298","50000098","31669376","31669394","50006271"],"name":["sogou"]}},"msg":""}

这里碰到的第一个问题是我开发所在的环境和http://bj.baidu.com不在同一个网段内，该url服务所在的IP为10.3.255.201，此时我需要进行hosts映射，这样当我访问http://bj.baidu.com/jobfairs/jobfairs_im_port.php?action=getIms时，便相当于我在访问了http://10.3.255.201/jobfairs/jobfairs_im_port.php?action=getIms。
但是我们一定有一个疑问，为什么我们不直接使用http://10.3.255.201/jobfairs/jobfairs_im_port.php?action=getIms来进行访问，答案是我们需要通过url获取到用户的城市，即http://bj.baidu.com/jobfairs/jobfairs_im_port.php?action=getIms，这里面包含bj.baidu.com，包含用户的城市信息bj。

解决方法是通过curl对url和host进行映射：

curl -H "Host: bj.ganji.com" http://10.3.255.201/jobfairs/jobfairs_im_port.php?action=getIms

参考链接：http://blog.csdn.net/lianxiang_biancheng/article/details/7575370 中的curl命令的使用。

（2）其次，如果这条日志记录中fromUserId或者toUserId包含某个企业的IM客户端id，则说明这条消息属于这个企业；

（3）最后，生成所需格式的日志，日志的字段格式如下：
时间企业Id 企业名称企业IM的id 应聘者IM的id 谁发送的信息(0:企业，1:应聘者) 消息内容

二.采用了三种实现方式 1.第一种：shell读取每一行记录传递给php进行匹配并输出（1）start.sh是启动文件，如下：

#!/bin/sh

#执行前清除所有该进程
pids=`ps aux | grep jobfairs | grep -v "grep" | awk '{print $2}'`      
if [ "$pids" != "" ];then 
    echo $pids
    kill -9 $pids
fi
sh jobfairs.sh >> /home/baidu/log/jobfairs.log

（2）jobfairs.sh是获取http内容，读取实时日志并每2分钟重新请求的实现，如下：

#!/bin/sh

logfile="/data3/im-log/nginx.im.imp.current/nginx.im.imp.current_current"

hours=`date +%H`
start_time=`date +%s`

#17点后停止运行程序
while [ $hours -lt 17 ]
do
    res=`curl -s -H "Host: bj.baidu.com" http://10.3.255.201/jobfairs/jobfairs_im_port.php?action=getIms`
    #echo $res
    
    len=${#res}
    if [ $len = 0 ]; then
        echo "Failed! Request error!"
        exit
    fi

    status=`echo $res | sed -e 's/.*status"://' -e 's/,.*//'`
    if [ $status != 1 ]; then 
        echo "Failed! Request stauts:"$status
        exit
    fi

    ret=`echo $res | sed -e 's/.*ret"://' -e 's/,"msg.*//'`
    #ret='{"2028097":{"im_account":["2875001357","197823104","3032631861","197305863"],"name":["8\u811a\u732b\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8\uff08\u4e60\u5927\u7237\u6dae\u8089\u5bf9\u976210000\u7c73\u7684\u79d1\u6280\u516c\u53f8\uff09"]},"2028098":{"im_account":["3658247660","192683241","197488883","108963206","197305001"],"name":["9\u811a\u732b\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8"]}}';

    tail -f $logfile | grep sendMsgOk | grep "spamReasons=\[\]" | awk -F"\t" '{
        printf("%s\t%s\t%s\t%s\n",$1,$3,$4,$11); 
    }' | while read line
    do
        /usr/local/webserver/php/bin/php jobfairs.php $ret "$line"
        
        #120s后停止生成日志，重新执行http请求去获取公司相关信息
        end_time=`date +%s`
        if [ $(expr $end_time - $start_time) -ge 120 ]; then
            #echo `date +%T`" "`date +%D`
            #echo "120s is done!"
            break
        fi
    done
    
    start_time=`date +%s`
    hours=`date +%H`
done

这里还涉及到一个知识点，就是如何将包含空格的字符串作为参数传递。
这里的场景是这样的：由于一行记录各个字段是以制表符分隔的，其中有一个字段msgContent是消息内容，而消息中经常包含空格，而php接受外来参数默认是以空格分隔的，这样如果将$line作为参数进行传递，就导致msgContent被分隔为了好几个字段。那我们如何解决这个问题呢，答案就是通过加双引号（即将$line变为"$line"），将一行记录作为一个整体字符串传入即可，然后php接收到这个字符串后，再通过explode("\t",$line)进行分隔出各个字段。如下所示：
/usr/local/webserver/php/bin/php jobfairs.php $ret "$line"

（3）jobfairs.php是对实时日志的每一行进行匹配并输出为IM的log格式：

<?php
    $ret = $_SERVER["argv"][1];
    $arr = json_decode($ret, true);//将json字符串解码成数组
    foreach ($arr as $key => $value) {
        $name = $value["name"][0];//企业名称
        foreach ($value["im_account"] as $v) { //企业对应的叮咚id
            $userId[$v] = $key;
            $compName[$v] = $name;
            //echo $key ."\t" . $v ."\t" . $name ."\n";
        }
    }
  
    $line = $_SERVER["argv"][2];//获取日志的一条记录
    $logArr = explode("\t", $line);
    //echo $line . "\n";

    //获取各个字段
    $time = $logArr[0];
    $fromUserId = $logArr[1];
    $toUserId = $logArr[2];
    $msgContent = $logArr[3];
    
    $fuiArr = explode('=', $fromUserId);
    $tuiArr = explode('=', $toUserId);
    $fui = $fuiArr[1];
    $tui = $tuiArr[1];

    $output = $time . "\t";
    if(isset($userId[$fui])) { //fromUserId是某个企业的叮咚id
        //echo $line . "\n";
        $output .= "companyId=$userId[$fui]\t";
        $output .= "companyName=$compName[$fui]\t";
        $output .= "companyDingdongId=$fui\t";
        $output .= "personalDingdongId=$tui\t";
        $output .= "whoSend=0\t";
        $output .= $msgContent;
        echo $output . "\n";
    } else if(isset($userId[$tui])) { //toUserId是某个企业的叮咚id
        //echo $line . "\n";
        $output .= "companyId=$userId[$tui]\t";
        $output .= "companyName=$compName[$tui]\t";
        $output .= "companyDingdongId=$tui\t";
        $output .= "personalDingdongId=$fui\t";
        $output .= "whoSend=1\t";
        $output .= $msgContent;
        echo $output . "\n";
    }
?>

2.第二种：php执行shell命令并将输出结果进行匹配

注：该方法不能生成实时日志，因为tail -f命令是实时获取更新命令，php无法获取其返回结果。所以该方法仅用于读取一段固定文本并进行处理。

这里通过exec执行tail -n1000这个shell命令，获取到最后1000行数据然后再进行处理。同时，这里还调用了php中curl模块获取http响应内容。该文件名为jobfairs2.php。

<?php
    //error_reporting(E_ALL & ~E_NOTICE);

    $host = array("Host: bj.baidu.com");
    $data = 'user=xxx&qq=xxx&id=xxx&post=xxx';
    $url = 'http://10.3.255.201/jobfairs/jobfairs_im_port.php?action=getIms';
    $res = curl_post($host, $data, $url);
   
    $arr = json_decode($res, true);
    $status = $arr["status"];

    if ($status != 1) {
        echo "Request Failed!";
        exit;
    }
    
    //获取返回的企业信息
    $ret = $arr["ret"];
    foreach ($ret as $key => $value) {
        $name = $value["name"][0];
        //将IM的Id和企业id进行hash映射
        foreach ($value["im_account"] as $v) {
            $userId[$v] = $key;
            $compName[$v] = $name;
        }
    }
    
    $logfile = "/data3/im-log/nginx.im.imp.current/nginx.im.imp.current_current";
    
    //tail -n1000获取最后1000行记录，并保存到$log变量中
    $shell = "tail -n 1000 $logfile | grep sendMsgOk | grep 'spamReasons=\[\]' | ";
    $shell .= "awk -F'\t' '{print $1,$3,$4,$11;}'";
    exec($shell, $log);

[2]Tomcat去除项目名称和端口号

来源: 互联网发布时间: 2013-10-24

Tomcat去除项目名称和端口号
1. 去除端口号
将端口设为80：
<Connector port="80" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" URIEncoding="GBK"/>
2. 去除项目名称
项目默认部署在webapps目录下，将项目拷出，放在与webapps同级的目录下，配置如下：

<Host name="localhost" appBase="webapps"
unpackWARs="true" autoDeploy="true"
xmlValidation="false" xmlNamespaceAware="false">

<Context path="" docBase="C:/WebApp/WebRoot/Blog" debug="0"/>
</Host>
例如要部署的项目名称是“Blog”,你这样访问了，http://ip/ tomcat管理页面http://ip/manager/html

如果path="/ABC"里面添加了内容，那么访问时的路径就是http://域名/ABC

注意：隐藏IP，直接通过域名访问的方法
在C:\Windows\System32\drivers\etc路径下找到hosts文件，配置：

# For example:
#       ip地址                 别名
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
# 127.0.0.1 localhost
# ::1 localhost
#127.0.0.1 activate.adobe.com

配置好之后，就可以通过别名访问项目了http://rhino.acme.com/

作者：cenfei78325747 发表于2013-5-22 17:58:06 原文链接

阅读：23 评论：0 查看评论

[3]nginx配置数据结构及合并过程

来源: 互联网发布时间: 2013-10-24

配置数据结构的创建过程

初始化cycle->conf_ctx

ngx_init_cycle()函数中根据nginx模块的数量ngx_max_module创建一个指针数组，如下。每个指针成员，对应nginx模块的配置数据结构。

cycle->conf_ctx = ngx_pcalloc(pool, ngx_max_module * sizeof(void *));

NGX_CORE_MODULE模块的配置数据结构创建

    for (i = 0; ngx_modules[i]; i++) {
        if (ngx_modules[i]->type != NGX_CORE_MODULE) {
            continue;
        }

        module = ngx_modules[i]->ctx;

        if (module->create_conf) {
            rv = module->create_conf(cycle);
            if (rv == NULL) {
                ngx_destroy_pool(pool);
                return NULL;
            }
            cycle->conf_ctx[ngx_modules[i]->index] = rv;
        }
    }

其中，包括ngx_core_module和ngx_http_module。

static ngx_core_module_t  ngx_core_module_ctx = {
    ngx_string("core"),
    ngx_core_module_create_conf,
    ngx_core_module_init_conf
};
ngx_module_t  ngx_core_module = {
    NGX_MODULE_V1,
    &ngx_core_module_ctx,                  /* module context */
    ngx_core_commands,                     /* module directives */
    NGX_CORE_MODULE,                       /* module type */
    NULL,                                  /* init master */
    NULL,                                  /* init module */
    NULL,                                  /* init process */
    NULL,                                  /* init thread */
    NULL,                                  /* exit thread */
    NULL,                                  /* exit process */
    NULL,                                  /* exit master */
    NGX_MODULE_V1_PADDING
};

ngx_core_module调用ngx_core_module_create_conf返回一个ngx_core_conf_t结构指针，成员包括daemon，master，worker_processes等。

static ngx_core_module_t  ngx_http_module_ctx = {
    ngx_string("http"),
    NULL,
    NULL
};
ngx_module_t  ngx_http_module = {
    NGX_MODULE_V1,
    &ngx_http_module_ctx,                  /* module context */
    ngx_http_commands,                     /* module directives */
    NGX_CORE_MODULE,                       /* module type */
    NULL,                                  /* init master */
    NULL,                                  /* init module */
    NULL,                                  /* init process */
    NULL,                                  /* init thread */
    NULL,                                  /* exit thread */
    NULL,                                  /* exit process */
    NULL,                                  /* exit master */
    NGX_MODULE_V1_PADDING
};

ngx_http_module模块没有create_conf回调函数。

解析http指令产生HTTP_MODULE模块配置数据

  ngx_http_conf_ctx_t         *ctx;

  ctx = ngx_pcalloc(cf->pool, sizeof(ngx_http_conf_ctx_t));
  *(ngx_http_conf_ctx_t **) conf = ctx;  /* conf指向ngx_http_module模块对应的配置数据结构 */

ctx->main_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);

ctx->srv_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);

ctx->loc_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);

for (m = 0; ngx_modules[m]; m++) {
if (ngx_modules[m]->type != NGX_HTTP_MODULE) {
continue;
}
module = ngx_modules[m]->ctx;
mi = ngx_modules[m]->ctx_index;

/* 调用create_xxx_conf()生成配置数据结构，并放于对应的指针数组中。 */
if (module->create_main_conf) {
ctx->main_conf[mi] = module->create_main_conf(cf);
if (ctx->main_conf[mi] == NULL) {
return NGX_CONF_ERROR;
}
}
if (module->create_srv_conf) {
ctx->srv_conf[mi] = module->create_srv_conf(cf);
if (ctx->srv_conf[mi] == NULL) {
return NGX_CONF_ERROR;
}
}
if (module->create_loc_conf) {
ctx->loc_conf[mi] = module->create_loc_conf(cf);
if (ctx->loc_conf[mi] == NULL) {
return NGX_CONF_ERROR;
}
}
}

ngx_http_core_module模块

static ngx_http_module_t  ngx_http_core_module_ctx = {
    ngx_http_core_preconfiguration,        /* preconfiguration */
    NULL,                                  /* postconfiguration */

    ngx_http_core_create_main_conf,        /* create main configuration */
    ngx_http_core_init_main_conf,          /* init main configuration */

    ngx_http_core_create_srv_conf,         /* create server configuration */
    ngx_http_core_merge_srv_conf,          /* merge server configuration */

    ngx_http_core_create_loc_conf,         /* create location configuration */
    ngx_http_core_merge_loc_conf           /* merge location configuration */
};

ngx_module_t  ngx_http_core_module = {
    NGX_MODULE_V1,
    &ngx_http_core_module_ctx,             /* module context */
    ngx_http_core_commands,                /* module directives */
    NGX_HTTP_MODULE,                       /* module type */
    NULL,                                  /* init master */
    NULL,                                  /* init module */
    NULL,                                  /* init process */
    NULL,                                  /* init thread */
    NULL,                                  /* exit thread */
    NULL,                                  /* exit process */
    NULL,                                  /* exit master */
    NGX_MODULE_V1_PADDING
};

这是一个NGX_HTTP_MODULE模块，所以ngx_http_core_create_main_conf()，ngx_http_core_create_srv_conf()，ngx_http_core_create_loc_conf()会分别创建三个配置数据结构cmcf, cscf, clcf，并分别存放与ctx->main_conf指针数组，ctx->src_conf指针数组，ctx->loc_conf指针数组当中。

server配置指令

server指令由ngx_http_core_module.c文件的ngx_http_core_server()负责解析。

    ctx = ngx_pcalloc(cf->pool, sizeof(ngx_http_conf_ctx_t));

    http_ctx = cf->ctx;   /* cf->ctx指的是http指令对应的ctx */
    ctx->main_conf = http_ctx->main_conf;

    ctx->srv_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);

    ctx->loc_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);

    for (i = 0; ngx_modules[i]; i++) {
        if (ngx_modules[i]->type != NGX_HTTP_MODULE) {
            continue;
        }

        module = ngx_modules[i]->ctx;

        if (module->create_srv_conf) {
            mconf = module->create_srv_conf(cf);
            if (mconf == NULL) {
                return NGX_CONF_ERROR;
            }

            ctx->srv_conf[ngx_modules[i]->ctx_index] = mconf;
        }

        if (module->create_loc_conf) {
            mconf = module->create_loc_conf(cf);
            if (mconf == NULL) {
                return NGX_CONF_ERROR;
            }

            ctx->loc_conf[ngx_modules[i]->ctx_index] = mconf;
        }
    }
    /* 把这个server指令对应的cscf放到cmcf的servers数组中。 */

    /* cscf的ctx成员包含srv_conf，和loc_conf，指向server层的数据 */
    cscf = ctx->srv_conf[ngx_http_core_module.ctx_index];
    cscf->ctx = ctx;

    cmcf = ctx->main_conf[ngx_http_core_module.ctx_index];

    cscfp = ngx_array_push(&cmcf->servers);
    if (cscfp == NULL) {
        return NGX_CONF_ERROR;
    }

    *cscfp = cscf;

location配置指令

location指令由ngx_http_core_module.c文件的ngx_http_core_location()负责解析。

    ctx = ngx_pcalloc(cf->pool, sizeof(ngx_http_conf_ctx_t));

    pctx = cf->ctx;   /* cf->ctx指的是cscf->ctx */
    ctx->main_conf = pctx->main_conf;
    ctx->srv_conf = pctx->srv_conf;

    ctx->loc_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);

    for (i = 0; ngx_modules[i]; i++) {
        if (ngx_modules[i]->type != NGX_HTTP_MODULE) {
            continue;
        }

        module = ngx_modules[i]->ctx;

        if (module->create_loc_conf) {
            ctx->loc_conf[ngx_modules[i]->ctx_index] =
                                                   module->create_loc_conf(cf);
            if (ctx->loc_conf[ngx_modules[i]->ctx_index] == NULL) {
                 return NGX_CONF_ERROR;
            }
        }
    }