Flume是什么
- 收集、聚合事件流数据的分布式框架
- 通常用于log数据
- 采用ad-hoc方案,明显优点如下:
- 可靠的、可伸缩、可管理、可定制、高性能
- 声明式配置,可以动态更新配置
- 提供上下文路由功能
- 支持负载均衡和故障转移
- 功能丰富
- 完全的可扩展
- Event
- Client
- Agent
- Sources、Channels、Sinks
- 其他组件:Interceptors、Channel Selectors、Sink Processor
Event是Flume数据传输的基本单元。flume以事件的形式将数据从源头传送到最终的目的。Event由可选的hearders和载有数据的一个byte array构成。
- 载有的数据对flume是不透明的
- Headers是容纳了key-value字符串对的无序集合,key在集合内是唯一的。
- Headers可以在上下文路由中使用扩展
public interface Event { public Map<String, String> getHeaders(); public void setHeaders(Map<String, String> headers); public byte[] getBody(); public void setBody(byte[] body); }
核心概念:Client
Clinet是一个将原始log包装成events并且发送它们到一个或多个agent的实体。
- 例如
- Flume log4j Appender
- 可以使用Client SDK (org.apache.flume.api)定制特定的Client
- 目的是从数据源系统中解耦Flume
- 在flume的拓扑结构中不是必须的
核心概念:Agent
一个Agent包含Sources, Channels, Sinks和其他组件,它利用这些组件将events从一个节点传输到另一个节点或最终目的。
- agent是flume流的基础部分。
- flume为这些组件提供了配置、生命周期管理、监控支持。
核心概念:Source
Source负责接收events或通过特殊机制产生events,并将events批量的放到一个或多个Channels。有event驱动和轮询2种类型的Source
- 不同类型的Source:
- 和众所周知的系统集成的Sources: Syslog, Netcat
- 自动生成事件的Sources: Exec, SEQ
- 用于Agent和Agent之间通信的IPC Sources: Avro
- Source必须至少和一个channel关联
Channel位于Source和Sink之间,用于缓存进来的events,当Sink成功的将events发送到下一跳的channel或最终目的,events从Channel移除。
- 不同的Channels提供的持久化水平也是不一样的:
- Memory Channel: volatile
- File Channel: 基于WAL(预写式日志Write-Ahead Logging)实现
- JDBC Channel: 基于嵌入Database实现
- Channels支持事务
- 提供较弱的顺序保证
- 可以和任何数量的Source和Sink工作
核心概念:Sink
Sink负责将events传输到下一跳或最终目的,成功完成后将events从channel移除。
- 不同类型的Sinks:
- 存储events到最终目的的终端Sink. 比如: HDFS, HBase
- 自动消耗的Sinks. 比如: Null Sink
- 用于Agent间通信的IPC sink: Avro
- 必须作用与一个确切的channel
- 可靠性基于:
- Agent间事务的交换
- Flow中,Channel的持久特性
- 可用性:
- 内建的Load balancing支持
- 内建的Failover支持
核心概念:Interceptor
用于Source的一组Interceptor,按照预设的顺序在必要地方装饰和过滤events。
- 内建的Interceptors允许增加event的headers比如:时间戳、主机名、静态标记等等
- 定制的interceptors可以通过内省event payload(读取原始日志),在必要的地方创建一个特定的headers。
Channel Selector允许Source基于预设的标准,从所有Channel中,选择一个或多个Channel
- 内建的Channel Selectors:
- 复制Replicating: event被复制到相关的channel
- 复用Multiplexing: 基于hearder,event被路由到特定的channel
- Flume通过Sink Processor实现负载均衡(Load Balancing)和故障转移(failover)
- 内建的Sink Processors:
- Load Balancing Sink Processor – 使用RANDOM, ROUND_ROBIN或定制的选择算法
- Failover Sink Processor
- Default Sink Processor(单Sink)
- 所有的Sink都是采取轮询(polling)的方式从Channel上获取events。这个动作是通过Sink Runner激活的
- Sink Processor充当Sink的一个代理
Abstract
Many Internet applications employ 3-tiersoftware architecture such as e-commerce system. Starting from analyze thearchitecture of an e-commerce development platform model based on 3-tiered Web,We present by using queuing network theory, we propose a reusable e-commercedevelopment platform based on 3-tiered Web. Finally, we implement the MVAalgorithm with the help of MATLAB and test the experimental data in LoadRunnerbenchmark.
1. IntroductionThe past few years have witnessed a tremendousgrowth in Internet Information and Web applications. E-Commerce is fleetlybecoming an everyday activity as consumers gain familiarity with shopping onthe Internet. The most common consideration is performance, because theseapplication systems eventually must provide cost-effective andhigh-availability services, therefore they have to be scaled to confront theexpected load.
Performance measurements can be the base forperformance modeling and prediction. The common performance metrics areresponse time, throughput, and resource utilization. Our primary goal was topredict the response time of JSP Web applications based on a queuing model,because response time is the unique performance metrics to which the users aredirectly confronted. Moreover, we also predict the throughput of the tiers ofWeb applications.
2. Architecture and DesignAs one of thepopular distributed computing platforms, J2EE technology has become a core partof enterprise applications based on 3-tiered Web. The e-commerce developmentplatform architecture is shown in Figure 1.
The presentation layer compose of JSP pages,pages components (DataGrid, menu, ComboBox, Textbox, date .etc.) and Taglib(Struts tags and component tags), which used to be operated by users anddisplay the results. Powerful and rich components will lead to an excellentresult of the page displaying and the user’s operation. When realizing thepresentation layer, we use the Struts framework implementing function ofController and Dispatcher.
2.2. Business logic layer The main task of the business logic layer is torealize the complex business. From the database perspective, these operationsusually involve a number of database tables and complex nested transaction; Fromthe object-oriented perspective, these business objects relies on a number ofrelated business objects or data access object and involves the map among severaltables. When realizing the business logic, we use the AOP technology of Spring abstractingbusiness logic in higher level. Thus modify the business rules according to thedemand of the users in specially situation. 2.3. Database Access LayerData persistent can be realized by the dataaccess object, the main job of the data access object is adding, modifying andquerying the basic data, .etc. but, it is not including the complex business.The key of the data access layer is to construct the reusable object layer withtechnologies of independent data access layer. The technologies in the dataaccess layer are various and the Hibernate and SQL Helper is main platform atpresent.
3. Model Design and Analysis 3.1. Model DescriptionThe most commonly deployedinfrastructure of Web services is the 3-tiered architecture which is shown inFigure 2. In this 3-tiered architecture, on the front-end of a representativeWeb site is the Web server that acts as the presentation layer. The mainfunction of this tier provides Internet information browse services. All thebusiness logic for a Web site resides in the 2nd tier (application server).This server receives request from 1st tier, processes the information in the3rd tier (database server). Database server is the warehouse of a Web site’sinformation. Everything from user accounts and types notifies and customerorders are stored in database.
Figure 2 Queuing Network Model of 3-Tiered Web Services Architecture
During the measurements the number ofapplication tiers was constant (T=3). There were two classes session operation.The number of the first one was fixed at 10, the second one was varied. For the sake of determining Zn we averaged the sleep times in the user scenario each class. In order to determine St we averaged the service times of each page and class remaining with the given tierand class. The visit ratios can be estimated as Vt ≈ λt / λreq , where λt is thenumber of requests of each page and class resided in the given application tierand class, and λreq is thenumber of requests belong to the given class of the whole application.
Computing Average Response Time Equation:
Computing Throughput Equation:
Our experimental setup and experimental validation of the model in J2EE environment is demonstrated.
The Web server of our test Web application was IBM server with Eclipse 3.2 environment, one of the most popular among middle enterprise platforms. The database management system was My SQL 5.0; development environment is composed of Eclipse 3.2,Hibernate3.0 and Spring 2.0; The server runs on a 3.0GHz Inter Pentium 4 processor, it had 2GB of system memory; the operating system was Windows Server 2003 with Service Pack 1. The client running on a 1.8 GHz Inter(R) Core(TM) 2 Duo T5470 processor with 1GB of system memory; the operating system was Windows XP Professional with Service Pack 2. The connection among the computers was provided by a 100 Mb/s network.MATLAB (MatrixLaboratory) is a numerical computing environment and programming language. It will be used to implement MVA (Mean-Value Analysis) algorithm and evaluatesystem performance measurements with model parameters as input. There were two classes of sessions: a database reader and a database writer. The number of coinstantaneous browser connections of one class was fixed at 10, while thenumber of coinstantaneous browser connections of the other class was varied from 40 to 100. The user think time of per request was varied from 0.01s to0.10s.
Figure 3 the Predicted Average Response Time Figure 4 the Predicted Throughput
Figure 3 indicate the delay trend of average response time, with the number of coinstantaneous browser connections varying from 1 to 50. Increasing the numberof simultaneous browser connections, the read transaction average response time grows linearly, while the uptrend of write transaction is inconspicuous. In the overloaded status, the read transaction average response time becomes magical high. Figure 4 demonstrate the effect of throughput, with the number of coinstantaneous browser connections varying from 1 to 50. Increasing the numberof simultaneous browser connections, before the saturation the write transaction throughput (transactions/second) slip sharply, while read transaction throughput grow slack. After the saturation the read transaction throughput remains close to constant.
LoadRunner is a tool, or more accurately a collection of tools, which facilitates in performing load testing on a system. LoadRunneris used to create a load or stress on a system, and can beused to measure the responsiveness of the system to that load. The key conceptbehind LoadRunner is the use of virtual users, or vusers.Figure 5 the Validated Average Response Time Figure 6 theValidated Transaction Throughput
We validated that the average response time and throughput of JSP Web applications (Figure 5 and Figure 6). The validatedresult indicated that the MVA algorithm evaluate the average response and the throughput acceptably well.
4. Conclusion We demonstrated and validated the queuing model of e-commerce application based on 3-tiered Web services in J2EE environment. Inother words, the put model parameters were estimated from the measurements on aper-class basis, the MVA evaluation algorithm was implemented with the help of MATLAB. In order to experimentally validate the model, LoadRunner was第四种方式:使用tx标签配置的拦截器
<?xml version="1.0" encoding="UTF-8"?>
<beansxmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.5.xsd
http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.5.xsd">
<context:annotation-config/>
<context:component-scanbase-package="com.bluesky"/>
<beanid="sessionFactory"
class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">
<propertyname="configLocation" value="classpath:hibernate.cfg.xml"/>
<propertyname="configurationClass" value="org.hibernate.cfg.AnnotationConfiguration"/>
</bean>
<!-- 定义事务管理器(声明式的事务) -->
<bean
id="transactionManager"
class="org.springframework.orm.hibernate3.HibernateTransactionManager">
<propertyname="sessionFactory"
ref="sessionFactory"/>
</bean>
<tx:adviceid="txAdvice" transaction-manager="transactionManager">
<tx:attributes>
<tx:methodname="*" propagation="REQUIRED"/>
</tx:attributes>
</tx:advice>
<aop:config>
<aop:pointcutid="interceptorPointCuts"
expression="execution(* com.bluesky.spring.dao.*.*(..))"/>
<aop:advisoradvice-ref="txAdvice"
pointcut-ref="interceptorPointCuts"/>
</aop:config>
</beans>
第五种方式:全注解
<?