1.下载apache-flume-1.4.0-bin.tar.gz并解压
tar -xzvf apache-flume-1.4.0-bin.tar.gz cd apache-flume-1.4.0-bin
2.创建一个简单的flume配置文件,内容如下:
vi a1.conf
# Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
3.启动flumeng
./bin/flume-ng agent --conf conf --conf-file a1.conf --name a1 -Dflume.root.logger=INFO,console
4.在另一个控制台,启用telnet
telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
hello world
OK
将会在步骤3中的控制台中看到flume接收到的消息。
在步骤3中的控制台按ctrl+C,中止测试。
5.修改a1的来源为从日志文件中获取数据
修改后的内容如下:
# Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -F /var/log/secure # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
6.再次启动flumeng
./bin/flume-ng agent --conf conf --conf-file a1.conf --name a1 -Dflume.root.logger=INFO,console
打开新的ssh客户端,输入用户名密码登录本服务器,将会看到flume有日志产生
但看到的日志不完整
/var/log/secure中的内容为:
Mar 12 14:50:21 web5 sshd[9856]: Accepted password for root from 10.0.2.11 port 1135 ssh2
Mar 12 14:50:21 web5 sshd[9856]: pam_unix(sshd:session): session opened for user root by (uid=0)
flume控制台内容为:
2014-03-12 06:50:21,571 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 4D 61 72 20 31 32 20 31 34 3A 35 30 3A 32 31 20 Mar 12 14:50:21 }
2014-03-12 06:50:25,575 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 4D 61 72 20 31 32 20 31 34 3A 35 30 3A 32 31 20 Mar 12 14:50:21 }
7.为flume添加个文件sink
修改后的a1.conf文件为:
# Name the components on this agent a1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -F /var/log/secure # Describe the sink a1.sinks.k1.type = logger a1.sinks.k2.type = file_roll a1.sinks.k2.sink.directory = /tmp/flume # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 a1.sinks.k2.channel = c1
重新启动flume后,会在/tmp/flume下生成数据
查看/tmp/flume下的文件,和flume控制台的数据,感觉二者把日志的数据分开了,
一部分日志数据显示在控制台,一部分数据保存到了/tmp/flume下的文件里
/var/log/secure中内容为:
Mar 12 14:50:21 web5 sshd[9856]: Accepted password for root from 10.0.2.11 port 1135 ssh2
Mar 12 14:50:21 web5 sshd[9856]: pam_unix(sshd:session): session opened for user root by (uid=0)
Mar 12 14:59:36 web5 sshd[10350]: Accepted password for root from 10.0.2.11 port 1193 ssh2
Mar 12 14:59:36 web5 sshd[10350]: pam_unix(sshd:session): session opened for user root by (uid=0)
Mar 12 15:00:56 web5 sshd[10350]: Received disconnect from 10.0.2.11: 11: Disconnect requested by Windows SSH Client.
Mar 12 15:00:56 web5 sshd[10350]: pam_unix(sshd:session): session closed for user root
/tmp/flume下的文件约30秒产生一个。
8.为flume添加个文件channel
修改后的a1.conf文件为:
# Name the components on this agent
a1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 c2 # Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.command = tail -F /var/log/secure # Describe the sink a1.sinks.k1.type = logger a1.sinks.k2.type = file_roll a1.sinks.k2.sink.directory = /tmp/flume # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 a1.channels.c2.type = memory a1.channels.c2.capacity = 1000 a1.channels.c2.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 c2 a1.sources.r1.selector.type = replicating a1.sinks.k1.channel = c1 a1.sinks.k2.channel = c2
修改后,控制台和/tmp/flume中的内容,与日志就对应上了。
相关推荐
Flume1.6.0入门:安装、部署、及flume的案例
Flume1.5.0入门:安装、部署、及flume的案例Flume1.5.0入门:安装、部署、及flume的案例
Flume入门使用.pdf 学习资料 复习资料 教学资源
章节四:入门使用案例 章节五:数据持久化 章节六:日志文件监控 章节七:多个Agent模型 章节八:拦截器 章节九:Channel选择器 章节十:Sink处理器 章节十一:导入数据到HDFS 章节十二:Flume SDK 章节十三:Flume...
flume入门介绍,简单介绍flume的背景和应用场景,flume的实现原理以及案例分享
尚硅谷大数据技术之Flume
这个是大数据Flume的学习视频,希望可以和大家一起学习!
Flume 快速入门教程,文本数据采集
第 2 章 Flume 快速入门2.1.1 安装地址2)文档查看地址3)下载地址2.1.2 安装部署2.2.1 监控端口数据官方案例1)案例需求:使用 Flum
Flume简介及基本使用,入门篇
实战flume基础,带你入门,分布式、高可靠的和高可用的日志采集系统,用 来从不同来源的系统中采集、汇总和搬移大容量的日志数据到一个集中式的数据存储中
对于Flume学习上的总结,包含: 1.Linux环境Flume安装配置及...2.Apache Flume 入门教程。 3.flume的部署和avro source测试;netcast source测试。 4.Flume部署及使用。 5.Flume监听本地Linux-hive日志文件采集到HDFS。
Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统。Flume基于流式架构,灵活简单。 第一章 Flume概述 第二章 Flume入门 第三章 Flume进阶 第四章 企业真题面试
flime安装+配置+测试+案例(采集日志至HDFS)+理论+搭建错误解决,超详细flum搭建,一篇带你入门flume,通俗易懂,详细步骤注解!!!
flume 一、下载 [pxj@pxj /opt]$sudo wget http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.16.2.tar.gz [sudo] pxj 的密码: --2020-02-13 01:21:32-- ...
水槽附加器 Flume appender 从一系列日志库(log4j、logback)推送日志事件
Linux下Flume的安装,入门篇
1.1 Flume入门 1.1.1 Flume概述 Flume概述 Flume是一个分布式的收集、汇总和移动大量的日志数据的可靠的服务。 有Cloudera公司开源 分布式、可靠、高可用的海量日志采集系统 数据源可定制、可扩展 数据存储系统可...