1.安装,需要指定JAVA_HOME
在conf/flume-evn.sh里面指定JAVA_HOME
2.channel 为file channel 运行时报OOM:
[hdfs]$ bin/flume-ng agent --conf conf/ --conf-file conf/kafka-hdfs-pv.conf --name tier1
Info: Sourcing environment configuration script /data2/apache-flume-1.8.0-bin/conf/flume-env.sh
Info: Including Hive libraries found via () for Hive access
+ exec /usr/local/java/jdk1.8.0_131/bin/java -Xmx20m -cp '/data2/apache-flume-1.8.0-bin/conf:/data2/apache-flume-1.8.0-bin/lib/*:/lib/*' -Djava.library.path= org.apache.flume.node.Application --conf-file conf/kafka-hdfs-pv.conf --name tier1
Exception in thread "PollableSourceRunner-KafkaSource-source1" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.kafka.common.record.Record.computeChecksum(Record.java:166)
at org.apache.kafka.common.record.Record.computeChecksum(Record.java:204)
at org.apache.kafka.common.record.Record.isValid(Record.java:218)
at org.apache.kafka.common.record.Record.ensureValid(Record.java:225)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:617)
at org.apache.kafka.clients.consumer.internals.Fetcher.handleFetchResponse(Fetcher.java:566)
at org.apache.kafka.clients.consumer.internals.Fetcher.access$000(Fetcher.java:69)
at org.apache.kafka.clients.consumer.internals.Fetcher$1.onSuccess(Fetcher.java:139)
at org.apache.kafka.clients.consumer.internals.Fetcher$1.onSuccess(Fetcher.java:136)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:320)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:213)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.quickPoll(ConsumerNetworkClient.java:202)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:864)
at org.apache.flume.source.kafka.KafkaSource.doProcess(KafkaSource.java:202)
at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:60)
at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:133)
at java.lang.Thread.run(Thread.java:748)
^CAttempting to shutdown background worker.
解决办法:
File Channel默认的java内存分配太少,只有20M,提高内存分配:
vim conf/flume-env.sh
export JAVA_OPTS="-Xms50m -Xmx50m -Dcom.sun.management.jmxremote"
修改为50M,不再报错.
3.在非Hadoop集群安装Flume,从kafka采集数据到HDFS,存储到HDFS时候报错:
java.lang.NoClassDefFoundError: org/apache/hadoop/io/SequenceFile$CompressionType
at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:235)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.SequenceFile$CompressionType
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
原因:
缺少Hadoop相关的jar包,从Hadoop集群的jar包中复制到
flume/lib/文件下即可:
缺少的jar包如下:
commons-configuration-1.6.jar
hadoop-auth-2.6.0-cdh5.15.0.jar
hadoop-common-2.6.0-cdh5.15.0.jar
hadoop-hdfs-2.6.0.jar
htrace-core-3.2.0-incubating.jar
htrace-core4.4.0.1-incubating.jar