1 安装sqoop,并且把Mysql中的表数据导出到HDFS下的文本文件里,整个过程抓图
2 安装flume或chukwa,并作简单测试
1、
(1)下载sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz,并解压
(2)修改sqoop配置文件
Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/usr/hadoop-2.7.2
Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=
set the path to where bin/hbase is available
export HBASE_HOME=/usr/hbase_123
Set the path to where bin/hive is available
export HIVE_HOME=/usr/hive
Set the path for where zookeper config dir is
export ZOOCFGDIR=/usr/zookeeper_349/conf
(3)运行sqoop
./bin/sqoop list-databases --connect jdbc:mysql://192.168.31.247:3306/ --username hive --password zaq1XSW@
(4)导入hdsf
./bin/sqoop import --connect jdbc:mysql://192.168.31.247:3306/sqoop --username hive --password zaq1XSW@ --table test -m 1
2、
(1)下载apache-flume-1.7.0-bin.tar.gz,并解压
(2)修改配置文件
flume-env.sh
Enviroment variables can be set here.
export JAVA_HOME=/usr/jdk1.8.0_101
Give Flume more memory and pre-allocate, enable remote monitoring via JMX
export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"
flume-conf
a1.sources = r1
a1.channels = c1
a1.sinks = k1
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
a1.sources.r1.channels = c1
a1.sinks.k1.type = logger
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sinks.k1.channel = c1
(3)启动flume
./bin/flume-ng agent --conf ./conf/ --conf-file ./conf/flume-conf --name a1 -Dflume.root.logger=INFO,console
(4)输出消息