DataX mongodb导出数据到mysql

Python版本要为2
cmd乱码解决:输入CHCP 65001
数据库中的数据中文乱码解决:在json文件中jdbcUrl项加上:?characterEncoding=utf8

DataX介绍

安装DataX

DataX下载地址
下载完成解压至某个路径下即可

查看配置模板

python datax.py -r {YOUR_READER} -w {YOUR_WRITER}

例如mysql:

C:\DataX\bin>python datax.py -r mysqlreader -w mysqlwriter

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.


Please refer to the mysqlreader document:
     https://github.com/alibaba/DataX/blob/master/mysqlreader/doc/mysqlreader.md

Please refer to the mysqlwriter document:
     https://github.com/alibaba/DataX/blob/master/mysqlwriter/doc/mysqlwriter.md

Please save the following configuration as a json file and  use
     python {DATAX_HOME}/bin/datax.py {JSON_FILE_NAME}.json
to run the job.

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader",
                    "parameter": {
                        "column": [],
                        "connection": [
                            {
                                "jdbcUrl": [],
                                "table": []
                            }
                        ],
                        "password": "",
                        "username": "",
                        "where": ""
                    }
                },
                "writer": {
                    "name": "mysqlwriter",
                    "parameter": {
                        "column": [],
                        "connection": [
                            {
                                "jdbcUrl": "",
                                "table": []
                            }
                        ],
                        "password": "",
                        "preSql": [],
                        "session": [],
                        "username": "",
                        "writeMode": ""
                    }
                }
            }
        ],
        "setting": {
            "speed": {
                "channel": ""
            }
        }
    }
}

配置mongodb2mysql.json文件

{
    "job": {
        "setting": {
            "speed": {
                "channel": 1
            }
        },
        "content": [{
            "reader": {
                "name": "mongodbreader",
                "parameter": {
                    "address": ["*.*.*.*:27017"],
                    "userName": "root",
                    "userPassword": "123456",
                    "dbName": "weixin",
                    "collectionName": "fileids_wxpy",
                    "column": [{
                        "index":0,
                        "name": "_id",
                        "type": "string"
                    }, {
                        "index":1,
                        "name": "crawler_time",
                        "type": "string"
                    }, {
                        "index":2,
                        "name": "file_url",
                        "type": "string"
                    }, {
                        "index":3,
                        "name": "flag",
                        "type": "string"
                    }, {
                        "index":4,
                        "name": "logo_url",
                        "type": "string"
                    }, {
                        "index":5,
                        "name": "source",
                        "type": "string"
                    }, {
                        "index":6,
                        "name": "update_date",
                        "type": "string"
                    }, {
                        "index":7,
                        "name": "update_time",
                        "type": "long"
                    }, {
                        "index":8,
                        "name": "wx_id",
                        "type": "string"
                    }, {
                        "index":9,
                        "name": "wx_name",
                        "type": "string"
                    }]
                }
            },
             "writer": {
                    "name": "mysqlwriter", 
                    "parameter": {
                        "column": [
                        "id",
                        "crawler_time",
                        "file_url",
                        "flag",
                        "logo_url",
                        "source",
                        "update_date",
                        "update_time",
                        "wx_id",
                        "wx_name"
                        ], 
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:mysql://*.*.*.*:3306/weixin?characterEncoding=utf8", 
                                "table": ["fileids_wxpy"]
                            }
                        ], 
                        "password": "123456", 
                        "username": "root"
                    }
                }
        }]
    }

}

mysql新建数据库 、表

create DATABASE weixin;
use weixin;
DROP TABLE IF EXISTS `fileids_wxpy`;
CREATE TABLE `fileids_wxpy` (
  `id` bigint(20) unsigned NOT NULL,
  `crawler_time` int(10) unsigned NOT NULL,
  `file_url` varchar(255) NOT NULL,
  `flag` varchar(255) NOT NULL,
  `logo_url` varchar(255) NOT NULL,
  `source` varchar(255) NOT NULL,
  `update_date` int(10) unsigned NOT NULL,
  `update_time` int(10) unsigned NOT NULL,
  `wx_id` varchar(255) NOT NULL,
  `wx_name` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

启动

C:\DataX\bin>python datax.py mongodb2mysql.json

报错

Caused by: com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster state is {type=UNKNOWN, servers=[{address=*.*.*.*:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSecurityException: Exception authenticating MongoCredential{mechanism=null, userName='root', source='weixin', password=<hidden>, mechanismProperties={}}}, caused by {com.mongodb.MongoCommandException: Command failed with error 18: 'Authentication failed.' on server*.*.*.*:27017. The full response is { "ok" : 0.0, "errmsg" : "Authentication failed.", "code" : 18, "codeName" : "AuthenticationFailed" }}}]
        at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
        at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
        at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
        at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
        at com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
        at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:201)
        at com.mongodb.operation.CountOperation.execute(CountOperation.java:206)
        at com.mongodb.operation.CountOperation.execute(CountOperation.java:53)
        at com.mongodb.Mongo.execute(Mongo.java:772)
        at com.mongodb.Mongo$2.execute(Mongo.java:759)
        at com.mongodb.MongoCollectionImpl.count(MongoCollectionImpl.java:185)
        at com.mongodb.MongoCollectionImpl.count(MongoCollectionImpl.java:165)
        at com.alibaba.datax.plugin.reader.mongodbreader.util.CollectionSplitUtil.doSplitInterval(CollectionSplitUtil.java:55)
        at com.alibaba.datax.plugin.reader.mongodbreader.util.CollectionSplitUtil.doSplit(CollectionSplitUtil.java:37)
        at com.alibaba.datax.plugin.reader.mongodbreader.MongoDBReader$Job.split(MongoDBReader.java:37)
        at com.alibaba.datax.core.job.JobContainer.doReaderSplit(JobContainer.java:732)
        at com.alibaba.datax.core.job.JobContainer.split(JobContainer.java:393)
        at com.alibaba.datax.core.job.JobContainer.start(JobContainer.java:117)
        ... 3 more

原因

MongoDB中每个数据库之间是相互独立的,都有独立的权限,正确的做法是使用root账号在【将要操作的数据库】中创建一个【子账号】,在用这个子账号连接mongo

解决办法

>use admin

switched to db admin

>db.auth("root","******")

1

>show dbs

admin

local

weixin

>use weixin

switched to db weixin

>db.createUser(
        {
            user:"DataXTest",
            pwd:"123456",
            roles:[{role:"dbOwner",db:"weixin"}]
        }
)

Successfully added user: {
    "user" : "DataXTest",
    "roles" : [
        {
            "role" : "dbOwner",
            "db" : "weixin"
        }
    ]

}

使用DataXTest来替换jsono配置文件中mongodb的账号root后,再次运行

C:\DataX\bin>python datax.py mongodb2mysql.json
2019-06-13 14:39:40.218 [job-0] INFO  JobContainer - PerfTrace not enable!
2019-06-13 14:39:40.219 [job-0] INFO  StandAloneJobContainerCommunicator - Total 50115 records, 17716504 bytes | Speed 36.04KB/s, 104 records/s | Error 12 records, 3513 bytes |  All Task WaitWriterTime 259.684s |  All Task WaitReaderTime 207.041s | Percentage 100.00%
2019-06-13 14:39:40.221 [job-0] INFO  JobContainer -
任务启动时刻                    : 2019-06-13 14:31:33
任务结束时刻                    : 2019-06-13 14:39:40
任务总计耗时                    :                487s
任务平均流量                    :           36.04KB/s
记录写入速度                    :            104rec/s
读出记录总数                    :               50115
读写失败总数                    :                  12

注: 此处错误的12条记录是由于id 长度超过19位

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 206,723评论 6 481
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 88,485评论 2 382
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 152,998评论 0 344
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 55,323评论 1 279
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 64,355评论 5 374
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,079评论 1 285
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,389评论 3 400
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,019评论 0 259
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 43,519评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,971评论 2 325
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,100评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,738评论 4 324
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,293评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,289评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,517评论 1 262
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,547评论 2 354
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,834评论 2 345

推荐阅读更多精彩内容