utf 8 - Issue on Mysql UTF8 data import with Sqoop -


i importing utf8 data mysql hdfs using sqoop import. works fine facing issue when data utf-8. source mysql table utf-8 compatible looks sqoop converting data during import. example - source value - л.с. loaded л.Ñ. hdfs.

currently, mysql (v5.6.10) character set & collation given below :

+--------------------------+-----------------------------------------+ | variable_name            | value                                   | +--------------------------+-----------------------------------------+ | character_set_client     | latin1                                  | | character_set_connection | latin1                                  | | character_set_database   | latin1                                  | | character_set_filesystem | binary                                  | | character_set_results    | latin1                                  | | character_set_server     | utf8                                    | | character_set_system     | utf8                                    | | collation_connection     | latin1_swedish_ci                       | | collation_database       | latin1_swedish_ci                       | | collation_server         | utf8_unicode_ci                         | +--------------------------+-----------------------------------------+  -- table structure create table utf_test_cases_ms  (   test_case varchar(50) not null,   english_lang varchar(250) not null,   language_name varchar(50) not null,   utf8_lang varchar(300) not null ) engine=myisam default charset=utf8 ;  -- mysql select * utf_test_cases_ms; +--------------------+--------------+---------------+-----------+ | test_case          | english_lang | language_name | utf8_lang | +--------------------+--------------+---------------+-----------+ | multiple character | hp           | russian       | л.с.    | +--------------------+--------------+---------------+-----------+  -- sqoop import command sqoop import --connect jdbc:mysql://<<ip_address_with_port>>/<<db_name>>  --table utfmb_test_cases_ms --username sqoop_user --password sqoop_pwd  --hive-import --hive-table utf_ms_db.utfmb_test_cases_ms  --create-hive-table --null-string '\\n' --null-non-string '\\n'   --fields-terminated-by '|'  --lines-terminated-by '\n' -m 1   -- hive (hdfs) select * utfmb_test_cases_ms; multiple character  hp  russian л.Ñ. 

do need change character set & collation in mysql config file ? need pass unicode / utf8 parameters while importing data via sqoop ?

please provide solution this. in advance!

(from comment)

create table utf_test_cases_ms (     test_case varchar(50) not null,      english_lang varchar(250) not null,      language_name varchar(50) not null,      utf8_lang varchar(300) not null ) engine=myisam default charset=utf8 ;  

pass character-set parameter in sqoop command

sqoop import --connect jdbc:mysql://server.foo.com/db --table bar \ --direct -- --default-character-set=latin1 

Comments

Popular posts from this blog

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -

minify - Minimizing css files -