Lets say your data is in Hebrew or other non-Latin language and you want to process it in Spark and store some of the results in MySQL. Cool... so you are setting the table charset and collate to UTF-8 either during the creation or by using ALTER to modify if already been created:
CREATE DATABASE name DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;
CREATE TABLE table_name (column_name column_type CHARACTER SET utf8 DEFAULT NULL,...)
ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
but its not enough. You will need to set the MySQL JDBC client connection parameters
either by concatenating the following to the URL:
.... ?useUnicode=true&characterEncoding=UTF-8
Or by setting connection parameters for the dataframe we are going to write:
...
connProps.setProperty("characterEncoding", "UTF-8") connProps.setProperty("useUnicode", "true") resultsetDf.write.mode(saveMode).jdbc(mysqljdbcurl, tableName, connProps)Good luck
1 comment:
Nice article,keep sharing more articles with us.
thank you...
big data online training
Post a Comment