spark sql 可以通过标准的jdbc连接数据库,获得数据源
package com.xx; import org.apache.spark.SparkConf; import org.apache.spark.SparkContext; import org.apache.spark.sql.DataFrame; import org.apache.spark.sql.SQLContext; /** * spark sql * @author Chenj */ public class SparkSql { private static final String appName = "spark sql test"; private static final String master = "spark://192.168.1.21:7077"; private static final String JDBCURL = "jdbc:mysql://192.168.1.55/lng?useUnicode=true&characterEncoding=utf-8&user=root&password=123456"; public static void main(String[] avgs){ SparkConf conf = new SparkConf(). setAppName(appName). setMaster(master). setSparkHome(System.getenv("SPARK_HOME")). setJars(new String[]{System.getenv("JARS")}); SparkContext sparkContext = new SparkContext(conf); SQLContext sqlContext = new SQLContext(sparkContext); DataFrame user = sqlContext.jdbc(JDBCURL, "tsys_user"); user.show(); } }
首先得上传mysql 的驱动jar包到集群中。
使用
./spark-submit --driver-class-path ../lib/mysql-connector-java-5.1.36.jar --class com.xx.SparkSql --master spark://ser21:7077 /usr/local/spark-1.0-SNAPSHOT.jar
--driver-class-path 为jdbc驱动地址,
原文:http://my.oschina.net/u/160697/blog/516300