怎么进行SparkSQL部署与简单使用-创新互联
这篇文章将为大家详细讲解有关怎么进行SparkSQL部署与简单使用,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
创新互联公司2013年成立,是专业互联网技术服务公司,拥有项目成都网站设计、网站建设网站策划,项目实施与项目整合能力。我们以让每一个梦想脱颖而出为使命,1280元阿荣做网站,已为上家服务,为阿荣各地企业和个人服务,联系电话:18982081108一、运行环境
Ø JDK:1.8.0_45 64位
Ø hadoop-2.6.0-cdh6.7.0
Ø Scala:2.11.8
Ø spark-2.3.1-bin-2.6.0-cdh6.7.0(需要自己编译)
Ø hive-1.1.0-cdh6.7.0
Ø MySQL5.6
二、SparkSQL运行准备
#元数据存在MySQL,启动MySQL
[root@hadoop001 ~]# su mysqladmin [mysqladmin@hadoop001 root]$ cd ~ [mysqladmin@hadoop001 ~]$ service mysql start Starting MySQL [ OK ]
#启动HDFS
[hadoop@hadoop001 sbin]$ ./start-dfs.sh
#配置SparkSQL 的hive-site.xml
[hadoop@hadoop001 ~]$ cp $HIVE_HOME/conf/hive-site.xml $SPARK_HOME/conf/
三、SparkSQL启动
#spark-sehll方式启动:
[hadoop@hadoop001 bin]$ ./spark-shell --master local[2] \ --jars ~/software/mysql-connector-java-5.1.34-bin.jar scala> spark.sql("use hive_data2").show(false) scala> spark.sql("select * from emp").show(false) +-----+------+---------+----+----------+-------+------+------+ |empno|ename |job |mgr |hiredate |salary |comm |deptno| +-----+------+---------+----+----------+-------+------+------+ |7369 |SMITH |CLERK |7902|1980-12-17|800.0 |null |20 | |7499 |ALLEN |SALESMAN |7698|1981-2-20 |1600.0 |300.0 |30 | |7521 |WARD |SALESMAN |7698|1981-2-22 |1250.0 |500.0 |30 | |7566 |JONES |MANAGER |7839|1981-4-2 |2975.0 |null |20 | |7654 |MARTIN|SALESMAN |7698|1981-9-28 |1250.0 |1400.0|30 | |7698 |BLAKE |MANAGER |7839|1981-5-1 |2850.0 |null |30 | |7782 |CLARK |MANAGER |7839|1981-6-9 |2450.0 |null |10 | |7788 |SCOTT |ANALYST |7566|1987-4-19 |3000.0 |null |20 | |7839 |KING |PRESIDENT|null|1981-11-17|5000.0 |null |10 | |7844 |TURNER|SALESMAN |7698|1981-9-8 |1500.0 |0.0 |30 | |7876 |ADAMS |CLERK |7788|1987-5-23 |1100.0 |null |20 | |7900 |JAMES |CLERK |7698|1981-12-3 |950.0 |null |30 | |7902 |FORD |ANALYST |7566|1981-12-3 |3000.0 |null |20 | |7934 |MILLER|CLERK |7782|1982-1-23 |1300.0 |null |10 | |8888 |HIVE |PROGRAM |7839|1988-1-23 |10300.0|null |null | +-----+------+---------+----+----------+-------+------+------+
#spark-sql方式启动:
[hadoop@hadoop001 bin]$ ./spark-sql --master local[2] \ --driver-class-path ~/software/mysql-connector-java-5.1.34-bin.jar #进入数据库 spark-sql> use hive_data2; 18/08/30 20:36:52 INFO HiveMetaStore: 0: get_database: hive_data2 18/08/30 20:36:52 INFO audit: ugi=hadoop ip=unknown-ip-addr cmd=get_database: hive_data2 Time taken: 0.114 seconds #查询数据 spark-sql> select * from emp; 18/08/30 20:37:05 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 1.292944 s 7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20 7499 ALLEN SALESMAN 7698 1981-2-20 1600.0 300.0 30 7521 WARD SALESMAN 7698 1981-2-22 1250.0 500.0 30 7566 JONES MANAGER 7839 1981-4-2 2975.0 NULL 20 7654 MARTIN SALESMAN 7698 1981-9-28 1250.0 1400.0 30 7698 BLAKE MANAGER 7839 1981-5-1 2850.0 NULL 30 7782 CLARK MANAGER 7839 1981-6-9 2450.0 NULL 10 7788 SCOTT ANALYST 7566 1987-4-19 3000.0 NULL 20 7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10 7844 TURNER SALESMAN 7698 1981-9-8 1500.0 0.0 30 7876 ADAMS CLERK 7788 1987-5-23 1100.0 NULL 20 7900 JAMES CLERK 7698 1981-12-3 950.0 NULL 30 7902 FORD ANALYST 7566 1981-12-3 3000.0 NULL 20 7934 MILLER CLERK 7782 1982-1-23 1300.0 NULL 10 8888 HIVE PROGRAM 7839 1988-1-23 10300.0 NULL NULL
关于怎么进行SparkSQL部署与简单使用就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
文章名称:怎么进行SparkSQL部署与简单使用-创新互联
本文路径:http://pwwzsj.com/article/heocp.html