HUE安装与使用 - 大道至简(老徐) - 博客园知识图谱

HUE安装与使用 - 大道至简(老徐) - 博客园

本站和网页 https://www.cnblogs.com/xupccc/p/9583656.html 的作者无关，不对其内容负责。快照谨为网络故障时之索引，不代表被搜索网站的即时页面。

HUE安装与使用 - 大道至简(老徐) - 博客园
首页
新闻
博问
专区
闪存
班级
我的博客
我的园子
账号设置
简洁模式 ...
退出登录
注册
登录
大道至简
道可道，非常道！
博客园
首页
新随笔
联系
订阅
管理
HUE安装与使用
HUE安装与使用
1、介绍
HUE是一个开源的Apache Hadoop UI系统，早期由Cloudera开发，后来贡献给开源社区。它是基于Python Web框架Django实现的。通过使用Hue我们可以通过浏览器方式操纵Hadoop集群。例如put、get、执行MapReduce Job等等。
2、安装
2.1 安装hue依赖的第三方包
#安装xml软件包
$>sudo yum install -y libxml2-devel.x86_64
#安装其他软件包
$>sudo yum install -y libxslt-devel.x86_64 python-devel openldap-devel asciidoc cyrus-sasl-gssapi
3、配置hue
hue与hadoop连接，即访问hadoop文件，可以使用两种方式。
WebHDFS
提供高速数据传输，client可以直接和DataNode通信。
HttpFS
一个代理服务，方便于集群外部的系统进行集成。注意：HA模式下只能使用该中方式。
3.1 配置hadoop的hue代理用户
[/soft/hadoop/etc/hadoop/core-site.xml]
注意：hadoop的代理用户配置方式是：hadoop.proxyuser.${superuser}.hosts，这里我的superuser是centos。
<property>
<name>hadoop.proxyuser.centos.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.centos.groups</name>
<value>*</value>
</property>
[/soft/hadoop/etc/hadoop/hdfs-site.xml]
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
[/soft/hadoop/etc/hadoop/httpfs-site.xml]
<property>
<name>httpfs.proxyuser.centos.hosts</name>
<value>*</value>
</property>
<property>
<name>httpfs.proxyuser.centos.groups</name>
<value>*</value>
</property>
分发配置文件
$>cd /soft/hadoop/etc/hadoop
$>xsync.sh core-site.xml
$>xsync.sh hdfs-site.xml
$>xsync.sh httpfs-site.xml
3.2 重启hadoop和yarn进程
$>stop-dfs.sh
$>stop-dfs.sh
$>start-dfs.sh
$>start-yarn.sh
3.3 启动httpfs进程
3.3.1 启动进程
$>/soft/hadoop/sbin/httpfs.sh start
3.3.2 检查14000端口
$>netstat -anop |grep 14000
3.4 配置hue文件
这里我们使用的是hadoop的namenode HA模式，因此只能配置httpfs方式访问hdfs文件。需要注意的是webhdfs_url指定的是14000的端口，具体如下所示。
[/home/centos/hue-3.12.0/desktop/conf/hue.ini]
...
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://mycluster:8020
# NameNode logical name.
logical_name=mycluster
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://s101:14000/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True
# Directory of the Hadoop configuration
hadoop_conf_dir=/soft/hadoop/etc/hadoop
3.5 配置hue的数据库为mysql
...
[[database]]
# Database engine is typically one of:
# postgresql_psycopg2, mysql, sqlite3 or oracle.
# Note that for sqlite3, 'name', below is a path to the filename. For other backends, it is the database name
# Note for Oracle, options={"threaded":true} must be set in order to avoid crashes.
# Note for Oracle, you can use the Oracle Service Name by setting "host=" and "port=" and then "name=<host>:<port>/<service_name>".
# Note for MariaDB use the 'mysql' engine.
engine=mysql
host=192.168.231.1
port=3306
user=root
password=root
# Execute this script to produce the database password. This will be used when 'password' is not set.
## password_script=/path/script
name=hue
## options={}
# Database schema, to be used only when public schema is revoked in postgres
## schema=
4、初始化mysql库，生成表
4.1 创建hue库
因为我们在hue.ini文件中指定的数据库名为hue，因此需要先创建hue数据库。
msyql>create database hue ;
4.2 初始化数据表
该步骤是创建表和插入部分数据。hue的初始化数据表命令由hue/bin/hue syncdb完成，创建期间，需要输入用户名和密码。如下所示：
#同步数据库
$>~/hue-3.12.0/build/env/bin/hue syncdb
#导入数据,主要包括oozie、pig、desktop所需要的表
$>~/hue-3.12.0/build/env/bin/hue migrate
4.3 查看mysql中是否生成表
查看是否在mysql中生成了所需要的表，截图如下所示：
msyql>show tables ;
5、启动hue进程
$>~/hue-3.12.0/build/env/bin/supervisor
启动过程如下图所示：
6、检查webui
http://s101:8888/
打开登录界面，输入前文创建的账户即可。
7、访问hdfs
点击右上角的hdfs链接，进入hdfs系统画面。
8、配置ResourceManager
8.1 修改hue.ini配置文件
[[yarn_clusters]]
...
# [[[ha]]]
# Resource Manager logical name (required for HA)
logical_name=cluster1
# Un-comment to enable
## submit_to=True
# URL of the ResourceManager API
resourcemanager_api_url=http://s101:8088
8.2 查看job执行情况
9、配置hive
9.1 编写hue.ini文件
[beeswax]
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=s101
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/soft/hive/conf
9.2 安装依赖软件包
如果不安装以下的依赖包，会导致sasl方面的错误，说hiveserver2没有启动。
$>sudo yum install -y cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi
9.3 启动hiveserver2服务器
$>/soft/hive/bin/hiveserver2
9.4 查看webui
10、配置hbase
10.1 修改hue.ini配置文件
hbase配置的是thriftserver2服务器地址，不是master地址，而且需要用小括号包起来。thriftserver需要单独启动。
[hbase]
# Comma-separated list of HBase Thrift servers for clusters in the format of '(name|host:port)'.
# Use full hostname with security.
# If using Kerberos we assume GSSAPI SASL, not PLAIN.
hbase_clusters=(s101:9090)
# HBase configuration directory, where hbase-site.xml is located.
hbase_conf_dir=/soft/hbase/conf
10.2 启动thriftserver服务器
注意：thriftserver服务器启动的名称是thrift。切记：有些文档上写的是thrit2，这里是thrfit。
$>hbase-daemon.sh start thrift
10.3 查看端口9090
10.4 查看hue中hbase
11、配置spark
11.1 介绍
hue与spark的集成使用livy server进行中转，livy server类似于hive server2。提供一套基于restful风格的服务，接受client提交http的请求，然后转发给spark集群。livy server不在spark的发行包中，需要单独下载。
注意：hue中通过netebook编写scala或者python程序，要确保notebook可以使用，需要启动hadoop的httpfs进程--切记！
注意下载使用较高的版本，否则有些类找不到。下载地址如下：
http://mirrors.tuna.tsinghua.edu.cn/apache/incubator/livy/0.5.0-incubating/livy-0.5.0-incubating-bin.zip
11.2 解压
$>unzip livy-server-0.2.0.zip -d /soft/
11.3 启动livy服务器
$>/soft/livy-server-0.2.0/bin/live-server
11.4 配置hue
推荐使用local或yarn模式启动job，这里我们配置成spark://s101:7077。
[spark]
# Host address of the Livy Server.
livy_server_host=s101
# Port of the Livy Server.
livy_server_port=8998
# Configure Livy to start in local 'process' mode, or 'yarn' workers.
livy_server_session_kind=spark://s101:7077
11.5 使用notebook编写scala程序
posted @
2018-09-04 12:11
大道至简(老徐)
阅读(60073)
评论(2)
编辑
收藏
举报
刷新评论刷新页面返回顶部
Copyright 2022 大道至简(老徐)
Powered by .NET 7.0 on Kubernetes