- 参照下列两篇文档进行部署
- https://blog.csdn.net/qq_35745940/article/details/131265747
- https://dolphinscheduler.apache.org/zh-cn/docs/3.1.7/guide/installation/pseudo-cluster
- 浏览器访问地址 http://hadoopm01:12345/dolphinscheduler/ui 即可登录系统UI。默认的用户名和密码是 admin/dolphinscheduler123
- hadoopm01 / hadoopm02
- /home/dolphinscheduler/dolphinscheduler/bin/env
- linux用户 dolphinscheduler 的密码就是 dolphinscheduler
- 海豚调度器新增oracle数据源时,需要补充新增一个oracle的jar包。 ojdbc8.jar ,需要放到每台机器的 /home/dolphinscheduler/dolphinscheduler/ 的 alert-server , api-server , master-server , worker-server 四个文件夹的lib文件夹下
- 海豚调度器集成kerberos(尚未完成)
- https://blog.csdn.net/m0_37759590/article/details/131338837
- https://blog.csdn.net/SmellyKitty/article/details/128792355?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_baidulandingword~default-1-128792355-blog-131338837.235^v38^pc_relevant_sort_base2&spm=1001.2101.3001.4242.1&utm_relevant_index=4
- https://blog.csdn.net/summer089089/article/details/107369994?spm=1001.2101.3001.6650.2&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7ERate-2-107369994-blog-128792355.235%5Ev38%5Epc_relevant_sort_base2&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7ERate-2-107369994-blog-128792355.235%5Ev38%5Epc_relevant_sort_base2&utm_relevant_index=3
- (这个方案已试过,不可行)https://blog.csdn.net/qq_18453581/article/details/129377982?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_baidulandingword~default-2-129377982-blog-131338837.235^v38^pc_relevant_sort_base2&spm=1001.2101.3001.4242.2&utm_relevant_index=5
- Fuda@2023
- 海豚调度器连接开启了kerberos认证hive报错的问题:
- 方案一:直接在json字符串中新增 {“principal”:”hive/_HOST@HADOOP.COM“} 无效
- 修改成 {“principal”:”hive/hadoopm02@HADOOP.COM“} 亦无效
- 将cdh的hive-shims-* .jar包全都复制到dolphinscheduler文件夹下亦无效
su root
Inspur@123
cp /opt/cloudera/parcels/CDH/jars/hive-shims-* /home/dolphinscheduler/dolphinscheduler/master-server/libs/
cp /opt/cloudera/parcels/CDH/jars/hive-shims-* /home/dolphinscheduler/dolphinscheduler/worker-server/libs/
cp /opt/cloudera/parcels/CDH/jars/hive-shims-* /home/dolphinscheduler/dolphinscheduler/api-server/libs/
cd /home/dolphinscheduler/dolphinscheduler/master-server/libs
chown dolphinscheduler:dolphinscheduler hive-shims-*
cd /home/dolphinscheduler/dolphinscheduler/worker-server/libs
chown dolphinscheduler:dolphinscheduler hive-shims-*
cd /home/dolphinscheduler/dolphinscheduler/api-server/libs
chown dolphinscheduler:dolphinscheduler hive-shims-*
- 方案二 :生成hdfs.keytab,配置common.properties文件
- 生成hdfs.keytab到指定路径
- kadmin.local -q “xst -k /opt/hdfs.keytab hdfs/hdfs@HADOOP.COM”
- 分发给每台机器opt路径下
- 编辑common.properties文件如下部分
# if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
resource.hdfs.root.user=hdfs
# if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
resource.hdfs.fs.defaultFS=hdfs://mycluster:8020
# whether to startup kerberos
hadoop.security.authentication.startup.state=true
# java.security.krb5.conf path
java.security.krb5.conf.path=/etc/krb5.conf
# login user from keytab username
login.user.keytab.username=hdfs/hdfs@HADOOP.COM
# login user from keytab path
login.user.keytab.path=/opt/hdfs.keytab
# kerberos expire time, the unit is hour
kerberos.expire.time=2 - 重启dolphinscheduler
- 以dolphinscheduler角色进入bin目录下 cd /home/dolphinscheduler/dolphinscheduler/bin
- 关闭所有dolphinscheduler服务 ./stop-all.sh
- 重启dolphinscheduler服务 ./start-all.sh
- 每台机器单独完成hdfs认证 kinit -kt /opt/hdfs.keytab hdfs/hdfs
- 重启后在json字符串中添加 {“principal”:”hive/_HOST@HADOOP.COM“}依旧无效
- 将hdfs,core-site.xml等文件放到dolphinscheduler文件夹下试试(都已经试过了)
- 生成hdfs.keytab到指定路径
- 发现api-server.worker-server,master-server都有common.properties文件。都拷过去试试
- 方案一:直接在json字符串中新增 {“principal”:”hive/_HOST@HADOOP.COM“} 无效
- 直接使用datax实现hive数据的读取测试一下看看
- https://betheme.net/houduan/72132.html?action=onClick
- https://blog.csdn.net/silentwolfyh/article/details/72852800
- 重装报错
- 1:海豚调度器任务运行时间时区问题
- 2:海豚调度器运行hive sql语句报错 – 也许是因为海豚没配置yarn参数导致的
- 3:数据质量模块
- 4:海豚datax采集配置模块
- 5:海豚调度器调度spark程序