1. hive模糊搜索表:show tables like ‘*name*’;
  2. 查看表结构信息:desc table_name;
  3. 查看分区信息:show partitions table_name;
  4. 加载本地文件:load data local inpath ‘/xxx/test.txt’ overwrite into table dm.table_name;
  5. 从查询语句给table插入数据:insert overwrite table table_name partition(dt) select * from table_name;
  6. 导出数据到本地系统:insert overwrite local directory ‘/tmp/text’ select a.* from table_name a order by 1;
    1. 创建表时指定的一些属性:
    1. 字段分隔符:row format delimited fields terminated by ‘\t’
    1. 行分隔符:row format delimited lines terminated by ‘\n’
    1. 文件格式为文本型存储:stored as textfile
    1. 命令行操作:hive -e ‘select table_cloum from table’执行一个查询,在终端上显示mapreduce的进度,执行完毕后,最后把查询结果输出到终端上,接着hive进程退出,不会进入交互模式
    1. hive -S -e ‘select table_cloum from table’ -S,终端上的输出不会有mapreduce的进度,执行完毕,只会把查询结果输出到终端上。
  7. hive修改表名:alter table old_table_name rename to new_table_name;
  8. hive复制表结构:create table new_table_name like table_name;
  9. hive添加字段:alter table table_name add columns(columns_values bigint comment ‘comm_text’);
  10. hive修改字段:alter table table_name change old_column new_column string comment ‘comm_text’;
  11. 删除分区:alter table table_name drop partition(dt=’2021-11-30′);
  12. 添加分区:alter table table_name add partition (dt=’2021-11-30′);
  13. 删除空数据库:drop database myhive2;
  14. 强制删除数据库:drop database myhive2 cascade;
  15. 删除表:drop table score5;
  16. 清空表:truncate table score6;
  17. 向hive表中加载数据
    1. 直接向分区表中插入数据:insert into table score partition(month =’202107′) values (‘001′,’002′,’100’);
    1. 通过load方式加载数据:load data local inpath ‘/export/servers/hivedatas/score.csv’ overwrite into table score partition(month=’201806′);
    1. 通过查询方式加载数据:insert overwrite table score2 partition(month = ‘202106’) select s_id,c_id,s_score from score1;
    1. 查询语句中创建表并加载数据:create table score2 as select * from score1;
    1. 在创建表是通过location指定加载数据的路径:create external table score6 (s_id string,c_id string,s_score int) row format delimited fields terminated by ‘,’ location ‘/myscore’;
    1. export导出与import 导入 hive表数据(内部表操作):
    1. create table techer2 like techer; –依据已有表结构创建表
    1. export table techer to ‘/export/techer’;
    1. import table techer2 from ‘/export/techer’;
  18. hive表中数据导出
    1. insert导出
    1. 将查询的结果导出到本地:insert overwrite local directory ‘/export/servers/exporthive’ select * from score;
    1. 将查询的结果格式化导出到本地:insert overwrite local directory ‘/export/servers/exporthive’ row format delimited fields terminated by ‘\t’ collection items terminated by ‘#’ select * from student;
    1. 将查询的结果导出到HDFS上(没有local):insert overwrite directory ‘/export/servers/exporthive’ row format delimited fields terminated by ‘\t’ collection items terminated by ‘#’ select * from score;
    1. Hadoop命令导出到本地:dfs -get /export/servers/exporthive/000000_0 /export/servers/exporthive/local.txt;
    1. hive shell 命令导出
    1. 基本语法:(hive -f/-e 执行语句或者脚本 > file)hive -e “select * from myhive.score;” > /export/servers/exporthive/score.txt
    1. hive -f export.sh > /export/servers/exporthive/score.txt
    1. export导出到HDFS上:export table score to ‘/export/exporthive/score’;

3.1.4 Hive查询语句

  1. GROUP BY 分组:select s_id ,avg(s_score) avgscore from score group by s_id having avgscore > 85; 对分组后的数据进行筛选,使用 having
  2. join 连接:inner join 内连接;left join 左连接;right join 右链接;full join 全外链接。
  3. order by 排序:ASC(ascend): 升序(默认) DESC(descend): 降序
  4. sort by 局部排序:每个MapReduce内部进行排序,对全局结果集来说不是排序。
  5. distribute by 分区排序:类似MR中partition,进行分区,结合sort by使用

作者 张, 宴银

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注