-
[Hive Metastore] DBS 테이블 알아보기Hive 2024. 10. 1. 15:33반응형
- 목차
Hive Metastore Database 조회/생성 .
아래의 명령어들을 통해서 Hive Metastore 데이터베이스을 조회 및 생성할 수 있습니다.
ANSI SQL 의 Database 관련 쿼리와 동일하게 사용하실 수 있습니다.
spark.sql("show databases").show()spark.sql("CREATE DATABASE my_db LOCATION 'hdfs://namenode:8020/data/my_db';")저는 Hive Metastore 의 외부 데이터베이스로 MySQL 을 사용하였고, 아래와 같은 순서로 Spark - Hive - MySQL 이 연결되게 됩니다.
Spark -> Hive -> MySQL실제 MySQL 내부에서 databases 목록을 조회하게 되면, 아래와 같은 구조를 확인할 수 있습니다.
저는 metastore 라는 MySQL 데이터베이스를 사전에 생성해두었고,
show databases; +--------------------+ | Database | +--------------------+ | information_schema | | metastore | | mysql | | performance_schema | | sys | +--------------------+ 5 rows in set (0.00 sec)Hive 의 schematool 초기화를 통해서 아래와 같은 테이블 목록들을 생성하였습니다.
show tables; +-------------------------------+ | Tables_in_metastore | +-------------------------------+ | AUX_TABLE | | BUCKETING_COLS | | CDS | | COLUMNS_V2 | | COMPACTION_METRICS_CACHE | | COMPACTION_QUEUE | | COMPLETED_COMPACTIONS | | COMPLETED_TXN_COMPONENTS | | CTLGS | | DATABASE_PARAMS | | DATACONNECTORS | | DATACONNECTOR_PARAMS | | DBS | | DB_PRIVS | | DC_PRIVS | // .. 생략또한 DBS 데이터베이스는 생성된 Hive Database 의 메타데이터들이 보관되는 아래와 같이 저장됩니다.
select * from DBS\G; *************************** 1. row *************************** DB_ID: 1 DESC: Default Hive database DB_LOCATION_URI: file:/user/hive/warehouse NAME: default OWNER_NAME: public OWNER_TYPE: ROLE CTLG_NAME: hive CREATE_TIME: 1617268874 DB_MANAGED_LOCATION_URI: NULL TYPE: NATIVE DATACONNECTOR_NAME: NULL REMOTE_DBNAME: NULL *************************** 2. row *************************** DB_ID: 2 DESC: DB_LOCATION_URI: hdfs://namenode:8020/data/my_db NAME: my_db OWNER_NAME: spark OWNER_TYPE: USER CTLG_NAME: hive CREATE_TIME: 1617268955 DB_MANAGED_LOCATION_URI: NULL TYPE: NATIVE DATACONNECTOR_NAME: NULL REMOTE_DBNAME: NULL 2 rows in set (0.00 sec)유용한 명령어.
MySQL Driver 을 가지는 Hive Docker Image 만들기.
mkdir -p ./dockers cat <<EOF> ./dockers/hive-docker FROM apache/hive:4.0.1 USER root RUN apt-get update && apt-get install -y wget RUN wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.30/mysql-connector-java-8.0.30.jar \ -O /opt/hive/lib/mysql-connector-java-8.0.30.jar USER hive EOF docker build -t hive-mysql:4.0.1 -f ./dockers/hive-docker .MySQL & Hive Metastore 실행.
docker network create hive cat <<EOF> user.sql CREATE DATABASE metastore; CREATE USER 'hive'@'%' IDENTIFIED BY 'hive123'; GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'%'; FLUSH PRIVILEGES; EOF docker run -d --network hive --name mysql --hostname mysql -e MYSQL_ROOT_PASSWORD=1234 -v ./user.sql:/docker-entrypoint-initdb.d/user.sql mysql:8.0.30 mkdir -p $(pwd)/conf cat <<EOF> $(pwd)/conf/hive-site.xml <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://mysql:3306/metastore?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive123</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://hive:9083</value> </property> <property> <name>hive.metastore.authorization.storage.checks</name> <value>true</value> </property> </configuration> EOF docker run -it --rm --network hive \ -v $(pwd)/conf/hive-site.xml:/opt/hive/conf/hive-site.xml \ --entrypoint /bin/bash \ hive-mysql:4.0.1 -c "schematool -dbType mysql -initSchema --verbose" docker run -d --env SERVICE_NAME=metastore --name hive \ --network hive --hostname hive \ -v $(pwd)/conf/hive-site.xml:/opt/hive/conf/hive-site.xml \ hive-mysql:4.0.1 hive --service metastoreHadoop 실행.
mkdir -p ./conf cat <<EOF> ./conf/core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://namenode:8020</value> <description>HDFS NameNode 주소 설정</description> </property> <property> <name>hadoop.http.staticuser.user</name> <value>hadoop</value> </property> </configuration> EOF cat <<EOF> ./conf/hdfs-site.xml <configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:///data/hadoop/hdfs/namenode</value> <description>Block Metadata files</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///data/hadoop/hdfs/datanode</value> <description>Block files</description> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> </configuration> EOF docker run -d --network hive --name namenode --hostname namenode \ -v ./conf/core-site.xml:/opt/hadoop/etc/hadoop/core-site.xml \ -v ./conf/hdfs-site.xml:/opt/hadoop/etc/hadoop/hdfs-site.xml \ -p 9870:9870 \ apache/hadoop:3.4.1 \ sh -c '[ ! -d /data/hadoop/hdfs/namenode ] && echo start namenode formating && hdfs namenode -format; hdfs namenode' docker run -d --network hive --name datanode1 --hostname datanode1 \ -v ./conf/core-site.xml:/opt/hadoop/etc/hadoop/core-site.xml \ -v ./conf/hdfs-site.xml:/opt/hadoop/etc/hadoop/hdfs-site.xml \ apache/hadoop:3.4.1 hdfs datanodespark user 권한 설정 및 디렉토리 생성.
docker run --platform linux/amd64 -it --rm --network hive --user hadoop apache/hadoop:3.4.1 sh -c ' whoami && \ hdfs dfs -mkdir -p hdfs://namenode:8020/data/my_db && \ hdfs dfs -chown spark:spark hdfs://namenode:8020/data/my_db 'Spark 실행.
docker run -d --name master --hostname master --network hive -p 8080:8080 -p 7077:7077 bitnami/spark:3.3.2 start-master.sh docker run -d --name worker1 --hostname worker1 --cpus 1 --memory 512M --network hive \ -v ./conf/hive-site.xml:/opt/bitnami/spark/conf/hive-site.xml \ bitnami/spark:3.3.2 start-worker.sh spark://master:7077 docker run -d --name worker2 --hostname worker2 --cpus 1 --memory 512M --network hive \ -v ./conf/hive-site.xml:/opt/bitnami/spark/conf/hive-site.xml \ bitnami/spark:3.3.2 start-worker.sh spark://master:7077 docker run -it --rm --name client --network hive -p 4040:4040 \ -v ./conf/hive-site.xml:/opt/bitnami/spark/conf/hive-site.xml \ bitnami/spark:3.3.2 spark-shell --master spark://master:7077 --conf spark.ui.port=4040반응형'Hive' 카테고리의 다른 글
[Hive] hive.metastore.warehouse.dir 알아보기 (0) 2024.11.18 [Hive] Remote Metastore Database 구성하기 (0) 2024.11.18 [Hive Metastore] hive.metastore.uris 알아보기 (0) 2024.02.04 Hive & Hadoop 연결하기 (0) 2023.12.17 Hive MySQL Metastore 알아보기 (0) 2023.12.17