CREATE-REPOSITORY
CREATE-REPOSITORYβ
Nameβ
CREATE REPOSITORY
Descriptionβ
This statement is used to create a repository. Repositories are used for backup or restore. Only root or superuser users can create repositories.
grammar:
CREATE [READ ONLY] REPOSITORY `repo_name`
WITH [BROKER `broker_name`|S3|hdfs]
ON LOCATION `repo_location`
PROPERTIES ("key"="value", ...);
illustrate:
- Creation of repositories, relying on existing brokers or accessing cloud storage directly through AWS s3 protocol, or accessing HDFS directly.
- If it is a read-only repository, restores can only be done on the repository. If not, backup and restore operations are available.
- PROPERTIES are different according to different types of broker or S3 or hdfs, see the example for details.
- ON LOCATION οΌ if it is S3 , here followed by the Bucket Name.
Exampleβ
- Create a warehouse named bos_repo, rely on BOS broker "bos_broker", and the data root directory is: bos://palo_backup
CREATE REPOSITORY `bos_repo`
WITH BROKER `bos_broker`
ON LOCATION "bos://palo_backup"
PROPERTIES
(
"bos_endpoint" = "http://gz.bcebos.com",
"bos_accesskey" = "bos_accesskey",
"bos_secret_accesskey"="bos_secret_accesskey"
);
- Create the same repository as Example 1, but with read-only properties:
CREATE READ ONLY REPOSITORY `bos_repo`
WITH BROKER `bos_broker`
ON LOCATION "bos://palo_backup"
PROPERTIES
(
"bos_endpoint" = "http://gz.bcebos.com",
"bos_accesskey" = "bos_accesskey",
"bos_secret_accesskey"="bos_accesskey"
);
- Create a warehouse named hdfs_repo, rely on Baidu hdfs broker "hdfs_broker", the data root directory is: hdfs://hadoop-name-node:54310/path/to/repo/
CREATE REPOSITORY `hdfs_repo`
WITH BROKER `hdfs_broker`
ON LOCATION "hdfs://hadoop-name-node:54310/path/to/repo/"
PROPERTIES
(
"username" = "user",
"password" = "password"
);
- Create a repository named s3_repo to link cloud storage directly without going through the broker.
CREATE REPOSITORY `s3_repo`
WITH S3
ON LOCATION "s3://s3-repo"
PROPERTIES
(
"s3.endpoint" = "http://s3-REGION.amazonaws.com",
"s3.region" = "s3-REGION",
"s3.access_key" = "AWS_ACCESS_KEY",
"s3.secret_key"="AWS_SECRET_KEY",
"s3.region" = "REGION"
);
- Create a repository named hdfs_repo to link HDFS directly without going through the broker.
CREATE REPOSITORY `hdfs_repo`
WITH hdfs
ON LOCATION "hdfs://hadoop-name-node:54310/path/to/repo/"
PROPERTIES
(
"fs.defaultFS"="hdfs://hadoop-name-node:54310",
"hadoop.username"="user"
);
### Keywords
- Create a repository named minio_repo to link minio storage directly through the s3 protocol.
CREATE REPOSITORY `minio_repo`
WITH S3
ON LOCATION "s3://minio_repo"
PROPERTIES
(
"s3.endpoint" = "http://minio.com",
"s3.access_key" = "MINIO_USER",
"s3.secret_key"="MINIO_PASSWORD",
"s3.region" = "REGION"
"use_path_style" = "true"
);
- Create a repository named minio_repo via temporary security credentials.
CREATE REPOSITORY `minio_repo`
WITH S3
ON LOCATION "s3://minio_repo"
PROPERTIES
(
"s3.endpoint" = "AWS_ENDPOINT",
"s3.access_key" = "AWS_TEMP_ACCESS_KEY",
"s3.secret_key" = "AWS_TEMP_SECRET_KEY",
"s3.session_token" = "AWS_TEMP_TOKEN",
"s3.region" = "AWS_REGION"
)
- Create repository using Tencent COS
CREATE REPOSITORY `cos_repo`
WITH S3
ON LOCATION "s3://backet1/"
PROPERTIES
(
"s3.access_key" = "ak",
"s3.secret_key" = "sk",
"s3.endpoint" = "http://cos.ap-beijing.myqcloud.com",
"s3.region" = "ap-beijing"
);
- Create repository and delete snapshots if exists.
CREATE REPOSITORY `s3_repo`
WITH S3
ON LOCATION "s3://s3-repo"
PROPERTIES
(
"s3.endpoint" = "http://s3-REGION.amazonaws.com",
"s3.region" = "s3-REGION",
"s3.access_key" = "AWS_ACCESS_KEY",
"s3.secret_key"="AWS_SECRET_KEY",
"s3.region" = "REGION",
"delete_if_exists" = "true"
);
Note: only the s3 service supports the "delete_if_exists" property.
Keywordsβ
CREATE, REPOSITORY
Best Practiceβ
- A cluster can create multiple warehouses. Only users with ADMIN privileges can create repositories.
- Any user can view the created repositories through the SHOW REPOSITORIES command.
- When performing data migration operations, it is necessary to create the exact same warehouse in the source cluster and the destination cluster, so that the destination cluster can view the data snapshots backed up by the source cluster through this warehouse.