Creating Cluster
Creating a Doris cluster in the compute-storage decoupled mode is to create the entire distributed system that contains both FE and BE nodes. Then, in such a cluster, users can create compute clusters. Each compute cluster is a group of computing resources consisting of one or more BE nodes.
A single FoundationDB + Meta Service + Recycler infrastructure can support multiple compute-storage decoupled clusters, where each compute-storage decoupled cluster is considered a data warehouse instance (instance).
In the compute-storage decoupled mode, the registration and changes of nodes in a warehouse is managed by Meta Service. FE, BE, and Meta Service interact for service discovery and authentication.
Creating a Doris cluster in the compute-storage decoupled mode entails interaction with Meta Service. Meta Service provides standard HTTP APIs for resource management operations. For more information, refer to Meta Service API.
The compute-storage decoupled mode of Doris adopts a service discovery mechanism. The steps to create a compute-storage separation cluster can be summarized as follows:
- Register and specify the data warehouse instance and its storage backend.
- Register and specify the FE and BE nodes that make up the data warehouse instance, including the specific machines and how they form the cluster.
- Configure and start all the FE and BE nodes.
127.0.0.1:5000
in the examples of this page refers to the address of Meta Service. Please replace it with the actual IP address and bRPC listening port for your own use case.- Please modify the configuration items in the following examples as needed.
Create cluster & storage vaultβ
The first step is to register a data warehouse instance in Meta Service. A single Meta Service can support multiple data warehouse instances (i.e., multiple sets of FE-BE). Specifically, this process includes describing the required storage vault (i.e., the shared storage layer demonstrated in Overview) for that data warehouse instance. The options for the storage vault include HDFS and S3 (or object storage that supports the S3 protocol, such as AWS S3, GCS, Azure Blob, MinIO, Ceph, and Alibaba Cloud OSS). Storage vault is the remote shared storage used by Doris in the compute-storage decoupled mode. Users can configure multiple storage vaults for one data warehouse instance, and store different tables on different storage vaults.
This step involves calling the create_instance
API of Meta Service. The key parameters include:
instance_id
: The ID of the data warehouse instance. It is typically a UUID string, such as6ADDF03D-4C71-4F43-9D84-5FC89B3514F8
. For simplicity in this guide, a regular string is used.name
: The name of the data warehouse instance, which should be filled in according to actual needs. It should follow the format of[a-zA-Z][0-9a-zA-Z_]+
.user_id
: The ID of the user who creates the data warehouse instance. It should follow the format of[a-zA-Z][0-9a-zA-Z_]+
.vault
: The storage vault information, such as HDFS properties and S3 Bucket details. Different storage vaults entails different parameters.
For more information, refer to "create_instance" in Meta Service API.
Multiple compute-storage decoupled clusters (data warehouse instances/instances) can be created by making multiple calls to the Meta Service create_instance
interface.
Create cluster using HDFS as storage vaultβ
To create a Doris cluster in the compute-storage decoupled mode using HDFS as the storage vault, configure the following items accurately and ensure that all nodes (including FE/BE nodes, Meta Service, and Recycler) have the necessary permissions to access the specified HDFS. This includes completing the Kerberos authorization configuration and connectivity checks for the machines (which can be tested using the Hadoop Client on the respective nodes).
Parameter | Description | Required/Optional | Notes |
---|---|---|---|
instance_id | instance_id | Required | Globally and historically unique, normally a UUID string |
name | Instance name. It should conform to the format of [a-zA-Z][0-9a-zA-Z_]+ | Optional | |
user_id | ID of the user who creates the instance. It should conform to the format of [a-zA-Z][0-9a-zA-Z_]+ | Required | |
vault | Storage vault | Required | |
vault.hdfs_info | Information of the HDFS storage vault | Required | |
vault.build_conf | Build configuration of the HDFS storage vault | Required | |
vault.build_conf.fs_name | HDFS name, normally the connection address | Required | |
vault.build_conf.user | User to connect to HDFS | Required | |
vault.build_conf.hdfs_kerberos_keytab | Kerberos Keytab path | Optional | Required when using Kerberos authentication |
vault.build_conf.hdfs_kerberos_principal | Kerberos Principal | Optional | Required when using Kerberos authentication |
vault.build_conf.hdfs_confs | Other configurations of HDFS | Optional | Can be filled in as needed |
vault.prefix | Prefix for data storage; used for data isolation | Required | Normally named after the specific business, such as big_data |
Example
curl -s "127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999" -d \
'{
"instance_id": "sample_instance_id",
"name": "sample_instance_name",
"user_id": "sample_user_id",
"vault": {
"hdfs_info" : {
"build_conf": {
"fs_name": "hdfs://172.21.0.44:4007",
"user": "hadoop",
"hdfs_kerberos_keytab": "/etc/emr.keytab",
"hdfs_kerberos_principal": "hadoop/172.30.0.178@EMR-XXXYYY",
"hdfs_confs" : [
{
"key": "hadoop.security.authentication",
"value": "kerberos"
}
]
},
"prefix": "sample_prefix"
}
}
}'
Create cluster using S3 as storage vaultβ
All object storage attributes are required in the creation statement. Specifically:
- When using object storage systems that support the S3 protocol, such as MinIO, make sure to test the connectivity and the correctness of the Access Key (AK) and Secret Access Key (SK). You can refer to AWS CLI with MinIO Server for further guidance.
- The value of the Bucket field should be the name of the bucket, which does NOT include the schema like
s3://
. - The
external_endpoint
should be kept the same as theendpoint
value. - If you are using a non-cloud provider object storage, you can fill in any values for the region and provider fields.
Parameter | Description | Required/Optional | Notes |
---|---|---|---|
instance_id | ID of the data warehouse instance in the compute-storage decoupled mode, normally a UUID string. It should conform to the format of [0-9a-zA-Z_-]+ . | Required | Example: 6ADDF03D-4C71-4F43-9D84-5FC89B3514F8 |
name | Instance name. It should conform to the format of [a-zA-Z][0-9a-zA-Z_]+ | Optional | |
user_id | ID of the user who creates the instance. It should conform to the format of [a-zA-Z][0-9a-zA-Z_]+ | Required | |
vault.obj_info | Object storage configuration | Required | |
vault.obj_info.ak | Object storage Access Key | Required | |
vault.obj_info.sk | Object storage Secret Key | Required | |
vault.obj_info.bucket | Object storage bucket name | Required | |
vault.obj_info.prefix | Prefix for data storage on object storage | Optional | If this parameter is empty, the default storage location will be in the root directory of the bucket. Example: big_data |
obj_info.endpoint | Object storage endpoint | Required | The domain or ip:port , not including the scheme prefix such as http://. |
obj_info.region | Object storage region | Required | If using MinIO, this parameter can be filled in with any value. |
obj_info.external_endpoint | Object storage external endpoint | Required | Normally consistent with the endpoint. Compatible with OSS. Note the difference between external and internal OSS. |
vault.obj_info.provider | Object storage provider; options include OSS, S3, COS, OBS, BOS, GCP, and AZURE | Required | If using MinIO, simply fill in 'S3'. |
Example (AWS S3)
curl -s "127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999" -d \
'{
"instance_id": "sample_instance_id",
"name": "sample_instance_name",
"user_id": "sample_user_id",
"vault": {
"obj_info": {
"ak": "ak_xxxxxxxxxxx",
"sk": "sk_xxxxxxxxxxx",
"bucket": "sample_bucket_name",
"prefix": "sample_prefix",
"endpoint": "s3.amazonaws.com",
"external_endpoint": "s3.amazonaws.com",
"region": "us-east1",
"provider": "AWS"
}
}
}'
Example (Tencent Cloud Object Storage)
curl -s "127.0.0.1:5000/MetaService/http/create_instance?token=greedisgood9999" -d \
'{
"instance_id": "sample_instance_id",
"name": "sample_instance_name",
"user_id": "sample_user_id",
"vault": {
"obj_info": {
"ak": "ak_xxxxxxxxxxx",
"sk": "sk_xxxxxxxxxxx",
"bucket": "sample_bucket_name",
"prefix": "sample_prefix",
"endpoint": "cos.ap-beijing.myqcloud.com",
"external_endpoint": "cos.ap-beijing.myqcloud.com",
"region": "ap-beijing",
"provider": "COS"
}
}
}'
Manage storage vaultβ
A warehouse can be configured with one or more storage vaults. Different tables can be stored on different storage vaults.
Conceptsβ
vault name
: The name of each storage vault is globally unique within the data warehouse instance, except for thebuilt-in vault
. Thevault name
is specified by the user when creating the storage vault.built-in vault
: This is the remote shared storage that stores Doris system tables. It must be configured when creating the data warehouse instance. The name of it is fixed asbuilt_in_storage_vault
. Only after configuring thebuilt-in vault
can the data warehouse (FE) be started.default vault
: This is the default storage vault at the data warehouse instance level. Users can specify a storage vault as the default storage vault, including thebuilt-in vault
. In the compute-storage decoupled mode, data must be stored on a remote shared storage. If the user does not specify thevault_name
in thePROPERTIES
section of the table creation statement, data of that table will be stored in thedefault vault
. The default vault can be reset, but the storage vault used by tables that have already been created will not change accordingly.
After configuring the built-in vault
, you can create additional storage vaults as needed. After the FE startup is successful, you can perform storage vault operations through SQL statements, including creating storage vaults, viewing storage vaults, and specifying storage vaults during table creation.
Create storage vaultβ
Syntax
CREATE STORAGE VAULT [IF NOT EXISTS] <vault_name>
PROPERTIES
("key" = "value",...)
<vault_name> is the user-defined name for the storage vault. It serves as the identifier for storage vault access.
Example
Create HDFS storage vault
CREATE STORAGE VAULT IF NOT EXISTS ssb_hdfs_vault
PROPERTIES (
"type"="hdfs", -- required
"fs.defaultFS"="hdfs://127.0.0.1:8020", -- required
"path_prefix"="big/data", -- optional, Normally named after the specifc business
"hadoop.username"="user" -- optional
"hadoop.security.authentication"="kerberos" -- optional
"hadoop.kerberos.principal"="hadoop/127.0.0.1@XXX" -- optional
"hadoop.kerberos.keytab"="/etc/emr.keytab" -- optional
);
Create S3 storage vault
CREATE STORAGE VAULT IF NOT EXISTS ssb_s3_vault
PROPERTIES (
"type"="S3", -- required
"s3.endpoint" = "oss-cn-beijing.aliyuncs.com", -- required
"s3.external_endpoint" = "oss-cn-beijing.aliyuncs.com", -- required
"s3.bucket" = "sample_bucket_name", -- required
"s3.region" = "bj", -- required
"s3.root.path" = "big/data/prefix", -- required
"s3.access_key" = "ak", -- required
"s3.secret_key" = "sk", -- required
"provider" = "cos", -- required
);
Newly created storage vaults may NOT be immediately visible to the BE. This means if you try to import data into tables with a newly created storage vault, you might expect error reports in the short term (< 1 minute) until the storage vault is fully propagated to the BE nodes.
Properties
Parameter | Description | Required/Optional | Example |
---|---|---|---|
type | S3 and HDFS are currently supported. | Required | s3 or hdfs |
fs.defaultFS | HDFS vault parameter | Required | hdfs://127.0.0.1:8020 |
path_prefix | HDFS vault parameter, the path prefix for data storage, normally configured based on specific business. | Optional | big/data/dir |
hadoop.username | HDFS vault parameter | Optional | hadoop |
hadoop.security.authentication | HDFS vault parameter | Optional | kerberos |
hadoop.kerberos.principal | HDFS vault parameter | Optional | hadoop/127.0.0.1@XXX |
hadoop.kerberos.keytab | HDFS vault parameter | Optional | /etc/emr.keytab |
dfs.client.socket-timeout | HDFS vault parameter, measured in millisecond | Optional | 60000 |
s3.endpiont | S3 vault parameter | Required | oss-cn-beijing.aliyuncs.com |
s3.external_endpoint | S3 vault parameter | Required | oss-cn-beijing.aliyuncs.com |
s3.bucket | S3 vault parameter | Required | sample_bucket_name |
s3.region | S3 vault parameter | Required | bj |
s3.root.path | S3 vault parameter, path prefix for the actual data storage | Required | /big/data/prefix |
s3.access_key | S3 vault parameter | Required | |
s3.secret_key | S3 vault parameter | Required | |
provider | S3 vault parameter. Options include OSS, AWS S3, COS, OBS, BOS, GCP, and Microsoft Azure. If using MinIO, simply fill in 'S3'. | Required | cos |
View storage vault
Syntax
SHOW STORAGE VAULT
The returned result contains 4 columns, which are the name of the storage vault, the ID of the storage vault, the properties of the storage vault, and whether it is the default storage vault.
Example
mysql> show storage vault;
+------------------------+----------------+-------------------------------------------------------------------------------------------------------------+-----------+
| StorageVaultName | StorageVaultId | Propeties | IsDefault |
+------------------------+----------------+-------------------------------------------------------------------------------------------------------------+-----------+
| built_in_storage_vault | 1 | build_conf { fs_name: "hdfs://127.0.0.1:8020" } prefix: "_1CF80628-16CF-0A46-54EE-2C4A54AB1519" | false |
| hdfs_vault | 2 | build_conf { fs_name: "hdfs://127.0.0.1:8020" } prefix: "big/data/dir_0717D76E-FF5E-27C8-D9E3-6162BC913D97" | false |
+------------------------+----------------+-------------------------------------------------------------------------------------------------------------+-----------+
Set default storage vaultβ
Syntax
SET <vault_name> AS DEFAULT STORAGE VAULT
Specify storage vault for tableβ
In the table creation statement, if you specify the storage_vault_name
in the PROPERTIES
, the data will be stored in the storage vault corresponding to the specified vault name
. After the table is successfully created, the storage_vault
cannot be modified, which means that the storage vault cannot be changed.
Example
CREATE TABLE IF NOT EXISTS supplier (
s_suppkey int(11) NOT NULL COMMENT "",
s_name varchar(26) NOT NULL COMMENT "",
s_address varchar(26) NOT NULL COMMENT "",
s_city varchar(11) NOT NULL COMMENT "",
s_nation varchar(16) NOT NULL COMMENT "",
s_region varchar(13) NOT NULL COMMENT "",
s_phone varchar(16) NOT NULL COMMENT ""
)
UNIQUE KEY (s_suppkey)
DISTRIBUTED BY HASH(s_suppkey) BUCKETS 1
PROPERTIES (
"replication_num" = "1",
"storage_vault_name" = "ssb_hdfs_vault"
);
Built-in storage vaultβ
When creating an instance, users can choose Vault Mode or Non-Vault Mode. In Vault Mode, the passed-in Vault will be set as the built-in storage vault
, which is used to save internal table information (such as statistics tables). If the built-in storage vault
is not created, the FE will not be able to start normally.
Users can also choose to store their new tables in the built-in storage vault
. This can be done by setting the built-in storage vault
as the default storage vault
or by setting the storage_vault_name
of the table to built-in storage vault
in the table creation statement.
Modify storage vaultβ
Some of the storage vault configurations are modifiable.
Coming soon
Delete storage vaultβ
Only non-default storage vaults that are not referenced by any tables can be deleted.
Coming soon
Storage vault privilegeβ
You can grant privileges of a specific storage vault to a designated MySQL user, so that the user can configure that storage vault for a newly created table or view that storage vault.
Syntax
GRANT
USAGE_PRIV
ON STORAGE VAULT <vault_name>
TO { ROLE | USER } {<role> | <user>}
Only the Admin user has the privilege to execute the GRANT
statement, which is used to grant the privileges for a specified storage vault to a User/Role.
Users/Roles with the USAGE_PRIV
privilege for a specific storage vault can perform the following operations:
- View the information of the storage vault using the
SHOW STORAGE VAULT
statement. - Specify that storage vault in the
PROPERTIES
when creating a table.
Example
grant usage_priv on storage vault my_storage_vault to user1
Revoke storage vault privileges for a MySQL user.
Syntax
REVOKE
USAGE_PRIV
ON STORAGE VAULT <vault_name>
FROM { ROLE | USER } {<role> | <user>}
Only the Admin user has the privilege to execute the REVOKE
statement, which is used to revoke the privileges that a User/Role has on a specific storage vault.
Example
revoke usage_priv on storage vault my_storage_vault from user1
Add FEβ
In the compute-storage decoupled mode, the node management interfaces for FE and BE are the same, with only the parameter configurations differing.
The initial FE and BE nodes can be added through the Meta Service add_cluster
interface.
The parameter list for the add_cluster
interface is as follows:
Parameter | Description | Required/Optional | Notes |
---|---|---|---|
instance_id | ID of the data warehouse instance in the compute-storage decoupled mode, normally a UUID string. It should conform to the format of [0-9a-zA-Z_-]+ . | Required | Globally and historically unique, normally a UUID string. Users should use a different instance_id each time they call this interface. |
cluster | Cluster object | Required | |
cluster.cluster_name | Cluster name. It should conform to the format of [a-zA-Z][0-9a-zA-Z_]+ . | Required | The FE cluster name is special. The default value of it is RESERVED_CLUSTER_NAME_FOR_SQL_SERVER . This can be modified by configuring cloud_observer_cluster_name in the fe.conf file. |
cluster.cluster_id | Cluster ID | Required | The FE cluster ID is special. The default value of it is RESERVED_CLUSTER_ID_FOR_SQL_SERVER . This can be modified by configuring cloud_observer_cluster_id in the fe.conf file. |
cluster.type | Cluster node type | Required | Two types are supported: SQL and COMPUTE . SQL represents the SQL Service corresponding to FE, while COMPUTE means that the compute nodes are corresponding to BE. |
cluster.nodes | Nodes in the cluster | Required | Array |
cluster.nodes.cloud_unique_id | cloud_unique_id of nodes. It should conform to the format of 1:<instance_id>:<string> , in which the string should conform to the format of [0-9a-zA-Z_-]+ . The value for each node should be different. | Required | cloud_unique_id in fe.conf and be.conf |
cluster.nodes.ip | Node IP | Required | When deploying FE/BE in FQDN mode, this field should be the domain name. |
cluster.nodes.host | Node domain name | Optional | This field is required when deploying FE/BE in FQDN mode. |
cluster.nodes.heartbeat_port | Heartbeat port of BE | Required for BE | heartbeat_service_port in be.conf |
cluster.nodes.edit_log_port | Edit log port of FE | Required for FE | edit_log_port in fe.conf |
cluster.nodes.node_type | FE node type | Required | This field is required when the cluster type is SQL . It can be either FE_MASTER or FE_OBSERVER . FE_MASTER indicates that the node is of Master role, and FE_OBSERVER indicates that the node is an Observer. Note that in an SQL type cluster, the nodes array can only have one FE_MASTER node, but it can include multiple FE_OBSERVER nodes. |
This is an example of adding one FE:
# Add FE
curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"SQL",
"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
"ip":"172.21.16.21",
"edit_log_port":12103,
"node_type":"FE_MASTER"
}
]
}
}'
# Confirm successful creation based on the returned result of the get_cluster command.
curl '127.0.0.1:5000/MetaService/http/get_cluster?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER"
}'
If you need to add 2 FE nodes during the initial operation using the interface mentioned above, you can add configurations for the additional node in the nodes
array.
This is an example of adding an observer
node:
...
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
"ip":"172.21.16.21",
"edit_log_port":12103,
"node_type":"FE_MASTER"
},
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
"ip":"172.21.16.22",
"edit_log_port":12103,
"node_type":"FE_OBSERVER"
}
]
...
If you need to add or drop FE nodes, you may refer to the "Manage compute cluster" section on this page.
Create compute clusterβ
Users can create one or more compute clusters, and a compute cluster can consist of any number of BE nodes. This is also performed via the Meta Service add_cluster
interface.
See the "Add FE" section above for more information of the interface.
Users can adjust the number of compute clusters and the number of nodes within each cluster based on their needs. Each compute cluster should have a unique cluster_name
and cluster_id
.
This is an example of adding a compute cluster that consists of 1 BE node.
# 172.19.0.11
# Add BE
curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"COMPUTE",
"cluster_name":"cluster_name0",
"cluster_id":"cluster_id0",
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
"ip":"172.21.16.21",
"heartbeat_port":9455
}
]
}
}'
# Confirm successful creation using get_cluster
curl '127.0.0.1:5000/MetaService/http/get_cluster?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
"cluster_name":"cluster_name0",
"cluster_id":"cluster_id0"
}'
If you need to add 2 BE nodes during the initial operation using the interface mentioned above, you can add the configurations for the additional node in the nodes
array.
This is an example of specifying a compute cluster with 2 BE nodes:
...
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
"ip":"172.21.16.21",
"heartbeat_port":9455
},
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node0",
"ip":"172.21.16.22",
"heartbeat_port":9455
}
]
...
For instructions on adding or dropping BE nodes, refer to the "Manage compute cluster" section on this page.
If you need to continue adding more compute clusters, you can simply repeat the operations described in this section.
FE/BE configurationβ
Compared to the compute-storage coupled mode, the compute-storage decoupled mode requires additional configurations for the FE and BE:
meta_service_endpoint
: The address of Meta Service, which needs to be filled in both the FE and BE.cloud_unique_id
: This should be filled with the corresponding value from theadd_cluster
request sent to Meta Service when creating the cluster. Doris determines whether it is operating in the compute-storage decoupled mode based on this configuration.
fe.confβ
meta_service_endpoint = 127.0.0.1:5000
cloud_unique_id = 1:sample_instance_id:cloud_unique_id_sql_server00
be.confβ
In the following example, meta_service_use_load_balancer
and enable_file_cache
can be copied for your use case. However, you might need to modify the other configuration items.
The file_cache_path
is a JSON array (configured according to the actual number of cache disks), and the definition of each field is as follows:
path
: The path to store the cached data, similar to thestorage_root_path
in the compute-storage coupled mode.total_size
: The expected upper limit of the cache space to be used.query_limit
: The maximum amount of cache data that can be evicted when a single query misses the cache (to prevent large queries from evicting all the cache). Since the cache needs to store data, it is best to use high-performance disks such as SSDs as the cache storage medium.
meta_service_endpoint = 127.0.0.1:5000
cloud_unique_id = 1:sample_instance_id:cloud_unique_id_compute_node0
meta_service_use_load_balancer = false
enable_file_cache = true
file_cache_path = [{"path":"/mnt/disk1/doris_cloud/file_cache","total_size":104857600000,"query_limit":10485760000}, {"path":"/mnt/disk2/doris_cloud/file_cache","total_size":104857600000,"query_limit":10485760000}]
Start/stop FE/BEβ
In the compute-storage decoupled mode of Doris, the startup and shutdown processes for the FE/BE is the same as those in the compute-storage coupled mode.
In the compute-storage decoupled mode, which follows a service discovery model, there is no need to use commands like alter system add/drop frontend/backend
to manage the nodes.
bin/start_be.sh --daemon
bin/stop_be.sh
bin/start_fe.sh --daemon
bin/stop_fe.sh
After startup, if the above configuration items are all correct in the logs, it indicates that the system has started to function normally, and you can connect to the FE through a MySQL client for access.
Manage compute clusterβ
Add/drop FE/BE nodeβ
These steps are similar to those in creating a compute cluster. Specify the new nodes in Meta Service, and then start the corresponding nodes (ensure correct configuration of the new nodes). There is no need to use the alter system add/drop
statements for additional operations.
In the compute-storage decoupled mode, you can increase/decrease multiple nodes at a time. However, it is recommended to add or drop the nodes one by one.
Example
Add two BE nodes to compute cluster cluster_name0
.
curl '127.0.0.1:5000/MetaService/http/add_node?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"COMPUTE",
"cluster_name":"cluster_name0",
"cluster_id":"cluster_id0",
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node1",
"ip":"172.21.16.22",
"heartbeat_port":9455
},
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node2",
"ip":"172.21.16.23",
"heartbeat_port":9455
}
]
}
}'
Remove two BE nodes from compute cluster cluster_name0
.
curl '127.0.0.1:5000/MetaService/http/drop_node?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"COMPUTE",
"cluster_name":"cluster_name0",
"cluster_id":"cluster_id0",
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node1",
"ip":"172.21.16.22",
"heartbeat_port":9455
},
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_compute_node2",
"ip":"172.21.16.23",
"heartbeat_port":9455
}
]
}
}'
Add an FE Follower. In the following example, node_type = FE_OBSERVER
.
Currently, Doris does not support adding FE Follower in the compute-storage decoupled mode.
curl '127.0.0.1:5000/MetaService/http/add_node?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"SQL",
"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
"ip":"172.21.16.22",
"edit_log_port":12103,
"node_type":"FE_OBSERVER"
}
]
}
}'
Remove an FE node.
curl '127.0.0.1:5000/MetaService/http/drop_node?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"SQL",
"cluster_name":"RESERVED_CLUSTER_NAME_FOR_SQL_SERVER",
"cluster_id":"RESERVED_CLUSTER_ID_FOR_SQL_SERVER",
"nodes":[
{
"cloud_unique_id":"1:sample_instance_id:cloud_unique_id_sql_server00",
"ip":"172.21.16.22",
"edit_log_port":12103,
"node_type":"FE_MASTER"
}
]
}
}'
Add/drop compute clusterβ
To add a new compute cluster, you can refer to the βCreate compute clusterβ section on this page.
To drop a compute cluster, you can call the Meta Service API and shut down the corresponding nodes.
Example
Drop the compute cluster cluster_name0
. (All parameters below are required.)
curl '127.0.0.1:5000/MetaService/http/add_cluster?token=greedisgood9999' -d '{
"instance_id":"sample_instance_id",
"cluster":{
"type":"COMPUTE",
"cluster_name":"cluster_name0",
"cluster_id":"cluster_id0"
}
}'