Skip to content

TiDB

TiDB is a converged distributed database product that supports both online transaction processing (OLTP) and online analytical processing (OATP). A native distributed database, compatible with important features such as the MySQL 5.7 protocol and the MySQL ecosystem. The goal of TiDB is to provide users with one-stop OLTP, OLAP, and HTAP solutions, which are suitable for various Cases such as high availability, high requirements for strong consistency, and large data scale.

Overall structure of TiDB

The TiDB distributed database splits the overall architecture into multiple modules, and each module communicates with each other to form a complete TiDB system. The corresponding architecture diagram is as follows:

  • TiDB Server

    The SQL layer exposes the connection endpoints of the MySQL protocol, is responsible for accepting client connections, performing SQL parsing and optimization, and finally generating a distributed execution plan. The TiDB layer itself is stateless. In practice, multiple TiDB instances can be started, and a unified access address is provided externally through load balancing components (such as LVS, HAProxy, or F5). Client connections can be evenly distributed among multiple TiDB instances. In order to achieve the effect of load balancing. TiDB Server itself does not store data, but only parses SQL and forwards actual data read requests to the underlying storage node TiKV (or TiFlash).

  • PD (Placement Driver) Server

    The meta information management module of the entire TiDB cluster is responsible for storing the real-time data distribution of each TiKV node and the overall topology of the cluster, providing the TiDB Dashboard management and control interface, and assigning transaction IDs to distributed transactions. PD not only stores meta information, but also sends data scheduling commands to specific TiKV nodes according to the real-time data distribution status reported by TiKV nodes, which can be said to be the "brain" of the entire cluster. In addition, the PD itself is also composed of at least 3 nodes and has high availability capabilities. It is recommended to deploy an odd number of PD nodes.

  • Storage Node

  • TiKV Server: Responsible for storing data. From the outside, TiKV is a distributed Key-Value storage engine that provides transactions. The basic unit for storing data is Region, and each Region is responsible for storing data of a Key Range (the left-closed right-open interval from StartKey to EndKey), and each TiKV node is responsible for multiple Regions. TiKV's API provides native support for distributed transactions at the KV key-value pair level, and provides the isolation level of SI (Snapshot Isolation) by default, which is also the core of TiDB's support for distributed transactions at the SQL level. After the SQL layer of TiDB completes the SQL parsing, it will convert the SQL execution plan into an actual call to the TiKV API. Therefore, the data is stored in TiKV. In addition, the data in TiKV will automatically maintain multiple copies (the default is three copies), which naturally supports high availability and automatic failover.

  • TiFlash: TiFlash is a special type of storage node. Different from ordinary TiKV nodes, inside TiFlash, data is stored in the form of columns, and its main feature is to accelerate analytical use cases.

Storage of TiDB database

  • Key-Value Pair

    The choice of TiKV is the Key-Value model, and it provides an ordered traversal method. Two key points of TiKV data storage:

    • This is a huge Map (can be compared to C++'s std::map), which stores Key-Value Pairs.

    • The Key-Value pairs in this Map are ordered according to the binary order of the Key, that is, you can Seek to a certain Key position, and then continuously call the Next method to obtain the Key-Value greater than this Key in increasing order.

  • Local Storage (Rocks DB)

    For any persistent storage engine, data must be stored on disk after all, and TiKV is no exception. However, TiKV did not choose to write data directly to the disk, but saved the data in RocksDB, and RocksDB is responsible for the specific data landing. The reason for this is that developing a stand-alone storage engine requires a lot of work, especially for a high-performance stand-alone engine, which requires various meticulous optimizations. RocksDB is an excellent stand-alone KV storage engine open sourced by Facebook. It can meet various requirements of TiKV for a stand-alone engine. Here you can simply think of RocksDB as a stand-alone persistent Key-Value Map.

  • Raft protocol

    TiKV chooses the Raft algorithm to ensure that data will not be lost and errors will not occur in the case of a single machine failure. Simply put, it is to copy data to multiple machines, so that when a certain machine cannot provide services, the copies on other machines can still provide services. This data replication scheme is reliable and efficient, and can handle the failure of replicas.

  • Region

    TiKV chooses to divide Range by Key. A certain continuous Key is stored on one storage node. Divide the entire Key-Value space into many segments, and each segment is a series of consecutive Keys, called a Region. Try to keep the data stored in each Region within a certain size. Currently, the default size in TiKV is no more than 96MB. Each Region can be described by a left-closed right-open interval like [StartKey, EndKey].

  • MVCC

    TiKV implements Multi-Version Concurrency Control (MVCC).

  • Distributed ACID transactions

    The transaction of TiKV adopts the transaction model used by Google in BigTable: Percolator.

Set up a test environment

Kubernetes cluster

This test uses three virtual machine nodes to deploy a Kubernetes cluster, including 1 master node and 2 worker nodes. Kubelete version is 1.22.0.

HwameiStor local storage

  1. Deploy HwameiStor local storage on the Kubernetes cluster

  2. Configure a 100G local disk sdb for HwameiStor on the two worker nodes respectively

  3. Create storageClass

Deploy TiDB on Kubernetes

You can use TiDB Operator to deploy TiDB on Kubernetes. TiDB Operator is an automatic operation and maintenance system for TiDB clusters on Kubernetes, providing full lifecycle management of TiDB including deployment, upgrade, scaling, backup and recovery, and configuration changes. With TiDB Operator, TiDB can seamlessly run on public cloud or privately deployed Kubernetes clusters.

The correspondence between TiDB and TiDB Operator versions is as follows:

TiDB version Applicable TiDB Operator version
dev dev
TiDB >= 5.4 1.3
5.1 <= TiDB < 5.4 1.3 (recommended), 1.2
3.0 <= TiDB < 5.1 1.3 (recommended), 1.2, 1.1
2.1 <= TiDB < 3.0 1.0 (end of maintenance)

Deploy TiDB Operator

  1. Install TiDB CRDs

    kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/manifests/crd.yaml
    
  2. Install TiDB Operator

    helm repo add pingcap https://charts.pingcap.org/
    kubectl create namespace tidb-admin
    helm install --namespace tidb-admin tidb-operator pingcap/tidb-operator --version v1.3.2 \
    --set operatorImage=registry.cn-beijing.aliyuncs.com/tidb/tidb-operator:v1.3.2 \
    --set tidbBackupManagerImage=registry.cn-beijing.aliyuncs.com/tidb/tidb-backup-manager:v1.3.2 \
    --set scheduler.kubeSchedulerImageName=registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler
    
  3. Check TiDB Operator components

Deploy TiDB cluster

kubectl create namespace tidb-cluster && \
kubectl -n tidb-cluster apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/basic/tidb-cluster.yaml
kubectl -n tidb-cluster apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/basic/tidb-monitor.yaml

Connect TiDB cluster

yum -y install mysql-client
kubectl port-forward -n tidb-cluster svc/basic-tidb 4000 > pf4000.out &

Check and verify TiDB cluster status

  1. Create the Hello_world table

    create table hello_world (id int unsigned not null auto_increment primary key, v varchar(32));
    
  2. Query the TiDB version number

    select tidb_version()\G;
    
  3. Query Tikv storage status

    select * from information_schema.tikv_store_status\G;
    

HwameiStor storage configuration

Create a PVC for tidb-tikv and tidb-pd from storageClass local-storage-hdd-lvm:

kubectl get po basic-tikv-0-oyaml
kubectl get po basic-pd-0-oyaml

Test content

Database SQL Basic Ability Test

After completing the deployment of the database cluster, the following basic capability tests were performed and all passed.

Distributed transaction

Test purpose: to support the realization of the integrity constraints of distributed data operations, that is, ACID properties, under multiple isolation levels

Test steps:

  1. Create a test database CREATE DATABASE testdb

  2. Create a test table CREATE TABLE t_test ( id int AUTO_INCREMENT, name varchar(32), PRIMARY KEY (id) )

  3. Run the test script

Test results: Support the integrity constraints of distributed data operations, namely ACID properties, under multiple isolation levels

Object Isolation

Test purpose: to test different schemas to achieve object isolation

Test script:

create database if not exists testdb;
use testdb
create table if not exists t_test
( id bigint,
   name varchar(200),
   sale_time datetime default current_timestamp,
   constraint pk_t_test primary key (id)
);
insert into t_test(id,name) values (1,'a'),(2,'b'),(3,'c');
create user 'readonly'@'%' identified by "readonly";
grant select on testdb.* to readonly@'%';
select * from testdb.t_test;
update testdb.t_test set name='aaa';
create user 'otheruser'@'%' identified by "otheruser";

Test results: Support creating different schemas to achieve object isolation

Table operation support

Test purpose: Test whether it supports creating, deleting and modifying table data, DML, columns, and partition tables

Test steps: run the test script step by step after connecting to the database

Test script:

# Create and drop tables
drop table if exists t_test;
create table if not exists t_test
( id bigint default '0',
   name varchar(200) default '' ,
   sale_time datetime default current_timestamp,
   constraint pk_t_test primary key (id)
);
# delete and modify
insert into t_test(id,name) values (1,'a'),(2,'b'),(3,'c'),(4,'d'),(5,'e');
update t_test set name='aaa' where id=1;
update t_test set name='bbb' where id=2;
delete from t_dml where id=5;
# Modify, add, delete columns
alter table t_test modify column name varchar(250);
alter table t_test add column col varchar(255);
insert into t_test(id,name,col) values(10,'test','new_col');
alter table t_test add column colwithdefault varchar(255) default 'aaaa';
insert into t_test(id,name) values(20,'testdefault');
insert into t_test(id,name,colwithdefault ) values(10,'test','non-default');
alter table t_test drop column colwithdefault;
# Partition table type (only some scripts are excerpted)
CREATE TABLE employees (
    id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)

Test results: support for creating, deleting and modifying table data, DML, columns, and partition tables

Index Support

Test purpose: Verify multiple types of indexes (unique, clustered, partitioned, Bidirectional indexes, Expression-based indexes, hash indexes, etc.) and index rebuild operations.

Test script:

alter table t_test add unique index udx_t_test (name);
# The default is the primary key clustered index
ADMIN CHECK TABLE t_test;
create index time_idx on t_test(sale_time);
alter table t_test drop index time_idx;
admin show ddl jobs;
admin show ddl job queries 156;
create index time_idx on t_test(sale_time);

Test results: support creation, deletion, combination, single column, unique index

expressions

Test purpose: To verify that the expression of the distributed database supports statements such as if, casewhen, forloop, whileloop, loop exit when (up to 5 types)

Prerequisite: The database cluster has been deployed.

Test script:

SELECT CASE id WHEN 1 THEN 'first' WHEN 2 THEN 'second' ELSE 'OTHERS' END AS id_new FROM t_test;
SELECT IF(id>2,'int2+','int2-') from t_test;

Test results: Support statements such as if, case when, for loop, while loop, loop exit when (up to 5 types)

Execution plan analysis

Test purpose: Verify the execution plan parsing support of the distributed database

Prerequisite: The database cluster has been deployed.

Test script:

explain analyze select * from t_test where id NOT IN (1,2,4);
explain analyze select * from t_test a where EXISTS (select * from t_test b where a.id=b.id and b.id<3);
explain analyze SELECT IF(id>2,'int2+','int2-') from t_test;

Test results: support for execution plan parsing

Execution plan binding

Test purpose: Verify the execution plan binding feature of the distributed database

Test steps:

  1. View the current execution plan of the sql statement

  2. Using the binding feature

  3. View the execution plan after the sql statement is bound

  4. Delete the binding

Test script:

explain select * from employees3 a join employees4 b on a.id = b.id where a.lname='Johnson';
explain select /*+ hash_join(a,b) */ * from employees3 a join employees4 b on a.id = b.id where a.lname='Johnson';

Test result: It may not be hash_join when no hint is used, but it must be hash_join after using hint.

Common Functions

Test purpose: to verify the standard database features of distributed databases (supported feature types)

Test results: support standard database features

Explicit/Implicit Transactions

Test purpose: verify the transaction support of the distributed database

Test results: support for explicit and implicit transactions

Character set

Test purpose: verify the data type support of the distributed database

Test results: Currently only UTF-8 mb4 character set is supported

Lock support

Test purpose: verify the lock implementation of the distributed database

Test results: Describe the implementation of locks, the blocking situation in the case of R-R/R-W/W-W, and the deadlock handling method

Isolation level

Test purpose: verify the transaction isolation level of the distributed database

Test results: support si isolation level, support rc isolation level (4.0 GA version)

Distributed Complex Query

Test purpose: to verify the distributed complex query capabilities of distributed databases

Test results: support distributed complex queries and operations such as cross-node join, support window features, and hierarchical queries

System Security Test

This part tests system security. After the database cluster deployment is completed, the following security tests all pass.

Account Management and Permission Test

Test purpose: verify the account authority management of the distributed database

Test script:

select host,user,authentication_string from mysql.user;
create user tidb IDENTIFIED by 'tidb';
select host,user,authentication_string from mysql.user;
set password for tidb = password('tidbnew');
select host, user, authentication_string, Select_priv from mysql.user;
grant select on *.* to tidb;
flush privileges;
select host, user, authentication_string, Select_priv from mysql.user;
grant all privileges on *.* to tidb;
flush privileges;
select * from mysql.user where user='tidb';
revoke select on *.* from tidb;
flush privileges;
revoke all privileges on *.* from tidb;
flush privileges;
grant select(id) on test.TEST_HOTSPOT to tidb;
drop user tidb;

Test Results:

  • Support creation, modification and deletion of accounts, configuration and passwords, support separation of powers among security, audit and data management

  • According to different accounts, the authority control of each level of the database includes: instance/library/table/column level

Access control

Test purpose: To verify the access control of the distributed database, and the database data is controlled by granting basic additions, deletions, changes, and queries

Test script:

mysql -u root -h 172.17.49.222 -P 4000
drop user tidb;
drop user tidb1;
create user tidb IDENTIFIED by 'tidb';
grant select on tidb.* to tidb;
grant insert on tidb.* to tidb;
grant update on tidb.* to tidb;
grant delete on tidb.* to tidb;
flush privileges;
show grants for tidb;
exit;
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'select * from aa;'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'insert into aa values(2);'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'update aa set id=3;'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'delete from aa where id=3;'

Test results: Database data is controlled by granting access rights for basic addition, deletion, modification, and query.

whitelist

Test purpose: verify the whitelist feature of the distributed database

Test script:

mysql -u root -h 172.17.49.102 -P 4000
drop user tidb;
create user tidb@'127.0.0.1' IDENTIFIED by 'tidb';
flush privileges;
select * from mysql.user where user='tidb';
mysql -u tidb -h 127.0.0.1 -P 4000 -ptidb
mysql -u tidb -h 172.17.49.102 -P 4000 -ptidb

Test results: support IP whitelist function, support IP segment wildcard operation

Operation Logging

Purpose of the test: to verify the operation monitoring capability of the distributed database

Test script: kubectl -ntidb-cluster logs tidb-test-pd-2 --tail 22

Test results: record key operations or wrong operations performed by users through the operation and maintenance management console or API

Operation and maintenance management test

This part tests system operation and maintenance. After the database cluster deployment is completed, the following operation and maintenance management tests all pass.

Data import and export

Test purpose: To verify the tool support for data import and export of distributed databases

Test script:

select * from sbtest1 into outfile '/sbtest1.csv';
load data local infile '/sbtest1.csv' into table test100;

Test results: support logical export and import at the table, schema, and database levels

Slow log query

Test purpose: to obtain SQL information of slow queries

Precondition: The SQL execution time must be greater than the configured slow query record threshold, and the SQL execution is complete

Test steps:

  1. Adjust the slow query threshold to 100ms

  2. Run sql

  3. Check the slow query information in log/system table/dashboard

Test script:

show variables like 'tidb_slow_log_threshold';
set tidb_slow_log_threshold=100;
select query_time, query from information_schema.slow_query where is_internal = false order by query_time desc limit 3;

Test result: slow query information can be obtained

Comments