0x2B|~0x2B – Page 3 – My broken wings still strong enough to cross the ocean with.

A Spring Cloud Toy Project

August 21, 2018 by gonwan·0 Comments

Recently played with the Spring/SpringBoot/SpringCloud stack with a toy project: https://github.com/gonwan/spring-cloud-demo. Just paste README.md here, and any pull request is welcome:

Introduction

The demo project is initialized from https://github.com/carnellj/spmia-chapter10. Additions are:

Code cleanup, bug fix, and better comments.
Java 9+ support.
Spring Boot 2.0 migration.
Switch from Postgres to MySQL, and from Kafka to RabbitMQ.
Easier local debugging by switching off service discovery and remote config file lookup.
Kubernetes support.
Swagger Integration.
Spring Boot Admin Integration.

The project includes:

[eureka-server]: Service for service discovery. Registered services are shown on its web frontend, running at 8761 port.
[config-server]: Service for config file management. Config files can be accessed via: http://${config-server}:8888/${appname}/${profile}. Where ${appname} is spring.application.name and ${profile} is something like dev, prd or default.
[zipkin-server]: Service to aggregate distributed tracing data, working with spring-cloud-sleuth. It runs at 9411 port. All cross service requests, message bus delivery are traced by default.
[zuul-server]: Gateway service to route requests, running at 5555 port.
[authentication-service]: OAuth2 enabled authentication service running at 8901. Redis is used for token cache. JWT support is also included. Spring Cloud Security 2.0 saves a lot when building this kind of services.
[organization-service]: Application service holding organization information, running at 8085. It also acts as an OAuth2 client to authentication-service for authorization.
[license-service]: Application service holding license information, running at 8080. It also acts as an OAuth2 client to authentication-service for authorization.
[config]: Config files hosted to be accessed by config-server.
[docker]: Docker compose support.
[kubernetes]: Kubernetes support.

NOTE: The new OAuth2 support in Spring is actively being developed. All functions are merging into core Spring Security 5. As a result, current implementation is suppose to change. See:

Tested Dependencies

Java 8+
Docker 1.13+
Kubernetes 1.11+

Building Docker Images

export BUILD_NAME=2.0.0
mvn clean package docker:build

1 2	export BUILD_NAME=2.0.0 mvn clean package docker:build

In case of running out of disk space, clean up unused images and volumes with:

docker rmi $(docker images -f "dangling=true" -q)
docker volume prune

1 2	docker rmi $(docker images -f "dangling=true" -q) docker volume prune

Running Docker Compose

export BUILD_NAME=2.0.0
docker-compose -f docker/docker-compose.yml up

1 2	export BUILD_NAME=2.0.0 docker-compose -f docker/docker-compose.yml up

Or with separate services:

docker-compose -f docker/docker-compose.yml up authentication-service organization-service license-service

1	docker-compose -f docker/docker-compose.yml up authentication-service organization-service license-service

Running Kubernetes

NOTE: Kubernetes does not support environment variable substitution by default.

kubectl create -f kubernetes/kubernetes.yml

1	kubectl create -f kubernetes/kubernetes.yml

Use Cases

Suppose you are using the kubernetes deployment.

Get OAuth2 token

curl is used here, and 31004 is the cluster-wide port of the Zuul gateway server:

# curl -u eagleeye:thisissecret http://172.16.87.12:31004/api/auth/oauth/token -X POST -d "grant_type=password&scope=webclient&username=user&password=password1"
{"access_token":"d3b817dc-fb7a-4e65-a080-d0e34c0dc4d5","token_type":"bearer","refresh_token":"a5d12d05-78ff-4170-ab4f-b9c4e9886358","expires_in":41496,"scope":"webclient"}

# curl -u eagleeye:thisissecret http://172.16.87.12:31004/api/auth/oauth/token -X POST -d "grant_type=password&scope=webclient&username=user&password=password1"

{"access_token":"d3b817dc-fb7a-4e65-a080-d0e34c0dc4d5","token_type":"bearer","refresh_token":"a5d12d05-78ff-4170-ab4f-b9c4e9886358","expires_in":41496,"scope":"webclient"}

Get organization info

Use the token returned from previous request.

# curl -H "Authorization: Bearer d3b817dc-fb7a-4e65-a080-d0e34c0dc4d5" http://172.16.87.12:31004/api/organization/v1/organizations/e254f8c-c442-4ebe-a82a-e2fc1d1ff78a
{"id":"e254f8c-c442-4ebe-a82a-e2fc1d1ff78a","name":"customer-crm-co","contactName":"Mark Balster","contactEmail":"mark.balster@custcrmco.com","contactPhone":"823-555-1212"}

# curl -H "Authorization: Bearer d3b817dc-fb7a-4e65-a080-d0e34c0dc4d5" http://172.16.87.12:31004/api/organization/v1/organizations/e254f8c-c442-4ebe-a82a-e2fc1d1ff78a

{"id":"e254f8c-c442-4ebe-a82a-e2fc1d1ff78a","name":"customer-crm-co","contactName":"Mark Balster","contactEmail":"mark.balster@custcrmco.com","contactPhone":"823-555-1212"}

Get license info associated with organization info

Use the token returned from previous request.

# curl -H "Authorization: Bearer d3b817dc-fb7a-4e65-a080-d0e34c0dc4d5" http://172.16.87.12:31004/api/license/v1/organizations/e254f8c-c442-4ebe-a82a-e2fc1d1ff78a/licenses/f3831f8c-c338-4ebe-a82a-e2fc1d1ff78a
{"id":"f3831f8c-c338-4ebe-a82a-e2fc1d1ff78a","organizationId":"e254f8c-c442-4ebe-a82a-e2fc1d1ff78a","organizationName":"customer-crm-co","contactName":"Mark Balster","contactPhone":"823-555-1212","contactEmail":"mark.balster@custcrmco.com","productName":"CustomerPro","licenseType":"user","licenseMax":100,"licenseAllocated":5,"comment":null}

# curl -H "Authorization: Bearer d3b817dc-fb7a-4e65-a080-d0e34c0dc4d5" http://172.16.87.12:31004/api/license/v1/organizations/e254f8c-c442-4ebe-a82a-e2fc1d1ff78a/licenses/f3831f8c-c338-4ebe-a82a-e2fc1d1ff78a

{"id":"f3831f8c-c338-4ebe-a82a-e2fc1d1ff78a","organizationId":"e254f8c-c442-4ebe-a82a-e2fc1d1ff78a","organizationName":"customer-crm-co","contactName":"Mark Balster","contactPhone":"823-555-1212","contactEmail":"mark.balster@custcrmco.com","productName":"CustomerPro","licenseType":"user","licenseMax":100,"licenseAllocated":5,"comment":null}

Distributed Tracing via Zipkin

Every response contains a correlation ID to help diagnose possible failures among service call. Run with curl -v to get it:

# curl -v ...
...
< sc-correlation-id: 3265b50156556c05
...

# curl -v ...

...

< sc-correlation-id: 3265b50156556c05

...

Search it in Zipkin to get all trace info, including latencies if you are interested in.
zipkin-1
zipkin-2

The license service caches organization info in Redis, prefixed with organizations:. So you may want to clear them to get a complete tracing of cross service invoke.

redis-cli -h 172.16.87.12 -c del $(redis-cli -h 172.16.87.12 -c keys organizations* | gawk '{ print $1 }')

1	redis-cli -h 172.16.87.12 -c del $(redis-cli -h 172.16.87.12 -c keys organizations* \| gawk '{ print $1 }')

Working with OAuth2

All OAuth2 tokens are cached in Redis, prefixed with oauth2:. There is also JWT token support. Comment/Uncomment @Configuration in AuthorizationServerConfiguration and JwtAuthorizationServerConfiguration classes to switch it on/off.

Swagger Integration

The organization service and license service have Swagger integration. Access via /swagger-ui.html.

Spring Boot Admin Integration

Spring Boot Admin is integrated into the eureka server. Access via: http://${eureka-server}:8761/admin.
sba-1

Deploying Kubernetes Cluster on CentOS 7

July 30, 2018 by gonwan·0 Comments

It is painful to deploying a Kubernetes cluster in mainland China. The installation requires access to Google servers, which is not so easy for every one. Fortunately, there are mirrors or alternative ways. I’ll use Docker v1.13 and Kubernetes v1.11 in the article.

1. Install Docker

CentOS SCL should be enabled first.

# yum install centos-release-scl
# yum install docker

1 2	# yum install centos-release-scl # yum install docker

2. Install Kubernetes

2.1 Add the Aliyun mirror for Kubernetes packages

# cat << EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
       http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# yum install -y kubelet kubectl kubeadm

# cat << EOF > /etc/yum.repos.d/kubernetes.repo

[kubernetes]

name=Kubernetes

baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64

enabled=1

gpgcheck=0

repo_gpgcheck=0

gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg

http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

EOF

# yum install -y kubelet kubectl kubeadm

2.2 Precheck OS environmemt

# kubeadm init --kubernetes-version=v1.11.1

1	# kubeadm init --kubernetes-version=v1.11.1

Run the init command by specify the version, the access to Google server is avoided. The script also advices you to turn off firewalld, swap, selinux and enable kernel parameters:

# systemctl stop firewalld
# systemctl disable firewalld
# swapoff -a
# setenforce 0

# systemctl stop firewalld

# systemctl disable firewalld

# swapoff -a

# setenforce 0

Open /etc/sysconfig/selinux, change enforcing to permissive.
Create /etc/sysctl.d/k8s.conf with content:

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

1 2	net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1

# sysctl --system

1	# sysctl --system

Remember to comment out swap volumes from /etc/fstab.

2.3 Pull Kubernates images

Pull the Kubernetes images from docker/docker-cn mirror maintained by anjia0532. These are minimal images required for a Kubernetes master installation.

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-apiserver-amd64:v1.11.1
# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-controller-manager-amd64:v1.11.1
# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-scheduler-amd64:v1.11.1
# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1
# docker pull registry.docker-cn.com/anjia0532/google-containers.pause:3.1
# docker pull registry.docker-cn.com/anjia0532/google-containers.etcd-amd64:3.2.18
# docker pull registry.docker-cn.com/anjia0532/google-containers.coredns:1.1.3

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-apiserver-amd64:v1.11.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-controller-manager-amd64:v1.11.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-scheduler-amd64:v1.11.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.pause:3.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.etcd-amd64:3.2.18

# docker pull registry.docker-cn.com/anjia0532/google-containers.coredns:1.1.3

These version numbers comes from the kubeadm init command if you cannot access Google servers. These images should be retagged to gcr.io ones before next steps, or the kubeadm command line would not find them:

# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-apiserver-amd64:v1.11.1 k8s.gcr.io/kube-apiserver-amd64:v1.11.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-controller-manager-amd64:v1.11.1 k8s.gcr.io/kube-controller-manager-amd64:v1.11.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-scheduler-amd64:v1.11.1 k8s.gcr.io/kube-scheduler-amd64:v1.11.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1 k8s.gcr.io/kube-proxy-amd64:v1.11.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.pause:3.1 k8s.gcr.io/pause:3.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.etcd-amd64:3.2.18 k8s.gcr.io/etcd-amd64:3.2.18
# docker tag registry.docker-cn.com/anjia0532/google-containers.coredns:1.1.3 k8s.gcr.io/coredns:1.1.3

# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-apiserver-amd64:v1.11.1 k8s.gcr.io/kube-apiserver-amd64:v1.11.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-controller-manager-amd64:v1.11.1 k8s.gcr.io/kube-controller-manager-amd64:v1.11.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-scheduler-amd64:v1.11.1 k8s.gcr.io/kube-scheduler-amd64:v1.11.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1 k8s.gcr.io/kube-proxy-amd64:v1.11.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.pause:3.1 k8s.gcr.io/pause:3.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.etcd-amd64:3.2.18 k8s.gcr.io/etcd-amd64:3.2.18

# docker tag registry.docker-cn.com/anjia0532/google-containers.coredns:1.1.3 k8s.gcr.io/coredns:1.1.3

Now the output of docker images looks like:

REPOSITORY                                                                         TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-apiserver-amd64                                                    v1.11.1             816332bd9d11        12 days ago         187 MB
registry.docker-cn.com/anjia0532/google-containers.kube-apiserver-amd64            v1.11.1             816332bd9d11        12 days ago         187 MB
k8s.gcr.io/kube-controller-manager-amd64                                           v1.11.1             52096ee87d0e        12 days ago         155 MB
registry.docker-cn.com/anjia0532/google-containers.kube-controller-manager-amd64   v1.11.1             52096ee87d0e        12 days ago         155 MB
k8s.gcr.io/kube-scheduler-amd64                                                    v1.11.1             272b3a60cd68        12 days ago         56.8 MB
registry.docker-cn.com/anjia0532/google-containers.kube-scheduler-amd64            v1.11.1             272b3a60cd68        12 days ago         56.8 MB
k8s.gcr.io/kube-proxy-amd64                                                        v1.11.1             d5c25579d0ff        12 days ago         97.8 MB
registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64                v1.11.1             d5c25579d0ff        12 days ago         97.8 MB
k8s.gcr.io/pause                                                                   3.1                 da86e6ba6ca1        7 months ago        742 kB
registry.docker-cn.com/anjia0532/google-containers.pause                           3.1                 da86e6ba6ca1        7 months ago        742 kB
k8s.gcr.io/etcd-amd64                                                              3.2.18              b8df3b177be2        3 months ago        219 MB
registry.docker-cn.com/anjia0532/google-containers.etcd-amd64                      3.2.18              b8df3b177be2        3 months ago        219 MB
k8s.gcr.io/coredns                                                                 1.1.3               b3b94275d97c        2 months ago        45.6 MB
registry.docker-cn.com/anjia0532/google-containers.coredns                         1.1.3               b3b94275d97c        2 months ago        45.6 MB

REPOSITORY TAG IMAGE ID CREATED SIZE

k8s.gcr.io/kube-apiserver-amd64 v1.11.1 816332bd9d11 12 days ago 187 MB

registry.docker-cn.com/anjia0532/google-containers.kube-apiserver-amd64 v1.11.1 816332bd9d11 12 days ago 187 MB

k8s.gcr.io/kube-controller-manager-amd64 v1.11.1 52096ee87d0e 12 days ago 155 MB

registry.docker-cn.com/anjia0532/google-containers.kube-controller-manager-amd64 v1.11.1 52096ee87d0e 12 days ago 155 MB

k8s.gcr.io/kube-scheduler-amd64 v1.11.1 272b3a60cd68 12 days ago 56.8 MB

registry.docker-cn.com/anjia0532/google-containers.kube-scheduler-amd64 v1.11.1 272b3a60cd68 12 days ago 56.8 MB

k8s.gcr.io/kube-proxy-amd64 v1.11.1 d5c25579d0ff 12 days ago 97.8 MB

registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64 v1.11.1 d5c25579d0ff 12 days ago 97.8 MB

k8s.gcr.io/pause 3.1 da86e6ba6ca1 7 months ago 742 kB

registry.docker-cn.com/anjia0532/google-containers.pause 3.1 da86e6ba6ca1 7 months ago 742 kB

k8s.gcr.io/etcd-amd64 3.2.18 b8df3b177be2 3 months ago 219 MB

registry.docker-cn.com/anjia0532/google-containers.etcd-amd64 3.2.18 b8df3b177be2 3 months ago 219 MB

k8s.gcr.io/coredns 1.1.3 b3b94275d97c 2 months ago 45.6 MB

registry.docker-cn.com/anjia0532/google-containers.coredns 1.1.3 b3b94275d97c 2 months ago 45.6 MB

Also KUBE_REPO_PREFIX and other environment variables can be used to customize the prefix. I have no time to verify them.

2.4 Start the Kubernetes master

Run the init script again and it will success with further guidelines:

# kubeadm init --kubernetes-version=v1.11.1
...
...
...
Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run (as a regular user):

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the addon options listed at:
  http://kubernetes.io/docs/admin/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.11.235:6443 --token d624xt.3eqs1udhr26w5luh --discovery-token-ca-cert-hash sha256:a0088da6f0445370457b1aeede4856caa580280e58cdea974f1110bc1b7d4a63

# kubeadm init --kubernetes-version=v1.11.1

...

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run (as a regular user):

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.

Run "kubectl apply -f [podnetwork].yaml" with one of the addon options listed at:

http://kubernetes.io/docs/admin/addons/

You can now join any number of machines by running the following on each node

as root:

kubeadm join 192.168.11.235:6443 --token d624xt.3eqs1udhr26w5luh --discovery-token-ca-cert-hash sha256:a0088da6f0445370457b1aeede4856caa580280e58cdea974f1110bc1b7d4a63

Run the mkdir/cp/chown command to enable kubectl usage. Then add the weave pod network. It may take some time, since images are pulled.

# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

1	# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version \| base64 \| tr -d '\n')"

Now the master is finished, verify with the Ready status:

# kubectl get nodes
NAME        STATUS    ROLES     AGE       VERSION
cdh-node1   Ready     master    2h        v1.11.1

# kubectl get nodes

NAME STATUS ROLES AGE VERSION

cdh-node1 Ready master 2h v1.11.1

2.4 Start the Kubernetes node(slave)

A Kubernetes node only requires kube-proxy-amd64 and pause images, pull these ones:

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1
# docker pull registry.docker-cn.com/anjia0532/google-containers.pause:3.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1 k8s.gcr.io/kube-proxy-amd64:v1.11.1
# docker tag registry.docker-cn.com/anjia0532/google-containers.pause:3.1 k8s.gcr.io/pause:3.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1

# docker pull registry.docker-cn.com/anjia0532/google-containers.pause:3.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.kube-proxy-amd64:v1.11.1 k8s.gcr.io/kube-proxy-amd64:v1.11.1

# docker tag registry.docker-cn.com/anjia0532/google-containers.pause:3.1 k8s.gcr.io/pause:3.1

Weave images can also been prefetched:

# docker images
REPOSITORY                                                            TAG                 IMAGE ID            CREATED             SIZE
docker.io/weaveworks/weave-npc                                        2.4.0               647ad6d59818        4 days ago          49.1 MB
docker.io/weaveworks/weave-kube                                       2.4.0               86ff1a48ce14        4 days ago          131 MB

# docker images

REPOSITORY TAG IMAGE ID CREATED SIZE

docker.io/weaveworks/weave-npc 2.4.0 647ad6d59818 4 days ago 49.1 MB

docker.io/weaveworks/weave-kube 2.4.0 86ff1a48ce14 4 days ago 131 MB

Join the node to our Kubernetes master by running the command line in the kubeadm init output:

# kubeadm join 192.168.11.235:6443 --token d624xt.3eqs1udhr26w5luh --discovery-token-ca-cert-hash sha256:a0088da6f0445370457b1aeede4856caa580280e58cdea974f1110bc1b7d4a63

1	# kubeadm join 192.168.11.235:6443 --token d624xt.3eqs1udhr26w5luh --discovery-token-ca-cert-hash sha256:a0088da6f0445370457b1aeede4856caa580280e58cdea974f1110bc1b7d4a63

3. Verify Kubernetes cluster status

Verify nodes with:

# kubectl get nodes -o wide
NAME        STATUS    ROLES     AGE       VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
cdh-node1   Ready     master    43m       v1.11.1   192.168.11.235           CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://1.13.1
cdh-node2   Ready         41m       v1.11.1   192.168.11.236           CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://1.13.1

# kubectl get nodes -o wide

NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME

cdh-node1 Ready master 43m v1.11.1 192.168.11.235 CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://1.13.1

cdh-node2 Ready 41m v1.11.1 192.168.11.236 CentOS Linux 7 (Core) 3.10.0-862.el7.x86_64 docker://1.13.1

Verify internal pods with:

# kubectl get pods -o wide --all-namespaces
NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE       IP               NODE
kube-system   coredns-78fcdf6894-f4qsc                1/1       Running   0          43m       10.32.0.3        cdh-node1
kube-system   coredns-78fcdf6894-rx48j                1/1       Running   0          43m       10.32.0.2        cdh-node1
kube-system   etcd-cdh-node1                          1/1       Running   0          42m       192.168.11.235   cdh-node1
kube-system   kube-apiserver-cdh-node1                1/1       Running   0          42m       192.168.11.235   cdh-node1
kube-system   kube-controller-manager-cdh-node1       1/1       Running   0          42m       192.168.11.235   cdh-node1
kube-system   kube-proxy-f46pz                        1/1       Running   0          43m       192.168.11.235   cdh-node1
kube-system   kube-proxy-zrdm5                        1/1       Running   0          41m       192.168.11.236   cdh-node2
kube-system   kube-scheduler-cdh-node1                1/1       Running   0          42m       192.168.11.235   cdh-node1
kube-system   weave-net-l6z9q                         2/2       Running   1          41m       192.168.11.236   cdh-node2
kube-system   weave-net-tkfrs                         2/2       Running   0          42m       192.168.11.235   cdh-node1

# kubectl get pods -o wide --all-namespaces

NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE

kube-system coredns-78fcdf6894-f4qsc 1/1 Running 0 43m 10.32.0.3 cdh-node1

kube-system coredns-78fcdf6894-rx48j 1/1 Running 0 43m 10.32.0.2 cdh-node1

kube-system etcd-cdh-node1 1/1 Running 0 42m 192.168.11.235 cdh-node1

kube-system kube-apiserver-cdh-node1 1/1 Running 0 42m 192.168.11.235 cdh-node1

kube-system kube-controller-manager-cdh-node1 1/1 Running 0 42m 192.168.11.235 cdh-node1

kube-system kube-proxy-f46pz 1/1 Running 0 43m 192.168.11.235 cdh-node1

kube-system kube-proxy-zrdm5 1/1 Running 0 41m 192.168.11.236 cdh-node2

kube-system kube-scheduler-cdh-node1 1/1 Running 0 42m 192.168.11.235 cdh-node1

kube-system weave-net-l6z9q 2/2 Running 1 41m 192.168.11.236 cdh-node2

kube-system weave-net-tkfrs 2/2 Running 0 42m 192.168.11.235 cdh-node1

If the status of a pod is not Running, get the detailed info from:

# kubectl describe pod weave-net-l6z9q -n kube-system

1	# kubectl describe pod weave-net-l6z9q -n kube-system

If something goes wrong, and you cannot restore from it, simply reset the master/node:

# kubeadm reset

1	# kubeadm reset

4. Install Kubernetes Dashboard

By default, all user pods are allocated on Kubernetes nodes(slaves). Pull the dashboard image in advance on the node machine:

# docker pull registry.docker-cn.com/anjia0532/google-containers.kubernetes-dashboard-amd64:v1.8.3
# docker tag registry.docker-cn.com/anjia0532/google-containers.kubernetes-dashboard-amd64:v1.8.3 k8s.gcr.io/kubernetes-dashboard-amd64:v1.8.3

1 2	# docker pull registry.docker-cn.com/anjia0532/google-containers.kubernetes-dashboard-amd64:v1.8.3 # docker tag registry.docker-cn.com/anjia0532/google-containers.kubernetes-dashboard-amd64:v1.8.3 k8s.gcr.io/kubernetes-dashboard-amd64:v1.8.3

Install with alternative setup, since recommended setup is not so friendly in a development envronment:

# kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.8.3/src/deploy/alternative/kubernetes-dashboard.yaml

1	# kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.8.3/src/deploy/alternative/kubernetes-dashboard.yaml

Refer here for remote access:

# kubectl -n kube-system edit service kubernetes-dashboard

1	# kubectl -n kube-system edit service kubernetes-dashboard

Change type: ClusterIP to type: NodePort and save file. Next we need to check port on which Dashboard was exposed.

# kubectl -n kube-system get service kubernetes-dashboard
NAME                   TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
kubernetes-dashboard   NodePort   10.106.156.70           80:31023/TCP   34m

# kubectl -n kube-system get service kubernetes-dashboard

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kubernetes-dashboard NodePort 10.106.156.70 80:31023/TCP 34m

Now, you can access with: http://<master-ip>:31023/.
You can grant admin grant full admin privileges to Dashboard’s Service Account in the development environment for convenience:

# cat dashboard-admin.yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
  labels:
    k8s-app: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system
# kubectl create -f dashboard-admin.yaml

# cat dashboard-admin.yaml

apiVersion: rbac.authorization.k8s.io/v1beta1

kind: ClusterRoleBinding

metadata:

labels:

k8s-app: kubernetes-dashboard

roleRef:

apiGroup: rbac.authorization.k8s.io

kind: ClusterRole

subjects:

- kind: ServiceAccount

namespace: kube-system

# kubectl create -f dashboard-admin.yaml

5. Troubleshoting

In my office environment, errors occur and the coredns are always in CrashLoopBackOff status:

Failed to list *api.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

1	Failed to list *api.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

I Googled a lot, read answers from Stackoverflow and Github, reset iptables/docker/kubernetes, but still failed to solve it. There ARE unresolved issues like #60315. So I tried to switch to flannel network instead of weave. First, Kubernetes and weave need to be reset:

# curl -L git.io/weave -o /usr/local/bin/weave
# chmod a+x /usr/local/bin/weave
# weave reset
# kubeadm reset

# curl -L git.io/weave -o /usr/local/bin/weave

# chmod a+x /usr/local/bin/weave

# weave reset

# kubeadm reset

This time, initialize kubeadm and network with:

# kubeadm init --kubernetes-version=v1.11.1 --pod-network-cidr=10.244.0.0/16
# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml

1 2	# kubeadm init --kubernetes-version=v1.11.1 --pod-network-cidr=10.244.0.0/16 # kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml

The flannel image can be pulled first:

# docker pull quay.io/coreos/flannel:v0.10.0-amd64

1	# docker pull quay.io/coreos/flannel:v0.10.0-amd64

Everything works. Also referred here.

Updated May 7, 2019: Kubernetes 1.13 finally add a command line switch to use an alternative repository. Simply run kubeadm with:

# kubeadm init --kubernetes-version=v1.14.1 --image-repository registry.aliyuncs.com/google_containers

1	# kubeadm init --kubernetes-version=v1.14.1 --image-repository registry.aliyuncs.com/google_containers

And verify with docker images.

Updated May 10, 2019: If using Ubuntu/Linuxmint, add repository with:

# curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
# cat << EOF > /etc/apt/sources.list.d/kubernetes.list 
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main 
EOF

# curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -

# cat << EOF > /etc/apt/sources.list.d/kubernetes.list

deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main

EOF

Updated June 3, 2019: flannel seems to have a close version dependency on kubernetes version. When deploying kubernetes 1.14, a specific git version should be used, according to the official document:

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml

1	# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml

Updated Jan 11, 2022: Just deployed a new cluster with docker 20.10.12 & kubernetes 1.23.1.
1. kubeadm defaults to systemd, instead of cgroupfs as the container runtime cgroup driver. In docker case, edit /etc/docker/daemon.json, and restart docker service:

{
  "exec-opts": ["native.cgroupdriver=systemd"]
}

{

"exec-opts": ["native.cgroupdriver=systemd"]

}

2. flannel script updated:

# flannel v0.16.1
# kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

1 2	# flannel v0.16.1 # kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml

3. kubernetes dashboard script updated:

# kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml

1	# kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.4.0/aio/deploy/recommended.yaml

The recommended configuration enables HTTPS, and an auto-generated certificate is used. Now follow the document to create an admin user and get the login token: https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md. Get the token with:

# kubectl -n kubernetes-dashboard get secret $(kubectl -n kubernetes-dashboard get sa/admin-user -o jsonpath="{.secrets[0].name}") -o go-template="{{.data.token | base64decode}}"

1	# kubectl -n kubernetes-dashboard get secret $(kubectl -n kubernetes-dashboard get sa/admin-user -o jsonpath="{.secrets[0].name}") -o go-template="{{.data.token \| base64decode}}"

Solving iBooks Not Syncing in macOS

March 1, 2018 by gonwan·0 Comments

As a note here:

Go to Menu –> Store –> Check for Available Downloads, to refresh your iBooks login manually. Also make sure the iCloud option for iBooks is enabled in settings.

Updated July 4, 2022: On MacOS 12, Go to Settings –> Apple ID —> iCloud Drive, disable and re-enable the iBooks sync. If this does not work, logout and re-login the Apple ID.

Quest for Better Replication in MySQL: Galera vs. Group Replication

January 1, 2018 by gonwan·0 Comments

Original post: https://www.percona.com/blog/2017/02/24/battle-for-synchronous-replication-in-mysql-galera-vs-group-replication/

UPDATE: Some of the language in the original post was considered overly-critical of Oracle by some community members. This was not my intent, and I’ve modified the language to be less so. I’ve also changed term “synchronous” (which the use of is inaccurate and misleading) to “virtually synchronous.” This term is more accurate and already used by both technologies’ founders, and should be less misleading.

I also wanted to thank Jean-François Gagné for pointing out the incorrect sentence about multi-threaded slaves in Group Replication, which I also corrected accordingly.

In today’s blog post, I will briefly compare two major virtually synchronous replication technologies available today for MySQL.

More Than Asynchronous Replication

Thanks to the Galera plugin, founded by the Codership team, we’ve had the choice between asynchronous and virtually synchronous replication in the MySQL ecosystem for quite a few years already. Moreover, we can choose between at least three software providers: Codership, MariaDB and Percona, each with its own Galera implementation.

The situation recently became much more interesting when MySQL Group Replication went into GA (stable) stage in December 2016.

Oracle, the upstream MySQL provider, introduced its own replication implementation that is very similar in concept. Unlike the others mentioned above, it isn’t based on Galera. Group Replication was built from the ground up as a new solution. MySQL Group Replication shares many very similar concepts to Galera. This post doesn’t cover MySQL Cluster, another and fully-synchronous solution, that existed much earlier then Galera — it is a much different solution for different use cases.

In this post, I will point out a couple of interesting differences between Group Replication and Galera, which hopefully will be helpful to those considering switching from one to another (or if they are planning to test them).

This is certainly not a full list of all the differences, but rather things I found interesting during my explorations.

It is also important to know that Group Replication has evolved a lot before it went GA (its whole cluster layer was replaced). I won’t mention how things looked before the GA stage, and will just concentrate on latest available 5.7.17 version. I will not spend too much time on how Galera implementations looked in the past, and will use Percona XtraDB Cluster 5.7 as a reference.

Multi-Master vs. Master-Slave

Galera has always been multi-master by default, so it does not matter to which node you write. Many users use a single writer due to workload specifics and multi-master limitations, but Galera has no single master mode per se.

Group Replication, on the other hand, promotes just one member as primary (master) by default, and other members are put into read-only mode automatically. This is what happens if we try to change data on non-master node:

mysql> truncate test.t1;
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

1 2	mysql> truncate test.t1; ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

To change from single primary mode to multi-primary (multi-master), you have to start group replication with the group_replication_single_primary_mode variable disabled.
Another interesting fact is you do not have any influence on which cluster member will be the master in single primary mode: the cluster auto-elects it. You can only check it with a query:

mysql> SELECT * FROM performance_schema.global_status WHERE VARIABLE_NAME like 'group_replication%';
+----------------------------------+--------------------------------------+
| VARIABLE_NAME                    | VARIABLE_VALUE                       |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 329333cd-d6d9-11e6-bdd2-0242ac130002 |
+----------------------------------+--------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM performance_schema.global_status WHERE VARIABLE_NAME like 'group_replication%';

+----------------------------------+--------------------------------------+

| VARIABLE_NAME | VARIABLE_VALUE |

+----------------------------------+--------------------------------------+

| group_replication_primary_member | 329333cd-d6d9-11e6-bdd2-0242ac130002 |

+----------------------------------+--------------------------------------+

1 row in set (0.00 sec)

Or just:

mysql> show status like 'group%';
+----------------------------------+--------------------------------------+
| Variable_name                    | Value                                |
+----------------------------------+--------------------------------------+
| group_replication_primary_member | 329333cd-d6d9-11e6-bdd2-0242ac130002 |
+----------------------------------+--------------------------------------+
1 row in set (0.01 sec)

mysql> show status like 'group%';

+----------------------------------+--------------------------------------+

| Variable_name | Value |

+----------------------------------+--------------------------------------+

| group_replication_primary_member | 329333cd-d6d9-11e6-bdd2-0242ac130002 |

+----------------------------------+--------------------------------------+

1 row in set (0.01 sec)

To show the hostname instead of UUID, here:

mysql> select member_host as "primary master" from performance_schema.global_status join performance_schema.replication_group_members where variable_name='group_replication_primary_member' and member_id=variable_value;
+----------------+
| primary master |
+----------------+
| f18ff539956d   |
+----------------+
1 row in set (0.00 sec)

mysql> select member_host as "primary master" from performance_schema.global_status join performance_schema.replication_group_members where variable_name='group_replication_primary_member' and member_id=variable_value;

+----------------+

| primary master |

+----------------+

| f18ff539956d |

+----------------+

1 row in set (0.00 sec)

Replication: Majority vs. All

Galera delivers write transactions synchronously to ALL nodes in the cluster. (Later, applying happens asynchronously in both technologies.) However, Group Replication needs just a majority of the nodes confirming the transaction. This means a transaction commit on the writer succeeds and returns to the client even if a minority of nodes still have not received it.

In the example of a three-node cluster, if one node crashes or loses the network connection, the two others continue to accept writes (or just the primary node in Single-Primary mode) even before a faulty node is removed from the cluster.

If the separated node is the primary one, it denies writes due to the lack of a quorum (it will report the error ERROR 3101 (HY000): Plugin instructed the server to rollback the current transaction.). If one of the nodes receives a quorum, it will be elected to primary after the faulty node is removed from the cluster, and will then accept writes.

With that said, the “majority” rule in Group Replication means that there isn’t a guarantee that you won’t lose any data if the majority nodes are lost. There is a chance these could apply some transactions that aren’t delivered to the minority at the moment they crash.

In Galera, a single node network interruption makes the others wait for it, and pending writes can be committed once either the connection is restored or the faulty node removed from cluster after the timeout. So the chance of losing data in a similar scenario is lower, as transactions always reach all nodes. Data can be lost in Percona XtraDB Cluster only in a really bad luck scenario: a network split happens, the remaining majority of nodes form a quorum, the cluster reconfigures and allows new writes, and then shortly after the majority part is damaged.

Schema Requirements

For both technologies, one of the requirements is that all tables must be InnoDB and have a primary key. This requirement is now enforced by default in both Group Replication and Percona XtraDB Cluster 5.7. Let’s look at the differences.

Percona XtraDB Cluster:

mysql> create table nopk (a char(10));
Query OK, 0 rows affected (0.08 sec)

mysql> insert into nopk values ("aaa");
ERROR 1105 (HY000): Percona-XtraDB-Cluster prohibits use of DML command on a table (test.nopk) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER

mysql> create table m1 (id int primary key) engine=myisam;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into m1 values(1);
ERROR 1105 (HY000): Percona-XtraDB-Cluster prohibits use of DML command on a table (test.m1) that resides in non-transactional storage engine with pxc_strict_mode = ENFORCING or MASTER

mysql> set global pxc_strict_mode=0;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into nopk values ("aaa");
Query OK, 1 row affected (0.00 sec)

mysql> insert into m1 values(1);
Query OK, 1 row affected (0.00 sec)

mysql> create table nopk (a char(10));

Query OK, 0 rows affected (0.08 sec)

mysql> insert into nopk values ("aaa");

ERROR 1105 (HY000): Percona-XtraDB-Cluster prohibits use of DML command on a table (test.nopk) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER

mysql> create table m1 (id int primary key) engine=myisam;

Query OK, 0 rows affected (0.02 sec)

mysql> insert into m1 values(1);

ERROR 1105 (HY000): Percona-XtraDB-Cluster prohibits use of DML command on a table (test.m1) that resides in non-transactional storage engine with pxc_strict_mode = ENFORCING or MASTER

mysql> set global pxc_strict_mode=0;

Query OK, 0 rows affected (0.00 sec)

mysql> insert into nopk values ("aaa");

Query OK, 1 row affected (0.00 sec)

mysql> insert into m1 values(1);

Query OK, 1 row affected (0.00 sec)

Before Percona XtraDB Cluster 5.7 (or in other Galera implementations), there were no such enforced restrictions. Users unaware of these requirements often ended up with problems.

Group Replication:

mysql> create table nopk (a char(10));
Query OK, 0 rows affected (0.04 sec)

mysql> insert into nopk values ("aaa");
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.

2017-01-15T22:48:25.241119Z 139 [ERROR] Plugin group_replication reported: 'Table nopk does not have any PRIMARY KEY. This is not compatible with Group Replication'

mysql> create table m1 (id int primary key) engine=myisam;
ERROR 3161 (HY000): Storage engine MyISAM is disabled (Table creation is disallowed).

mysql> create table nopk (a char(10));

Query OK, 0 rows affected (0.04 sec)

mysql> insert into nopk values ("aaa");

ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.

2017-01-15T22:48:25.241119Z 139 [ERROR] Plugin group_replication reported: 'Table nopk does not have any PRIMARY KEY. This is not compatible with Group Replication'

mysql> create table m1 (id int primary key) engine=myisam;

ERROR 3161 (HY000): Storage engine MyISAM is disabled (Table creation is disallowed).

I am not aware of any way to disable these restrictions in Group Replication.

GTID

Galera has it’s own Global Transaction ID, which has existed since MySQL 5.5, and is independent from MySQL’s GTID feature introduced in MySQL 5.6. If MySQL’s GTID is enabled on a Galera-based cluster, both numerations exist with their own sequences and UUIDs.

Group Replication is based on a native MySQL GTID feature, and relies on it. Interestingly, a separate sequence block range (initially 1M) is pre-assigned for each cluster member.

WAN Support

The MySQL Group Replication documentation isn’t very optimistic on WAN support, claiming that both “Low latency, high bandwidth network connections are a requirement” and “Group Replication is designed to be deployed in a cluster environment where server instances are very close to each other, and is impacted by both network latency as well as network bandwidth.” These statements are found here and here. However there is network traffic optimization: Message Compression.

I don’t see group communication level tunings available yet, as we find in the Galera evs.* series of wsrep_provider_options.

Galera founders actually encourage trying it in geo-distributed environments, and some WAN-dedicated settings are available (the most important being WAN segments).

But both technologies need a reliable network for good performance.

State Transfers

Galera has two types of state transfers that allow syncing data to nodes when needed: incremental (IST) and full (SST). Incremental is used when a node has been out of a cluster for some time, and once it rejoins the other nodes has the missing write sets still in Galera cache. Full SST is helpful if incremental is not possible, especially when a new node is added to the cluster. SST automatically provisions the node with fresh data taken as a snapshot from one of the running nodes (donor). The most common SST method is using Percona XtraBackup, which takes a fast and non-blocking binary data snapshot (hot backup).

In Group Replication, state transfers are fully based on binary logs with GTID positions. If there is no donor with all of the binary logs (included the ones for new nodes), a DBA has to first provision the new node with initial data snapshot. Otherwise, the joiner will fail with a very familiar error:

2017-01-16T23:01:40.517372Z 50 [ERROR] Slave I/O for channel 'group_replication_recovery': Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236

2017-01-16T23:01:40.517372Z 50 [ERROR] Slave I/O for channel 'group_replication_recovery': Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236

The official documentation mentions that provisioning the node before adding it to the cluster may speed up joining (the recovery stage). Another difference is that in the case of state transfer failure, a Galera joiner will abort after the first try, and will shutdown its mysqld instance. The Group Replication joiner will then fall-back to another donor in an attempt to succeed. Here I found something slightly annoying: if no donor can satisfy joiner demands, it will still keep trying the same donors over and over, for a fixed number of attempts:

[root@cd81c1dadb18 /]# grep 'Attempt' /var/log/mysqld.log | tail
2017-01-16T22:57:38.329541Z 12 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2017-01-16T22:57:38.539984Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/10'
2017-01-16T22:57:38.806862Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 3/10'
2017-01-16T22:58:39.024568Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 4/10'
2017-01-16T22:58:39.249039Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 5/10'
2017-01-16T22:59:39.503086Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 6/10'
2017-01-16T22:59:39.736605Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 7/10'
2017-01-16T23:00:39.981073Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 8/10'
2017-01-16T23:00:40.176729Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 9/10'
2017-01-16T23:01:40.404785Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 10/10'

[root@cd81c1dadb18 /]# grep 'Attempt' /var/log/mysqld.log | tail

2017-01-16T22:57:38.329541Z 12 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'

2017-01-16T22:57:38.539984Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/10'

2017-01-16T22:57:38.806862Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 3/10'

2017-01-16T22:58:39.024568Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 4/10'

2017-01-16T22:58:39.249039Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 5/10'

2017-01-16T22:59:39.503086Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 6/10'

2017-01-16T22:59:39.736605Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 7/10'

2017-01-16T23:00:39.981073Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 8/10'

2017-01-16T23:00:40.176729Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 9/10'

2017-01-16T23:01:40.404785Z 12 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 10/10'

After the last try, even though it fails, mysqld keeps running and allows client connections…

Auto Increment Settings

Galera adjusts the auto_increment_increment and auto_increment_offset values according to the number of members in a cluster. So, for a 3-node cluster, auto_increment_increment will be “3” and auto_increment_offset from “1” to “3” (depending on the node). If a number of nodes change later, these are updated immediately. This feature can be disabled using the wsrep_auto_increment_control setting. If needed, these settings can be set manually.

Interestingly, in Group Replication the auto_increment_increment seems to be fixed at 7, and only auto_increment_offset is set differently on each node. This is the case even in the default Single-Primary mode! this seems like a waste of available IDs, so make sure that you adjust the group_replication_auto_increment_increment setting to a saner number before you start using Group Replication in production.

Multi-Threaded Slave Side Applying

Galera developed its own multi-threaded slave feature, even in 5.5 versions, for workloads that include tables in the same database. It is controlled with the wsrep_slave_threads variable. Group Replication uses a feature introduced in MySQL 5.7, where the number of applier threads is controlled with slave_parallel_workers. Galera will do multi-threaded replication based on potential conflicts of changed/locked rows. Group Replication parallelism is based on an improved LOGICAL_CLOCK scheduler, which uses information from writesets dependencies. This can allow it to achieve much better results than in normal asynchronous replication MTS mode. More details can be found here: http://mysqlhighavailability.com/zooming-in-on-group-replication-performance/

Flow Control

Both technologies use a technique to throttle writes when nodes are slow in applying them. Interestingly, the default size of the allowed applier queue in both is much different:

gcs.fc_limit (Galera) = 16 (the limit is increased automatically based on number of nodes, i.e. to 28 in 3-node cluster)
group_replication_flow_control_applier_threshold (Group Replication) = 25000.
Moreover, Group Replication provides separate certifier queue size, also eligible for the Flow Control trigger: group_replication_flow_control_certifier_threshold. One thing I found difficult, is checking the actual applier queue size, as the only exposed one via performance_schema.replication_group_member_stats is the Count_Transactions_in_queue (which only shows the certifier queue).

Network Hiccup/Partition Handling

In Galera, when the network connection between nodes is lost, those who still have a quorum will form a new cluster view. Those who lost a quorum keep trying to re-connect to the primary component. Once the connection is restored, separated nodes will sync back using IST and rejoin the cluster automatically.

This doesn’t seem to be the case for Group Replication. Separated nodes that lose the quorum will be expelled from the cluster, and won’t join back automatically once the network connection is restored. In its error log we can see:

2017-01-17T11:12:18.562305Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2017-01-17T11:12:18.631225Z 0 [Note] Plugin group_replication reported: 'getstart group_id ce427319'
2017-01-17T11:12:21.735374Z 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-01-17T11:12:21.735519Z 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-01-17T11:12:21.735527Z 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-01-17T11:12:21.735553Z 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-01-17T11:12:21.735558Z 0 [Note] Plugin group_replication reported: 'new state x_start'

2017-01-17T11:12:18.562305Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'

2017-01-17T11:12:18.631225Z 0 [Note] Plugin group_replication reported: 'getstart group_id ce427319'

2017-01-17T11:12:21.735374Z 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'

2017-01-17T11:12:21.735519Z 0 [Note] Plugin group_replication reported: 'new state x_start'

2017-01-17T11:12:21.735527Z 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'

2017-01-17T11:12:21.735553Z 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'

2017-01-17T11:12:21.735558Z 0 [Note] Plugin group_replication reported: 'new state x_start'

Its status changes to:

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 329333cd-d6d9-11e6-bdd2-0242ac130002 | f18ff539956d | 3306        | ERROR        |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM performance_schema.replication_group_members;

+---------------------------+--------------------------------------+--------------+-------------+--------------+

+---------------------------+--------------------------------------+--------------+-------------+--------------+

+---------------------------+--------------------------------------+--------------+-------------+--------------+

1 row in set (0.00 sec)

It seems the only way to bring it back into the cluster is to manually restart Group Replication:

mysql> START GROUP_REPLICATION;
ERROR 3093 (HY000): The START GROUP_REPLICATION command failed since the group is already running.
mysql> STOP GROUP_REPLICATION;
Query OK, 0 rows affected (5.00 sec)

mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (1.96 sec)

mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 24d6ef6f-dc3f-11e6-abfa-0242ac130004 | cd81c1dadb18 | 3306        | ONLINE       |
| group_replication_applier | 329333cd-d6d9-11e6-bdd2-0242ac130002 | f18ff539956d | 3306        | ONLINE       |
| group_replication_applier | ae148d90-d6da-11e6-897e-0242ac130003 | 0af7a73f4d6b | 3306        | ONLINE       |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)

mysql> START GROUP_REPLICATION;

ERROR 3093 (HY000): The START GROUP_REPLICATION command failed since the group is already running.

mysql> STOP GROUP_REPLICATION;

Query OK, 0 rows affected (5.00 sec)

mysql> START GROUP_REPLICATION;

Query OK, 0 rows affected (1.96 sec)

mysql> SELECT * FROM performance_schema.replication_group_members;

+---------------------------+--------------------------------------+--------------+-------------+--------------+

+---------------------------+--------------------------------------+--------------+-------------+--------------+

+---------------------------+--------------------------------------+--------------+-------------+--------------+

3 rows in set (0.00 sec)

Note that in the above output, after the network failure, Group Replication did not stop. It waits in an error state. Moreover, in Group Replication a partitioned node keeps serving dirty reads as if nothing happened (for non-super users):

cd81c1dadb18 {test} ((none)) > SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST  | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 24d6ef6f-dc3f-11e6-abfa-0242ac130004 | cd81c1dadb18 | 3306        | ERROR        |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
1 row in set (0.00 sec)

cd81c1dadb18 {test} ((none)) > select * from test1.t1;
+----+-------+
| id | a     |
+----+-------+
| 1  | dasda |
| 3  | dasda |
+----+-------+
2 rows in set (0.00 sec)

cd81c1dadb18 {test} ((none)) > show grants;
+-------------------------------------------------------------------------------+
| Grants for test@%                                                             |
+-------------------------------------------------------------------------------+
| GRANT SELECT, INSERT, UPDATE, DELETE, REPLICATION CLIENT ON *.* TO 'test'@'%' |
+-------------------------------------------------------------------------------+
1 row in set (0.00 sec)

cd81c1dadb18 {test} ((none)) > SELECT * FROM performance_schema.replication_group_members;

+---------------------------+--------------------------------------+--------------+-------------+--------------+

+---------------------------+--------------------------------------+--------------+-------------+--------------+

+---------------------------+--------------------------------------+--------------+-------------+--------------+

1 row in set (0.00 sec)

cd81c1dadb18 {test} ((none)) > select * from test1.t1;

+----+-------+

| id | a |

+----+-------+

| 1 | dasda |

| 3 | dasda |

+----+-------+

2 rows in set (0.00 sec)

cd81c1dadb18 {test} ((none)) > show grants;

+-------------------------------------------------------------------------------+

| Grants for test@% |

+-------------------------------------------------------------------------------+

| GRANT SELECT, INSERT, UPDATE, DELETE, REPLICATION CLIENT ON *.* TO 'test'@'%' |

+-------------------------------------------------------------------------------+

1 row in set (0.00 sec)

A privileged user can disable super_read_only, but then it won’t be able to write:

cd81c1dadb18 {test} ((none)) > insert into test1.t1 set a="split brain";
ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.

cd81c1dadb18 {root} ((none)) > select * from test1.t1;
+----+-------+
| id | a     |
+----+-------+
| 1 | dasda  |
| 3 | dasda  |
+----+-------+
2 rows in set (0.00 sec)

cd81c1dadb18 {test} ((none)) > insert into test1.t1 set a="split brain";

ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.

cd81c1dadb18 {root} ((none)) > select * from test1.t1;

+----+-------+

| id | a |

+----+-------+

| 1 | dasda |

| 3 | dasda |

+----+-------+

2 rows in set (0.00 sec)

I found an interesting thing here, which I consider to be a bug. In this case, a partitioned node can actually perform DDL, despite the error:

cd81c1dadb18 {root} ((none)) > show tables in test1;
+-----------------+
| Tables_in_test1 |
+-----------------+
| nopk            |
| t1              |
+-----------------+
2 rows in set (0.01 sec)

cd81c1dadb18 {root} ((none)) > create table test1.split_brain (id int primary key);
ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.

cd81c1dadb18 {root} ((none)) > show tables in test1;
+-----------------+
| Tables_in_test1 |
+-----------------+
| nopk            |
| split_brain     |
| t1              |
+-----------------+
3 rows in set (0.00 sec)

cd81c1dadb18 {root} ((none)) > show tables in test1;

+-----------------+

| Tables_in_test1 |

+-----------------+

| nopk |

| t1 |

+-----------------+

2 rows in set (0.01 sec)

cd81c1dadb18 {root} ((none)) > create table test1.split_brain (id int primary key);

ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.

cd81c1dadb18 {root} ((none)) > show tables in test1;

+-----------------+

| Tables_in_test1 |

+-----------------+

| nopk |

| split_brain |

| t1 |

+-----------------+

3 rows in set (0.00 sec)

In a Galera-based cluster, you are automatically protected from that, and a partitioned node refuses to allow both reads and writes. It throws an error: ERROR 1047 (08S01): WSREP has not yet prepared node for application use. You can force dirty reads using the wsrep_dirty_reads variable.

There many more subtle (and less subtle) differences between these technologies – but this blog post is long enough already. Maybe next time 🙂

Batch Insert with MySQL

December 27, 2017 by gonwan·1 Comment

Adopting to using Spring Data JPA these day, there is a post saying: IDENTITY generator disables JDBC batch inserts. To figure out the impact, create a table with 10 data fields and an auto-increment id for testing. I am using MySQL 5.7.20 / MariaDB 10.3.3 / Spring Data JPA 1.11.8 / Hibernate 5.0.12.

CREATE TABLE `t_user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `field1` varchar(255) DEFAULT NULL,
  `field2` varchar(255) DEFAULT NULL,
  `field3` varchar(255) DEFAULT NULL,
  `field4` varchar(255) DEFAULT NULL,
  `field5` varchar(255) DEFAULT NULL,
  `field6` varchar(255) DEFAULT NULL,
  `field7` varchar(255) DEFAULT NULL,
  `field8` varchar(255) DEFAULT NULL,
  `field9` varchar(255) DEFAULT NULL,
  `field10` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `t_user` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`field1` varchar(255) DEFAULT NULL,

`field2` varchar(255) DEFAULT NULL,

`field3` varchar(255) DEFAULT NULL,

`field4` varchar(255) DEFAULT NULL,

`field5` varchar(255) DEFAULT NULL,

`field6` varchar(255) DEFAULT NULL,

`field7` varchar(255) DEFAULT NULL,

`field8` varchar(255) DEFAULT NULL,

`field9` varchar(255) DEFAULT NULL,

`field10` varchar(255) DEFAULT NULL,

PRIMARY KEY (`id`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

And generate the persistence entity, add @GeneratedValue annotation:

package com.gonwan.spring.generated;

import javax.persistence.*;

@Entity
@Table(name = "t_user", schema = "test", catalog = "")
public class TUser {
    private int id;
    private String field1;
    private String field2;
    private String field3;
    private String field4;
    private String field5;
    private String field6;
    private String field7;
    private String field8;
    private String field9;
    private String field10;

    @Id
    @Column(name = "id", nullable = false)
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    /* mysql / table */
    //@GeneratedValue(strategy = GenerationType.TABLE, generator = "tableGenerator")
    //@TableGenerator(name = "tableGenerator", allocationSize = 100, table = "t_generator", pkColumnName = "gen_name", valueColumnName = "gen_value", pkColumnValue = "SEQ_USER")
    /* mariadb / sequence  */
    //@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "sequenceGenerator")
    //@SequenceGenerator(name = "sequenceGenerator", allocationSize = 100, sequenceName = "s_user")
    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    /* field getters/setters omitted. */

}

package com.gonwan.spring.generated;

import javax.persistence.*;

@Entity

@Table(name = "t_user", schema = "test", catalog = "")

public class TUser {

private int id;

private String field1;

private String field2;

private String field3;

private String field4;

private String field5;

private String field6;

private String field7;

private String field8;

private String field9;

private String field10;

@Id

@Column(name = "id", nullable = false)

@GeneratedValue(strategy = GenerationType.IDENTITY)

/* mysql / table */

//@GeneratedValue(strategy = GenerationType.TABLE, generator = "tableGenerator")

//@TableGenerator(name = "tableGenerator", allocationSize = 100, table = "t_generator", pkColumnName = "gen_name", valueColumnName = "gen_value", pkColumnValue = "SEQ_USER")

/* mariadb / sequence */

//@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "sequenceGenerator")

//@SequenceGenerator(name = "sequenceGenerator", allocationSize = 100, sequenceName = "s_user")

public int getId() {

return id;

}

public void setId(int id) {

this.id = id;

}

/* field getters/setters omitted. */

}

My benchmark runs to batch insert 2000 records in 1/2/4/8/16/32 concurrent threads.

1. IDENTITY

When using GenerationType.IDENTITY, result looks like:

Finished: threads=1, records_per_threads=2000, duration_in_ms=823
Finished: threads=2, records_per_threads=2000, duration_in_ms=609
Finished: threads=4, records_per_threads=2000, duration_in_ms=1188
Finished: threads=8, records_per_threads=2000, duration_in_ms=2329
Finished: threads=16, records_per_threads=2000, duration_in_ms=4577
Finished: threads=32, records_per_threads=2000, duration_in_ms=9579

Finished: threads=1, records_per_threads=2000, duration_in_ms=823

Finished: threads=2, records_per_threads=2000, duration_in_ms=609

Finished: threads=4, records_per_threads=2000, duration_in_ms=1188

Finished: threads=8, records_per_threads=2000, duration_in_ms=2329

Finished: threads=16, records_per_threads=2000, duration_in_ms=4577

Finished: threads=32, records_per_threads=2000, duration_in_ms=9579

As mentioned, Hibernate/JPA disables batch insert when using IDENTITY. Look into org.hibernate.event.internal.AbstractSaveEventListener#saveWithGeneratedId() for details. To make it clear, it DOES run faster when insert multiple entities in one transaction than in separated transactions. It saves transaction overhead, not round-trip overhead.

The generated key is eventually retrieved from java.sql.Statement#getGeneratedKeys(). And datasource-proxy is used to display the underlining SQL generated.

2. TABLE

Now switch to GenerationType.TABLE. Just uncomment the corresponding @GeneratedValue and @TableGenerator annotation. Result looks like:

Finished: threads=1, records_per_threads=2000, duration_in_ms=830
Finished: threads=2, records_per_threads=2000, duration_in_ms=854
Finished: threads=4, records_per_threads=2000, duration_in_ms=1775
Finished: threads=8, records_per_threads=2000, duration_in_ms=3479
Finished: threads=16, records_per_threads=2000, duration_in_ms=6542
Finished: threads=32, records_per_threads=2000, duration_in_ms=13768

Finished: threads=1, records_per_threads=2000, duration_in_ms=830

Finished: threads=2, records_per_threads=2000, duration_in_ms=854

Finished: threads=4, records_per_threads=2000, duration_in_ms=1775

Finished: threads=8, records_per_threads=2000, duration_in_ms=3479

Finished: threads=16, records_per_threads=2000, duration_in_ms=6542

Finished: threads=32, records_per_threads=2000, duration_in_ms=13768

To fix Hibernate deprecation warning and get better performance, add the line to application.properties:

spring.jpa.hibernate.use-new-id-generator-mappings=true

1	spring.jpa.hibernate.use-new-id-generator-mappings=true

I began to think that was the whole story for batch, and the datasource-proxy interceptor also traced down the batch SQL. But after I looked into dumped TCP packages using wireshark, I found the final SQL was still not in batch format. Say, they were in:

insert into `t_user` (field1, ...) values ('value1_1', ...);
insert into `t_user` (field1, ...) values ('value1_2', ...);
insert into `t_user` (field1, ...) values ('value1_3', ...);

insert into `t_user` (field1, ...) values ('value1_1', ...);

insert into `t_user` (field1, ...) values ('value1_2', ...);

insert into `t_user` (field1, ...) values ('value1_3', ...);

Instead of:

insert into `t_user` (field1, ...) values ('value1_1', ...), ('value1_2', ...), ('value1_3', ...);

1	insert into `t_user` (field1, ...) values ('value1_1', ...), ('value1_2', ...), ('value1_3', ...);

The latter one saves client/server round-trips and is recommended by MySQL. After adding rewriteBatchedStatements=true to my connection string, MySQL generated batch statements and result was much improved:

Finished: threads=1, records_per_threads=2000, duration_in_ms=433
Finished: threads=2, records_per_threads=2000, duration_in_ms=409
Finished: threads=4, records_per_threads=2000, duration_in_ms=708
Finished: threads=8, records_per_threads=2000, duration_in_ms=1566
Finished: threads=16, records_per_threads=2000, duration_in_ms=2926
Finished: threads=32, records_per_threads=2000, duration_in_ms=6388

Finished: threads=1, records_per_threads=2000, duration_in_ms=433

Finished: threads=2, records_per_threads=2000, duration_in_ms=409

Finished: threads=4, records_per_threads=2000, duration_in_ms=708

Finished: threads=8, records_per_threads=2000, duration_in_ms=1566

Finished: threads=16, records_per_threads=2000, duration_in_ms=2926

Finished: threads=32, records_per_threads=2000, duration_in_ms=6388

3. SEQUENCE

Last switch to GenerationType.SEQUENCE. Sequence is a new feature added in MariaDB 10.3 series. Create a sequence in MariaDB with:

CREATE SEQUENCE `s_user` START WITH 1 INCREMENT BY 100;

1	CREATE SEQUENCE `s_user` START WITH 1 INCREMENT BY 100;

Generally, the increment should match the one specified in @SequenceGenerator, at least >= allocationSize. See org.hibernate.id.enhanced.PooledOptimizer#generate().

Hibernate apparently does not support the new feature, I dealt with it by adding a new dialect:

package com.gonwan.spring;

import org.hibernate.dialect.MySQL5Dialect;

/*
 * Copied from org.hibernate.dialect.PostgreSQL81Dialect.
 */
public class MariaDB103Dialect extends MySQL5Dialect {

    @Override
    public boolean supportsSequences() {
        return true;
    }

    @Override
    public boolean supportsPooledSequences() {
        return true;
    }

    @Override
    public String getSequenceNextValString(String sequenceName) {
        return "select " + getSelectSequenceNextValString(sequenceName);
    }

    @Override
    public String getSelectSequenceNextValString(String sequenceName) {
        return "nextval (`" + sequenceName + "`)";
    }

}

package com.gonwan.spring;

import org.hibernate.dialect.MySQL5Dialect;

* Copied from org.hibernate.dialect.PostgreSQL81Dialect.

public class MariaDB103Dialect extends MySQL5Dialect {

@Override

public boolean supportsSequences() {

return true;

}

@Override

public boolean supportsPooledSequences() {

return true;

}

@Override

public String getSequenceNextValString(String sequenceName) {

return "select " + getSelectSequenceNextValString(sequenceName);

}

@Override

public String getSelectSequenceNextValString(String sequenceName) {

return "nextval (`" + sequenceName + "`)";

}

And add configuration:

spring.jpa.properties.hibernate.dialect=com.gonwan.spring.MariaDB103Dialect

1	spring.jpa.properties.hibernate.dialect=com.gonwan.spring.MariaDB103Dialect

supportsSequences() adds the sequence support. supportsPooledSequences() adds some pool-like optimization both supported by MariaDB and Hibernate. Otherwise, Hibernate uses tables to mimic sequences. Refer to org.hibernate.id.enhanced.SequenceStyleGenerator#buildDatabaseStructure(). Result with and without batch:

# without batch
Finished: threads=1, records_per_threads=2000, duration_in_ms=723
Finished: threads=2, records_per_threads=2000, duration_in_ms=615
Finished: threads=4, records_per_threads=2000, duration_in_ms=1147
Finished: threads=8, records_per_threads=2000, duration_in_ms=2195
Finished: threads=16, records_per_threads=2000, duration_in_ms=4687
Finished: threads=32, records_per_threads=2000, duration_in_ms=9312
# with batch
Finished: threads=1, records_per_threads=2000, duration_in_ms=298
Finished: threads=2, records_per_threads=2000, duration_in_ms=155
Finished: threads=4, records_per_threads=2000, duration_in_ms=186
Finished: threads=8, records_per_threads=2000, duration_in_ms=356
Finished: threads=16, records_per_threads=2000, duration_in_ms=695
Finished: threads=32, records_per_threads=2000, duration_in_ms=1545

# without batch

Finished: threads=1, records_per_threads=2000, duration_in_ms=723

Finished: threads=2, records_per_threads=2000, duration_in_ms=615

Finished: threads=4, records_per_threads=2000, duration_in_ms=1147

Finished: threads=8, records_per_threads=2000, duration_in_ms=2195

Finished: threads=16, records_per_threads=2000, duration_in_ms=4687

Finished: threads=32, records_per_threads=2000, duration_in_ms=9312

# with batch

Finished: threads=1, records_per_threads=2000, duration_in_ms=298

Finished: threads=2, records_per_threads=2000, duration_in_ms=155

Finished: threads=4, records_per_threads=2000, duration_in_ms=186

Finished: threads=8, records_per_threads=2000, duration_in_ms=356

Finished: threads=16, records_per_threads=2000, duration_in_ms=695

Finished: threads=32, records_per_threads=2000, duration_in_ms=1545

Dramatically improved when compared to the table generator. A sequence generator uses cache in memory(default 1000), and is optimized to eliminate lock when generating IDs.

4. Summary

	1 thread	2 threads	4 threads	8 threads	16 threads	32 threads
IDENTITY	823	609	1188	2329	4577	9579
TABLE	830	854	1775	3479	6542	13768
TABLE with batch	433	409	708	1566	2926	6388
SEQUENCE	723	615	1147	2195	4687	9312
SEQUENCE with batch	298	155	186	356	695	1545

From the summary table, IDENTITY is simplest. TABLE is a compromise to support batch insert. And SEQUENCE yields the best performance. Find the entire project in Github.