Vince Yuan's blog (ABANDONED, please visit http://vinceyuan.github.io ): 2015

Friday, September 04, 2015

I migrated my blogs to http://vinceyuan.github.io

This site is ABANDONED. Please visit http://vinceyuan.github.io.

Monday, June 08, 2015

Managing Docker images with tags

When you build Docker images many times, you will find it is very necessary to manage images with tags.
If you build an image without a tag, the default tag 'latest' is created automatically.
docker build -t company/myimage .
The complete name is company/myimage:latest

When you need to update a new version of company/myimage, you don't need to create an image company/myimage_v2. You should create an image with the same name and a new tag.
docker build -t company/myimage:v2 .
And set it latest
docker tag -f company/myimage:v2 company/myimage:latest

So your new image has two tags: v2 and latest. When you start a container based on company/myimage, company/myimage:latest will be used.

Saturday, May 30, 2015

Docker Tricks and Tips

Recently I deployed my node.js app, Redis, Postgres, Nginx with Docker and wrote a tutorial. I want to share you with some tricks and tips about Docker.

Choose Debian as the base image.

Debian image is much smaller than other OS images. It is recommended in Docker's official best practices. A problem is the packages in Debian apt are not always latest. I think Debian only likes the very stable versions. You may need to spend some time installing the proper packages.

Choose the same base image for all your own images.

If your images are based on the same base image, it will save you a lot of disk spaces. Your images share the same base image on disk.

Build images for each step.

If your node.js app needs to access Redis, Postgres, you'd better create a Redis client image, a Postgres client image, a node.js image, and your app image. Each image is based on the previous one. If you have 2 node.js app, you don't need to install Redis, Postgres clents, node.js again, just use the node.js image as the base.

Put your important data at the host instead of the container.

If a container is deleted, all files in the container will be deleted too. Use volume for the important data.

Be very careful with the image whose Dockerfile has VOLUME.

If you use this image without providing a folder of the host, the container will create a volume at /var/lib/docker/vfs/dir/. When the container is deleted, that volume will not be deleted automatically. If big data is stored there, it will use up your disk space. You can install this useful tool https://github.com/cpuguy83/docker-volumes on your host to find and remove the dangling volumes.

You can run docker without sudo.

# Add theuser to docker group to run docker as a non-root user
# MUST logout and re-login to let it effective
usermod -aG docker theuser

Delete the unnecessary files in the Dockerfile to save disk space.

For example,


FROM myredisclient


RUN apt-get update \


 && apt-get install -y wget \


 && echo 'deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main' >> /etc/apt/sources.list.d/pgdg.list \


 && wget --no-check-certificate --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \


 && apt-get update \


 && apt-get install -y --force-yes postgresql-client \


 && apt-get clean \


 && apt-get autoremove \

 && rm -rf /var/lib/apt/lists/*

Friday, May 29, 2015

Restoring/Backing up Postgres Database in a Docker Container

In the previous tutorial, I show you how to deploy a web app, Redis, Postgres, Nginx with Docker. This post shows you how to restore and backup Postgres database which is running in a Docker container. You can get all source code at https://github.com/vinceyuan/DockerizingWebAppTutorial.

I use pg_dump to dump a database to a text file. And then zip and upload it to Amazon S3 with a handy command line tool s3cmd. In this tutorial, I can't provide access_key and secret_key of S3. So you may not really run it until you get your own.

Let's install s3cmd on the host.
apt-get install -y s3cmd
Run s3cmd to configure. You should input your access_key and secret_key of S3. It creates .s3cfg at your home directory.

Restore

In the previous tutorial, after running Postgres in a container, we should restore database from the backup. We need to get the backup first.
cd /mydata && mkdir db_restore && cd db_restore
List backup files in your s3.
s3cmd ls s3://your_db_dumps/
Download a backup file and unzip.
s3cmd get s3://db_dumps/dump2015-05-22T09:15:13+0800.txt.gz
gunzip dump2015-05-22T09\:15\:13+0800.txt.gz

Run a container with Postgres client to restore db from dump.

docker run -d --name myredispgclient --link mypostgres:postgres -v /mydata/db_restore/:/tmp/ -i -t myredispgclient /bin/bashIf it does not show the console of the container, run this to get it


# docker exec -i -t myredispgclient bash
$ env | grep POSTGRES_
$ psql -h $POSTGRES_PORT_5432_TCP_ADDR -p $POSTGRES_PORT_5432_TCP_PORT -U postgres
> \l # List all databases. Make sure mynodeappdb exists
> \q # Quit
#MIN (0-59)  HOUR (0-23)  DoM (1-31)  MONTH (1-12)  DoW (0-7)     CMD
#Dump pophub db and upload to S3 every Tuesday at 10:00 Hong Kong time
     0            2           *           *            2          docker start mydbbackup2s3





$ psql -h $POSTGRES_PORT_5432_TCP_ADDR -p $POSTGRES_PORT_5432_TCP_PORT -U postgres -d mynodeappdb < /tmp/dump2015-05-22T09\:15\:13+0800.txt

$ exit # Exit the console of the container

# docker stop myredispqclient

Backup

Let's build mys3cmd image which has s3cmd 1.0 installed. The latest version is 1.5.2. But 1.5.2 fails to upload big files for me. So I have to use 1.0. This is the Dockerfile. It is based on myredispgclient image.


FROM myredispgclient



RUN apt-get update \

 && wget -O- -q http://s3tools.org/repo/deb-all/stable/s3tools.key | apt-key add - \

 && wget -O/etc/apt/sources.list.d/s3tools.list http://s3tools.org/repo/deb-all/stable/s3tools.list \

 && apt-get update \

 && apt-get install -y --force-yes --no-install-recommends s3cmd=1.0.0-4 \

 && apt-get clean \

 && apt-get autoremove \
 && rm -rf /var/lib/apt/lists/*

Build it.
cd /DockerizingWebAppTutorial/dockerfiles/mys3cmd
docker build -t mys3cmd .

Let's build mydbbackup2s3 image for backing up. I tried many times to write a working Dockerfile for mydbbackup2s3. That's why I create mys3cmd image. With mys3cmd image, it doesn't need to download and install s3cmd when I re-build mydbbackup2s3 again and again. In the Dockerfile of mydbbackup2s3, we copied .pgpass and .s3cfg to /root/ (You need to provide .s3cfg yourself). .pgpass stores the password of the database and we have to chmod 600 for it. dump_db_and_upload.sh is a script I wrote to dump, zip, and upload the backup to s3.


FROM mys3cmd



COPY .pgpass /root/

COPY .s3cfg /root/

COPY dump_db_and_upload.sh /root/

RUN chmod 600 /root/.pgpass



VOLUME /db_dumps



CMD ["/root/dump_db_and_upload.sh"]

The first 'postgres' in .pgpass is the linked container.
postgres:5432:mynodeappdb:postgres:postgres

dump_db_and_upload.sh


#!/bin/bash

# A script to dump db and compress it and then upload the file to S3.

# should change mode like 'chmod 777 dump_db_and_sync.sh'

FILENAME=$(TZ=Asia/Hong_Kong date +"dump%Y-%m-%dT%H:%M:%S+0800.txt.gz")

FULLDIR="/db_dumps/"

FULLPATH="$FULLDIR$FILENAME"

S3PATH="s3://db_dumps/"

echo "Begin to dump mynodeappdb to $FULLPATH"

# We don't use $POSTGRES_PORT_5432_TCP_ADDR for host, but use postgres which is linked

# $POSTGRES_PORT_5432_TCP_ADDR will change, but link name postgres does not change.

# We also use the link name postgres in .pgpass

pg_dump -h postgres -U postgres mynodeappdb | gzip > $FULLPATH

echo "Done"

echo "Begin to upload the dump to $S3PATH"

s3cmd put $FULLPATH $S3PATH

echo "Done"

echo "Delete the local dump"

rm $FULLPATH

echo "Finished dump and upload"

Build the image.
cd /DockerizingWebAppTutorial/dockerfiles/mydbbackup2s3
docker build -t mydbbackup2s3 .

Run the container. We don't add --restart=always here.
docker run -d --name mydbbackup2s3 --link mypostgres:postgres -v /mydata/db_dumps:/db_dumps mydbbackup2s3

It should dump database and backup immediately. The container will quit after backing up is done.

Auto-backup

We should backup the database regularly and automatically. Let's create a cron job by running this command:
crontab -e

Input the following lines, save and quit. It will run mydbbackup2s3 container to backup database every Tuesday.


#MIN (0-59)  HOUR (0-23)  DoM (1-31)  MONTH (1-12)  DoW (0-7)     CMD

#Dump pophub db and upload to S3 every Tuesday at 10:00 Hong Kong time





     0            2           *           *            2          docker start mydbbackup2s3

Here are some tips about Docker.

Deploying a Web App, Redis, Postgres and Nginx with Docker

This tutorial introduces how to deploy a web app, Redis, Postgres and Nginx with Docker on the same server. In this tutorial, the web app is a node.js(express) app. We use Redis as a cache store, Postgres as the database, and Nginx as the reverse proxy server. You can get all source code at https://github.com/vinceyuan/DockerizingWebAppTutorial.

Why Docker

Docker is a virtualization technology. The key feature I like most is it provides resource isolation. The traditional way of building a (low-traffic) website is we install the web app, cache, database, Nginx directly on a server. It's not easy to change the settings or the content a lot, because they are in the same environment. Changing one may impact others. With Docker, we can put each service in a container. It keeps the host server very clean. We can easily create/delete/change/re-create containers.

Install Docker on the host

Docker runs on a 64-bit Linux OS only. If your Linux is 32-bit, you have to re-install the 64-bit version. My original OS was 32-bit CentOS. Now I am using 64-bit Debian 8. The main reason I choose Debian is its distribution size is small and Docker recommends it in Best Practices(it's ridiculous that almost all examples at docker.com use ubuntu). Actually the host's OS can be different to the container's OS. I choose Debian instead of 64-bit CentOS because I don't want to spend any time on the differences. For example, the package management tools on Debian and CentOS are different. One is apt, the other is yum.

Currently, Docker's official installation on Debian 8 does not work. You need to run the following commands as root. theuser is the user of host OS.

sudo sucurl -sSL https://get.docker.com/ | sh
# Add theuser to docker group to run docker as a non-root user
# MUST logout and re-login to let it effective
usermod -aG docker theuser
# Start docker service
systemctl enable docker.service
systemctl start docker.service

Prepare


cd /

git clone https://github.com/vinceyuan/DockerizingWebAppTutorial.git

The folder /DockerizingWebAppTutorial contains all we need. mynodeapp is a very simple node.js (express) app. It just reads a number from Redis, and gets a query result from Postgres. There are several Dockerfiles in the dockerfiles folder. We will use them to build images.
Create folders:
cd / && mkdir mydata && cd myata

mkdir redis_data && mkdir postgres_data && mkdir nginx_data

root@pophubserver:/mydata# mkdir log_mynodeapp && mkdir log_nginx

Let's run the first container.

Redis

We use the official Redis image. Run it directly with this command:
docker run -d -v /mydata/redis_data:/data --name myredis --restart=always redis
-v /mydata/redis_data:/data means we mount a folder /mydata/redis_data of the host as a volume /data in a container. Nginx will save dump.rdb at /mydata/redis_data in the host. If we don't mount a volume, Nginx will save dump.rdb in the container. When this container is deleted, dump.rdb will be deleted too. So we should always mount a volume for the important data e.g. database file, logs.
--name myredis means we name this container myredis
--restart=always means the container will restart after it quits unexpectedly. It also makes the container start automatically after the server reboots.

That command outputs:

$ docker run -d -v /mydata/redis_data:/data --name myredis --restart=always redis

Unable to find image 'redis:latest' locally

latest: Pulling from redis

7a3e804ed6c0: Pull complete 

b96d1548a24e: Pull complete 

5ba9a5b9710f: Pull complete 

37f07aacbfe5: Pull complete 

ec7f3a6b5dc6: Pull complete 

499b313c4d4e: Pull complete 

4416945429c6: Pull complete 

0daf71066555: Pull complete 

1f86439b265d: Pull complete 

9e6288fa06c0: Pull complete 

3c083702089f: Pull complete 

71cc4c7123fc: Pull complete 

91e5e3734476: Pull complete 

8d7fb9bd09ab: Pull complete 

e6b7cf8bf1b1: Pull complete 

96182c1bd121: Pull complete 

4b7672067154: Already exists 

redis:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.

Digest: sha256:01b59520487a9ada4b8e31558c0580930a4e5f2a565a1cb85b66efe7c6ce810d

Status: Downloaded newer image for redis:latest

a96b6d2555e9f9fb1f70fea60f8cf75326cd331ebef9d4b667e322cea899d48c

It downloads redis:latest image from Docker Hub. Let's check if myredis container is running.

$ docker ps -a

CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES

a96b6d2555e9        redis:latest        "/entrypoint.sh redi   16 minutes ago      Up 16 minutes       6379/tcp            myredis

We can see myredis is running.

We need to run redis-cli in this container to set a value in Redis.

$ docker exec -i -t myredis bash

root@a96b6d2555e9:/data# redis-cli

127.0.0.1:6379> set number 1

OK

127.0.0.1:6379> save

OK

127.0.0.1:6379> exit

root@a96b6d2555e9:/data# exit

Postgres

We use the official Postgres image too. Just run it directly.
docker run -d --name mypostgres -e POSTGRES_PASSWORD=postgres -v /mydata/postgres_data:/var/lib/postgresql/data --restart=always postgres
-e POSTGRES_PASSWORD=postgres means we set the environment variable POSTGRES_PASSWORD to postgres.
-v /mydata/postgres_data:/var/lib/postgresql/data means we mount /mydata/postgres_data as a volume. This is very important. It's safe to keep database files in the host.
Create mynodeappdb:

$ docker exec -i -t mypostgres bash

root@11602c44f706:/# psql -U postgres

psql (9.4.2)

Type "help" for help.

postgres=# create database mynodeappdb;

CREATE DATABASE

postgres=# \q

root@11602c44f706:/# exit

We can see mypostgres and myredis are running.

$ docker ps -a

CONTAINER ID        IMAGE               COMMAND                CREATED              STATUS              PORTS               NAMES

11602c44f706        postgres:latest     "/docker-entrypoint.   About a minute ago   Up About a minute   5432/tcp            mypostgres          

a96b6d2555e9        redis:latest        "/entrypoint.sh redi   32 minutes ago       Up 32 minutes       6379/tcp            myredis

Redis client and Postgres client

The Dockerfile for redis client:


FROM debian:7

RUN apt-get update \

 && apt-get install -y redis-server \

 && apt-get clean \

 && apt-get autoremove \

 && rm -rf /var/lib/apt/lists/*



RUN service redis-server stop

It's based on debian:7. It actually installs both redis server and client. But we only need the client. So it stops redis-server.

Build it:


cd /DockerizingWebAppTutorial/dockerfiles/myredisclient
docker build -t myredisclient .

The Dockerfile for Postgres client:


FROM myredisclient

RUN apt-get update \

 && apt-get install -y wget \

 && echo 'deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main' >> /etc/apt/sources.list.d/pgdg.list \

 && wget --no-check-certificate --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \

 && apt-get update \

 && apt-get install -y --force-yes postgresql-client \

 && apt-get clean \

 && apt-get autoremove \

 && rm -rf /var/lib/apt/lists/*

It's based on myredisclient, because our web app needs to access both redis and postgres. The annoying thing is the default postgresql-client in Debian apt is a very old version (pg_dump will not work, because the version does not match the server's version). This Dockerfile installs the latest version (currently 9.4).

Build it


cd /DockerizingWebAppTutorial/dockerfiles/myredispgclient
docker build -t myredispgclient .

We can see there are 5 images in the host.

$ docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE

myredispgclient     latest              78b18351c561        6 minutes ago       132.5 MB

myredisclient       latest              bb2ac4846244        8 minutes ago       87.7 MB

postgres            latest              1636d90f0662        2 days ago          214 MB

redis               latest              4b7672067154        4 days ago          111 MB

debian              7                   b96d1548a24e        9 days ago          84.97 MB

Node.js

Let's build a Node.js image. In the Dockerfile for mynodejs image, we install node.js, express, forever and then set NODE_ENV production. In this example, I am not using the latest version.


FROM myredispgclient

RUN apt-get update \

 && apt-get install -y --force-yes --no-install-recommends \

      apt-transport-https \

      build-essential \

      curl \

      ca-certificates \

      git \

      lsb-release \

      python-all \

      rlwrap \

 && apt-get clean \

 && apt-get autoremove \

 && rm -rf /var/lib/apt/lists/*



RUN curl https://deb.nodesource.com/node/pool/main/n/nodejs/nodejs_0.10.30-1nodesource1~wheezy1_amd64.deb > node.deb \

 && dpkg -i node.deb \

 && rm node.deb



RUN npm install -g express@3.4.7 \

 && npm install -g forever \

 && npm cache clear


ENV NODE_ENV production

Build it.


cd /DockerizingWebAppTutorial/dockerfiles/mynodejs
docker build -t mynodejs .

mynodeapp

Then we build an image for mynodeapp. In Dockerfile, we run npm install, and use forever to run the node.js app. We don't use forever start, because we don't run it as a daemon (otherwise, the container will quit immediately).


FROM mynodejs



COPY . /src

RUN cd /src && npm install



VOLUME /log
CMD ["forever", "-l", "/log/forever.log", "-o", "/log/out.log", "-e", "/log/err.log", "/src/app.js"]

Build it


cd /DockerizingWebAppTutorial/mynodeapp
docker build -t mynodeapp .

Actually we can merge these 4 Dockerfiles into one to create one image. I build 4 images for re-using images. For example, if we want to build an image for another node.js app, we can write a Dockerfile based on mynodejs image. If we want to replace node.js with Go, we can write a Dockerfile based on myredispgclient.

The core code of mynodeapp:


var conString;

if ('development' == app.get('env')) {

  app.use(express.errorHandler());

  conString = "postgres://vince:@localhost/mynodeappdb"; // Use your db, user and password

} else {

 conString = "postgres://postgres:postgres@localhost/mynodeappdb"; // Use your db, user and password

}



var pgClient = new pg.Client(conString);

pgClient.connect(function(err) {

  if(err) return console.error('Could not connect to postgres', err);

  console.log('Connected to postgres');

});

var redisClient = redis.createClient(6379, '127.0.0.1', {})



app.get('/', function(req, res) {

 pgClient.query('SELECT NOW() AS "theTime"', function(err1, result) {

    redisClient.get("number", function(err2, reply) {

     res.render('index', { pgTime: result.rows[0].theTime, number: reply });

  });

  }); // Errors Ignored in this example. You should check errors in a real project. 

});



http.createServer(app).listen(app.get('port'), function() {

  console.log('Express server listening on port ' + app.get('port'));
});

There is a problem. We are using localhost or 127.0.0.1 for redis and postgres' host address. It works only when they are installed on the same server. But now they are in different containers. Even if we use --link, we still cannot access them via localhost and 127.0.0.1. We can use the following code to get correct host and port.


var redis_host = process.env.REDIS_PORT_6379_TCP_ADDR || '127.0.0.1';

var redis_port = process.env.REDIS_PORT_6379_TCP_PORT || 6379;
var db_host = process.env.POSTGRES_PORT_5432_TCP_ADDR || 'localhost';

REDIS_PORT_6379_TCP_ADDR is created by Docker if you run a container with --link myredis:redis. You can get Postgres user account, password, port from the environment variables too.

Run a container based on mynodeapp image. We also name the container mynodeapp. You can rename it whatever you like.

docker run -d --name mynodeapp --link mypostgres:postgres --link myredis:redis -v /mydata/log_mynodeapp:/log -p 3000:3000 --restart=always mynodeapp

By default, each container is isolated. --link allows a container access another container. --link mypostgres:postgres means we can access mypostgres container with the alias 'postgres' just like localhost for 127.0.0.1.

-v /mydata/log_mynodeapp:/log mounts a volume. We want to keep logs in the host.

-p 3000:3000 maps host's port 3000 to container's port 3000. It is not mandatory. But with it, we can use curl localhost:3000 in the host to check if mynodeapp container runs correctly.

$ curl localhost:3000

<!DOCTYPE html><html><head><title></title><link rel="stylesheet" href="/stylesheets/style.css"></head><body><H1>mynodeapp</H1><p>Number from Redis: 1</p><p>Time from Postgres: Fri May 29 2015 09:47:54 GMT+0000 (UTC)</p></body></html>

The web app runs correctly in the container.

Nginx

Now we install Nginx. In the Dockerfile, we make directory /mynodeapp/public. A folder in the host will be mounted here.


FROM nginx



# Create folder for static files

RUN mkdir /mynodeapp && mkdir /mynodeapp/public

# copy sslcert files to /etc/nginx/ for https

#COPY mydomain.* /etc/nginx/

# copy conf
COPY nginx-docker.conf /etc/nginx/nginx.conf

In nginx-docker.conf, we use mynodeapp for the server address, because it is linked.



    upstream mynodeapp_upstream {

        server mynodeapp:3000;

        keepalive 64;

    }

Build the image and run the container.

cd /DockerizingWebAppTutorial/dockerfiles/mynginx

docker build -t mynginx .

Run mynginx container.
docker run -d --name mynginx --link mynodeapp:mynodeapp -v /mydata/nginx_data:/var/cache/nginx -v /mydata/log_nginx:/var/log/nginx -v /DockerizingWebAppTutorial/mynodeapp/public:/mynodeapp/public -p 80:80 -p 443:443 --restart=always mynginx
--link mynodeapp:mynodeapp means we link mynodeapp container to mynginx container. We don't link myredis and mypostgres because mynginx does not access them directly.
We also mount 2 folders for logging.
-p 443:443 is for https. However, this example does not provide ssl certificate files.

Run curl localhost and curl localhost/stylesheets/style.css to check if mynginx runs correctly.

# curl localhost

<!DOCTYPE html><html><head><title></title><link rel="stylesheet" href="/stylesheets/style.css"></head><body><H1>mynodeapp</H1><p>Number from Redis: 1</p><p>Time from Postgres: Fri May 29 2015 10:12:35 GMT+0000 (UTC)</p></body></html>root@pophubserver:/DockerizingWebAppTutorial/dockerfiles/mynginx# 

root@pophubserver:/DockerizingWebAppTutorial/dockerfiles/mynginx# curl localhost/stylesheets/style.css

body {

  padding: 50px;

  font: 14px "Lucida Grande", Helvetica, Arial, sans-serif;

}

a {

  color: #00B7FF;

}

Now we finished deploying a web app, Redis, Postgres and Nginx with Docker. It took me a lot of time to really deploy my real app with Docker. Luckily I tested in a VirtualBox VM. I can delete/create images/containers back and forth easily with Docker.

An important part is missing. That's restoring and backing up database. I will show you in another tutorial. Here are some tips about Docker.

Friday, May 08, 2015

High memory usage of Emojis on iOS

Emoji is very popular. I am making a keyboard which contains many Emojis. However, I found the memory usage of Emoji is too high. When the keyboard's memory usage is high enough, iOS will kill the keyboard immediately. (Looks like the limit is 40mb on iOS 8.3)

The emojis are just unicode characters. Apple renders them with lovely icons, which causes the high memory usage. It's OK, because we all know icons use much memory. But the problem is after you have destroyed the view which uses Emoji, the memory of Emoji will not be released. Looks like iOS keeps the Emoji cache in the app or app extension. It's acceptable in an app, but not acceptable in an app extension.

What I have to do is to delete many emojis from my keyboard.