Tuesday, October 18, 2011

Containers on Linux

At Oracle OpenWorld we talked about Linux Containers. Here is an example of getting a Linux container going with Oracle Linux 6.1, UEK2 beta and btrfs. This is just an example, not released, production, bug-free... for those that don't read README files ;-)
This container example is using the existing Linux cgroups features in the mainline kernel (and also in UEK, UEK2) and lxc tools to create the environments.
Example assumptions :
- Host OS is Oracle Linux 6.1 with UEK2 beta.
- using btrfs filesystem for containers (to make use of snapshot capabilities)
- mounting the fs in /container
- use Oracle VM templates as a base environment
- Oracle Linux 5 containers

I have a second disk on my test machine (/dev/sdb) which I will use for this exercise.

# mkfs.btrfs  -L container  /dev/sdb

# mount
/dev/mapper/vg_wcoekaersrv4-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
/dev/mapper/vg_wcoekaersrv4-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/mapper/loop0p2 on /mnt type ext3 (rw)
/dev/mapper/loop1p2 on /mnt2 type ext3 (rw)
/dev/sdb on /container type btrfs (rw)

lxc tools installed...

# rpm -qa|grep lxc
lxc-libs-0.7.5-2.x86_64
lxc-0.7.5-2.x86_64
lxc-devel-0.7.5-2.x86_64
lxc tools come with template config files :

# ls /usr/lib64/lxc/templates/

lxc-altlinux lxc-busybox lxc-debian lxc-fedora lxc-lenny
lxc-ol4 lxc-ol5 lxc-opensuse lxc-sshd lxc-ubuntu
I created one for Oracle Linux 5 : lxc-ol5.

Download Oracle VM template for OL5 from http://edelivery.oracle.com/linux. I used OVM_EL5U5_X86_PVM_10GB.
We want to be able to create 1 environment that can be used in both container and VM mode to avoid duplicate effort.
Untar the VM template.
# tar zxvf OVM_EL5U5_X86_PVM_10GB.tar.gz
These are the steps needed (to be automated in the future)...
Copy the content of the VM virtual disk's root filesystem into a btrfs subvolume in order to easily clone the base template.
My template configure script defines :
template_path=/container/ol5-template

- create subvolume ol5-template on /containers

# btrfs subvolume create /container/ol5-template
Create subvolume '/container/ol5-template'
- loopback mount the Oracle VM template System image / partition
# kpartx -a System.img 
# kpartx -l System.img
loop0p1 : 0 192717 /dev/loop0 63
loop0p2 : 0 21607425 /dev/loop0 192780
loop0p3 : 0 4209030 /dev/loop0 21800205
I need to mount the 2nd partition of the virtual disk image, kpartx will set up loopback devices for each of the virtual disk partitions. So let's mount loop0p2 which will contain the Oracle Linux 5 / filesystem of the template.
# mount /dev/mapper/loop0p2 /mnt

# ls /mnt
bin boot dev etc home lib lost+found media misc mnt opt proc
root sbin selinux srv sys tftpboot tmp u01 usr var
Great, now we have the entire template / filesystem available. Let's copy this into our subvolume. This subvolume will then become the basis for all OL5 containers.
# cd /mnt
# tar cvf - * | ( cd /container/ol5-template ; tar xvf ; )
In the near future we will put some automation around the above steps.
# pwd
/container/ol5-template

# ls
bin boot dev etc home lib lost+found media misc mnt opt proc
root sbin selinux srv sys tftpboot tmp u01 usr var
From this point on, the lxc-create script, using the template config as an argument, should be able to automatically create a snapshot and set up the filesystem correctly.
# lxc-create -n ol5test1 -t ol5

Cloning base template /container/ol5-template to /container/ol5test1 ...
Create a snapshot of '/container/ol5-template' in '/container/ol5test1'
Container created : /container/ol5test1 ...
Container template source : /container/ol5-template
Container config : /etc/lxc/ol5test1
Network : eth0 (veth) on virbr0
'ol5' template installed
'ol5test1' created

# ls /etc/lxc/ol5test1/
config fstab

# ls /container/ol5test1/
bin boot dev etc home lib lost+found media misc mnt opt proc
root sbin selinux srv sys tftpboot tmp u01 usr var
Now that it's created and configured, we should be able to just simply start it :
# lxc-start -n ol5test1
INIT: version 2.86 booting
Welcome to Enterprise Linux Server
Press 'I' to enter interactive startup.
Setting clock (utc): Sun Oct 16 06:08:27 EDT 2011 [ OK ]
Loading default keymap (us): [ OK ]
Setting hostname ol5test1: [ OK ]
raidautorun: unable to autocreate /dev/md0
Checking filesystems
[ OK ]
mount: can't find / in /etc/fstab or /etc/mtab
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling /etc/fstab swaps: [ OK ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting sysstat: Calling the system activity data collector (sadc):
[ OK ]
Starting background readahead: [ OK ]
Flushing firewall rules: [ OK ]
Setting chains to policy ACCEPT: nat mangle filter [ OK ]
Applying iptables firewall rules: [ OK ]
Loading additional iptables modules: no [FAILED]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0:
Determining IP information for eth0... done.
[ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Enabling ondemand cpu frequency scaling: [ OK ]
Starting irqbalance: [ OK ]
Starting portmap: [ OK ]
FATAL: Could not load /lib/modules/2.6.39-100.0.12.el6uek.x86_64/modules.dep: No such file or directory
Starting NFS statd: [ OK ]
Starting RPC idmapd: Error: RPC MTAB does not exist.
Starting system message bus: [ OK ]
Starting o2cb: [ OK ]
Can't open RFCOMM control socket: Address family not supported by protocol

Mounting other filesystems: [ OK ]
Starting PC/SC smart card daemon (pcscd): [ OK ]
Starting HAL daemon: [FAILED]
Starting hpiod: [ OK ]
Starting hpssd: [ OK ]
Starting sshd: [ OK ]
Starting cups: [ OK ]
Starting xinetd: [ OK ]
Starting crond: [ OK ]
Starting xfs: [ OK ]
Starting anacron: [ OK ]
Starting atd: [ OK ]
Starting yum-updatesd: [ OK ]
Starting Avahi daemon... [FAILED]
Starting oraclevm-template...
Regenerating SSH host keys.
Stopping sshd: [ OK ]
Generating SSH1 RSA host key: [ OK ]
Generating SSH2 RSA host key: [ OK ]
Generating SSH2 DSA host key: [ OK ]
Starting sshd: [ OK ]
Regenerating up2date uuid.
Setting Oracle validated configuration parameters.

Configuring network interface.
Network device: eth0
Hardware address: 52:19:C0:EF:78:C4

Do you want to enable dynamic IP configuration (DHCP) (Y|n)?

...
This will run the well-known Oracle VM template configure scripts and set up the container the same way as it would an Oracle VM guest.

The session that runs lxc-start is the local console. It is best to run this session inside screen so you can disconnect and reconnect.

At this point,I can use lxc-console to log into the local console of the container, or, since the container has its internal network up and running and sshd is running, I can also just ssh into the guest.
# lxc-console -n ol5test1 -t 1

Enterprise Linux Enterprise Linux Server release 5.5 (Carthage)
Kernel 2.6.39-100.0.12.el6uek.x86_64 on an x86_64

host login:
I can simple get out of the console entering ctrl-a q.

From inside the container :
# mount
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

# /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 52:19:C0:EF:78:C4
inet addr:192.168.122.225 Bcast:192.168.122.255 Mask:255.255.255.0
inet6 addr: fe80::5019:c0ff:feef:78c4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:141 errors:0 dropped:0 overruns:0 frame:0
TX packets:19 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8861 (8.6 KiB) TX bytes:2476 (2.4 KiB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:560 (560.0 b) TX bytes:560 (560.0 b)

# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2124 656 ? Ss 06:08 0:00 init [3]
root 397 0.0 0.0 1780 596 ? Ss 06:08 0:00 syslogd -m 0
root 400 0.0 0.0 1732 376 ? Ss 06:08 0:00 klogd -x
root 434 0.0 0.0 2524 368 ? Ss 06:08 0:00 irqbalance
rpc 445 0.0 0.0 1868 516 ? Ss 06:08 0:00 portmap
root 469 0.0 0.0 1920 740 ? Ss 06:08 0:00 rpc.statd
dbus 509 0.0 0.0 2800 576 ? Ss 06:08 0:00 dbus-daemon --system
root 578 0.0 0.0 10868 1248 ? Ssl 06:08 0:00 pcscd
root 610 0.0 0.0 5196 712 ? Ss 06:08 0:00 ./hpiod
root 615 0.0 0.0 13520 4748 ? S 06:08 0:00 python ./hpssd.py
root 637 0.0 0.0 10168 2272 ? Ss 06:08 0:00 cupsd
root 651 0.0 0.0 2780 812 ? Ss 06:08 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
root 660 0.0 0.0 5296 1096 ? Ss 06:08 0:00 crond
root 745 0.0 0.0 1728 580 ? SNs 06:08 0:00 anacron -s
root 753 0.0 0.0 2320 340 ? Ss 06:08 0:00 /usr/sbin/atd
root 817 0.0 0.0 25580 10136 ? SN 06:08 0:00 /usr/bin/python -tt /usr/sbin/yum-updatesd
root 819 0.0 0.0 2616 1072 ? SN 06:08 0:00 /usr/libexec/gam_server
root 830 0.0 0.0 7116 1036 ? Ss 06:08 0:00 /usr/sbin/sshd
root 2998 0.0 0.0 2368 424 ? Ss 06:08 0:00 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient-eth0.leases -pf /var/run/dhc
root 3102 0.0 0.0 5008 1376 ? Ss 06:09 0:00 login -- root
root 3103 0.0 0.0 1716 444 tty2 Ss+ 06:09 0:00 /sbin/mingetty tty2
root 3104 0.0 0.0 1716 448 tty3 Ss+ 06:09 0:00 /sbin/mingetty tty3
root 3105 0.0 0.0 1716 448 tty4 Ss+ 06:09 0:00 /sbin/mingetty tty4
root 3138 0.0 0.0 4584 1436 tty1 Ss 06:11 0:00 -bash
root 3167 0.0 0.0 4308 936 tty1 R+ 06:12 0:00 ps aux
From the host :
# lxc-info -n ol5test1
state: RUNNING
pid: 16539

# lxc-kill -n ol5test1

# lxc-monitor -n ol5test1
'ol5test1' changed state to [STOPPING]
'ol5test1' changed state to [STOPPED]
So creating more containers is trivial. Just keep running lxc-create.
# lxc-create -n ol5test2 -t ol5

# btrfs subvolume list /container
ID 297 top level 5 path ol5-template
ID 299 top level 5 path ol5test1
ID 300 top level 5 path ol5test2
lxc-tools will be uploaded to the uek2 beta channel to start playing with this. Oracle Linux 4 example
Here is the same principle for Oracle Linux 4. Using the template create script lxc-ol4. I started out using the OVM_EL4U7_X86_PVM_4GB template and followed the same steps.

# kpartx -a System.img 

# kpartx -l System.img
loop0p1 : 0 64197 /dev/loop0 63
loop0p2 : 0 8530515 /dev/loop0 64260
loop0p3 : 0 4176900 /dev/loop0 8594775

# mount /dev/mapper/loop0p2 /mnt

# cd /mnt

# btrfs subvolume create /container/ol4-template
Create subvolume '/container/ol4-template'

# tar cvf - * | ( cd /container/ol4-template ; tar xvf - ; )

# lxc-create -n ol4test1 -t ol4

Cloning base template /container/ol4-template to /container/ol4test1 ...
Create a snapshot of '/container/ol4-template' in '/container/ol4test1'
Container created : /container/ol4test1 ...
Container template source : /container/ol4-template
Container config : /etc/lxc/ol4test1
Network : eth0 (veth) on virbr0
'ol4' template installed
'ol4test1' created

# lxc-start -n ol4test1
INIT: version 2.85 booting
/etc/rc.d/rc.sysinit: line 80: /dev/tty5: Operation not permitted
/etc/rc.d/rc.sysinit: line 80: /dev/tty6: Operation not permitted
Setting default font (latarcyrheb-sun16): [ OK ]

Welcome to Enterprise Linux
Press 'I' to enter interactive startup.
Setting clock (utc): Sun Oct 16 09:34:56 EDT 2011 [ OK ]
Initializing hardware... storage network audio done [ OK ]
raidautorun: unable to autocreate /dev/md0
Configuring kernel parameters: error: permission denied on key 'net.core.rmem_default'
error: permission denied on key 'net.core.rmem_max'
error: permission denied on key 'net.core.wmem_default'
error: permission denied on key 'net.core.wmem_max'
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.core_uses_pid = 1
fs.file-max = 327679
kernel.msgmni = 2878
kernel.msgmax = 8192
kernel.msgmnb = 65536
kernel.sem = 250 32000 100 142
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.sysrq = 1
fs.aio-max-nr = 3145728
net.ipv4.ip_local_port_range = 1024 65000
kernel.shmmax = 4398046511104
[FAILED]
Loading default keymap (us): [ OK ]
Setting hostname ol4test1: [ OK ]
Remounting root filesystem in read-write mode: [ OK ]
mount: can't find / in /etc/fstab or /etc/mtab
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling swap space: [ OK ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting sysstat: [ OK ]
Setting network parameters: error: permission denied on key 'net.core.rmem_default'
error: permission denied on key 'net.core.rmem_max'
error: permission denied on key 'net.core.wmem_default'
error: permission denied on key 'net.core.wmem_max'
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.core_uses_pid = 1
fs.file-max = 327679
kernel.msgmni = 2878
kernel.msgmax = 8192
kernel.msgmnb = 65536
kernel.sem = 250 32000 100 142
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.sysrq = 1
fs.aio-max-nr = 3145728
net.ipv4.ip_local_port_range = 1024 65000
kernel.shmmax = 4398046511104
[FAILED]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Starting portmap: [ OK ]
Starting NFS statd: [FAILED]
Starting RPC idmapd: Error: RPC MTAB does not exist.
Mounting other filesystems: [ OK ]
Starting lm_sensors: [ OK ]
Starting cups: [ OK ]
Generating SSH1 RSA host key: [ OK ]
Generating SSH2 RSA host key: [ OK ]
Generating SSH2 DSA host key: [ OK ]
Starting sshd: [ OK ]
Starting xinetd: [ OK ]
Starting crond: [ OK ]
Starting xfs: [ OK ]
Starting anacron: [ OK ]
Starting atd: [ OK ]
Starting system message bus: [ OK ]
Starting cups-config-daemon: [ OK ]
Starting HAL daemon: [ OK ]
Starting oraclevm-template...
Regenerating SSH host keys.
Stopping sshd: [ OK ]
Generating SSH1 RSA host key: [ OK ]
Generating SSH2 RSA host key: [ OK ]
Generating SSH2 DSA host key: [ OK ]
Starting sshd: [ OK ]
Regenerating up2date uuid.
Setting Oracle validated configuration parameters.

Configuring network interface.
Network device: eth0
Hardware address: D2:EC:49:0D:7D:80

Do you want to enable dynamic IP configuration (DHCP) (Y|n)?
...
...

# lxc-console -n ol4test1

Enterprise Linux Enterprise Linux AS release 4 (October Update 7)
Kernel 2.6.39-100.0.12.el6uek.x86_64 on an x86_64

localhost login:

No comments:

Post a Comment