Category Archives: Linux

Kernel space – User space – Containers – Virtualisation

How many times I’ve heard “well, a container is like a super light-weight virtual machine“. And yes, true, I admit as well, that I was one of them.

But I wasn’t happy about this answer, so I did some researches and I think now I have a better understanding and I feel the pain of my friends where I was simplistically (and wrongly) saying that – public apologies 😛 🙂

 

So… let’s start…

 

Concept 1: Virtual memory.

Virtual memory is the collective memory used by processes (RAM, disk swap, etc).

Of this virtual memory, we have generally a separation beween 2 types:

  • kernel space: reserverd for the kernel and generally drivers
  • user space: for the applications, incluse libraries

This separation serves to provide memory protection and hardware protection from malicious or errant software behavior.

NOTE1: User space is not namespace.

 

NOTE2: FUSE is not really related with this topic, but could confuse someone. So, just to clarify: FUSE – (Filesystem in Userspace) is a software interface for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a “bridge” to the actual kernel interfaces.

Modern kernels have cgroups and namespace capabilities.

  • Cgroups can restrict what you can USE -> CPU, memory, storage, network, devices, etc. Also allows to ‘freeze’.
  • Namespace can restrict what you SEE -> PID, mnt, UID/GID, etc…

Containers runtimes (like LXC, Docker, etc…) are using cgroups and namespaces to create separate isolated user-space entities called ‘containers‘.
Containers have basically no overhead because they are using the same system calls to the host kernel => No need of emuation or virtual machine.

They use the same kernel of the host (this is a key difference with virtualisation). So, currently, you cannot run Windows containers on a Linux host. But you can still run different versions of Linux, as they all share the same kernel.

Virtualisation: fully isolated OS, running its own kernel.

  • Full virtualised: (eg. VMWare, Virtuabox, ESXi…). The OS in the VM is not aware to be a VM. Hypervisor emulates the hardware platform for the guest OS and then translates the hardware accesses requests to the physical hardware. Hypervisor provides the drivers to the guest OS.
    => higher overhead because hardware virtualisation BUT best isolation and security
  • Para virtualised: (XEN, KVM) the OS in the VM knows to be virtualised. Drivers are sending instructions directly to the hardware of the host, via the Hypervisor. Hardware is not virtualised BUT the OS runs in isolation.
    => better performance and ability to use recent hardware drivers directly BUT guest OS needs to be modified to use paravirtualised devices

NOTE: Emulation is not platform virtualisation (e.g. QEMU)
With emulation you can emulate different architectures (e.g. ARM/RISC…) on a host that has a differnt instruction set (eg. i386). Performances are cleary not ideal.


Main sources:

Tips for RHCSA certification

Just a collection of notes and screenshot that can help in getting ready for the RHCSA exam.
Basted on RHEL 7.

 

Boot systems into different targets manually

 

 Configure networking and hostname resolution statically or dynamically

 

Interrupt the boot process in order to gain access to a system

 

Mount and unmount CIFS network file systems

 

Configure a system to use time services

timedatectl

timedatectl list timezones
timedatectl set-timezone America/Phoenix


timedatectl set-time 9:00:00
timedatectl set-ntp true/false
timedatectl

 

Bridge / Bond interfaces CentOS/RedHat

Just few notes about how to bridge or bond network interfaces in CentOS/RedHat systems

# Install the required packages

yum install bridge-utils


BRIDGE
------

/etc/sysconfig/network-scripts/

#ifcfg-br0
DEVICE=br0
TYPE=Bridge
IPADDR=192.168.1.1
NETMASK=255.255.255.0
ONBOOT=yes
BOOTPROTO=none
NM_CONTROLLED=no
DELAY=0

#ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
HWADDR=AA:BB:CC:DD:EE:FF
BOOTPROTO=none
ONBOOT=yes
NM_CONTROLLED=no
BRIDGE=br0


#### USE SCREEN!!
service network restart 

================================
BOND >>> 2 or more eth interfaces!
----

#ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
USERCTL=no
SLAVE=yes
MASTER=bond0
BOOTPROTO=none
HWADDR=AA:BB:CC:DD:EE:FF
NM_CONTROLLED=no

#ifcfg-eth1
DEVICE=eth1
TYPE=Ethernet
USERCTL=no
SLAVE=yes
MASTER=bond0
BOOTPROTO=none
HWADDR=AA:BB:CC:DD:EE:FF
NM_CONTROLLED=no

#ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
BONDING_OPTS='mode=1 miimon=100'
BRIDGE=br0
NM_CONTROLLED=no

#ifcfg-br0
DEVICE=br0
ONBOOT=yes
TYPE=Bridge
IPADDR=192.168.1.1
NETMASK=255.255.255.0
NM_CONTROLLED=no


# ifup bond0
#### USE SCREEN!!
# service network restart 


==========================

For DHCP and not static

DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes

 

Sources:

Systemd – find what’s wrong with systemctl

True: all the last changes in Linux distro didn’t make me really really happy.
I still like to use init.d to start a process (it took me a while to get used to service yourservice status syntax) and so.

Anyway, the main big ones don’t seem to look back, and we need to get used to this 🙂

I have few raspberry PIs at home, and I’ve noticed that after a restart I was experiencing different weird behaviours. The main two:

  • stuck and not rebooting
  • receiving strange logrotate email alerts (e.g. /etc/cron.daily/logrotate:
    gzip: stdin: file size changed while zipping)

I tried to ignore them, but when you issue a reboot from a remote place and it doesn’t reboot, you understand that you should start to check what’s going on, instead of just unplug-replug your PI.

 

And here the discovery: systemctl

This magic command was able to show me the processes with issues, and slowly find out what was wrong with logrotate or my reboot. Or, better, I have realised that after fixing what was marked as failed, I didn’t experience any weird behaviors.

So, here few steps that I’d like to share – to help maybe someone else in the future, myself included – as I tend to forget things if I don’t use them 🙂

To check if your system is healthy or not:

systemctl status

Output should return “running”. If you get “degraded”, well, there is definitely something wrong.

Use the following to check what has failed:

systemctl --failed

Now, investigate those specific processes. Try to analyse their status and logs or literally try to restart them to see live what is the error:

systemctl status <broken_service>

journalctl _PID=<PID_of_broken_service>

tail /var/log/<broken_service>

systemctl restart <broken_service>

 

After fixing all, I tried to reboot few times and after I was checking again the overall status to make sure it was “running”.

In my case, I had few issues with “systemd-modules-load.service”. This probably related to my dist-upgrade. Some old and no longer existing modules were still listed in /etc/modules and, of course, the service wasn’t able to load them, miserably failing.
I’ve tested each module using modprobe <module_name> and I’ve commented out the ones where failing. Restarted and voila`, status… running!

On another PI I had some issues with Apache, but I can’t remember how I fixed it. Still, the goal of this post is mostly make everyone aware that systemctl can give you some interesting info about the system and you can focus your energies on the failed services.

I admit in totally honesty that I have no much clue why after fixing these failed services, all issues disappeared. In fact, the reboot wasn’t affecting one PI with the same non-existing modules listed, but it was stopping another one during the boot. Again, I could probably troubleshoot further but I have a life to live as well 🙂

 

Sources:

OVH API notes

 

Create App: https://eu.api.ovh.com/createApp/

Python wrapper project: https://github.com/ovh/python-ovh

Web API control panel: https://eu.api.ovh.com/console/#/

# Where to find the ID of your application created from the portal

/me/api/application


# If you use the python script to get the customer_key
# You can find the ID here, filtering by application ID

/me/api/credential


# Here you can find your serviceName (project ID)
# - I got mad to understand what it was before!

/cloud/project

 

Example of ovh.conf file

[default]
endpoint=ovh-eu

[ovh-eu]
application_key=my_app_key
application_secret=my_application_secret
;consumer_key=my_consumer_key

;consumer_key needs to be uncommented once you have got it

 

 

Custom python script to allow access only to a specific project under my Cloud OVH account

# -*- encoding: utf-8 -*-

import ovh

# create a client using configuration
client = ovh.Client()

# Request full access to /cloud/project/<PROJECT_ID>/
ck = client.new_consumer_key_request()
ck.add_recursive_rules(ovh.API_READ_WRITE, '/cloud/project/<PROJECT_ID>/')

## Request full access to ALL
#ck = client.new_consumer_key_request()
# ck.add_recursive_rules(ovh.API_READ_WRITE, '/')

# Request token
validation = ck.request()

print "Please visit %s to authenticate" % validation['validationUrl']
raw_input("and press Enter to continue...")

# Print customerKey
print "Btw, your 'consumerKey' is '%s'" % validation['consumerKey']

 

How to create a script

  1. Create the app from the link above
  2. Get the keys and store them safely
  3. Install the OVH python wrapper
  4. Create ovh.conf file and use the keys from your app
  5. Use the python example (or mine) to get the customerKey
  6. Update ovh.conf with the customKey
  7. Create your script and have fun! 🙂

Script example to get a list of snapshots:

# -*- encoding: utf-8 -*-
import json
import ovh

serviceName="<PROJECT_ID>"
region="GRA3"


# Auth
client = ovh.Client()


result = client.get("/cloud/project/%s/snapshot" % serviceName,
    flavorType=None,
    region="%s" % region,
)


# Pretty print
print json.dumps(result, indent=4)

 

Email notification for successful SSH connection

If you manage a remote server, and you are a bit paranoiac about the bad guys outside, it could be nice to have some sort of notification every time a SSH connection is successful.

I found this post and it seems working pretty well for me as well.
I’ve installed this on my CentOS7 server and seems working good! Of course, this in addition to an aggressive Fail2Ban setup.

  1. Make sure you have your MTA (Postfix/Sendmail…) configured to deliver emails to the user root
  2. Make sure you get the emails for the user root (otherwise doesn’t make any sense 😛 )
  3. Create this script (this is a slightly modified version comparing with the one in the original post:
    #!/bin/sh
    if [ "$PAM_TYPE" != "open_session" ]
    then
      exit 0
    else
      {
        echo "User: $PAM_USER"
        echo "Remote Host: $PAM_RHOST"
        echo "Service: $PAM_SERVICE"
        echo "TTY: $PAM_TTY"
        echo "Date: `date`"
        echo "Server: `uname -a`"
      } | mail -s "$PAM_SERVICE login on `hostname -s` from user $PAM_USER@$PAM_RHOST" root
    fi
    exit 0
    
  4. Set the permission:
    chmod +x /usr/local/bin/send-mail-on-ssh-login.sh
  5. Append this line to /etc/pam.d/sshd
    session optional pam_exec.so /usr/local/bin/send-mail-on-ssh-login.sh
  6.  …and that’s it! 😉

 

If you’d like to have a specific domain/IP whitelisted, for example if you don’t want to get notified when you connect from your office or your home (fixed IP or dynamic IP is required), you can use this version of the script:

#!/bin/bash
if [ "$PAM_TYPE" != "open_session" ]; then
  exit 0
else
  MSG="$PAM_SERVICE login on `hostname -s` from user $PAM_USER@$PAM_RHOST"
  # check if the PAM_RHOST is shown as IP
  echo "$PAM_RHOST" | grep -q -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
  if [ $? -eq 0 ]; then
    SRCIP=$PAM_RHOST
  else
    SRCIP=$(dig +short $PAM_RHOST)
  fi
  SAFEIP=$(dig +short myofficedomain.com)
  if [ "$SRCIP" == "$SAFEIP" ]; then
    echo "Authorised $MSG" | logger
  else
  {
    echo "User: $PAM_USER"
    echo "Remote Host: $PAM_RHOST"
    echo "Service: $PAM_SERVICE"
    echo "TTY: $PAM_TTY"
    echo "Date: `date`"
    echo "Server: `uname -a`"
  } | mail -s "Unexpected $MSG" root
  fi
fi
exit 0

The script will send an email ONLY if the source IP is not the one from myofficedomain.com; however, it will log the authentication in /var/log/messages using logger command.

Chef – notes

Websites: https://www.chef.io
Learning site: https://learn.chef.io

As any other Configuration Manager tools, the main goal is automate and keep consistency in the infrastructure:

  • create files if missing
  • ignore file/task if already up to date
  • replace with original version if modified

Typically, Chef is comprised of three parts:

  1. your workstation – where you create your recipes/cookbooks
  2. a Chef server – The guy who host the active version of recipes/cookbooks (central repository) and manage the nodes
  3. nodes – machines managed by Chef server. FYI, any nodes has Chef client installed.
diagram

picture source https://learn.chef.io

Generally, you deploy your cookbooks on your workstation and push them onto the Chef Server. The node(s) communicate with the Chef Server via chef-client and pulls and execute the cookbook.

There is no communication between the workstation and the node EXCEPT for the first initial bootstrap task. This is the only time when the workstation connects directly to the node and provides the details required to communicate with the Chef Server (Chef Server’s URL, validation Key). It also installs chef on the node and runs chef-client for the first time. During this time, the nodes gets registered on the Chef Sever and receive a unique client.pem key, that will be used by chef-client to authenticate afterwards.
The information gets stored in a Postgress DB, and there is some indexing happening as well in Apache Solr (Elastic Search in a Chef Server cluster environment).

Further explanation here: https://docs.chef.io/chef_overview.html

Some terms:

  • resource: part of the system in a desiderable state (e.g. package installed, file created…);
  • recipe: it contains declaration of resources, basically, the things to do;
  • cookbook: is a collection of recipes, templates, attributes, etc… basically The final collection of all.

Important to remember:

  • there are default actions. If not specified, the default action applies (e.g. :create for a file),
  • in the recipe you define WHAT but not HOW. The “how” is managed by Chef itself,
  • the order is important! For example, make sure to define the install of a package BEFORE setting a state enable. ONLY attributes can be listed without order.


Labs

Test images: http://chef.github.io/bento/ and https://atlas.hashicorp.com/bento
=> you can get these boxes using Vagrant

Example, how to get CentOS7 for Virtualbox and start it/connect/remove:

vagrant box add bento/centos-7.2 --provider=virtualbox

vagrant init bento/centos-7.2

vagrant up

vagrant ssh

vagrant destroy

Exercises:

Software links and info:

Chef DK: it provides tools (chef, knife, berks…) to manage your servers remotely from your workstation.
Download link here.

To communicate with the Chef Server, your workstation needs to have .chef/knife.rb file configured as well:

# See http://docs.chef.io/config_rb_knife.html for more information on knife configuration options

current_dir = File.dirname(__FILE__)
log_level                :info
log_location             STDOUT
node_name                "admin"
client_key               "#{current_dir}/admin.pem"
chef_server_url          "https://chef-server.test/organizations/myorg123"
cookbook_path            ["#{current_dir}/../cookbooks"]

Make sure to also have admin.pem (the RSA key) in the same .chef directory.

To fetch and verify the SSL certificate from the Chef server:

knife ssl fetch

knife ssl check

 

Chef DK also provides tools to allow you to configure a machine directly, but it is just for testing purposes. Syntax example:

chef-client --local-mode myrecipe.rb

 

 

Chef ServerDownload here.
To remember, Chef Server needs RSA keys (command line switch –filename) to communicate. We have user’s key, organisation key (chef-validator key).
There are different type of installation. Here you can find more information. And here more detail about the new HA version.

Chef Server can have a web interface, if you also install the Chef Management Console:

# chef-server-ctl install chef-manage

 

Alternatively you can use Hosted Chef service.

Chef Client:
(From official docs) The chef-client accesses the Chef server from the node on which it’s installed to get configuration data, performs searches of historical chef-client run data, and then pulls down the necessary configuration data. After the chef-client run is finished, the chef-client uploads updated run data to the Chef server.

 


Handy commands:

# Create a cookbook (structure) called chef_test01, into cookbooks dir
chef generate cookbook cookbooks/chef_test01

# Create a template for file "index.html" 
# this will generate a file "index.html.erb" under "cookbooks/templates" folder
chef generate template cookbooks/chef_test01 index.html

# Run a specific recipe web.rb of a cookbook, locally
# --runlist + --local-mode
chef-client --local-mode --runlist 'recipe[chef_test01::web]'

# Upload cookbook to Chef server
knife cookbook upload chef_test01

# Verify uploaded cookbooks (and versions)
knife cookbook list

# Bootstrap a node (to do ONCE)
# knife bootstrap ADDRESS --ssh-user USER --sudo --identity-file IDENTITY_FILE --node-name NODE_NAME
# Opt: --run-list 'recipe[RECIPE_NAME]'
knife bootstrap 10.0.3.1 --ssh-port 22 --ssh-user user1 --sudo --identity-file /home/me/keys/user1_private_key --node-name node1
# Verify that the node has been added
knife node list
knife node show node1

# Run cookbook on one node
# (--attribute ipaddress is used if the node has no resolvable FQDN)
knife ssh 'name:node1' 'sudo chef-client' --ssh-user user1 --identity-file /home/me/keys/user1_private_key --attribute ipaddress

# Delete the data about your node from the Chef server
knife node delete node1
knife client delete node1

# Delete Cookbook on Chef Server (select which version)
# use  --all --yes if you want remove everything
knife cookbook delete chef_test01

# Delete a role
knife role delete web

 


Practical examples:

Create file/directory

directory '/my/path'

file '/my/path/myfile' do
  content 'Content to insert in myfile'
  owner 'user1'
  group 'user1'
  mode '0644'
end

Package management

package 'httpd'

service 'httpd' do
  action [:enable, :start]
end

Use of template

template '/var/www/html/index.html' do
  source 'index.html.erb'
end

Use variables in the template

<html>
  <body>
    <h1>hello from <%= node['fqdn'] %></h1>
  </body>
</html>

 


General notes

Chef Supermarket

link here – Community cookbook repository.
Best way to get a cookbook from Chef Supermarket is using Berkshelf command (berks) as it resolves all the dependencies. knive supermarket does NOT resolve dependencies.

Add the cookbooks in Berksfile

source 'https://supermarket.chef.io'
cookbook 'chef-client'

And run

berks install

This will download the cookbooks and dependencies in ~/.berkshelf/cookbooks

Then to upload ALL to Chef Server, best way:

# Production
berks upload 

# Just to test (ignore SSL check)
berks upload --no-ssl-verify

 

Roles

Define a function of a node.
Stored as objects on the Chef server.
knife role create OR (better) knife role from file <role/myrole.json>. Using JSON is recommended as it can be version controlled.

Examples of web.json role:

{
   "name": "web",
   "description": "Role for Web Server",
   "json_class": "Chef::Role",
   "override_attributes": {
   },
   "chef_type": "role",
   "run_list": ["recipe[chef_test01::default]",
                "recipe[chef_test01::web]"
   ],
   "env_run_lists": {
   }
}

Commands:

# Push a role
knife role from file roles/web.json
knife role from file roles/db.json

# Check what's available
knife role list

# View the role pushed
knife role show web

# Assign a role to a specific node
knife node run_list set node1 "role[web]"
knife node run_list set node2 "role[db]"

# Verify
knife node show node1
knife node show node2

To apply the changes you need to run chef-client on the node.

You can also verify:

knife status 'role:web' --run-list

 


Kitchen

All the following is extracted from the official https://learn.chef.io

Test Kitchen helps speed up the development process by applying your infrastructure code on test environments from your workstation, before you apply your work in production.

Test Kitchen runs your infrastructure code in an isolated environment that resembles your production environment. With Test Kitchen, you continue to write your Chef code from your workstation, but instead of uploading your code to the Chef server and applying it to a node, Test Kitchen applies your code to a temporary environment, such as a virtual machine on your workstation or a cloud or container instance.

When you use the chef generate cookbook command to create a cookbook, Chef creates a file named .kitchen.yml in the root directory of your cookbook. .kitchen.yml defines what’s needed to run Test Kitchen, including which virtualisation provider to use, how to run Chef, and what platforms to run your code on.

Kitchen steps:

Kitchen WORKFLOW

Handy commands:

$ kitchen list
$ kitchen create
$ kitchen converge

 

Dynamic MOTD on Centos7

Just few steps!

Install figlet package:

yum install figlet

 

Create /etc/motd.sh script with this content:

#!/bin/sh
#
clear
figlet -f slant $(hostnamectl --pretty)
printf "\n"
printf "\t- %s\n\t- Kernel %s\n" "$(cat /etc/redhat-release)" "$(uname -r)"
printf "\n"



date=`date`
load=`cat /proc/loadavg | awk '{print $1}'`
root_usage=`df -h / | awk '/\// {print $(NF-1)}'`
memory_usage=`free -m | awk '/Mem:/ { total=$2 } /buffers\/cache/ { used=$3 } END { printf("%3.1f%%", used/total*100)}'`
swap_usage=`free -m | awk '/Swap/ { printf("%3.1f%%", "exit !$2;$3/$2*100") }'`
users=`users | wc -w`
time=`uptime | grep -ohe 'up .*' | sed 's/,/\ hours/g' | awk '{ printf $2" "$3 }'`
processes=`ps aux | wc -l`
ethup=$(ip -4 ad | grep 'state UP' | awk -F ":" '!/^[0-9]*: ?lo/ {print $2}')
ip=$(ip ad show dev $ethup |grep -v inet6 | grep inet|awk '{print $2}')

echo "System information as of: $date"
echo
printf "System load:\t%s\tIP Address:\t%s\n" $load $ip
printf "Memory usage:\t%s\tSystem uptime:\t%s\n" $memory_usage "$time"
printf "Usage on /:\t%s\tSwap usage:\t%s\n" $root_usage $swap_usage
printf "Local Users:\t%s\tProcesses:\t%s\n" $users $processes
echo

[ -f /etc/motd.tail ] && cat /etc/motd.tail || true

Make the script executable:

chmod +x /etc/motd.sh

Append this script to /etc/profile in order to be executed as last command once a user logs in:

echo "/etc/motd.sh" >> /etc/profile

Try and have fun! 🙂

 

If you are using Debian, here the other guide.

LVM – How to

Intro

LVM is a very powerful technology, and can really help the Sysadmin’s life.
However, this is something that we generally setup at the beginning (most of the time now it’s automatically setup during the installation process), and it’s well know… when we stop using something, we tend to forget how to use it.

This is why I’m writing this how to, mostly to keep track of the major features and commands, in case I will need them again in the future 😉

Before proceeding, please digest the following journey of this poor physical device that gets abstracted up to usable pieces.

                                                          VG                        VG
                                                   +---------------+         +---------------+
                                                   |      PV       |  +--->  |               |
                                   PV              | +-----------+ |         |  LV           |
                              +-----------+        | |  8E LVM   | |         |               |
              8E LVM          |  8E LVM   |        | | +-------+ | |         +---------------+
             +-------+        | +-------+ |        | | | +---+ | | |  +--->  +---------------+
+---+        | +---+ |        | | +---+ | |        | | | |DEV| | | |         |               |
|DEV| +----> | |DEV| | +----> | | |DEV| | | +----> | | | +---+ | | |         |  LV           |
+---+        | +---+ |        | | +---+ | |        | | +-------+ | |         |               |
  1.         +-------+        | +-------+ |        | +-----------+ |  +--->  |               |
                2.            +-----------+        |               |         +---------------+
                                   3.              |      PV       |         +---------------+
                                                   | +-----------+ |         |               |
 1. Original Device:                               | |  8E LVM   | |  +--->  |  LV           |
    (physical/virtual disk/partition/raid)         | | +-------+ | |         |               |
 2. fdisk'd to label 8E for LVM                    | | | +---+ | | |         |               |
 3. initialised as LVM Physical Volume             | | | |DEV| | | |         |               |
 4. Added in a LVM Volume Group                    | | | +---+ | | |  +--->  |               |
 5. "partitioned" in single/multiple               | | +-------+ | |         |               |
    LVM Logical Groups                          4. | +-----------+ |      5. |               |
                                                   +---------------+         +---------------+

 

 

Prepare partions

First of all, we need to find which device(s) we want to setup for LVM

fdisk -l

[root@n1 ~]# fdisk -l

Disk /dev/xvda: 21.5 GB, 21474836480 bytes, 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b7f16

    Device Boot      Start         End      Blocks   Id  System
/dev/xvda1   *        2048    41943039    20970496   83  Linux

Disk /dev/md1: 4996 MB, 4996726784 bytes, 9759232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/md2: 4996 MB, 4996726784 bytes, 9759232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/md3: 4996 MB, 4996726784 bytes, 9759232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

We can see 3 md devices, probably RAID devices. These are the ones that we are going to use for our LVM exercise.

Now, let’s create an LVM partition.

fdisk <device> => n , p , 1 , (enter) , (enter) , t , 8e , w

[root@n1 ~]# fdisk /dev/md1
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table
Building a new DOS disklabel with disk identifier 0x50b03cd2.

Command (m for help): n
Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-9759231, default 2048): 
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-9759231, default 9759231): 
Using default value 9759231
Partition 1 of type Linux and of size 4.7 GiB is set

Command (m for help): t
Selected partition 1
Hex code (type L to list all codes): 8e
Changed type of partition 'Linux' to 'Linux LVM'

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Do the same for all the devices that you want to use for LVM. In my example, I’ve done this for /dev/md1, /dev/md2 and /dev/md3.

Shortcut (risky but quicker) 🙂

echo -e "o\nn\np\n1\n\n\nt\n8e\nw" | fdisk /dev/mdx

All seems now good to go: we have Linux LVM partitions!

Disk /dev/md1: 4996 MB, 4996726784 bytes, 9759232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x50b03cd2

    Device Boot      Start         End      Blocks   Id  System
/dev/md1p1            2048     9759231     4878592   8e  Linux LVM

Disk /dev/md2: 4996 MB, 4996726784 bytes, 9759232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x4b68e9a3

    Device Boot      Start         End      Blocks   Id  System
/dev/md2p1            2048     9759231     4878592   8e  Linux LVM

Disk /dev/md3: 4996 MB, 4996726784 bytes, 9759232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x8dcf9ba0

    Device Boot      Start         End      Blocks   Id  System
/dev/md3p1            2048     9759231     4878592   8e  Linux LVM

Time to start to configure LVM

Configure LVM

First of all, we need to make these Linux LVM partition able to be part of a group (vg). I always find tricky to remember the logic behind. Let’s imagine that the device itself now is just labelled “Linux LVM” but we need to initiate it in somehow.

pvcreate <dev>

[root@n1 ~]# pvcreate /dev/md1p1
  Physical volume "/dev/md1p1" successfully created.
[root@n1 ~]# pvcreate /dev/md2p1
  Physical volume "/dev/md2p1" successfully created.
[root@n1 ~]# pvcreate /dev/md3p1
  Physical volume "/dev/md3p1" successfully created.

Now these guys are ready to be part of a group. In this case a Virtual Group (vg).
Let’s check that it’s actually true:
pvs

[root@n1 ~]# pvs
  PV         VG Fmt  Attr PSize PFree
  /dev/md1p1    lvm2 ---  4.65g 4.65g
  /dev/md2p1    lvm2 ---  4.65g 4.65g
  /dev/md3p1    lvm2 ---  4.65g 4.65g

Time to create a group with these devices (this could be done also with just a single device):

vgcreate <lvmgroupname> <dev> <dev> …

[root@n1 ~]# vgcreate mylvmvg /dev/md1p1 /dev/md2p1 /dev/md3p1
  Volume group "mylvmvg" successfully created

Now, let’s check again with pvs and vgs

[root@n1 ~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/md1p1 mylvmvg lvm2 a--  4.65g 4.65g
  /dev/md2p1 mylvmvg lvm2 a--  4.65g 4.65g
  /dev/md3p1 mylvmvg lvm2 a--  4.65g 4.65g
[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree 
  mylvmvg   3   0   0 wz--n- 13.95g 13.95g

Now pvs shows the VG group no longer empty but with mylvmvg. And vgs tells us that the VG is about 14GB in size, fully free with no LV in it.

Good! Now, let’s make some LVs (logical volumes). These will be the new “partitions/disks” that we will be actually able to format, mount and use! 🙂

lvcreate -n <name> -L xGB <vg_group_name>

[root@n1 ~]# lvcreate -n part1 -L 2GB mylvmvg
  Logical volume "part1" created.

Some checks to verify:

[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree 
  mylvmvg   3   1   0 wz--n- 13.95g 11.95g
[root@n1 ~]# lvs
  LV    VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  part1 mylvmvg -wi-a----- 2.00g

A new LV appears in vgs and lvs shows the 2GB volume that we have created.

Let’s create another one, but this time, using the full remaining space (using -l 100%VG option instead of -L xGB)

[root@n1 ~]# lvcreate -n part2 -l 100%VG mylvmvg
  Logical volume "part2" created.
[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree
  mylvmvg   3   2   0 wz--n- 13.95g    0 
[root@n1 ~]# lvs
  LV    VG      Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  part1 mylvmvg -wi-a-----  2.00g                                                    
  part2 mylvmvg -wi-a----- 11.95g                                                    
[root@n1 ~]# 

Magic!

Now, we have two devices, both ‘a’ -> active and ready to be formatted:
mkfs.ext4 <device>

[root@n1 ~]# mkfs.ext4 /dev/mylvmvg/part1 
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
131072 inodes, 524288 blocks
26214 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=536870912
16 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

I’ve done this for /dev/mylvmvg/part1 and /dev/mylvmvg/part2.

Let’s create the mount points and mount them:

[root@n1 ~]# mkdir -p /mountpoint1 /mountpoint2

[root@n1 ~]# mount -t ext4 /dev/mylvmvg/part1 /mountpoint1

[root@n1 ~]# mount -t ext4 /dev/mylvmvg/part2 /mountpoint2

[root@n1 ~]# mount | grep mylvmvg
/dev/mapper/mylvmvg-part1 on /mountpoint1 type ext4 (rw,relatime,data=ordered)
/dev/mapper/mylvmvg-part2 on /mountpoint2 type ext4 (rw,relatime,data=ordered)

[root@n1 ~]# df -Th | grep mapper
/dev/mapper/mylvmvg-part1 ext4      2.0G  6.0M  1.8G   1% /mountpoint1
/dev/mapper/mylvmvg-part2 ext4       12G   41M   11G   1% /mountpoint2

As you can see, the devices are appearing now as /dev/mapper/mylvmvg-partX. You can use either /dev/mylvmvg/partX or /dev/mapper/mylvmvg-partX. Theoretically, the mapper one is recommended (my bad!).

Now the 2 devices are ready to be used as a typical disk/partition formatted with ext4 filesystem.


Resize Logical Volume

Now, imagine that part1 is too small, and you need more space. And luckily, your part2 volume has plenty. Is there any way to “steal” some space from part2 and give it to part1?
Ooohh yesss! 🙂

How?

  1. shrink part2 logical volume AND its filesystem
  2. expand part1 logical volume AND its filesystem

Here the comments inline:

# Important the -r (this RESIZE the filesystem during the process)
[root@n1 ~]# lvreduce -L -5GB -r /dev/mylvmvg/part2 
Do you want to unmount "/mountpoint2"? [Y|n] y
fsck from util-linux 2.23.2
/dev/mapper/mylvmvg-part2: 12/783360 files (0.0% non-contiguous), 92221/3131392 blocks
resize2fs 1.42.9 (28-Dec-2013)
Resizing the filesystem on /dev/mapper/mylvmvg-part2 to 1820672 (4k) blocks.
The filesystem on /dev/mapper/mylvmvg-part2 is now 1820672 blocks long.

  Size of logical volume mylvmvg/part2 changed from 11.95 GiB (3058 extents) to 6.95 GiB (1778 extents).
  Logical volume mylvmvg/part2 successfully resized.

# Here we can see that part2 is now smaller than before
[root@n1 ~]# lvs
  LV    VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  part1 mylvmvg -wi-ao---- 2.00g                                                    
  part2 mylvmvg -wi-ao---- 6.95g                                                    

# And here we can see 5GB available in the vg
[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree
  mylvmvg   3   2   0 wz--n- 13.95g 5.00g

# We assign the 5GB available to part1
[root@n1 ~]# lvextend -L +5GB -r /dev/mylvmvg/part1
  Size of logical volume mylvmvg/part1 changed from 2.00 GiB (512 extents) to 7.00 GiB (1792 extents).
  Logical volume mylvmvg/part1 successfully resized.
resize2fs 1.42.9 (28-Dec-2013)
Filesystem at /dev/mapper/mylvmvg-part1 is mounted on /mountpoint1; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 1
The filesystem on /dev/mapper/mylvmvg-part1 is now 1835008 blocks long.

# No more Free space
[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree
  mylvmvg   3   2   0 wz--n- 13.95g    0 

# part1 is now 7GB (prev 2GB)
[root@n1 ~]# lvs
  LV    VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  part1 mylvmvg -wi-ao---- 7.00g                                                    
  part2 mylvmvg -wi-ao---- 6.95g                                                    

# df shows as well the new size
[root@n1 ~]# df -Th | grep mapper
/dev/mapper/mylvmvg-part1 ext4      6.9G  9.1M  6.6G   1% /mountpoint1
/dev/mapper/mylvmvg-part2 ext4      6.8G   37M  6.4G   1% /mountpoint2

 

Move logical volume onto a new RAID array

Now, let’s imagine that one of the 3 initial md devices are having problems, or simply we want to move on a faster/bigger raid array.
The magic of LVM is that we can actually do this with NO DOWNTIME!

How?

In this example we assume that a new /dev/md10 device is attached to our server and we want to remove /dev/md2 device.

  1. We need to take the new device and go through all the previous steps:
    1. fdisk
    2. pvcreate
  2. After that, we need to add this initialised device in the existing volume group (vg)
  3. Move whatever is stored on the physical device
  4. Shrink the volume group
  5. Remove the device
[root@n1 ~]# echo -e "o\nn\np\n1\n\n\nt\n8e\nw" | fdisk /dev/md10
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table
Building a new DOS disklabel with disk identifier 0x465fad01.

Command (m for help): Building a new DOS disklabel with disk identifier 0x5aa41f03.

Command (m for help): Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): Partition number (1-4, default 1): First sector (2048-104791935, default 2048): Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-104791935, default 104791935): Using default value 104791935
Partition 1 of type Linux and of size 50 GiB is set

Command (m for help): Selected partition 1
Hex code (type L to list all codes): Changed type of partition 'Linux' to 'Linux LVM'

Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

[root@n1 ~]# fdisk -l | grep md10
Disk /dev/md10: 53.7 GB, 53653471232 bytes, 104791936 sectors
/dev/md10p1            2048   104791935    52394944   8e  Linux LVM

[root@n1 ~]# pvcreate /dev/md10p1
  Physical volume "/dev/md10p1" successfully created.

[root@n1 ~]# pvs
  PV          VG      Fmt  Attr PSize  PFree 
  /dev/md10p1         lvm2 ---  49.97g 49.97g
  /dev/md1p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md2p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md3p1  mylvmvg lvm2 a--   4.65g     0 

[root@n1 ~]# vgextend mylvmvg /dev/md10p1
  Volume group "mylvmvg" successfully extended

[root@n1 ~]# pvs
  PV          VG      Fmt  Attr PSize  PFree 
  /dev/md10p1 mylvmvg lvm2 a--  49.96g 49.96g
  /dev/md1p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md2p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md3p1  mylvmvg lvm2 a--   4.65g     0 

[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree 
  mylvmvg   4   2   0 wz--n- 63.91g 49.96g

Now where the new bits are starting:
pvmove, vgreduce, pvremove

[root@n1 ~]# pvmove /dev/md2p1
  /dev/md2p1: Moved: 0.00%
  /dev/md2p1: Moved: 5.63%
  /dev/md2p1: Moved: 11.51%
  ...
  /dev/md2p1: Moved: 92.61%
  /dev/md2p1: Moved: 98.07%
  /dev/md2p1: Moved: 100.00%

# Here we can see 4 phisical volumes, and a size of ~64GB
[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree 
  mylvmvg   4   2   0 wz--n- 63.91g 49.96g

# We can see also that /dev/md2p1 is now fully FREE
[root@n1 ~]# pvs
  PV          VG      Fmt  Attr PSize  PFree 
  /dev/md10p1 mylvmvg lvm2 a--  49.96g 45.32g
  /dev/md1p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md2p1  mylvmvg lvm2 a--   4.65g  4.65g
  /dev/md3p1  mylvmvg lvm2 a--   4.65g     0 

# we can safely remove this device from the vg
[root@n1 ~]# vgreduce mylvmvg /dev/md2p1
  Removed "/dev/md2p1" from volume group "mylvmvg"

[root@n1 ~]# vgs
  VG      #PV #LV #SN Attr   VSize  VFree 
  mylvmvg   3   2   0 wz--n- 59.26g 45.32g

#/dev/md2p1 doesn't belong to any VG anymore
[root@n1 ~]# pvs
  PV          VG      Fmt  Attr PSize  PFree 
  /dev/md10p1 mylvmvg lvm2 a--  49.96g 45.32g
  /dev/md1p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md2p1          lvm2 ---   4.65g  4.65g
  /dev/md3p1  mylvmvg lvm2 a--   4.65g     0 

# Removing and confirm: no more /dev/md2p1
[root@n1 ~]# pvremove /dev/md2p1
  Labels on physical volume "/dev/md2p1" successfully wiped.

[root@n1 ~]# pvs
  PV          VG      Fmt  Attr PSize  PFree 
  /dev/md10p1 mylvmvg lvm2 a--  49.96g 45.32g
  /dev/md1p1  mylvmvg lvm2 a--   4.65g     0 
  /dev/md3p1  mylvmvg lvm2 a--   4.65g     0 

 

In this example we have left LVM to decide where to put the data that was stored on /dev/md2 device.
Just for reference, we could have specified the destination physical device (e.g. if we were thinking to remove more devices and make sure that the data was ending up on the new RAID and not sprat across the other disks):

pvmove /dev/md2p1 /dev/md10p1

Or, in case we just wanted to move a specific logical volume, let’s say part1

pvmove -n part1 /dev/md2p1 /dev/md10p1

 

…happy LVM’ing! 😉

Vim – remove the yellow highlight

Be honest. Sometimes happened also to you to see a file that you edited a while ago with Vim, still showing that terribly annoying yellow highlight. And I’m sure you probably gave up, thinking that the time was going to remove it.

Wrong! 😛

Here how to get rid of it:

  1. open the file
  2. press ESC
  3. type :nohl
  4. press Enter

Alternatively, the way I kept to use (which seems easier to remember), is actually search for some crazy string.
For example:

  1. open the file
  2. press ESC
  3. type /fkjsaddflkjasd;flka
    (randomly type stuffs)
  4. press Enter

Done 😉