Tag Archive for 'LVM'

ZFS en Linux

Este fin de semana he instalado ZFS on FUSE/Linux para probar ZFS en mi servidor Linux:

  • ZFS en Wikipedia:

    ZFS is a file system designed by Sun Microsystems for the Solaris Operating System. The features of ZFS include support for high storage capacities, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. ZFS is implemented as open-source software, licensed under the Common Development and Distribution License (CDDL).

  • ZFS features:
    • Pooled Storage Model
    • Always consistent on disk
    • Protection from data corruption
    • Live data scrubbing
    • Instantaneous snapshots and clones
    • Fast native backup and restore
    • Highly scalable
    • Built in compression
    • Simplified administration model

Tampoco he tenido tiempo de probar mucho, pero espero poder investigar un poco mas:


alegrome# lsb_release -d
Description:    Debian GNU/Linux 4.0 (etch)

alegrome# uname -a
Linux alegrome 2.6.18-6-686 #1 SMP Mon Aug 18 08:42:39 UTC 2008 i686 GNU/Linux

alegrome# zpool status -v
  pool: zpool01
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zpool01     ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            hde3    ONLINE       0     0     0
            hdg3    ONLINE       0     0     0

errors: No known data errors

alegrome# zfs list
NAME               USED  AVAIL  REFER  MOUNTPOINT
zpool01            134K  5.60G    21K  /zfs
zpool01/incoming    18K  5.60G    18K  /zfs/incoming
zpool01/tmp         18K  5.60G    18K  /zfs/tmp

Mirror de un lvol en LVM de Linux

Para quien venga de HP-UX, resulta facil mirrorear un lvol bajo LVM: con un lvextend -m. En Linux, el lvextend -m simplemente no funciona…

De hecho el comando lvcreate si tiene una opcion -m, y esa si que funciona (lo he probado)! Pero con el lvextend no la coje.

Me he bajado las fuentes de lvm2. En commands.h he descubierto el comando lvconvert que no conocia:

alegrome# lvconvert
  Exactly one of --mirrors or --snapshot arguments required.
  lvconvert: Change logical volume layout

lvconvert [-m|--mirrors Mirrors [--corelog]]
        [--alloc AllocationPolicy]
        [-d|--debug]
        [-h|-?|--help]
        [-v|--verbose]
        [--version]
        LogicalVolume[Path] [PhysicalVolume[Path]...]

lvconvert [-s|--snapshot]
        [-c|--chunksize]
        [-d|--debug]
        [-h|-?|--help]
        [-v|--verbose]
        [-Z|--zero {y|n}]
        [--version]
        OriginalLogicalVolume[Path] SnapshotLogicalVolume[Path]

Este comando si que parece funcionar. Probemos a mirrorear un lvol:

# lvconvert -m 1 /dev/vg02/lvweb
  Logical volume lvweb converted.

Miremos con un lvdisplay lo que nos ha hecho el comando:

alegrome# lvdisplay -m lvweb
  --- Logical volume ---
  LV Name                /dev/vg02/lvweb
  VG Name                vg02
  LV UUID                9I4wK7-2hn7-j4dI-o5yT-ngYx-8Wtd-m3no6q
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                200.00 MB
  Current LE             50
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:6

  --- Segments ---
  Logical extent 0 to 49:
    Type                mirror
    Mirrors             2
    Mirror size         50
    Mirror log volume   lvweb_mlog
    Mirror region size  512.00 KB
    Mirror original:
      Logical volume    lvweb_mimage_0
      Logical extents   0 to 49
    Mirror destinations:
      Logical volume    lvweb_mimage_1
      Logical extents   0 to 49

A notar ahi: Type = mirror.

Como se puede ver, el lvweb ahora si esta en mirror. La verdad es que no he visto esto documentado en ningun sitio (¿alguien ha visto mas sobre esto?).

Para quitar el mirror (reducir), se haria asi:

# lvconvert -m 0 /dev/vg02/lvweb
  Logical volume lvweb converted.

# lvdisplay -m lvweb
  --- Logical volume ---
  LV Name                /dev/vg02/lvweb
  VG Name                vg02
  LV UUID                9I4wK7-2hn7-j4dI-o5yT-ngYx-8Wtd-m3no6q
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                200.00 MB
  Current LE             50
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:6

  --- Segments ---
  Logical extent 0 to 49:
    Type                linear
    Physical volume     /dev/hdh4
    Physical extents    18370 to 18419

Con lvs vemos el estado del mirrorring

alegrome# lvs lvweb
  LV     VG   Attr   LSize Origin Snap%  Move Log         Copy%
  lvweb  vg02 mwi-ao 200M                    lvweb_mlog   12.22

En Attr, la “m” es de mirror.

Este post viene originado por un comentario de Rubik a un post de Ivan sobre “Crear un raid 1 a partir de un disco con datos sin formatear“.. Gracias a ambos.

Redistribuir los PV links de un VG (HP-UX) — Update

Respecto al post anterior sobre como redistribuir los PV links de un VG sobre los distinctos caminos a los discos, agradezco el comentario de RuBiCK apuntando a la opción -s del comando pvchange:

NAME
   pvchange - change characteristics and access path of a physical volume
   in an LVM volume group

SYNOPSIS
...
   /usr/sbin/pvchange [-A autobackup] -s pv_path
...
     -s  Immediately begin accessing the associated
         physical volume named by pv_path.
...

Usando este comando no es necesario quitar y reañadir los PV links al VG. Ademas de ser mas seguro, tambien se reduce considerablemente el script, quedando asi:

#!/usr/bin/ksh
# Distribuye los PV links sobre las controladoras
set -o nounset

VG=$1
I=0
for PV in $(
   vgdisplay -v $VG | grep "PV Name" |
   grep -v "Alternate Link" | awk '{ print $3 }' )
do

   LISTA_LINKS=$( pvdisplay $PV | grep "PV Name" | awk '{ print $3 }' | sort )
   NUM_LINKS=$( echo $LISTA_LINKS | wc -w )

   NEW=$(( I % NUM_LINKS + 1 ))
   PRI=$( echo $LISTA_LINKS | awk "{ print \$${NEW} }" )

   pvchange -A n -s $PRI
   (( I = I + 1 ))

done
vgcfgbackup $VG

Gracias RuBiCK por la contribución ;-).

Redistribuir los PV links de un VG (HP-UX)

Inspirado por el post de RuBiCK sobre como extender un VG con todos los PV links alternates de cada PV, se me occurrio hacer un script para distribuir todos los PV links sobre los distinctos caminos a los discos (es decir sobre las posibles controladoras). Esto no aplica si estamos usando un drivers que balancea el acceso a los discos y no hace uso de los pv links (por ejemplo Powerpath).

Por ejemplo, un VG de 3 PVs. Cada PV se ve por 4 caminos, por las controladoras c4, c6, c8 y c10.

Posiblemente, el primary path de los 3 PVs sea por la c4, mientras que los demas caminos estan en standby. Aun que no sea comparable a Powerpath, es mas interesante distribuir las I/O sobre todos los caminos posibles. Para esto, podemos redistribuir los primary path sobre los caminos posibles.

Por ejemplo: PV1 por la c4, PV2 por la c6 y PV3 por la c8 (y seguiriamos asi con los demas discos…)

Para hacerlo en caliente, lo que hace el script es quitar los caminos que no seran el primario y reañadirlos (en el orden correcto):

# pvdisplay /dev/dsk/c4t0d4
--- Physical volumes ---
PV Name                     /dev/dsk/c4t0d4
PV Name                     /dev/dsk/c6t0d4     Alternate Link
PV Name                     /dev/dsk/c8t0d4     Alternate Link
PV Name                     /dev/dsk/c10t0d4    Alternate Link

# vgreduce -A n $VG /dev/dsk/c8t0d4 /dev/dsk/c10t0d4 /dev/dsk/c4t0d4

# vgextend -A n $VG /dev/dsk/c8t0d4 /dev/dsk/c10t0d4 /dev/dsk/c4t0d4

# pvdisplay /dev/dsk/c6t0d4
--- Physical volumes ---
PV Name                     /dev/dsk/c6t0d4
PV Name                     /dev/dsk/c8t0d4     Alternate Link
PV Name                     /dev/dsk/c10t0d4    Alternate Link
PV Name                     /dev/dsk/c4t0d4     Alternate Link

He hecho el siguiente script para hacer el trabajo automaticamente con todos los discos de un VG:

#!/usr/bin/ksh
# Distribuye los PV links sobre las controladoras
set -o nounset

# bucle para cada primary link:
VG=$1
FILE=$( mktemp )
J=0
for PV in $(
   vgdisplay -v $VG | grep "PV Name" |
   grep -v "Alternate Link" | awk '{ print $3 }' )
do
   (( J = J + 1 ))
   I=1

   LISTA_LINKS=$( pvdisplay $PV | grep "PV Name" | awk '{ print $3 }' | sort )
   NUM_LINKS=$( echo $LISTA_LINKS | wc -w )

   for LINK in $( pvdisplay $PV | grep "PV Name" | awk '{ print $3 }' )
   do
      (( N = ( I + J ) % NUM_LINKS + 1 ))
      echo $PV $( echo $LINK | tr "t" "/" | cut -d/ -f 4 ) $LINK $I $N
      (( I = I + 1 ))
   done
done | sort -k 5 > $FILE

for PV in $(
   vgdisplay -v $VG | grep "PV Name" |
   grep -v "Alternate Link" | awk '{ print $3 }' )
do

   # Lista de links
   set -- $( grep $PV" " $FILE | awk '{ print $3 }' )

   PRIMARY=$1
   shift
   ALTERNATES=$*

   # esta bien el futur primary? (sino saltamos)
   pvdisplay $PRIMARY >/dev/null 2>&1 || continue

   vgreduce -A n $VG $ALTERNATES
   vgextend -A n $VG $ALTERNATES

done
vgcfgbackup $VG
rm $FILE

XFS and LVM snapshots

This is another article I’ve written some years ago (2003-09-29, original URL). It used to be quite popular, so I’ve decided to recover it and publish it again here:

I want to do a consistent snapshot of my /home, which is an XFS filesystem created on an LVM logical volume:

# grep home /etc/mtab
/dev/vg01/lvhome /home xfs rw,noatime 0 0

# lvdisplay /dev/vg01/lvhome
--- Logical volume ---
LV Name                /dev/vg01/lvhome
VG Name                vg01
LV Write Access        read/write
LV Status              available
LV #                   2
# open                 1
LV Size                800 MB
Current LE             200
Allocated LE           200
Allocation             next
Read ahead sectors     1024
Block device           58:1

In order to get a consistent image of the filesystem in the snapshot, we need to freeze it, so the log jornal is flushed and no more accesses are done to it.

So the idea is : freeze, take snapshot and then unfreeze (see xfs_freeze(8))

# xfs_freeze -f /home
# lvcreate -l 30 -n lvsnap_home -s /dev/vg01/lvhome
# xfs_freeze -u /home

I’ve created the snapshot with 30 extents which is ~15% of the original LV.

Now we can mount the snapshot, and for example make a backup of the filesystem while users can continue to work.

# mount /dev/vg01/lvsnap_home /mnt/tmp/
mount: block device /dev/vg01/lvsnap_home is write-protected, mounting read-only
mount: wrong fs type, bad option, bad superblock on /dev/vg01/lvsnap_home,
or too many mounted file systems

The mount error message is a little confusing, but looking at console messages or syslog, we can see the explaination:

kernel: XFS: Filesystem lvm(58,2) has duplicate UUID - can't mount

Uff… actually we are in trouble: as expected, the snapshot is an image of the orginal filesystem, so it also has it UUID.

The solution could be:

# xfs_admin -U generate /dev/vg01/lvsnap_home

But it can’t be done because the snapshot is read-only.

NOTE: i’ve read that there is a LVM kernel patch and userland tools patch for mounting a LVM snapshot r/w, but i’ve not seen it.

Actually i’ve found the solution in Documentation/filesystems/xfs.txt: using the nouuid option of mount:

# mount -o ro,nouuid /dev/vg01/lvsnap_home /mnt/tmp/

That’s it:

alegrome:~# df /home /mnt/tmp/
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/vg01/lvhome        814400    201508    612892  25% /home
/dev/vg01/lvsnap_home   814400    201504    612896  25% /mnt/tmp

UPDATED (Sat, 31 Mar 2007 14:40:59 +0200):
This article works for LVM10. It may be obsolete when you use LVM2, as it seems that LVM2 snapshots are R/W by default now.




Close
Powered by ShareThis