Thursday, January 9, 2014

RAID-0 (stripe) on solaris 10 using solaris volume manager

First step is to prepare hard drives for raid 0. So we will create one big partition that will span across whole drive. Run format and select first drive:



root@jsc-x4100-17:~# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c2t0d0 
          /pci@7b,0/pci1022,7458@11/pci1000,3060@2/sd@0,0
       1. c2t1d0 
          /pci@7b,0/pci1022,7458@11/pci1000,3060@2/sd@1,0
       2. c2t2d0 
          /pci@7b,0/pci1022,7458@11/pci1000,3060@2/sd@2,0
       3. c2t3d0 
          /pci@7b,0/pci1022,7458@11/pci1000,3060@2/sd@3,0
Specify disk (enter its number): 1
selecting c2t1d0
[disk formatted]


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        fdisk      - run the fdisk program
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        volname    - set 8-character volume name
        !     - execute , then return
        quit
format>p


PARTITION MENU:
        0      - change `0' partition
        1      - change `1' partition
        2      - change `2' partition
        3      - change `3' partition
        4      - change `4' partition
        5      - change `5' partition
        6      - change `6' partition
        7      - change `7' partition
        select - select a predefined table
        modify - modify a predefined partition table
        name   - name the current table
        print  - display the current table
        label  - write partition map and label to the disk
        ! - execute , then return
        quit
partition> 0

 Part      Tag    Flag     Cylinders        Size            Blocks
  0       home    wm       1 - 8920       68.33GB    (8920/0/0) 143299800

Enter partition id tag[home]:home

Enter partition permission flags[wm]:wm

Enter new starting cyl[1]: 1
Enter partition size[143299800b, 8920c, 8920e, 69970.61mb, 68.33gb]: $
partition> label
Ready to label disk, continue? y
partition> q

FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        fdisk      - run the fdisk program
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        save       - save new disk/partition definitions
        inquiry    - show vendor, product and revision
        volname    - set 8-character volume name
        !     - execute , then return
        quit
format> q


At this point we have disk 1 partitioned with slice 0 spanning from cyl 1 to the end of drive $. Instead of repeating same steps for disk 2 and disk 3, we will use Solaris prtvtoc to print disk's 1 partition table and fmthard to apply that table to disk 2 and 3 (all disks are identical).



root@jsc-x4100-17:~# prtvtoc /dev/rdsk/c2t1d0s2 > /var/tmp/prtvtoc.c2t1d0s2
root@jsc-x4100-17:~# fmthard -s /var/tmp/prtvtoc.c2t1d0s2 /dev/rdsk/c2t2d0s2
fmthard:  New volume table of contents now in place.
root@jsc-x4100-17:~# fmthard -s /var/tmp/prtvtoc.c2t1d0s2 /dev/rdsk/c2t3d0s2
fmthard:  New volume table of contents now in place.

Next step is to create replicas of metadevice state database. Metadevice database contains configuration and state of all metadevices and hot spare pools on the system. Since this information is important, we will be creating 3 replicas of this database, one per each drive. Metadevice state database can be created on any slice on hard drive, including slice that will later became part of metadevice. Also it's possible to create more than 1 replica of database per one slice. If one or more metadevice state databases fails, volume management compare other databases and based on majority consensus algorithm decides which replicas are valid. Command to create metadevice replicas is metadb.


root@jsc-x4100-17:~# metadb -a -f c2t1d0s0 c2t2d0s0 c2t3d0s0

-a is to add database replicas, and -f is to force adding (we have to force adding since there no metadevice state replicas exists). Use metadb -i to check state of metadevice replicas. In our case we can see that replicas are active a flag, and that they are up to date u flag


root@jsc-x4100-17:~# metadb -i
        flags           first blk       block count
     a        u         16              8192            /dev/dsk/c2t1d0s0
     a        u         16              8192            /dev/dsk/c2t2d0s0
     a        u         16              8192            /dev/dsk/c2t3d0s0
 r - replica does not have device relocation information
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

metadb's -c switch determines how many replicas per slice we want. If we have had issued -c 3 on three slices we would end up with 9 metadevice state database replicas:


root@jsc-x4100-17:~# metadb -a -f -c 3 c2t1d0s0 c2t2d0s0 c2t3d0s0

root@jsc-x4100-17:~# metadb -i

        flags           first blk       block count
     a        u         16              8192            /dev/dsk/c2t1d0s0
     a        u         8208            8192            /dev/dsk/c2t1d0s0
     a        u         16400           8192            /dev/dsk/c2t1d0s0
     a        u         16              8192            /dev/dsk/c2t2d0s0
     a        u         8208            8192            /dev/dsk/c2t2d0s0
     a        u         16400           8192            /dev/dsk/c2t2d0s0
     a        u         16              8192            /dev/dsk/c2t3d0s0
     a        u         8208            8192            /dev/dsk/c2t3d0s0
     a        u         16400           8192            /dev/dsk/c2t3d0s0
 r - replica does not have device relocation information
 o - replica active prior to last mddb configuration change
 u - replica is up to date
 l - locator for this replica was read successfully
 c - replica's location was in /etc/lvm/mddb.cf
 p - replica's location was patched in kernel
 m - replica is master, this is replica selected as input
 W - replica has device write errors
 a - replica is active, commits are occurring to this replica
 M - replica had problem with master blocks
 D - replica had problem with data blocks
 F - replica had format problems
 S - replica is too small to hold current data base
 R - replica had device read errors

Once we have metadevice state databases we will proceede with creating of metadevice. Command is metainit metadevice ame number of stripes width logical name for slice1 slice2 .... Number of stripes parameter determines how many stripes we want in metadevice. For example if number of stripes equals to 1, we are creating simple stripe, if it's equal to number of slices than we have concatenation. width specifies number of slices that make up a stripe. In our case number of stripes will be 1, and width 3


root@jsc-x4100-17:~# metainit d0 1 3 c2t1d0s0 c2t2d0s0 c2t3d0s0
d0: Concat/Stripe is setup

To verify stripe and get some info we use metastat command


root@jsc-x4100-17:~# metastat
d0: Concat/Stripe
    Size: 429835140 blocks (204 GB)
    Stripe 0: (interlace: 32 blocks)
        Device     Start Block  Dbase   Reloc
        c2t1d0s0      16065     Yes     Yes
        c2t2d0s0      16065     Yes     Yes
        c2t3d0s0      16065     Yes     Yes

Device Relocation Information:
Device   Reloc  Device ID
c2t1d0   Yes    id1,sd@SSEAGATE_ST973401LSUN72G_3710ZJ07____________3LB0ZJ07
c2t2d0   Yes    id1,sd@SSEAGATE_ST973401LSUN72G_3710ZGLR____________3LB0ZGLR
c2t3d0   Yes    id1,sd@SSEAGATE_ST973401LSUN72G_3710Z1DG____________3LB0Z1DG


And now final steps is to create 4 soft partitons within metadevice d0. For creating soft partitions we are using metainit with -p switch and specifying size of soft partition as last parameter (in our example it's 204gb/4 = 51gb)



root@jsc-x4100-17:~# metainit d1 -p d0 51g
d1: Soft Partition is setup
root@jsc-x4100-17:~# metainit d2 -p d0 51g
d2: Soft Partition is setup
root@jsc-x4100-17:~# metainit d3 -p d0 51g
d3: Soft Partition is setup
root@jsc-x4100-17:~# metainit d4 -p d0 51g
d4: Soft Partition is setup

you can verify this with metastat


root@jsc-x4100-17:home# metastat
d4: Soft Partition
    Device: d0
    State: Okay
    Size: 106954752 blocks (51 GB)
        Extent              Start Block              Block count
             0                320864384                106954752

d0: Concat/Stripe
    Size: 429835140 blocks (204 GB)
    Stripe 0: (interlace: 32 blocks)
        Device     Start Block  Dbase        State Reloc Hot Spare
        c2t1d0s0      16065     Yes           Okay   Yes
        c2t2d0s0      16065     Yes           Okay   Yes
        c2t3d0s0      16065     Yes           Okay   Yes

d3: Soft Partition
    Device: d0
    State: Okay
    Size: 106954752 blocks (51 GB)
        Extent              Start Block              Block count
             0                213909600                106954752

d2: Soft Partition
    Device: d0
    State: Okay
    Size: 106954752 blocks (51 GB)
        Extent              Start Block              Block count
             0                106954816                106954752

d1: Soft Partition
    Device: d0
    State: Okay
    Size: 106954752 blocks (51 GB)
        Extent              Start Block              Block count
             0                       32                106954752

Device Relocation Information:
Device   Reloc  Device ID
c2t1d0   Yes    id1,sd@SSEAGATE_ST973401LSUN72G_3710ZJ07____________3LB0ZJ07
c2t2d0   Yes    id1,sd@SSEAGATE_ST973401LSUN72G_3710ZGLR____________3LB0ZGLR
c2t3d0   Yes    id1,sd@SSEAGATE_ST973401LSUN72G_3710Z1DG____________3LB0Z1DG

after this you can use soft partitions as you would be using any other partition, format them, mount, udsdump, ufsrestore etc ... for example:


root@jsc-x4100-17:~# echo y|newfs /dev/md/rdsk/d1

/dev/md/rdsk/d1:        106954752 sectors in 17408 cylinders of 48 tracks, 128 
sectors
        52224.0MB in 1088 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
.....................
super-block backups for last 10 cylinder groups at:
 105978656, 106077088, 106175520, 106273952, 106372384, 106470816, 106569248,
 106667680, 106766112, 106864544

repeat this step for /dev/md/rdsk/d2 /dev/md/rdsk/d3 and /dev/md/rdsk/d4. when you'r finished you can happily mount soft partitions into locations where zones will be installed.

if you want to remove your metadevice/metadb use reversed steps:
first remove soft partitions from meta device


root@jsc-x4100-17:~# metaclear -p d0
d4: Soft Partition is cleared
d3: Soft Partition is cleared
d2: Soft Partition is cleared
d1: Soft Partition is cleared

then remove metadevice


root@jsc-x4100-17:~# metaclear d0
d0: Concat/Stripe is cleared

and finaly metadb


root@jsc-x4100-17:~# metadb -f -d c2t1d0s0 c2t2d0s0 c2t3d0s0

No comments:

Post a Comment