Sunday, June 21, 2015

Correcting device paths when replacing fiber boot disks after a ufsrestore


Goal

A Sun Microsystems platform, with a fiber disk as the boot disk, is dependent on the disk's World Wide Number (WWN) being included in its physical device path, in order for the system to boot properly. Below is a detail from a Sun Fire 280R system:

/pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w210000203746f423,0:a 
Physical Device Path
/pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w210000203746f423,0:a
WWN


If the primary boot device is replaced by a disk which has been ufsrestore'd (from a backup via ufsdump), then the WWN of this disk must be implemented into the Solaris Operating System device path, and must be associated with the correct  Solaris OS logical device.

Solution

Procedure for correcting device paths on the Sun Fire 280R:
To show this as a working example, we copied c1t0d0s0 to c1t1d0s0 on a Sun Fire 280R via ufsdump and ufsrestore, then installed the boot-block. The two disks were switched, emulating the replacement of the primary disk.
Note: The ufsdump and ufsrestore process is not part of the scope of this document.

Upon boot, the mounting of the root filesystem will be corrupted, and the boot process may fail, as seen below:

Sun Fire 280R (UltraSPARC-III) , No Keyboard 
OpenBoot 4.0, 2048 MB memory installed, Serial #16459995. 
Ethernet address 8:0:20:fb:28:db, Host ID: 80fb28db. 
Rebooting with command: boot disk
Boot device: /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@0,0  File and args:
SunOS Release 5.8 Version Generic_108528-10 64-bit 
Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. 
configuring IPv4 interfaces: eri0. 
Hostname: ib-sf280r 
mount: /dev/dsk/c1t0d0s0 is not this fstype. 
/sbin/rcS: /etc/dfs/sharetab: cannot create failed to open /etc/coreadm.confsyseventd: 
Unable to open daemon lock file '/etc/sysevent/syseventd_lock':'Read-only file system'
INIT: Cannot create /var/adm/utmpx
INIT: failed write of utmpx entry:"  "
INIT: failed write of utmpx entry:"  " 
INIT: SINGLE USER MODE
Type control-d to proceed with normal startup,(or give root password for system maintenance):
single-user privilege assigned to /dev/console. 
Entering System Maintenance Mode

The output from df -k may show the physical disk device mounted, but not the logical device (c1t0d0s0):

# df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w210000203746f423,0:a
4131866  888592 3201956    22%    /
/proc                      0       0       0     0%    /proc
fd                         0       0       0     0%    /dev/fd
mnttab                     0       0       0     0%    /etc/mnttab
swap                 2756696       0 2756696     0%    /var/run


The disk device that the system was booted from, is disk@w210000203746f423,0:a. This disk's WWN does not match the WWN of the disk that the vfstab file shows to be the boot device. To correct this situation, boot to mini-root from a Solaris OS CD-ROM or install image("boot net -s" or "boot cdrom -s"), and recreate the link, of the physical device path (/devices) to the logical device name (c1t0d0s0) by following the steps below:

ok boot net -s
INIT: SINGLE USER MODE
Mount the root filesystem to /mnt
# mount /dev/dsk/c1t0d0s0 /mnt


Determine (from the vfstab file) the expected logical boot device (c1t0d0s0)

# more /mnt/etc/vfstab
#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
#
#/dev/dsk/c1d0s2 /dev/rdsk/c1d0s2 /usr          ufs     1       yes     -
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/dsk/c1t0d0s1       -       -       swap    -       no      -
/dev/dsk/c1t0d0s0       /dev/rdsk/c1t0d0s0      /       ufs     1       no
swap    -       /tmp    tmpfs   -       yes     -


Determine if the correct physical disk, as determined by its WWN (World Wide Number), is linked to the logical name (from the vfstab file) of /dev/dsk/c1t0d0s1:
# cd /mnt/dev/dsk
# ls -al c1t0d0s0
lrwxrwxrwx   1 root     root          70 Sep 10 12:29 c1t0d0s0 ->
../../devices/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w2100002037c80681,0:a


This shows the disk with the WWN of w2100002037c80681 as the linked disk. The disk device should be w210000203746f423 (as shown in the df -k output above.)
To fix the problem:
Make the device paths and instance numbers in the /etc/path_to_inst file, equivalent to the miniroot devices from the cdrom/net boot:
# devfsadm -r /mnt -p /mnt/etc/path_to_inst


Make the /device directory equivalent to the mini-root devices created from the cdrom/net boot. This may not be necessary if the correct device tree exists:
# cd /devices
# find . -print|cpio -pduVm /mnt/devices
0 blocks


Create the links from the correct logical device to the physical device:
# disks -r /mnt
Notice the change in the link from c1t0d0s0 to the correct disk:
# cd /mnt/dev/dsk # ls -l c1t0d0s0 lrwxrwxrwx   1 root root 70 Sep 11 10:38 c1t0d0s0 ->
../../devices/pci@8,600000/SUNW,qlc@4/fp@0,0/ssd@w210000203746f423,0:a


Alternate fix if using a Solaris 8 OS cdrom or net image:
Remove the old path_to_inst file:
# mv /mnt/etc/path_to_inst /mnt/etc/orig.path_to_inst
Remove all the old device links:
# rm /mnt/dev/rdsk/c* ; rm /mnt/dev/dsk/c* ; rm /mnt/dev/rmt/*
Rebuild the device structure using the devfsadm command:
# devfsadm -r /mnt -p /mnt/etc/path_to_inst


Now that the boot device is associated with the correct logical and physical Solaris OS device, this disk will boot, providing the OpenBoot[TM] PROM (OBP) monitor has a correct boot alias (aka disk, disk1 etc). It might be advisable to set the OBP boot-device variable with luxadm prior to rebooting. At a later time, a boot alias can be created at the OBP:
Before :
# eeprom|grep boot-device
boot-device=/pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w2100002037c80681,0:a
# luxadm set_boot_dev /dev/dsk/c1t0d0s0
Do you want to change boot-device to the new setting  (y/n) y
After :
# eeprom|grep boot-device
boot-device=/pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w210000203746f423,0:a

*NOTE: If platform has a graphics card installed, a reconfiguration

boot is required to allow X-windows to start on console.

# reboot
A reboot at this point, will boot to the correct boot device, using the boot-device variable. If a boot alias is desired, use show-disks to create it. Then set boot-device to the new alias:
ok show-disks
a) /pci@8,600000/SUNW,qlc@4/fp@0,0/disk
b) /pci@8,700000/scsi@6,1/disk
c) /pci@8,700000/scsi@6/disk
q) NO SELECTION
Enter Selection, q to quit: a
/pci@8,600000/SUNW,qlc@4/fp@0,0/disk has been selected.
Type ^Y ( Control-Y ) to insert it in the command line.
e.g. ok nvalias mydev ^Y
for creating devalias mydev for
/pci@8,600000/SUNW,qlc@4/fp@0,0/disk
ok nvalias new-disk /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w210000203746f423,0:a
ok devalias
new-disk                 /pci@8,600000/SUNW,qlc@4/fp@0,0/disk@w210000203746f423,0:a
ok setenv boot-device new-disk
boot-device =         new-disk
ok boot
 
The system should now be booting correctly.

No comments:

Post a Comment