Classic Shell Scripting - Arnold Robbins [168]
At the same time, create final password file entries for the users who originally had multiple UIDs and for UIDs that had multiple users.
Create the final password file.
Create the list of commands to change file ownership, and then run the commands. As will be seen, this has some aspects that require careful planning.
In passing, we note that all the code here operates under the assumption that usernames and UID numbers are not reused more than twice. This shouldn't be a problem in practice, but it is worth being aware of in case a more complicated situation comes along one day.
Separating Users by Manageability
Merging the password files is easy. The files are named u1.passwd and u2.passwd, respectively. The sort command does the trick. We use tee to save the file and simultaneously print it on standard output where we can see it:
$ sort u1.passwd u2.passwd | tee merge1
abe:x:105:10:Honest Abe Lincoln:/home/abe:/bin/bash
adm:x:3:4:adm:/var/adm:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
ben:x:201:10:Ben Franklin:/home/ben:/bin/bash
ben:x:301:10:Ben Franklin:/home/ben:/bin/bash
betsy:x:1110:10:Betsy Ross:/home/betsy:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
bin:x:1:1:bin:/bin:/sbin/nologin
camus:x:112:10:Albert Camus:/home/camus:/bin/bash
daemon:x:2:2:daemon:/sbin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
dorothy:x:110:10:Dorothy Gale:/home/dorothy:/bin/bash
george:x:1100:10:George Washington:/home/george:/bin/bash
jhancock:x:200:10:John Hancock:/home/jhancock:/bin/bash
jhancock:x:300:10:John Hancock:/home/jhancock:/bin/bash
root:x:0:0:root:/root:/bin/bash
root:x:0:0:root:/root:/bin/bash
tj:x:105:10:Thomas Jefferson:/home/tj:/bin/bash
tolstoy:x:2076:10:Leo Tolstoy:/home/tolstoy:/bin/bash
toto:x:110:10:Toto Gale:/home/toto:/bin/bash
Example 11-3 presents splitout.awk. This script separates the merged file into three new files, named dupusers, dupids, and unique1, respectively.
Example 11-3. The splitout.awk program
#! /bin/awk -f
# $1 $2 $3 $4 $5 $6 $7
# user:passwd:uid:gid:long name:homedir:shell
BEGIN { FS = ":" }
# name[ ] --- indexed by username
# uid[ ] --- indexed by uid
# if a duplicate appears, decide the disposition
{
if ($1 in name) {
if ($3 in uid)
; # name and uid identical, do nothing
else {
print name[$1] > "dupusers"
print $0 > "dupusers"
delete name[$1]
# remove saved entry with same name but different uid
remove_uid_by_name($1)
}
} else if ($3 in uid) {
# we know $1 is not in name, so save duplicate ID records
print uid[$3] > "dupids"
print $0 > "dupids"
delete uid[$3]
# remove saved entry with same uid but different name
remove_name_by_uid($3)
} else
name[$1] = uid[$3] = $0 # first time this record was seen
}
END {
for (i in name)
print name[i] > "unique1"
close("unique1")
close("dupusers")
close("dupids")
}
function remove_uid_by_name(n, i, f)
{
for (i in uid) {
split(uid[i], f, ":")
if (f[1] = = n) {
delete uid[i]
break
}
}
}
function remove_name_by_uid(id, i, f)
{
for (i in name) {
split(name[i], f, ":")
if (f[3] = = id) {
delete name[i]
break
}
}
}
The program works by keeping a copy of each input line in two arrays. The first is indexed by username, the second by UID number. The first time a record is seen, the username and UID number have not been stored in either array, so a copy of the line is saved in both.
When an exact duplicate record (the username and UID are identical) is seen, nothing is done with it, since we already have the information. If the username has been seen but the UID is new, both records are written to the dupusers file, and the copy of the first record in the uid array is removed, since we don't need it. Similar logic applies to records where the UID has been seen before but the username doesn't match.
When the END rule is executed, all the records remaining in the name array represent unique records. They are written to the unique1 file, and then all the files are closed.
remove_uid_by_name( ) and remove_name_by_uid( ) are awk