Unless it is formatted:
inode permissions hard-links owner group ...
266705 crw-rw---- 1 root uucp
major minor day month date hh:mm:ss year path
4, 68 Sun Apr 20 09:27:33 2003 /dev/ttyS4
NOTE: that pesky comma after the major number
NOTE: the 'path' may be multiple fields:
/home/mszick/core
/proc/982/fd/0 -> /dev/null
/proc/982/fd/1 -> /home/mszick/.xsession-errors
/proc/982/fd/13 -> /tmp/tmpfZVVOCs (deleted)
/proc/982/fd/7 -> /tmp/kde-mszick/ksycoca
/proc/982/fd/8 -> socket:[11586]
/proc/982/fd/9 -> pipe:[11588]
If that isn't enough to keep your parser guessing,
either or both of the path components may be relative:
../Built-Shared -> Built-Static
../linux-2.4.20.tar.bz2 -> ../../../SRCS/linux-2.4.20.tar.bz2
The first character of the 11 (10?) character permissions field:
's' Socket
'd' Directory
'b' Block device
'c' Character device
'l' Symbolic link
NOTE: Hard links not marked - test for identical inode numbers
on identical filesystems.
All information about hard linked files are shared, except
for the names and the name's location in the directory system.
NOTE: A "Hard link" is known as a "File Alias" on some systems.
'-' An undistingushed file
Followed by three groups of letters for: User, Group, Others
Character 1: '-' Not readable; 'r' Readable
Character 2: '-' Not writable; 'w' Writable
Character 3, User and Group: Combined execute and special
'-' Not Executable, Not Special
'x' Executable, Not Special
's' Executable, Special
'S' Not Executable, Special
Character 3, Others: Combined execute and sticky (tacky?)
'-' Not Executable, Not Tacky
'x' Executable, Not Tacky
't' Executable, Tacky
'T' Not Executable, Tacky
Followed by an access indicator
Haven't tested this one, it may be the eleventh character
or it may generate another field
' ' No alternate access
'+' Alternate access
LSfieldsDoc
ListDirectory()
{
local -a T
local -i of=0 # Default return in variable
# OLD_IFS=$IFS # Using BASH default ' \t\n'
case "$#" in
3) case "$1" in
-of) of=1 ; shift ;;
* ) return 1 ;;
esac ;;
2) : ;; # Poor man's "continue"
*) return 1 ;;
esac
# NOTE: the (ls) command is NOT quoted (")
T=( $(ls --inode --ignore-backups --almost-all --directory \
--full-time --color=none --time=status --sort=none \
--format=long $1) )
case $of in
# Assign T back to the array whose name was passed as $2
0) eval $2=\( \"\$\{T\[@\]\}\" \) ;;
# Write T into filename passed as $2
1) echo "${T[@]}" > "$2" ;;
esac
return 0
}
# # # # # Is that string a legal number? # # # # #
#
# IsNumber "Var"
# # # # # There has to be a better way, sigh...
IsNumber()
{
local -i int
if [ $# -eq 0 ]
then
return 1
else
(let int=$1) 2>/dev/null
return $? # Exit status of the let thread
fi
}
# # # # # Index Filesystem Directory Information # # # # #
#
# IndexList "Field-Array-Name" "Index-Array-Name"
# or
# IndexList -if Field-Array-Filename Index-Array-Name
# IndexList -of Field-Array-Name Index-Array-Filename
# IndexList -if -of Field-Array-Filename Index-Array-Filename
# # # # #
: << IndexListDoc
Walk an array of directory fields produced by ListDirectory
Having suppressed the line breaks in an otherwise line oriented
report, build an index to the array element which starts each line.
Each line gets two index entries, the first element of each line
(inode) and the element that holds the pathname of the file.
The first index entry pair (Line-Number==0) are informational:
Index-Array-Name[0] : Number of "Lines" indexed
Index-Array-Name[1] : "Current Line" pointer into Index-Array-Name
The following index pairs (if any) hold element indexes into
the Field-Array-Name per:
Index-Array-Name[Line-Number * 2] : The "inode" field element.
NOTE: This distance may be either +11 or +12 elements.
Index-Array-Name[(Line-Number * 2) + 1] : The "pathname" element.
NOTE: This distance may be a variable number of elements.
Next line index pair for Line-Number+1.
IndexListDoc
IndexList()
{
local -a LIST # Local of listname passed
local -a -i INDEX=( 0 0 ) # Local of index to return
local -i Lidx Lcnt
local -i if=0 of=0 # Default to variable names
case "$#" in # Simplistic option testing
0) return 1 ;;
1) return 1 ;;
2) : ;; # Poor man's continue
3) case "$1" in
-if) if=1 ;;
-of) of=1 ;;
* ) return 1 ;;
esac ; shift ;;
4) if=1 ; of=1 ; shift ; shift ;;
*) return 1
esac
# Make local copy of list
case "$if" in
0) eval LIST=\( \"\$\{$1\[@\]\}\" \) ;;
1) LIST=( $(cat $1) ) ;;
esac
# Grok (grope?) the array
Lcnt=${#LIST[@]}
Lidx=0
until (( Lidx >= Lcnt ))
do
if IsNumber ${LIST[$Lidx]}
then
local -i inode name
local ft
inode=Lidx
local m=${LIST[$Lidx+2]} # Hard Links field
ft=${LIST[$Lidx+1]:0:1} # Fast-Stat
case $ft in
b) ((Lidx+=12)) ;; # Block device
c) ((Lidx+=12)) ;; # Character device
*) ((Lidx+=11)) ;; # Anything else
esac
name=Lidx
case $ft in
-) ((Lidx+=1)) ;; # The easy one
b) ((Lidx+=1)) ;; # Block device
c) ((Lidx+=1)) ;; # Character device
d) ((Lidx+=1)) ;; # The other easy one
l) ((Lidx+=3)) ;; # At LEAST two more fields
# A little more elegance here would handle pipes,
#+ sockets, deleted files - later.
*) until IsNumber ${LIST[$Lidx]} || ((Lidx >= Lcnt))
do
((Lidx+=1))
done
;; # Not required
esac
INDEX[${#INDEX[*]}]=$inode
INDEX[${#INDEX[*]}]=$name
INDEX[0]=${INDEX[0]}+1 # One more "line" found
# echo "Line: ${INDEX[0]} Type: $ft Links: $m Inode: \
# ${LIST[$inode]} Name: ${LIST[$name]}"
else
((Lidx+=1))
fi
done
case "$of" in
0) eval $2=\( \"\$\{INDEX\[@\]\}\" \) ;;
1) echo "${INDEX[@]}" > "$2" ;;
esac
return 0 # What could go wrong?
}
# # # # # Content Identify File # # # # #
#
# DigestFile Input-Array-Name Digest-Array-Name
# or
# DigestFile -if Input-FileName Digest-Array-Name
# # # # #
# Here document used as a comment block.
: <
The key (no pun intended) to a Unified Content File System (UCFS)
is to distinguish the files in the system based on their content.
Distinguishing files by their name is just, so, 20th Century.
The content is distinguished by computing a checksum of that content.
This version uses the md5sum program to generate a 128 bit checksum
representative of the file's contents.
There is a chance that two files having different content might
generate the same checksum using md5sum (or any checksum). Should
that become a problem, then the use of md5sum can be replace by a
cyrptographic signature. But until then...
The md5sum program is documented as outputting three fields (and it
does), but when read it appears as two fields (array elements). This
is caused by the lack of whitespace between the second and third field.
So this function gropes the md5sum output and returns:
[0] 32 character checksum in hexidecimal (UCFS filename)
[1] Single character: ' ' text file, '*' binary file
[2] Filesystem (20th Century Style) name
Читать дальше