October 2013 |
|
December 2013 |
awk --version | head -1 Output: awk version 20070501AWK syntax for search
awk '/search pattern1/ {Actions} /search pattern2/ {Actions}' fileLet's start with a simple input file
cat employee.txt 100 Thomas Manager Sales $65,000 200 Jason Developer Technology $65,500 300 John Sysadmin Technology $77,000 400 Emily Manager Marketing $99,500 500 Randy DBA Technology $66,000Print everything from this input file
awk '{print;}' employee.txt 100 Thomas Manager Sales $65,000 200 Jason Developer Technology $65,500 300 John Sysadmin Technology $77,000 400 Emily Manager Marketing $99,500 500 Randy DBA Technology $66,000Search for pattern and then print the matching lines
awk '/Thomas/ {print;} /Emily/ {print;}' employee.txt 100 Thomas Manager Sales $65,000 400 Emily Manager Marketing $99,500Find employees with employee id greater than 200
awk '$1 >200' employee.txt 300 John Sysadmin Technology $77,000 400 Emily Manager Marketing $99,500 500 Randy DBA Technology $66,000Print list of employees in the Technology department
awk '$4 ~/Technology/' employee.txt 200 Jason Developer Technology $65,500 300 John Sysadmin Technology $77,000 500 Randy DBA Technology $66,000Print only specific fields
awk '{print $2,$5;}' employee.txt Thomas $65,000 Jason $65,500 John $77,000 Emily $99,500 Randy $66,000NF is a built in variable which represents total number of fields in a record. So, we could write the above query as
awk '{print $2,$NF;}' employee.txtSuppose we have a series of actions, e.g., we want to print a header on the first line followed by the result of the query on subsequent lines, and finally a termination message on the last line. Here is the syntax to accomplish this
awk 'BEGIN {Action} Actions END {Action}'Print a headline, then specific filelds from the input file, and finally an exit message
awk 'BEGIN {print "Name\tDesignation\tDepartment\tSalary";} {print $2,"\t",$3,"\t",$4,"\t",$NF;} END{print "Report Generated\n--------------";}' employee.txt Name Designation Department Salary Thomas Manager Sales $65,000 Jason Developer Technology $65,500 John Sysadmin Technology $77,000 Emily Manager Marketing $99,500 Randy DBA Technology $66,000 Report Generated --------------Another example
awk 'BEGIN {count=0;} $4 ~ /Technology/ {count++;} END { print "Number of employees in Technology Dept =",count;}' employee.txt Number of employees in Technology Dept = 3
FS : Input Field Separator OFS : Output Field Separator RS : Record Separator ORS : Output Record Separator NR : Number of Records NF : Number of Fields in a record FILENAME : Name of the current input file FNR : Number of Records relative to the current input fileExamples
awk -F 'FS' ':' inputfilename awk 'BEGIN{FS=":";}' awk 'BEGIN{OFS="=";} {print $2,$NF;}' employee.txt awk 'BEGIN {RS="\n\n";FS="\n";} {print $1,$2;}' employee.txt awk 'BEGIN{ORS="\n\n";} {print $1,$2;}' employee.txt awk 'BEGIN{OFS="=";} {print $1,$2;}END {print NR, "records are processed";}' employee.txt awk '{print NR,"->",NF}' employee.txt awk '{print FILENAME}' employee.txt awk '{print FILENAME, FNR;}' employee.txt
int(x) : the nearest integer to x, exactly as in C++. sqrt(x) : positive square root of x exp(x) : exponential of x (e ^ x), reports an error if x is out of range log(x) : natural logarithm of x, if x is positive; otherwise, reports an error sin(x) : the sine of x, with x in radians cos(x) : the cosine of x, with x in radians atan2(y, x): the arctangent of y / x in radians rand() : a random number between 0 and 1. The value is never 0 and never 1.string manipulation:
index(in, find) : the first occurrence of "find" in string "in" (default 0). length(string) : number of characters in the string match(string, regexp): the first match of regex in the given string split(string, array [, fieldsep]): split the string into pieces, store in array sprintf(format, expression1,...): same as in C++ sub(regexp, replacement [, target]): change the first occurrence of regex gsub(regexp, replacement [, target]): same as sub, but global substitution substr(string, start [, length]): return a substring tolower(string): same as in C++ toupper(string): same as in C++Input/Output:
close(filename) system(command) systime() strftime([format [, timestamp]])strftime supports the following date format specifications:
%a The locale's abbreviated weekday name. %A The locale's full weekday name. %b The locale's abbreviated month name. %B The locale's full month name. %c The locale's "appropriate" date and time representation. %d The day of the month as a decimal number (01--31). %H The hour (24-hour clock) as a decimal number (00--23). %I The hour (12-hour clock) as a decimal number (01--12). %j The day of the year as a decimal number (001--366). %m The month as a decimal number (01--12). %M The minute as a decimal number (00--59). %p AM/PM designations associated with a 12-hour clock. %S The second as a decimal number (00--61).(11) %U The week number of the year (the first Sunday as the first day of week one) as a decimal number (00--53). %w The weekday as a decimal number (0--6). Sunday is day zero. %W The week number of the year (the first Monday as the first day of week one) as a decimal number (00--53). %x The locale's "appropriate" date representation. %X The locale's "appropriate" time representation. %y The year without century as a decimal number (00--99). %Y The year with century as a decimal number (e.g., 1995). %Z The time zone name or abbreviation. %% A literal `%'.Examples:
awk 'BEGIN{print int(3.534); print int(4); print int(-5.223); print int(-5); }' 3 4 -5 -5 awk 'BEGIN{print log(12); print log(0); print log(1); print log(-1); }' 2.48491 -inf 0 nan awk 'BEGIN{ print sqrt(16); print sqrt(0); print sqrt(-12); }' 4 0 nan awk 'BEGIN{ print exp(123434346); print exp(0); print exp(-12); }' inf 1 6.14421e-06 awk 'BEGIN { print sin(3.1415926); print sin(atan2(0,-1)); print sin(90); }' 5.35898e-08 1.22465e-16 0.893997 awk 'BEGIN { print cos(3.1415926); print cos(atan2(0,-1)); print cos(90); }' -1 -1 -0.448074The following example generates 1000 random numbers between 0 to 100 and shows how often each number was used. The configuration file rand.awk can be found here.
awk -f rand.awk 0 Occured 12 times 1 Occured 7 times 2 Occured 4 times 3 Occured 8 times 4 Occured 7 times 5 Occured 11 times 6 Occured 7 times 7 Occured 12 times 8 Occured 5 times 9 Occured 12 times 10 Occured 13 times 11 Occured 8 times 12 Occured 12 times 13 Occured 10 times 14 Occured 6 times 15 Occured 11 times 16 Occured 14 times 17 Occured 9 times 18 Occured 15 times 19 Occured 12 times 20 Occured 10 times 21 Occured 9 times 22 Occured 13 times 23 Occured 9 times 24 Occured 11 times 25 Occured 8 times 26 Occured 8 times 27 Occured 8 times 28 Occured 11 times 29 Occured 14 times 30 Occured 11 times 31 Occured 7 times 32 Occured 11 times 33 Occured 10 times 34 Occured 11 times 35 Occured 8 times 36 Occured 10 times 37 Occured 5 times 38 Occured 10 times 39 Occured 10 times 40 Occured 9 times 41 Occured 5 times 42 Occured 2 times 43 Occured 12 times 44 Occured 8 times 45 Occured 8 times 46 Occured 10 times 47 Occured 13 times 48 Occured 11 times 49 Occured 11 times 50 Occured 4 times 51 Occured 12 times 52 Occured 13 times 53 Occured 13 times 54 Occured 3 times 55 Occured 9 times 56 Occured 6 times 57 Occured 12 times 58 Occured 11 times 59 Occured 16 times 60 Occured 11 times 61 Occured 11 times 62 Occured 13 times 63 Occured 14 times 64 Occured 15 times 65 Occured 16 times 66 Occured 11 times 67 Occured 11 times 68 Occured 15 times 69 Occured 7 times 70 Occured 8 times 71 Occured 10 times 72 Occured 6 times 73 Occured 10 times 74 Occured 12 times 75 Occured 9 times 76 Occured 13 times 77 Occured 13 times 78 Occured 10 times 79 Occured 8 times 80 Occured 11 times 81 Occured 11 times 82 Occured 10 times 83 Occured 12 times 84 Occured 9 times 85 Occured 6 times 86 Occured 12 times 87 Occured 10 times 88 Occured 17 times 89 Occured 7 times 90 Occured 11 times 91 Occured 12 times 92 Occured 16 times 93 Occured 14 times 94 Occured 11 times 95 Occured 8 times 96 Occured 5 times 97 Occured 4 times 98 Occured 6 times 99 Occured 8 times 100 Occured timesThe following example generates 5 random numbers between 5 and 50 using srand. The configuration file srand.awk can be found here.
awk -f srand.awk 7 13 14 49More examples
awk 'BEGIN { print index("peanut", "an") }' 3 awk 'BEGIN { print length("abcde")}' 5 awk 'BEGIN { print length(15 * 35)}' 3 # because 15*35 = "525" awk 'BEGIN { print match("I am at Fermilab this week", lab)}' 1 awk 'BEGIN { print split("cul-de-sac", a, "-")}' 3 awk 'BEGIN {str = "daabaaa"; sub(/a*/, "c&c", str); print str;}' ccdaabaaa awk 'BEGIN {str = "The candidate came."; sub(/candidate/, "& and his wife", str); print str; }' The candidate and his wife came.Count the total number of fields in a file
awk '{ total += NF}; END {print total}' employee.txt 25Print the even-numbered lines
awk 'NR % 2 == 0' employee.txt 200 Jason Developer Technology $65,500 400 Emily Manager Marketing $99,500Find the employee with the highest employee ID
awk '$1 > maxid { maxid=$1;}; END { print}' employee.txt 500 Randy DBA Technology $66,000
awk '{ if($1>200) print; }' employee.txt 300 John Sysadmin Technology $77,000 400 Emily Manager Marketing $99,500 500 Randy DBA Technology $66,000 awk '{ if($1>200) print $2, $3; else print $3, $4}' employee.txt Manager Sales Developer Technology John Sysadmin Emily Manager Randy DBA awk '{ if($1<200) print $2; else if ($1<400) print $3, $4; else print $5}' employee.txt Thomas Developer Technology Sysadmin Technology $99,500 $66,000Concatenate every 3 lines of input with a comma.
awk 'ORS=NR%3?",":"\n"' employee.txt 100 Thomas Manager Sales $65,000,200 Jason Developer Technology $65,500,300 John Sysadmin Technology $77,000 400 Emily Manager Marketing $99,500,500 Randy DBA Technology $66,000,
awk 'BEGIN { while (count++<50) string=string "x"; print string }' 300 John Sysadmin Technology $77,000 400 Emily Manager Marketing $99,500 500 Randy DBA Technology $66,000
awk 'BEGIN{count=1; do print count, "I am bored, printing this 100 times"; while(count++<100)}' 1 I am bored, printing this 100 times 2 I am bored, printing this 100 times .... 100 I am bored, printing this 100 times
awk '{ for (i = 1; i <= NF; i++) total = total+$i; print i,NF,total }; END { print total }' employee.txt 6 5 100 6 5 300 6 5 600 6 5 1000 6 5 1500 1500Cool stuff: Print the fields in reverse order on every line
awk 'BEGIN{ORS="";}{ for (i=NF; i>0; i--) print $i," "; print "\n"; }' employee.txt $65,000 Sales Manager Thomas 100 $65,500 Technology Developer Jason 200 $77,000 Technology Sysadmin John 300 $99,500 Marketing Manager Emily 400 $66,000 Technology DBA Randy 500More examples
awk 'BEGIN{ x=1; while(1) {print "Break after 10 iterations"; if ( x==10 ) break; x++;} }' Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations Break after 10 iterations
awk 'BEGIN{ x=1; while(x<=20) { if(x>5 && x<=15){ x++; continue;} print "Value of x",x;x++;} }' Value of x 1 Value of x 2 Value of x 3 Value of x 4 Value of x 5 Value of x 16 Value of x 17 Value of x 18 Value of x 19 Value of x 20
awk 'BEGIN{ x=1; while(x<=10) {if(x==5){exit;} print "Value of x",x;x++;} }' Value of x 1 Value of x 2 Value of x 3 Value of x 4
cat duplicates.txt foo bar foo baz barRemove duplicates
awk '!($0 in array) { array[$0]; print }' duplicates.txt foo bar bazReverse the order of lines in the above file
awk '{ a[i++] = $0 } END { for (j=i-1; j>=0;) print a[j--] }' duplicates.txt bar baz foo bar fooList all words and their frequency
awk 'BEGIN {print "Word\tCount";} {word[$0]++;} END{ for (var in word) print var,"\t",word[var]; }' duplicates.txt Word Count baz 1 foo 2 bar 2Generate tables in an html file. The script file generateHtml.awk can be found here.
awk -f generateHtml.awk employee.txt >> table.html
awk ' # Function to obtain a random non-negative integer less than n function randint(n) { return int(n * rand()) }' # Function to roll a simulated die. function roll(n) { return 1 + int(rand() * n) } # Roll 3 six-sided dice and # print total number of points. { printf("%d points\n", roll(6)+roll(6)+roll(6)) }'
echo "Hello" | sed 's/Hell/Heaven/' HeavenoUsing & as the matched string
echo "123 abc" | sed 's/[0-9]*/& &/' 123 123 abc echo "Kalanand Mishra" | sed 's/Kal*/(&)/' (Kal)anand Mishra echo kalanand mishra | sed 's/[^ ]*/(&)/' (kalanand) mishraUsing \1, \2, ... to keep part of the pattern
echo abcd123 | sed 's/\([a-z]*\).*/\1/' abcd echo kalanand mishra | sed 's/\([a-z]*\).*/\1/' kalanandSimilarly, we can switch the two words around
echo kalanand mishra | sed 's/\([a-z]*\) \([a-z]*\)/\2 \1/' mishra kalanandRemoving duplicate words
echo kalanand kalanand mishra | sed 's/\([a-z]*\) \1/\1/' kalanand mishraUse "/g" option for global replacement
echo kalanand mishra | sed 's/[^ ]*/(&)/g' (kalanand) (mishra)Keep the first occurrence of the word but delete the second:
echo Mishra, Mishra Kalanand | sed 's/[a-zA-Z]* //2' Mishra, Kalanand echo Mishra, Mishra Kalanand | sed 's/[a-zA-Z]* /DELETED /2' Mishra, DELETED Kalanand echo Mishra, Mishra Kalanand | sed 's/[a-zA-Z]* //2' | sed 's/[a-zA-Z]*, //1' KalanandMultiple commands with -e command
echo Kalanand Mishra | sed -e 's/a/A/' -e 's/h/H/' KAlanand MisHraFilenames on the command line
sed 's/^#.*//' employee.txt duplicates.txt | grep -v '^$' | wc -l 10The "-n" option will not print anything
echo kalanand mishra | sed -n 's/[^ ]*/(&)/g' # Nothing is printedExecute sed commands from a script file
sed -f sedscript <filename>where sedscript could look like this:
# This script is called 'sedscript' # sed comment - This script changes lower case vowels to upper case s/a/A/g s/e/E/g s/i/I/g s/o/O/g s/u/U/gExample:
echo kalanand mishra | sed -f sedscript kAlAnAnd mIshrAPassing arguments into a sed script
echo kalanand mishra | sed 's/'$0'/&/' kalanand mishraRestricting to a line number or range
echo "123\n112233\n112222233333" | sed '3 s/[0-9][0-9]*//' 123 112233 #empty line echo "123\n112233\n112222233333" | sed '2,3 s/[0-9][0-9]*//' 123 #empty line #empty lineThe "$" is one of those conventions that mean "last". So the above query could be written as
echo "123\n112233\n112222233333" | sed '2,$ s/[0-9][0-9]*//'Patterns
echo "#0123\n#3456\n789" | sed '/^#/ s/[0-9][0-9]*//' # # 789Transform with y
echo kalanand mishra | sed 'y/abcdef/ABCDEF/' kAlAnAnD mishrA
sed 's/#.*//' file echo "I am going\n#home\nto dinner" | sed 's/#.*//' I am going to dinnerEliminate Comments and Empty Lines Using sed
sed 's/#.*//;/^$/d' file echo "I am going\n#home\nto dinner\n\nback to work again" | sed 's/#.*//;/^$/d' I am going to dinner back to work againEliminate HTML Tags from file Using sed
sed 's/<[^>]*>//g' index.htmlDelete Last X Number of Characters From Each Line (X = 3 in this example)
echo "kalanand\nmishra" | sed 's/...$//' kalan misSubstitute Only When the Line Matches with the Pattern
# If the line matches with the pattern “-”, then it replaces all the characters from “-” with the empty. echo "This is\n-Kalanand\nMishra" | sed '/\-/s/\-.*//g' This is MishraChanging the PATH
sed 's|myfolder/mysubfolder/myfile|myotherfolder/myotherfile|' configDouble-space a file
sed G fileDouble-space a file which already has blank lines in it. Output file should contain no more than one blank line between lines of text.
sed '/^$/d;G' fileTriple space a file
sed 'G;G'Undo double-spacing (assumes even-numbered lines are always blank)
sed 'n;d'Insert a blank line above every line which matches "regex"
sed '/regex/{x;p;x;}'Insert a blank line below every line which matches "regex"
sed '/regex/G'Insert a blank line above and below every line which matches "regex"
sed '/regex/{x;p;x;G;}'Number each line of a file (simple left alignment). Using a tab instead of space will preserve margins.
sed = filename | sed 'N;s/n/t/'Count lines (emulates "wc -l")
sed -n '$='Substitute (find and replace) "foo" with "bar" on each line
sed 's/foo/bar/' # replaces only 1st instance in a line sed 's/foo/bar/4' # replaces only 4th instance in a line sed 's/foo/bar/g' # replaces ALL instances in a line sed 's/(.*)foo(.*foo)/1bar2/' # replace the next-to-last case sed 's/(.*)foo/1bar/' # replace only the last caseSubstitute "foo" with "bar" EXCEPT for lines which contain "baz"
sed '/baz/!s/foo/bar/g'Substitute "foo" with "bar" ONLY for lines which contain "baz"
sed '/baz/s/foo/bar/g'Change "scarlet" or "ruby" or "puce" to "red"
sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g'Join pairs of lines side-by-side (like "paste")
sed '$!N;s/n/ /'If a line ends with a backslash, append the next line to it
sed -e :a -e '/$/N; s/n//; ta'Delete every 8th line
sed 'n;n;n;n;n;n;n;d;'Delete ALL blank lines from a file (same as "grep '.' ")
sed '/^$/d' # method 1 sed '/./!d' # method 2A use case: extract version number from string (only version, without other numbers). For example, I have: Chromium 12.0.742.112 Ubuntu 11.04. I want: 12.0.742.112 instead of: 12.0.742.11211.04.
sed 's/[^0-9.]*\([0-9.]*\).*/\1/'
sort --helpI get the following output from GNU manual
sort --help Usage: sort [OPTION]... [FILE]... Write sorted concatenation of all FILE(s) to standard output. Mandatory arguments to long options are mandatory for short options too. Ordering options: -b, --ignore-leading-blanks ignore leading blanks -d, --dictionary-order consider only blanks and alphanumeric characters -f, --ignore-case fold lower case to upper case characters -g, --general-numeric-sort compare according to general numerical value -i, --ignore-nonprinting consider only printable characters -M, --month-sort compare (unknown) < `JAN' < ... < `DEC' -n, --numeric-sort compare according to string numerical value -r, --reverse reverse the result of comparisons Other options: -c, --check check whether input is sorted; do not sort -k, --key=POS1[,POS2] start a key at POS1, end it at POS2 (origin 1) -m, --merge merge already sorted files; do not sort -o, --output=FILE write result to FILE instead of standard output -s, --stable stabilize sort by disabling last-resort comparison -S, --buffer-size=SIZE use SIZE for main memory buffer -t, --field-separator=SEP use SEP instead of non-blank to blank transition -T, --temporary-directory=DIR use DIR for temporaries, not $TMPDIR or /tmp; multiple options specify multiple directories -u, --unique with -c, check for strict ordering; without -c, output only the first of an equal run -z, --zero-terminated end lines with 0 byte, not newline --help display this help and exit --version output version information and exit POS is F[.C][OPTS], where F is the field number and C the character position in the field. OPTS is one or more single-letter ordering options, which override global ordering options for that key. If no key is given, use the entire line as the key. SIZE may be followed by the following multiplicative suffixes: % 1% of memory, b 1, K 1024 (default), and so on for M, G, T, P, E, Z, Y. With no FILE, or when FILE is -, read standard input. *** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.Simple sort
echo "John\nJack\nZeck\nJake\nJoseph\nJoe" | sort Jack Jake Joe John Joseph ZeckReverse sort (-r option):
echo "John\nJack\nZeck\nJake\nJoseph\nJoe" | sort Zeck Joseph John Joe Jake Jack
Opther useful options:
-f ignore case
-s stable sort
echo "5\n4\n12\n1\n3\n56" | sort 1 12 3 4 5 56So, the simple sort of numbers is not what we really want. But we can use "-n" option to get the expected result.
echo "5\n4\n12\n1\n3\n56" | sort -n 1 3 4 5 12 56And, if our lines happen to have some leading blanks, we can easily ignore those and still sort correctly (using the -b flag):
echo "A\na\nb\n B\n C\n E\n D\n C\n" | sort B C E C D A a b
echo "A\na\nb\n B\n C\n E\n D\n C\n" | sort -b A B C C D E a bWe may want to use unique (-u) option to remove duplicates from sort result
echo "A\na\nb\n B\n C\n E\n D\n C\n" | sort -b -u A B C D E a b
Sorting by column number in a multi-column file
ls -1l | sort -k5 drwx------+ 2 kalanand staff 68 Feb 4 2008 Mail drwxr-xr-x+ 2 kalanand staff 68 Feb 4 2008 nt_files drwxr-xr-x+ 2 kalanand staff 68 Feb 4 2008 private drwxr-xr-x+ 18 kalanand staff 612 Nov 25 13:27 public_htmlNote: The column separator is, by default, any blank character. We can change using "-t" option (e.g., -t:). We can also sort for columns in a range:
ls -1l | sort -k5,9 -n
Examples:
Without requiring unique option
echo "John\nJack\nZeck\nJake\nJoseph\nJoe\nJake" | sort Jack Jake Jake Joe John Joseph ZeckWith uniq
echo "John\nJack\nZeck\nJake\nJoseph\nJoe\nJake" | sort | uniq Jack Jake Joe John Joseph ZeckIdentify duplicates (-d)
echo "John\nJack\nZeck\nJake\nJoseph\nJoe\nJake" | sort | uniq -d JakeCount occurrences (-c)
echo "John\nJack\nZeck\nJake\nJoseph\nJoe\nJake" | sort | uniq -c 1 Jack 2 Jake 1 Joe 1 John 1 Joseph 1 ZeckSkip a filed skip in comparisons (-f)
echo "John 100\nJack 551\nZeck 185\nJake 411\nJoseph 56\nJoe 21\nJake 29" | sort | uniq -f 1 Jack 551 Jake 29 Jake 411 Joe 21 John 100 Joseph 56 Zeck 185
split [OPTION] [INPUT [PREFIX]]Description: Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is 'x'. With no INPUT, or when INPUT is -, read standard input.
-a, --suffix-length=N use suffixes of length N (default 2) -b, --bytes=SIZE put SIZE bytes per output file -C, --line-bytes=SIZE put at most SIZE bytes of lines per output file -d, --numeric-suffixes use numeric suffixes instead of alphabetic -l, --lines=NUMBER put NUMBER lines per output fileUse case: Let's say I have to send a personal video file of size 100MB to my friends through Gmail but Gmail has a maximum file upload limit of size 20MB. But I can split it into 5 smaller files of size 20MB and then upload the small files.
split -b 20m <Largefilename> <Smallfilename>where 20m is the size of output file in MB,to split in KB put k instead of m. The split files have names <Smallfilename>x, where x= "aa", "ab", "ac", ... etc.
To merge several small files into a single large file:
We can join the part files using the following command
cat Smallfilename* > Largefilename
wc -w *.txt | sort -nr | head -4 find . *.txt -print | xargs grep '[A-Z][A-Z]' awk '{if($1>300) print $1,$2,$3,$4}' employee.txt | sort -n | uniq | head -5 awk '{print $1,$2,$3,$4}' employee.txt | sort -u | wc -l find . -name employee.txt | xargs grep '[100-500]' | awk '{print $1, " ", $2, " ", $3, " ", $4, " ", $5}' | sort -u | wc -l