posts | comments
28Dec

grep - syntax and options reference quick guide

No comments

GREP::: quick syntax and options reference guide

grep basically searches.

shell> grep foo
file(s)

shell> grep foo
*

returns all the lines in file(s) that contain a string matching
the expression “foo” (which may be a regular expression).

Another way of using grep is to have it accept data through STDIN
and filter to STDOUT. For example,

shell> ls
|grep blah

lists all files in the current directory containing the string “blah”

Some common option:

-v - invert (print
all lines except those that contain the pattern).

-i - ignore case
of letters (small and capital treated as the same)

-l - (list) -
print a list of the file names where mathes found

-s - suppress error
messages about nonexistent or unreadable files.

-c   - print only a count of the lines that  contain
the pattern.

-n   - precede each line by its line number  in
the file (first line is 1).

-h   - prevents the name of the file containing  the
matching  line  from  being  appended to that line
(used when searching multiple files).

-w  - search for the expression as  a  word  as
if surrounded by \< and \>.

Example: - find all cgi scripts in the directory which
calls certain stored procedure:

ls | grep -i myproc

Example: - count how many ‘httpd’ processes are running:

ps -ef | grep
httpd | wc

Example: - pipe several greps to filter out things:

ps -ef | grep
-v  ^oracle | grep -v  ^root | grep -v  ^nobody

Example: find all files in the
directory tree containing certain pattern:

find . -type f
-print | xargs grep -ls ‘your pattern’ /dev/null/

Commonly used find commands:

find . -mtime -1 -print
- find files modified in the last 24 hours

find . -mtime -7 -name ‘j*html’  -print
- find files modified no more than 7 days

find . -name  ‘*.pl’  -exec perl -wc
{} \;
- compile many perl files at once

Note: in the example above that the patterns are
regular expressions. So the ‘^’ means  “match at the beginning of
the line”.

Example: using dot (matches any character) and star
(tells that the preceding character may be 0 or more times):

The File for These Examples Wildcards #1 Wildcards #2 Wildcards #3
>cat file

big

bad bug

bag

bigger

boogy

>grep b.*g file

big

bad bug

bag

bigger

boogy

>grep b.*g. file

bigger

boogy

>grep ggg* file

bigger

Note: If the pattern consists of several words -
grep will think that the 2nd word is a file name.  So you need to
surround the whole pattern with single quotes.  You may want to use
double quotes if you want shell to expand the meaning of variables, for
example:

grep “$HOME” file
- searches file for the name of your home directory

grep ‘$HOME’ file
- searches for the string $HOME

Note: If the pattern contains ‘$’ or some other
special characters ( ?  \  .  [  ]  ^  $
) - shell will interpret them before passing them to grep.  To avoid
this, you need to escape them with backslashes. Also some characters may
have special meaning to grep itself (like dot or ‘^’). If you want just
the character itself - you need to escape with backslash. You use escapes
inside single-quoted pattern.

Example:

grep ‘hello\.gif’ file    -
matches hello.gif

grep ‘hello.gif’ file
- matches lines containing hello-gif , hello1gif , helloagif , etc.

Example: using ‘?’ (means ‘may be one’):

grep ‘bugg\?y’ file
-  matches all of the following: bugy , buggy but not bugggy

grep ‘Fred\(eric\)\? Smith’ file
- matches Fred Smith or Frederic Smith

Other regex constructs:

grep ‘\(abc\)*’ file    - matches
abc , abcabcabc etc. (i.e. , any number of repetitions of the string abc
, including the empty string.)

grep [Hh]ello  file    - matches
lines containing hello or Hello

Ranges:

[0-3]   is the same as   [0123]

[a-k]   is the same as   [abcdefghijk]

[A-C] is the same as [ABC]

[A-Ca-k] is the same as [ABCabcdefghijk]

There are also some alternate forms :

[[:alpha:]] is the same as [a-zA-Z]

[[:upper:]] is the same as [A-Z]

[[:lower:]] is the same as [a-z]

[[:digit:]] is the same as [0-9]

[[:alnum:]] is the same as [0-9a-zA-Z]

[[:space:]] matches any white space including tabs

The [] may be used to search for non-matches. This is done by
putting a ‘^’ as a first character inside the square brackets.

Example:

grep “([^()]*)a” file
-  returns any line containing a pair of parentheses that are innermost
(don’t have parenthesis inside them) and are followed by the letter “a”.
So it matches these lines

(hello)a

(aksjdhaksj d ka)a

But not this

x=(y+2(x+1))a

Note:

A closing square bracket loses its special meaning if placed
first in a list. For example []12] matches ] , 1, or 2.

A dash - loses it’s usual meaning inside lists if it is placed
last.

A carat ^ loses it’s special meaning if it is not placed first

Most special characters lose their meaning inside square brackets

Note that a $ sign loses its meaning if characters follow it

Matching a Specific Number Of Repetitions of a Pattern:

Example: searching for a 7 digit phone number like this:

grep “[:digit:]\{3\}[ -]\?[:digit:]\{4\}”
file

This matches phone numbers, possibly containing a dash or whitespace
in the middle.

The $ character matches the end of the line. The ^ character matches
the beginning of the line.

Example:

grep “^[[:space:]]*hello[[:space:]]*$”
file

grep “^From.*somename” /var/spool/mail/myname
- searches mail inbox for headers from a particular person.

Vertical line means ‘either this or that’:

Example:

grep “cat\|dog”    -
file matches lines containing the word “cat” or the word “dog”

grep “I am a \(cat\|dog\)” matches lines
containing the string “I am a cat” or the string “I am a dog”.

Backreferences

The expression <H[1-6]>.*</H[1-6]> is not good enough to match
html headers, since it matches <H1>Hello world</H3> (Error here -
opening tag is different from the closing one). Solution - use a backreference

Backreference is an expression \n where n is
a number.  It matches the contents of the n’th set of parentheses
in the expression.

Examples:

<H\([1-6]\).*</H\1> matches what we were trying to match before.

“Mr \(dog\|cat\) came home to Mrs \1 and they went to visit Mr \(dog\|cat\)
and Mrs \2 to discuss the meaning of life matches”
egrep

despite the origin of the name (extended), egrep actually has less
functionality as it is designed for compatibility with the traditional
egrep. A better way to do an extended “grep” is to use grep -E which uses
extended regular expression syntax without loss of functionality.

grep grep -E Available for egrep?
a\+ a+ yes
a\? a? yes
expression1\|expression2 expression1|expression2? yes
\(expression\) (expression1) yes
\{m,n\} {m,n} no
\{,n\} {,n} no
\{m,} {m,} no
\{m} {m} no

———————-

This page adapted from http://www.pegasus.rutgers.edu/~elflord/unix/grep.html

Monday, December 28th, 2009 at 12:48 pm and is filed under z-A usefull - HowTos and Tutorials. You can follow any responses to this entry through the RSS 2.0 feed. Responses are currently closed, but you can trackback from your own site.

Comments are closed.