Grep
For example, finding a 10-letter crossword puzzle word matching the pattern below in a dictionary file containing hundreds of thousands of properly spelled English words:
The command that we want to enter first is one that takes us to a directory where there is a file containing 234,936 English words, one word per line.
The “web2” File of English Words
Using grep
Grep works when you type “grep” on the command line, followed by a space (press the space bar) and the pattern you are trying to match, followed by a space and the path to and name of a file that you are going to search. The blank spaces are the standard UNIX way to inicate separate command line elements. You active the command by pressing the “return” or “enter” key.
grep Commands
Again, we are using only the simplest capabilities of grep, so we need only the word “grep” followed by one or more spaces, followed by a pattern to match, followed by the path to a file (our “web2” file) that we are going to search.
grep pattern indicators include:
. | (a period) This means “Match any character”. |
^ | (a carat or “shift 6”) This means the next character in the pattern must be at the beginning of a line. |
$ | (dollar sign or “shift 4) This means the previous character in the pattern must be at the end of a line. |
Other than that, we just enter the explicit letters that we are looking for. For example the following search:
We are using the beginning and end of line markers for grep rather than the beginning of word “\<“ and end of word “\>” markers because the words are arranged one per line, and so strictly speaking there are no word ends or beginnings in the file.
Let’s Do It!
So now let’s look at the problem originally posed at the beginning of this page. We will try to match:
The required pattern for this is:
Indicating that it is 10 letters long, that there must be an “s” in the first position, an “o” in the fourth position, an “l” in the sixth position, a “y” in the seventh position, and a “d” in the tenth position, which must be the last letter of the 10-letter word.
After this pattern is one or more spaces, followed by the description of the file to be searched, which is “web2”, the file containing the 234,936 English words.
Our command line is thus:
Assignment
To receive my assessment of how well you understood this tutorial, search for the following pattern using grep in web2:
_ a _ _ _ f _ c _ n _ _ _ (13 letters)
Discussion
No comments yet.