I’ve kind of half understood
grep for a while, but assumed that I didn’t really get it at all. I thought I knew nothing at all about
sed. I took some time this weekend to sit down and actually learn about these two commands and discovered I already knew a good deal about both of them and filled in some of what I didn’t know pretty easily. Both are a lot more simple and straightforward than I thought they were.
grep comes from “global regular expression print”. This is not really an acronym, but comes from the old time
ed line editor tool. In that tool, if you wanted to globally search a file you were editing and print the lines that matched, you’d type
re is the regular expression you are using to search. The functionality got pulled out of
ed and made into a standalone tool,
grep, in its simplest use, searches all the lines of a given file or files and prints all the lines that match a regular expression. Syntax:
grep <options> <expression> <file(s)>
So if you want to search the file
animals.txt for all the lines that contain the word
dog, you just type:
grep dog animals.txt
It’s usually suggested that you include the expression in single quotes. This prevents a lot of potential problems, such as misinterpretations of spaces or unintentional expansion:
grep 'flying squirrel' animals.txt
It’s also a good idea to explicitly use the
-e flag before the expression. This explicitly tells
grep that the thing coming next is the expression. Say you had a file that was list of items each preceded with a dash, and you wanted to search for
grep '-dog' animals.txt
Even with the quotes,
grep will try to parse
-dog as a command line flag. This handles it:
grep -e '-dog' animals.txt
You can search multiple files at the same time with wildcards:
grep -e 'dog' *
This will find all the lines that contain
dog in any file in the current directory.
You can also recurse directories using the
grep -r -e 'dog' *
You can combine flags, but make sure that
e is the last one before the expression:
grep -re 'dog' *
A very common use of
grep is to pipe the output of one command into
cat animals.txt | grep -e 'dog'
This simple example is exactly the same as just using
grep with the file name, so is an unnecessary use of
cat, but if you have some other command that generates a bunch of text, this is very useful.
grep simply outputs its results to
stdout – the terminal. You could pipe that into another command or save it to a new file…
grep -e 'dog' animals.txt > dogs.txt
When you get into more complex regular expressions, you’ll need to start escaping the special characters you use to construct them, like parentheses and brackets:
grep -e '\(flying \)\?squirrel' animals.txt
This can quickly become a pain. Time for extended regular expressions, using the
grep -Ee '(flying )?squirrel' animals.txt
Much easier. Note that
-E had nothing at all to do with
-e. That confused me earlier. You should use both in this case. You may have heard of the tool
egrep. This is simply
grep -E. In some systems
egrep is literally a shell script that calls
grep -E. In others it’s a separate executable, but it’s just
grep -E under the hood.
egrep -e '(flying )?squirrel' animals.txt
The above covers most of what you need to know to use basic
grep. There are some other useful flags you can check into as well:
-o prints only the text that matches the expression, instead of the whole line
-h suppresses the file name from printing
-n prints the line number of each printed line.
Simple Text Search
If the thing you are searching for is simple text, you can use
grep -F or
fgrep. All the same, but you can’t use regular expressions, just search for a simple string.
grep -Fe 'dog' animals.text fgrep -e 'dog' animals.text
grep with Perl syntax for regular expressions. This is a lot more powerful than normal
grep regex syntax, but a lot more complex, so only use it if you really need it. It’s also not supported on every system. To use it, use the
grep -Pe 'dog' animals.text
In this simple case, using Perl syntax gives us nothing beyond the usual syntax. Also note that
pgrep is NOT an alternate form of
grep -P. So much for consistency.
I thought that I knew next to nothing about
sed, but it turns out that I’ve been using it for a few years for text replacement in
sed stands for “stream editor” and also hails from the
ed line editor program. The syntax is:
sed <options> command <file(s)>
The two main options you’ll use most of the time are
-E which work the same way they do in
The most common use of
sed is to replace text in files. There are other uses which can edit the files in other ways, but I’ll stick to the basic replacement use case.
sed reads each line of text in a file or files and looks for a match. It then performs a replacement on the matched text and prints out the resulting lines. The expression to use for replacement is
x is the text you are looking for, and
y is what to replace it with. So to replace all instances of
cat with the word
sed -e 's/cat/feline/' animals.txt
sed will print every line of text in the file, whether or not it found a match or not. But the lines that it matched will be changed the way you specified.
After the final slash in the expression, you can add other regex flags like
g for global or
i for case insensitivity.
sed just outputs to
stdout. You can redirect that to another file using
> or pipe it to another process using
|. But do NOT save the output back to the original file. Try it out some time on a test file that you don’t care about and see what happens.
There are lots of other options you can use with
sed but the above will probably get you by for a while to come. As you need more, just read up.
The biggest thing I took away from the couple of hours I spent on this was actually how easy these two commands were to learn. I’d been avoiding doing so for a long time and now wish that I had spend the effort on this much earlier.