Browse:  GUI  JavaScript  Linux  OpenBSD  PHP  Software  Web  Windows  Deals
 

Basic Log Grepping, Searching, and Sorting On Linux / Unix
by Cory Rauch 2007-02-10 Category: Linux-System

In this tutorial I'll cover basic searching, grepping, and sorting of logs generated in Linux or Unix. This is very useful for digging for information in logs that may be of interest. Below I'll give some example ways to use the powerful grep command.

What is Grep

Grep is a command line utility included in most Unix and Linux distributions that allows you to search text output from a command. For example below we will search in the file test.txt for the string 'test' and if its found that line is outputted to the console.

# cat test.txt | grep 'test'

If you notice above we use another command called cat to output a text file then pipe it to grep to search for the test string. Pipe is a way to direct output to another command kind of like the name implies and is denoted by the | character.

Usage Scenarios

To use grep for log search we may want to first see what we are working with. For example on referrer log I might notice the file looks like below:

.....
http://www.improvedsource.com/view.php/GUI/1/ -> /search/images/search.gif
http://www.improvedsource.com/view.php/GUI/1/ -> /search/results.php
http://www.improvedsource.com/view.php/GUI/1/ -> /search/images/point.gif
http://www.improvedsource.com/view.php/GUI/1/ -> /search/images/seperator.gif
http://www.improvedsource.com/view.php/GUI/1/ -> /view.php/GUI/1/images/bg.jpg - -> /view.php/Software-Bugs/12/
http://www.improvedsource.com/view.php/GUI/1/ -> /view.php/GUI/1/images/bg.jpg
http://www.improvedsource.com/view.php/GUI/1/ -> /view.php/GUI/1/images/bg.jpg
http://www.improvedsource.com/view.php/GUI/1/ -> /view.php/GUI/1/images/bg.jpg
- -> /rss.php/latest.rdf
- -> /rss.php/latest.rdf
http://www.google.com/search?hl=en&q=web+form+services&btnG=Google+Search
-> /view.php/Web-Forms/10/
....

Looking at a snippet of this file I can see that google referrals include a search string. This would be very interesting data to see what people are looking for to get to my web page. Maybe I could use this to write articles for topics not already covered. So I could grep for google.com like so.

# cat referrer_log* | grep 'google.com'

This would give me all google.comm entries. Now this is great but what it still would be difficult to read all those entries and just to get an idea of the most popular queries from google. So what I could do is filter unique entries and sort by count of occurence. To do this see below:

# cat referrer_log* | grep 'google.com' | sort | uniq -ic | sort -n

Looking at the above command you see three new commands added. First the sort command which does what the name implies, then we pip to uniq which filter only for unique entries and we pass it to include a count for occurence. Next I tell it to sort by number. The reason is the uniq command will start each line with occurence count is the sort by number command will sort that first number. We include the first sort so that uniq works properly. Now how can we extract just a portion of the matching line you may ask, well this is a little more advance and requires some knowledge of regular expression. But for example on google we could type:

# cat referrer_log* | grep 'google.com' | grep -ohP 'q=(.+?)&' | sort | uniq -ic | sort -n

Notice the grep -ohP command that was added. The O options will include all matching text, so in our example the regular expression 'q=(.+?)&' will only match on the google query string in the url. So it would output something like below.

....
3 q=linux+desktop+weather&
3 q=reduce+boottime+linux&
3 q=sysbotz&
4 q=coolest+linux+desktop&
4 q=debian+linux+boot+time+load+kernel&
4 q=ubuntu+php+5.2&
7 q=cool+linux+desktops&
......

Other ImprovedSource Articles:
Boot Fedora Linux Faster: How I Modified Fedora To Boot In Under 25 Seconds
Linux Fastest Boot Time Challenge
How to make a System Restore CD-ROM

[ Back ]

ImprovedSource. Copyright 2007 + Contact Us + Home + Search + RSS Feed