DevOps Zone is brought to you in partnership with:

A friendly Finnish hacker. I am technology consultant, open source advocator and entrepreneur. My expertise areas cover HTML5, Python, Plone, Javascript, WebGL,UNIX and mobile web. Mikko likes sushi, Angry Birds and dislikes winter. Mikko is a DZone MVB and is not an employee of DZone and has posted 43 posts at DZone. You can read more from them at their website. View Full User Profile

Power Searching Using UNIX grep

12.27.2012
| 1273 views |
  • submit to reddit

UNIX grep is a command tool for searching text strings inside files. (One should not confuse it with find which matches filenames and properties). In this blog post there are some hints how to use grep to search from files fast and efficiently.

Example how to search Plone source tree for “content-core” examples in page template files

Some notes about grep

  • Grep can search multiple files and directory trees
  • Grep can be tuned to be faster
  • Grep output can be friendly and colorized

As with many UNIX tools, due to legacy and backwards compatibility, grep doesn’t do these things out of the box and simply provides you an plain barebone interface.

1. Install GNU grep

GNU grep supports plenty of options, like better coloring, over BSD grep which is shipped with BSD based operating systems like OSX. You can install GNU grep from grep package of MacportsSee ztanesh README for example sudo port install command.

2. Searching multiple files

Below is an example how to search case-insensitive (-i) match, recursively (-R) from a folder, only including (–include) .py files. I.e. It searches all Python files in the source tree for “foobar” word:

grep -Ri --include="*.py" foobar ~/code/mixnap/krusovice-src

3. Using colors

You can colorize things in grep output like filename, linenumber, highlighted match and lines around the result.

Below is my example for setting GREP_COLORS environment variable

GREP_OPTIONS="--color=always"
GREP_COLORS="ms=01;37:mc=01;37:sl=:cx=01;30:fn=35:ln=32:bn=32:se=36"

Note: Use GREP_COLORS, not deprecated GREP_COLOR environment variable, as the former provides much more options.

4. Search as ASCII

By default, grep will decode incoming text files in encoding set in environment variables. This will take CPU cycles. If you are searching plain ASCII match, like with programming language source code files, you can gain much speed by disabling the decoding. Override LC_CTYPE environment variable when running grep:

LC_CTYPE=POSIX grep....

This is a GNU grep bug and fixed in 2.7.

5. Show lines around the match

You can specify –before-context and –after-context options which show the text snippet around the matching line. Also –line-number is very useful switch when dealing with source code files.

6. ZSH shell search alias

This wraps it all together. We define a ZSH function search which will give us a shortcut for searching multiple files in a folder tree:

# Search ASCII-string from multiple files in the currect working directory
# E.g.
# search "foobar" "*.html"
# search "foobar" "*.html" myfolder
# By default we excluse dotted files and directoves (.git, .svn)
function search() {

  if [[ ! -n "$1" ]] ; then
  echo "Usage: search \"pattern\" \"*.filemask\" \"path\""
  return
  fi

  # Did we get path arg
  if [[ ! -n "$3" ]] ;
  then
  search_path="."
  else
  search_path="$3"
  fi

  # LC_CTYPE="posix" 20x increases performance for ASCII search
  # https://twitter.com/jlaurila/status/86750682094374912

  # We use specially tuned GREP colors - make sure you have GNU grep on OSX
  # https://github.com/miohtama/ztanesh/blob/master/README.rst

  GREP_COLORS="ms=01;37:mc=01;37:sl=:cx=01;30:fn=35:ln=32:bn=32:se=36" LC_CTYPE=POSIX \
  grep -Ri "$1" --line-number --before-context=3 --after-context=3 --color=always --include="$2" --exclude=".*" "$search_path"/*
}

This, and other ZSH goodies, are available in ztanesh package on Github.

7. Turn off OS native file indexing

If you use grep as your primary search tool I suggest you turn off your operating system search indexing operations like OSX Spotlight. These just take space and CPU cycles.

Published at DZone with permission of Mikko Ohtamaa, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: