There are some tools that look like you will never replace them.
One of those (for me) is grep. It does what it does very
well (remarks about the shortcomings of regexen in general aside).
It works reasonably well with Unicode/UTF-8 (a great opportunity to
Fail Miserably for any tool, viz. a2ps).
Yet, the other day I read
about ack, which claims to
be "better than grep, a search tool for programmers". Woo. Better
than grep? In what way?
The ack homepage lists
the top ten reasons why one should use it instead of grep.
Actually, it's thirteen reasons but then some are dupes. So I'd say
"about ten reasons". Let's look at them in order.
-
It's blazingly fast because it only searches the stuff you want
searched.
Wait, how does it know what I want? A DWIM-Interface at last? Not
quite. First off, ack is faster than grep
for simple searches. Here's an example:
$ time ack 1Jsztn-000647-SL exim_main.log >/dev/null
real 0m3.463s
user 0m3.280s
sys 0m0.180s
$ time grep -F 1Jsztn-000647-SL exim_main.log >/dev/null
real 0m14.957s
user 0m14.770s
sys 0m0.160s
Two notes: first, yes, the file was in the page cache before I
ran ack; second, I even made it easy for grep by telling
it explicitly I was looking for a fixed string (not that it helped
much, the same command without -F was faster by about
0.1s). Oh and for completeness, the exim logfile I searched has
about two million lines and is 250M. I've run those tests ten times
for each, the times shown above are typical.
So yes, for simple searches, ack is faster than grep.
Let's try with a more complicated pattern, then. This time, let's
use the pattern (klausman|gentoo) on the same file. Note
that we have to use -E for grep to use extended
regexen, which ack in turn does not need, since it
(almost) always uses them. Here, grep takes its sweet
time: 3:56, nearly four minutes. In contrast, ack
accomplished the same task in 49 seconds (all times averaged over
ten runs, then rounded to integer seconds).
As for the "being clever" side of speed, see below, points 5 and
6
-
ack is pure Perl, so it runs on Windows just fine.
This isn't relevant to me, since I don't use windows for
anything where I might need grep. That said, it might be a killer
feature for others.
-
The standalone version uses no non-standard modules, so you can
put it in your ~/bin without fear.
Ok, this is not so much of a feature than a hard criterion. If I
needed extra modules for the whole thing to run, that'd be a deal
breaker. I already have tons of libraries, I don't need more
undergrowth around my dependency tree.
-
Searches recursively through directories by default, while
ignoring .svn, CVS and other VCS directories.
This is a feature, yet one that wouldn't pry me away from grep:
-r is there (though it distinctly feels like an
afterthought). Since ack ignores a certain set of files
and directories, its recursive capabilities where there from the
start, making it feel more seamless.
-
ack ignores most of the crap you don't want to search
To be precise:
- VCS directories
- blib, the Perl build directory
- backup files like foo~ and #foo#
- binary files, core dumps, etc.
Most of the time, I don't want to search those (and have to
exclude them with grep -v from find results). Of
course, this ignore-mode can be switched off with ack
(-u). All that said, it sure makes command lines shorter
(and easier to read and construct). Also, this is the first spot
where ack's Perl-centricism shows. I don't mind, even though I
prefer that other language with
P.
-
Ignoring .svn directories means that ack is faster than grep
for searching through trees.
Dupe. See Point 5
-
Lets you specify file types to search, as in --perl or
--nohtml.
While at first glance, this may seem limited, ack comes
with a plethora of definitions (45 if I counted correctly), so it's
not as perl-centric as it may seem from the example. This feature
saves command-line space (if there's such a thing), since it avoids
wild find-constructs. The docs mention that --perl also
checks the shebang line of files that don't have a suffix, but make
no mention of the other "shipped" file type recognizers doing
so.
-
File-filtering capabilities usable without searching with ack
-f. This lets you create lists of files of a given type.
This mostly is a consequence of the feature above. Even if it
weren't there, you could simply search for "."
-
Color highlighting of search results.
While I've looked upon color in shells as kinda childish for a
while, I wouldn't want to miss syntax highlighting in vim, colors
for ls (if they're not as sucky as the defaults we had for years)
or match highlighting for grep. It's really neat to see that yes,
the pattern you grepped for indeed matches what you think it does.
Especially during evolutionary construction of command lines and
shell scripts.
-
Uses real Perl regular expressions, not a GNU subset
Again, this doesn't bother me much. I use
egrep/grep -E all the time, anyway. And I'm no
Perl programmer, so I don't get withdrawal symptoms every time I
use another regex engine.
-
Allows you to specify output using Perl's special
variables
This sounds neat, yet I don't really have a use case for
it. Also, my perl-fu is weak, so I probably won't use it anyway.
Still, might be a killer feature for you.
The docs have an example:
ack '(Mr|Mr?s)\. (Smith|Jones)'
--output='$&'
-
Many command-line switches are the same as in GNU grep:
Specifically mentioned are -w, -c and
-l. It's always nice if you don't have to look up all the
flags every time.
-
Command name is 25% fewer characters to type! Save days of
free-time! Heck, it's 50% shorter compared to grep -r
Okay, now we have proof that not only the ack webmaster
can't count, he's also making up reasons for fun. Works for me.
Bottom line: yes, ack is an exciting new tool which
partly replaces grep. That said, a drop-in replacement it ain't.
While the standalone version of ack needs nothing but a perl
interpreter and its standard modules, for embedded systems that may
not work out (vs. the binary with no deps beside a libc). This
might also be an issue if you need grep early on during
boot and /usr (where your perl resides) isn't mounted yet. Also,
default behaviour is divergent enough that it might yield nasty
surprises if you just drop in ack instead of grep. Still, I
recommend giving ack a try if you ever use grep
on the command line. If you're a coder who often needs to search
through working copies/checkouts, even more so.
Update
I've written
a followup on this, including some tips for day-to-day usage
(and an explanation of grep's sucky performance).
Original link: http://blog.i-no.de//archives/20...
|