Performance measurements of brute-force search for a substring in lines - Arkanis DevelopmentStuff about programming, technology, life… err… how about the universe?http://arkanis.de/weblog.xml2021-01-03T04:16:23+01:00Comment by Marc SeegerMarc Seegerhttp://arkanis.de/weblog/2020-12-28-performance-measurements-of-brute-force-search-for-a-substring-in-lines#comment-2021-01-03-04-16-23-marc-seeger2021-01-03T04:16:23+01:00
<p>Woah! I knew ripgrep was a bit faster, but this difference is quire impressive.
I actually work with some of the coreutils maintainers, but I only found some older posts about the perf differences.</p>
<p>tl;dr from a 2016: grep supports things like back-references and/or full PCRE and apparently the code to handle non-UTF8 multibyte strings makes things a bit more complex, but many people in at least Japan and Korea might have issues <span class="smiley smile">:)</span></p>
<p>Either way: Always love it when you blog pops up in my feed reader <span class="smiley smile">:)</span></p>
Comment by StephanStephanhttp://arkanis.de/weblog/2020-12-28-performance-measurements-of-brute-force-search-for-a-substring-in-lines#comment-2020-12-30-06-32-16-stephan2020-12-30T06:32:16+01:00
<p>Hi Marc, long time no see. Thanks for dropping by. <span class="smiley smile">:)</span></p>
<p>The original text files are already gone, but the gen-lines.rb script is deterministic so no pain in generating them again. Gave it a spin and added the numbers for ripgrep, Silver Searcher and ack to the post (see the update at the end).</p>
<p>The blog posts about ripgrep are pretty interesting. And its performance is equally impressive. As are the capabilities of ack and Silver Searcher.</p>
Comment by Marc SeegerMarc Seegerhttp://arkanis.de/weblog/2020-12-28-performance-measurements-of-brute-force-search-for-a-substring-in-lines#comment-2020-12-29-17-07-34-marc-seeger2020-12-29T17:07:34+01:00
<p>Fun post!
If you happen to have the text files still sitting around, I would LOVE to know how</p>
<p>ripgrep <a href="https://blog.burntsushi.net/ripgrep/">https://blog.burntsushi.net/ripgrep/</a>
ag aka the silver searcher: <a href="https://geoff.greer.fm/ag/">https://geoff.greer.fm/ag/</a>
ack <a href="https://beyondgrep.com/">https://beyondgrep.com/</a></p>
<p>are holding up.
Most of them are optimized for recursive searches, but especially ripgrep has some impressive "single file" benchmarks <a href="https://blog.burntsushi.net/ripgrep/#single-file-benchmarks">https://blog.burntsushi.net/ripgrep/#single-file-benchmarks</a></p>