Yes, all longest regex matches in linear time is possible
Comments
nextaccountic
ieviev
Sorry, finished the post just now with more comparisons on other inputs
The reason is just that the normal mode is faster in average non pathological cases
tracnar
Could you have a heuristics based on the input size and the pattern to decide what to use?
ieviev
Yes, this is entirely possible. you can even explore the automaton eagerly and detect if it's possible to loop from an accepting state to a nonaccepting one.
Exciting stuff for future work
nextaccountic
Ripgrep does something like thhis. It has a meta regex engine that switches engine when it finds what looks like pathological cases (or rather, the regex-automata crate does, which is used by the regex crate, which powers ripgrep).
https://docs.rs/regex-automata/latest/regex_automata/meta/st...
Ripgrep in turn exposes some knobs to tweak the heuristics
https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#how...
> input size normal hardened speedup w/ hardened
> 1,000 0.7ms 28us 25x
> 5,000 18ms 146us 123x
> 10,000 73ms 303us 241x
> 50,000 1.8s 1.6ms 1,125x
Why is there a normal mode if hardened mode is faster for all input sizes?