Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, realized this later. Maybe the trie (or whatever) index can have a concept of "any single character" built in to it. This should be easier than making into a full automata.


http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levensht...

See the part about how you would intersect a trie and a levenshtein automata and find the ones that must be in both. It's not as hard as one would think

(Tries are already DFA)

You can also do it for other types of indexes.


The simple way to do this is to have a default child in the trie. But even then, it can get absurdly large. Look at something like "aaaa"(4), for instance.


That's basically what this is, except you do it for the query rather than for the index, and you build the structure more intelligently than just inserting all possible insertions/deletions/substitutions. If you wanted to make an index like that you could build a Levenshtein DFA for each term in the index, and then union all those together.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: