We have started porting it to Antlr to make it language agnostic though we are still open to looking at other options as well. Does anyone have any good experience or some ideas for alternative approaches?
There are quite a few components so common that people refer to them only by their part number (ESP8266, 555, WS2812, etc) - are you planning to support things like that too?
Right now it looks like only common/standard parts are parseable (resistors, leds, etc).
I think once you are entering actual manufacturer part numbers you are better off deferring this to a parts database or search engine. In my uses I have been deferring this to an Octopart search. You can see this working on e.g. https://bom-builder.kitspace.org (enter something into the description field and click around on the wand icons).
We are planning to include standard part naming schemes such as Pro Electron [1] and JEDEC [2] to recognize common diodes and transistors where it makes sense. The more interesting approach to me is to actually try recognize transistor descriptions such as "npn sot23 125mW beta=200 ic>=200mA". There is some work in progress, by dvc94ch, for both approaches (if I recall correctly) on the antlr branch: https://github.com/monostable/electro-grammar/pull/4
That gets pretty fuzzy, though. For example, does ESP8266 refer to the actual chip itself (unlikely, at least in a hobbyist context) or one of the variety of modules? These general designations aren't really referring to specific parts, so much as classes of parts. That's probably not so useful here, given that this looks to be focused on parsing BOMs for ordering purposes.
Antlr the last I looked and tried to use it was a pain. Version mismatches and incompatible changes... Plain old bison/yacc/lex or some of the newer Rust parsing libraries would probably be simpler and easier to maintain.
Yeah dvc94ch who wrote most of what's been done for v2 so far was suggesting Rust and then web-assembly. Kind of warming to it after using Antlr for a while. We both really need JS versions though so I am still not sold. Compiling with Emscripten might be an option but it could be a lot of pain as well.
I’ve just tinkered with wasm and rust, but it was surprisingly pretty straightforward. It seems there a huge amount of effort being placed in the ecosystem. Great project so far! I’ll have to check out the bom tools.
Emscripten compiles to asm.js, which is just a subset of JS. CanIUse will tell you some browsers don't support it, but it really means they don't support accelerating it. Your code very well may still run in those browsers (but I'm not sure about the determinant of those cases).
(I'm not joking. I've been messing with parsers and such in Prolog the past week or so and I'm just blown away. I'm happy but also I feel stupid for not learning it sooner. I've wasted so much time and effort. I've been working too hard. Just learn Prolog and then use it, the total time saved will pay for the learning curve by the end of the year.)
I am reading up on this a bit and it seems very interesting but I am not quite sure it would be able to fill the gap that something like Antlr fills. We want to generate parsers in various languages (JS, Python, Go) from a single grammar and existing implementations of "Parsing with Derivatives" seem to be experimental and mostly in functional languages.
Prolog is really interesting too but how would it work in practice for our use case?
Ah, I misunderstood what you were asking about. My suggestions don't really apply so much to your case.
You might take a look at the Meta-II metacompiler. It's really simple but by the same token it's trivial to port it to different languages. It's part of the basis for VPRI's PEG (Parser Expression Grammar) parser generators.
Nearley is amazing. The authors even implemented Joop Leo's optimization. If the other published optimizations were implemented as well, Nearley would probably become the Earley parser implementation. There is an issue suggesting this but it's marked as wontfix.
Maybe they are not interested in optimization for the sake of optimization? For my use case it has been plenty fast anyway (not that it's a particularly intensive use case).
Working with Nearley was a fun experience and I recommend the blog post that got me started with it to anyone who is interested in writing a parser in JS: https://medium.com/@gajus/parsing-absolutely-anything-in-jav...
We have started porting it to Antlr to make it language agnostic though we are still open to looking at other options as well. Does anyone have any good experience or some ideas for alternative approaches?