For those wondering about the use case, this is very useful when enabling streaming for structured output in LLM responses, such as JSON responses. For my local Raspberry Pi agent I needed something performant, I've been using streaming-json-js [1], but development appears to have been a bit dormant over the past year. I'll definitely take a look at your jsonriver and see how it compares!
Do any LLMs support constrained generation of newline delimited json? Or have you found that they're generally reliable enough that you don't need to do constrained sampling?
[1] https://github.com/karminski/streaming-json-js