Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's a shame they couldn't use yaml, instead. I compared them and yaml uses about 20% fewer tokens. However, I can understand accuracy, derived from frequency, being more important than token budget.


I think YAML actually uses more tokens than JSON without indents, especially with deep data. For example "," being a single token makes JSON quite compact.

You can compare JSON and YAML on https://platform.openai.com/tokenizer


I would imagine JSON is easier for a LLM to understand (and for humans!) because it doesn't rely on indentation and confusing syntax for lists, strings etc.


Its a lot more straightforward to use JSON programmatically than YAML.


If you are using any kind of type checking instead of blindly trusting generated json it's exactly the same amount of work.


It really shouldn't be, though. I.e. not unless you're parsing or emitting it ad-hoc, for example by assuming that an expression like:

  "{" + $someKey + ":" + $someValue + "}"
produces a valid JSON. It does - sometimes - and then it's indeed easier to work with. It'll also blow up in your face. Using JSON the right way - via a proper parser and serializer - should be identical to using YAML or any other equivalent format.


Even if the APIs for both were equally simple, modules for manipulating json are way more likely to be available in the stdlib of whatever language you’re using.


JSON can be minified.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: