Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A neat way of dealing with sparse input is to take the entire chat history (if any) into account and ask the LLM to expand the query so that the semantic search has more to work with. Generally, using the LLM to add more data to the user query based on context, previous conversation, or just having it produce a fake document all together based on the sparse query can work well to improve the vectors you use in the similarity search. A concern with this strategy is latency, as you need to add another generation hop before you can query the vector db.


Interesting. Do you have specific examples or a link to a post detailing this?


The approach is based on hypothetical document embeddings (HyDE). Here is a good description of it in the context of langchain: https://python.langchain.com/docs/use_cases/question_answeri...

The original paper proposing this technique can be found here: https://arxiv.org/pdf/2212.10496.pdf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: