Looking for ideas on chunking reservation related conversations and context aware RAG
Hey guys, i had discussed this briefly a few meetings back about the problem I am trying to solve. I am stuck on how to best create the vector store, as simply chunking all messages and using some kind of Sentence splitting does not work well I manage a property portfolio on platforms like Airbnb, handling customer support through the entire guest journey (pre-booking to post-stay). I'm building a RAG system to help automate responses to guest inquiries. Here are the questions I have - some context of business below ## Technical Questions 1. **Vector Database Strategy** - How to structure embeddings for different information types? - Chunking Strategy Challenges: - Single message chunks: Lose conversation context - Multi-message chunks: How many messages maintain coherence? - Entire conversation chunks: May be too broad for specific queries - How to preserve booking context (guest state, property details) within chunks? - Should property-specific and global information be in separate vector spaces? - How to handle property hierarchies in vector search? 2. **Temporal Relevance** - How to weight conversation recency differently based on query type? - How to combine current property documents with historical conversations? 3. **Context-Aware Retrieval** - How to incorporate guest journey state into the retrieval process? - How to handle property relationships (e.g., similar apartments sharing info)? - How to balance property-specific vs. global policy information? 4. **Security and Policy Compliance** - How to ensure RAG responses respect security policies based on guest journey state? - How to handle platform-specific rules in responses? ## Data Sources and Unique Challenges ### 1. Historical Conversations (around 10,000 reservations over 7 years , each having 10-40 messages during the client journey) - Stored in PostgreSQL - Time relevance varies by query type: ``` Example A: "What's the WiFi password?" → Recent conversations only relevant (passwords change)