Activity
Mon
Wed
Fri
Sun
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Memberships

AI Developer Accelerator

Public • 3.8k • Free

Data Alchemy

Public • 22.2k • Free

Ecom Phenom Workshop

Private • 70 • Free

AI Developer Accelerator Pro

Private • 32 • $49/m

53 contributions to AI Developer Accelerator
CrewAI Docs in a single file.
Hey everyone! Now there’s a link that contains the complete CrewAI documentation in a .txt file, so you can feed it into any LLM or agent, or even add it using ‘@‘ in cursor. https://docs.crewai.com/llms-full.txt This should make things way easier. Same goes for Anthropic/Claude: http://docs.anthropic.com/llms-full.txt
6
7
New comment 3d ago
1 like • 6d
Very useful thanks Bastian
Looking for ideas on chunking reservation related conversations and context aware RAG
Hey guys, i had discussed this briefly a few meetings back about the problem I am trying to solve. I am stuck on how to best create the vector store, as simply chunking all messages and using some kind of Sentence splitting does not work well I manage a property portfolio on platforms like Airbnb, handling customer support through the entire guest journey (pre-booking to post-stay). I'm building a RAG system to help automate responses to guest inquiries. Here are the questions I have - some context of business below ## Technical Questions 1. **Vector Database Strategy** - How to structure embeddings for different information types? - Chunking Strategy Challenges: - Single message chunks: Lose conversation context - Multi-message chunks: How many messages maintain coherence? - Entire conversation chunks: May be too broad for specific queries - How to preserve booking context (guest state, property details) within chunks? - Should property-specific and global information be in separate vector spaces? - How to handle property hierarchies in vector search? 2. **Temporal Relevance** - How to weight conversation recency differently based on query type? - How to combine current property documents with historical conversations? 3. **Context-Aware Retrieval** - How to incorporate guest journey state into the retrieval process? - How to handle property relationships (e.g., similar apartments sharing info)? - How to balance property-specific vs. global policy information? 4. **Security and Policy Compliance** - How to ensure RAG responses respect security policies based on guest journey state? - How to handle platform-specific rules in responses? ## Data Sources and Unique Challenges ### 1. Historical Conversations (around 10,000 reservations over 7 years , each having 10-40 messages during the client journey) - Stored in PostgreSQL - Time relevance varies by query type: ``` Example A: "What's the WiFi password?" → Recent conversations only relevant (passwords change)
0
8
New comment 8d ago
1 like • 9d
How long are the largest messages you are receiving in tokens? OpenAI 4o-mini can process 128K tokens inwards (roughly 100,000 words) and output 16,000 tokens. Why do you need to chunk at all.
D&D using LLM
I remember one of the community mentioned he was doing something with D&D not sure if the have seen - https://obie.medium.com/my-kids-and-i-just-played-d-d-with-chatgpt4-as-the-dm-43258e72b2c6
0
0
Can't find your great code snippets - I wrote an app for that
I have 90+ VS code project envs with hundreds of python libraries. I will share the code but here is a screenshot.
6
3
New comment 14d ago
Can't find your great code snippets - I wrote an app for that
Going from PDF to Chunks the smart way
I got asked on yesterdays call about how to take a PDF into a more consistent way into chunks for RAG. The first challenge you have with converting any PDF file is dealing with the unique underlying way that the PDF document may be formated. Much of that formating has no impact on the printed output but does have an impact if you are using python to extract with Langchain making the output often inconsistent with sections often being wrongly aggregated for the chunking process. A better approach that has worked consistantly for me is to first convert the PDF into Markdown then convert the Markdown into chunking see: Step One: import pymupdf4llm import pathlib # Convert PDF to markdown md_text = pymupdf4llm.to_markdown("input.pdf") # Save markdown to file (optional) perhaps just save as a string pathlib.Path("output.md").write_bytes(md_text.encode()) Step Two: from langchain_text_splitters import MarkdownHeaderTextSplitter # Define headers to split on headers_to_split_on = [ ("#", "Header 1"), ("##", "Header 2"), ("###", "Header 3"), ] # Initialize splitter markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on) # Split the markdown text md_header_splits = markdown_splitter.split_text(md_text)
2
3
New comment 14d ago
0 likes • 15d
@Tom Welsh there is a way u found need to talk to @Brandon Hancock about turning it on
1-10 of 53
Paul Miller
4
43points to level up
@paul-miller-1511
Co-founder of two SAAS startups. Part-time Python amateur dev working with AI. Political advocate for public infrastructure in New Zealand.

Active 2d ago
Joined Apr 28, 2024
Auckland, New Zealand.
powered by