Activity
Mon
Wed
Fri
Sun
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
What is this?
Less
More

Memberships

Adonis Gang

Private • 147.8k • Free

Optimization Nation

Private • 212 • Free

Building in public by Daniel

Private • 9.3k • Free

AI Automation Agency Hub

Private • 32.8k • Free

ON IC (old)

Private • 26 • Paid

Data Alchemy

Public • 19.8k • Free

2 contributions to Data Alchemy
Automatically Boost your RAG Knowledgebase with Images | Free plug-n-play script to do so on auto-pilot
Hey everyone! A few days ago, I posted an article about a Python script I developed that helped me seamlessly transcribe and integrate 3,000+ visual slides full of crucial information into my RAG Knowledgebase—completely on autopilot. Now, I’ve given the script a full makeover to make it as easy as possible to plug-and-play! The article took off on Medium, so I knew I had to bring it here for you all to benefit: [ARTICLE LINK + SCRIPT LINK & FREE PROMPT TEMPLATES] [LinkedIn Post link in case you prefer it] This all started because I was developing a RAG chatbot for a client and... I ran into a pretty big problem. All the info he provided was in slides full of text & images! I had to figure out a way to implement these slides into the Knowledgebase (and it had to work). So I got to work and developed a script to do so on autopilot. It leverages the vision capabilities of the latest LLM models to transcribe the slides accurately—OCR just couldn’t cut it because it ignores the layout and misses out on crucial images. But that’s not all! The script is loaded with extra features: - SUMMARIZING & VECTORIZING: The script doesn’t just transcribe; it summarizes key concepts and creates vectors to ensure your Knowledgebase captures everything. These vectors are essential for data integration. - FOLDER PROCESSING: It processes every subfolder in your directory, so no image gets left behind. Perfect for managing large datasets. - SMART FILE NAMING: The script updates transcription filenames with vector counts, so you always know where you stand. - MERGING TRANSCRIPTIONS: You can merge all transcriptions into a single file—whether by folder or into one master file—keeping your data organized and accessible. - VECTOR COUNTING: Get a quick snapshot of your data volume with vector counts for each main folder—great for ensuring completeness. - VECTOR UPLOADING: Finally, it uploads all vectors to the Qdrant vector store (but you can switch to another provider with a simple code tweak).
9
9
New comment 17d ago
Automatically Boost your RAG Knowledgebase with Images | Free plug-n-play script to do so on auto-pilot
1 like • 21d
@Marcio Pacheco hope it's helpful!
1 like • 17d
@Ana Crosatto Thomsen Glad you liked it! :)
Hello from Madrid!
Hey everyone, Glad to be here. I’m Marcos; born & raised in Madrid, Spain I enjoy business, productivity, fitness, reading and developing solutions that actually help people. Currently studying Business Management & Technology, but I’ve been in the entrepreneurship game since I was 14 (turning 20 this year), with a bunch of loses but some big wins as well. Pivoted from my last business since market trends shifted, and it got harder and harder to make healthy profits. But with it I made some good money and started refining my python development & business skills. Sadly nearing my final months on the industry, one of the biggest players that owed me multiple five-figures after months of delayed payments (may not sound like a lot to some but hey, I was younger and it was tough) went bankrupt. With that rough end to my journey on that industry, I pivoted to AI Automations & Developing Solutions with Artificial Intelligence using my previous coding knowledge to my advantage. Fast forward a couple of years and I’ve developed a custom-coded SaaS platform on python to help solve student queries for Driving Schools all across Spain (all backend, using WhatsApp as frontend for both messaging & user management). I’ve even been featured in some newspapers & radio programs from my country! :) Also developed a program that automatically generates tailored and in-depth health & fitness analysis for customers with info from a form (or even imported data from fitness apps like strong). (lil’ rant here: in my opinion, these tailored documents are an amazing way to add value and will surely have a place in the future of B2B and B2C sales, whether as lead magnets, value-adders, or stand-alone products) As for now, I’m currently focusing on building my brand on social media and attracting more clients. Anyways, that’s mostly it. Hope I didn’t make my story too boring. Fyi, my dms are fully open for anyone on this community so feel free to reach out. Thanks for reading this far. Hope you have an amazing rest of your day :)
11
9
New comment Jul 17
2 likes • Jul 16
@Lukas Haak thanks for the warm welcome Lukas!
1 like • Jul 16
@Marcio Pacheco thanks Marcio, appreciate your message :)
1-2 of 2
Marcos Santiago
3
41points to level up
@marcos-santiago-3730
Entrepreneur & Developer

Active 4d ago
Joined Jul 16, 2024
Madrid, Spain
powered by