wb's picture
4 2

wb

whitebill

AI & ML interests

None yet

Recent Activity

reacted to as-cle-bert's post with ๐Ÿ”ฅ 4 days ago
๐ŸŽ‰๐„๐š๐ซ๐ฅ๐ฒ ๐๐ž๐ฐ ๐˜๐ž๐š๐ซ ๐ซ๐ž๐ฅ๐ž๐š๐ฌ๐ž๐ฌ๐ŸŽ‰ Hi HuggingFacers๐Ÿค—, I decided to ship early this year, and here's what I came up with: ๐๐๐Ÿ๐ˆ๐ญ๐ƒ๐จ๐ฐ๐ง (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft GitHub Repo ๐Ÿ‘‰ https://github.com/AstraBert/PdfItDown PyPi Package ๐Ÿ‘‰ https://pypi.org/project/pdfitdown/ ๐’๐ž๐ง๐“๐ซ๐„๐ฏ ๐ฏ๐Ÿ.๐ŸŽ.๐ŸŽ (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the ๐—ฟ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น performance of your ๐˜๐—ฒ๐˜…๐˜ ๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด models, I have good news for you๐Ÿฅณ๐Ÿฅณ The new release for ๐’๐ž๐ง๐“๐ซ๐„๐ฏ now supports ๐—ฑ๐—ฒ๐—ป๐˜€๐—ฒ and ๐˜€๐—ฝ๐—ฎ๐—ฟ๐˜€๐—ฒ retrieval (thanks to FastEmbed by Qdrant) with ๐˜๐—ฒ๐˜…๐˜-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ณ๐—ถ๐—น๐—ฒ ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐˜€ (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new ๐—ฟ๐—ฒ๐—น๐—ฒ๐˜ƒ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—บ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฐ๐˜€! GitHub repo ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv Release Notes ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0 PyPi Package ๐Ÿ‘‰ https://pypi.org/project/sentrev/ Happy New Year and have fun!๐Ÿฅ‚
View all activity

Organizations

medical-ai-linkdata's profile picture