wb
whitebill
AI & ML interests
None yet
Recent Activity
reacted
to
as-cle-bert's
post
with ๐ฅ
4 days ago
๐๐๐๐ซ๐ฅ๐ฒ ๐๐๐ฐ ๐๐๐๐ซ ๐ซ๐๐ฅ๐๐๐ฌ๐๐ฌ๐
Hi HuggingFacers๐ค, I decided to ship early this year, and here's what I came up with:
๐๐๐๐๐ญ๐๐จ๐ฐ๐ง (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft
GitHub Repo ๐ https://github.com/AstraBert/PdfItDown
PyPi Package ๐ https://pypi.org/project/pdfitdown/
๐๐๐ง๐๐ซ๐๐ฏ ๐ฏ๐.๐.๐ (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the ๐ฟ๐ฒ๐๐ฟ๐ถ๐ฒ๐๐ฎ๐น performance of your ๐๐ฒ๐
๐ ๐ฒ๐บ๐ฏ๐ฒ๐ฑ๐ฑ๐ถ๐ป๐ด models, I have good news for you๐ฅณ๐ฅณ
The new release for ๐๐๐ง๐๐ซ๐๐ฏ now supports ๐ฑ๐ฒ๐ป๐๐ฒ and ๐๐ฝ๐ฎ๐ฟ๐๐ฒ retrieval (thanks to FastEmbed by Qdrant) with ๐๐ฒ๐
๐-๐ฏ๐ฎ๐๐ฒ๐ฑ ๐ณ๐ถ๐น๐ฒ ๐ณ๐ผ๐ฟ๐บ๐ฎ๐๐ (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new ๐ฟ๐ฒ๐น๐ฒ๐๐ฎ๐ป๐ฐ๐ฒ ๐บ๐ฒ๐๐ฟ๐ถ๐ฐ๐!
GitHub repo ๐ https://github.com/AstraBert/SenTrEv
Release Notes ๐ https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0
PyPi Package ๐ https://pypi.org/project/sentrev/
Happy New Year and have fun!๐ฅ
reacted
to
cfahlgren1's
post
with ๐
7 days ago
The https://huggingface.co/deepseek-ai/DeepSeek-V3 is very good! I have been playing with it and found it is really good at one-shotting a pretty good landing page.
You can play with it here: https://deepseek-artifacts.vercel.app
All the responses get saved in the https://huggingface.co/datasets/cfahlgren1/react-code-instructions dataset. Hopefully we can build one of the biggest, highest quality frontend datasets on the hub ๐ช
Organizations
Collections
1
models
None public yet
datasets
None public yet