Stephen Genusa PRO

StephenGenusa

AI & ML interests

LFM, LLM, Quantization, Vision, RAG/Hybrid/Graph, Multimodality, NLP (will take us further down the road with existing LLM tech)

Recent Activity

liked a model 8 days ago

prithivMLmods/Triangulum-10B

reacted to vincentg64's post with 🔥 13 days ago

LLM 2.0, RAG & Non-Standard Gen AI on GitHub https://mltblog.com/3DsyZSq In this article, I share my latest Gen AI and LLM advances, featuring innovative approaches radically different from both standard AI and classical ML/NLP. The focus is on doing better with less, using efficient architectures, new algorithms and evaluation metrics. It originates from research that I started long ago. It gained significant momentum in the last two years. See background and history at https://mltblog.com/4g2sKTv. OpenAI, Perplexity, Anthropic, Llama and others typically follow the trend and implement solutions very similar to mines within 3 to 6 months after I publish new milestones. For instance, multi-tokens, knowledge graph tokens, multi-indexes, real-time fine-tuning, mixtures of experts, LLM routers, small enterprise sub-LLMs, prompt distillation, relevancy scoring engine, deep contextual retrieval, optimum agentic chunking, and modern UI instead of the basic prompt box. I keep adding new features all the time, staying ahead of competition. ➡️ Read full article with links to GitHub, at https://mltblog.com/3DsyZSq

liked a model 19 days ago

bespokelabs/Bespoke-MiniCheck-7B

View all activity

Organizations

StephenGenusa's activity

liked a model 8 days ago

prithivMLmods/Triangulum-10B

Text Generation • Updated 4 days ago • 79 • 9

reacted to vincentg64's post with 🔥 13 days ago

Post

2214

LLM 2.0, RAG & Non-Standard Gen AI on GitHub https://mltblog.com/3DsyZSq

In this article, I share my latest Gen AI and LLM advances, featuring innovative approaches radically different from both standard AI and classical ML/NLP. The focus is on doing better with less, using efficient architectures, new algorithms and evaluation metrics. It originates from research that I started long ago. It gained significant momentum in the last two years. See background and history at https://mltblog.com/4g2sKTv.

OpenAI, Perplexity, Anthropic, Llama and others typically follow the trend and implement solutions very similar to mines within 3 to 6 months after I publish new milestones. For instance, multi-tokens, knowledge graph tokens, multi-indexes, real-time fine-tuning, mixtures of experts, LLM routers, small enterprise sub-LLMs, prompt distillation, relevancy scoring engine, deep contextual retrieval, optimum agentic chunking, and modern UI instead of the basic prompt box. I keep adding new features all the time, staying ahead of competition.

➡️ Read full article with links to GitHub, at https://mltblog.com/3DsyZSq

1 reply

liked a model 19 days ago

bespokelabs/Bespoke-MiniCheck-7B

Text Classification • Updated 19 days ago • 5.95k • 57

upvoted a paper 3 months ago

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 69

reacted to m-ric's post with 🚀 3 months ago

Post

1283

𝗔𝗱𝗱 𝘀𝗼𝘂𝗿𝗰𝗲 𝗵𝗶𝗴𝗵𝗹𝗶𝗴𝗵𝘁𝗶𝗻𝗴 𝘁𝗼 𝘆𝗼𝘂𝗿 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺! 📄💡

RAG systems are supposed to make your LLM's answer more trustworthy, by inserting in the prompt some supporting documents from a knowledge base : we say that we're "adding some context".

👎 But if you don't know which part of the answer has been generated based on which input tokens, it's hard to tell wether it was effectively grounded in the context knowledge or not!

🤔 I've been working on the question: is it possible to add notes to the answer linking to which part of the context they're generated from?

And I've found a great solution: a great technique called Layer-wise Relevance Propagation (LRP), showcased in a paper at ICML `24 by Reduan Achtibat et al allows, allows to precisely score how important each input token was in generating your output! They've made it into a library called LXT.

📊 For each generated output token, LXT gives you attribution scores for each input token.

⚙️ So I've worked a bit more on aggregating these scores into meaningful spans between successive input and output tokens, and I finally obtained my desired result: RAG with source highlighting!

Try the demo here 👉 m-ric/rag_highlights

Caveats:
- It slows down generation (for now quite a lot, could hopefully be reduced a lot)
- For now it supports only specific models: Llama models and Mixtral

If there's enough interest in this solution, I can improve it further and spin it off into a specific library for RAG! 🚀

upvoted a paper 3 months ago

MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling

Paper • 2409.16160 • Published Sep 24, 2024 • 33

reacted to Wauplin's post with 🤗 3 months ago

Post

2893

What a great milestone to celebrate! The huggingface_hub library is slowly becoming a cornerstone of the Python ML ecosystem when it comes to interacting with the @huggingface Hub. It wouldn't be there without the hundreds of community contributions and feedback! No matter if you are loading a model, sharing a dataset, running remote inference or starting jobs on our infra, you are for sure using it! And this is only the beginning so give a star if you wanna follow the project 👉 https://github.com/huggingface/huggingface_hub

1 reply

New activity in mattshumer/ref_70_e3 4 months ago

🚩 Report: Ethical issue(s)

#73 opened 4 months ago by

StephenGenusa

replied to m-ric's post 4 months ago

I think there will be a big breakthrough as well, but I'd be surprised if it happens soon. If it does, I'd be happy. While the architectures of LLMs continue to advance I don't see any evidence that significant progress is being made and I personally think the architectures are too primitive and inherently self-limiting. I am also a believer that bigger does not necessarily mean better. I think we've reached the limits or are near the point of reaching the limits of where size dictates how powerful the LLM is.

Therefore, I think, given the current architectural limitations, the external limits, namely those dictated by power availability, and the many resources/costs of building better LLMs, will slow AI development until a radical change comes along.

We've managed to survive without them and now that we have them, they are a great step forward and we'll continue using and improving what we have. There are many improvements that can be made around the LLM using NLP to improve what we expect from LLMs and that's where the focus will turn for the time being, such as xLLM. Better architectures are going to have to take into account the difference in statistical models of representations of the world and the way humans communicate through speech and writing.

replied to vincentg64's post 4 months ago

Vincent, thank you for your time, effort and especially for your willingness to share your expertise. I am really looking forward to this!

reacted to vincentg64's post with ❤️ 4 months ago

Post

1454

Hyperfast Contextual Custom LLM with Agents, Multitokens, Explainable AI, and Distillation https://mltblog.com/4dNPSnB

New additions to this ground-breaking system include multi-token distillation when processing prompts, agents to meet user intent, more NLP, and a command prompt menu accepting both standard prompts and various actions.

I also added several illustrations, featuring xLLM in action with a full session and sample commands to fine-tune in real-time. All the code, input sources (anonymized corporate corpus from fortune 100 company), contextual backend tables including embeddings, are on GitHub. My system has zero weight, no transformer, and no neural network. It relies on explainable AI, does not require training, is fully reproducible, and fits in memory. Yet your prompts can retrieve relevant full text entities from the corpus with no latency — including URLs, categories, titles, email addresses, and so on — thanks to well-designed architecture.

Read more, get the code, paper and everything for free, at https://mltblog.com/4dNPSnB

2 replies

reacted to ybelkada's post with 🔥 5 months ago

Post

3779

FalconMamba 7B - a new model from TII (Technology Innovation Institute) is out !

- Blogpost: https://huggingface.co/blog/falconmamba
- Link to collection: tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a
- Link to playground: tiiuae/falcon-mamba-playground

liked a model 5 months ago

LLM4Binary/llm4decompile-22b-v2

Text Generation • Updated Jun 25, 2024 • 197 • 16

reacted to MonsterMMORPG's post with 🚀❤️🔥 5 months ago

Post

5270

FLUX Local & Cloud Tutorial With SwarmUI - FLUX: The Groundbreaking Open Source txt2img Model Outperforms Midjourney & Others - FLUX: The Anticipated Successor to SD3

🔗 Comprehensive Tutorial Video Link ▶️ https://youtu.be/bupRePUOA18

FLUX represents a milestone in open source txt2img technology, delivering superior quality and more accurate prompt adherence than #Midjourney, Adobe Firefly, Leonardo Ai, Playground Ai, Stable Diffusion, SDXL, SD3, and Dall E3. #FLUX, a creation of Black Forest Labs, boasts a team largely comprised of #StableDiffusion's original developers, and its output quality is truly remarkable. This statement is not hyperbole; you'll witness its capabilities in the tutorial. This guide will demonstrate how to effortlessly install and utilize FLUX models on your personal computer and cloud platforms like Massed Compute, RunPod, and a complimentary Kaggle account.

🔗 FLUX Setup Guide (publicly accessible) ⤵️
▶️ https://www.patreon.com/posts/106135985

🔗 FLUX Models One-Click Robust Automatic Downloader Scripts ⤵️
▶️ https://www.patreon.com/posts/109289967

🔗 Primary Windows SwarmUI Tutorial (Essential for Usage Instructions) ⤵️
▶️ https://youtu.be/HKX8_F1Er_w

🔗 Cloud-based SwarmUI Tutorial (Massed Compute - RunPod - Kaggle) ⤵️
▶️ https://youtu.be/XFUZof6Skkw

🔗 SECourses Discord Server for Comprehensive Support ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 SECourses Reddit Community ⤵️
▶️ https://www.reddit.com/r/SECourses/

🔗 SECourses GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 Official FLUX 1 Launch Announcement Blog Post ⤵️
▶️ https://blackforestlabs.ai/announcing-black-forest-labs/

Video Segments

0:00 Introduction to the state-of-the-art open source txt2img model FLUX
5:01 Process for integrating FLUX model into SwarmUI
....

liked a model 6 months ago

THUDM/chatglm3-6b-128k

Updated Dec 5, 2024 • 241 • 77

liked 3 models 9 months ago