223 112 2599

Knut Jägersberg

KnutJaegersberg

jagersbergknut

AI & ML interests

NLP, opinion mining, narrative intelligence

Recent Activity

liked a model about 20 hours ago

PowerInfer/SmallThinker-3B-Preview

liked a model 1 day ago

nomic-ai/modernbert-embed-base-unsupervised

liked a model 1 day ago

nomic-ai/modernbert-embed-base

View all activity

Articles

Organizations

KnutJaegersberg's activity

reacted to s3nh's post with ❤️ 5 days ago

Post

1713

Welcome back,

Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect.
Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3

https://huggingface.co/SmolTuners

3 replies

posted an update 12 days ago

Post

1284

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs

I found it useful to think of AI agent design as progressing up a ladder, through evolutionary selection.

https://huggingface.co/blog/KnutJaegersberg/intelligence-potentiation

reacted to sayakpaul's post with 🤗 22 days ago

Post

2059

Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences

7 replies

reacted to ariG23498's post with 🤗 27 days ago

Post

1301

We are blessed with another iteration of Pali Gemma. Google launches PaliGemma 2.

google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

merve/paligemma2-vqav2

posted an update 27 days ago

Post

1216

Practical Consciousness Theory for AI System Design

Wrote a blog post about practical consciousness theory

https://huggingface.co/blog/KnutJaegersberg/practical-consciousness-theory

posted an update about 1 month ago

Post

1832

DrNicefellow/Qwen-QwQ-32B-Preview-4.25bpw-exl2

Rumor has it this is currently the best model for 24 GB VRAM local usage.

DrNicefellow/Qwen-QwQ-32B-Preview-4.25bpw-exl2

posted an update about 1 month ago

Post

1109

openGPT-X/Teuken-7B-instruct-research-v0.4

New European LLM

openGPT-X/Teuken-7B-instruct-research-v0.4

posted an update 4 months ago

Post

1176

appvoid/arco

arco consistently outperforms every sota model below 600m parameters on average

appvoid/arco

posted an update 5 months ago

Post

2239

Wrote a blog post with some ideas about prompt engineering

https://huggingface.co/blog/KnutJaegersberg/first-principles-prompt-engineering

posted an update 5 months ago

Post

2314

mobiuslabsgmbh/Llama-3.1-70b-instruct_4bitgs64_hqq

99% of the performance across various benchmarks!

mobiuslabsgmbh/Llama-3.1-70b-instruct_4bitgs64_hqq

posted an update 5 months ago

Post

924

neuralmagic/Meta-Llama-3.1-405B-Instruct-FP8

Requant of the big llama, using 20% less memory

neuralmagic/Meta-Llama-3.1-405B-Instruct-FP8

posted an update 6 months ago

Post

1376

Decensored Gemma2-27b

TheDrummer/Big-Tiger-Gemma-27B-v1

reacted to merve's post with 👍 6 months ago

Post

3284

Just shipped: introduction to vision language models (aka image-text-to-text) https://huggingface.co/tasks/image-text-to-text

Learn about more machine learning tasks at https://huggingface.co/tasks

posted an update 6 months ago

Post

641

Unsocial Intelligence: an Investigation of the Assumptions of AGI Discourse

I don't agree with some of the assertions made here, but it is an interesting paper and a good overview.

https://arxiv.org/abs/2401.13142

reacted to merve's post with ❤️ 6 months ago

Post

4339

Florence-2 is a new vision foundation model capable of a wide variety of tasks 🤯
Demo 👉🏻 gokaygokay/Florence-2
Collection 👉🏻 microsoft/florence-6669f44df0d87d9c3bfb76de

This model can handle tasks that vary from OCR to semantic segmentation.

The difference from previous models is that the authors have compiled a dataset consisting of 126M images with 5.4B annotations labelled with their own data engine pseudolabelled by smaller specialized models and APIs.

The model has a similar architecture to previous models: an image encoder and a multimodality encoder with a text decoder. The authors have compiled the multitask dataset with prompts for each task.

You can also fine-tune this model on any task of choice. The authors also released different results on downstream tasks and reported their results when un/freezing the vision encoder 🤓📉
They have released fine-tuned models too, you can find them in the collection above 🤗

3 replies

reacted to merve's post with 🔥 7 months ago

Post

3013

Finally @CVPR2024 is here! 🩷
Have you claimed your papers and linked your models/datasets/demos?
This will increase visibility and impact of your paper 💫

To index your papers, go here
CVPR2024/CVPR2024-papers
Find your paper, click on paper page link, index the paper, then click on your name (workflow is below 👇🏻)
If you'd like to add links to your paper, go here CVPR2024/update-CVPR2024-papers
login, find your paper's id, retrieve the paper, fill in the info and submit!

replied to s3nh's post 7 months ago

Don't burn out! Lighten up again will you.

posted an update 7 months ago

Post

1557

What We Learned from a Year of Building with LLMs

It's a nice perspective outlined in here.

“When a measure becomes a target, it ceases to be a good measure.”

— Goodhart’s Law

https://www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/

reacted to s3nh's post with ❤️ 7 months ago

Post

GPU Poor POV: Burnout

Sometimes we do not have an energy to post about AI and new methods.
And thats totally ok, I guess.
Remember to sleep well and drink a lot of water. Have a great day :D <3

2 replies

replied to BramVanroy's post 9 months ago

it mixed up stuff in the output, gave weird answers. didn't have that problem with other models. maybe the update they released sovled that issue, I just never cared, given the alternatives.

Knut Jägersberg

AI & ML interests

Recent Activity

Articles

**Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs**

Practical Consciousness Theory for AI System Design

Perspectives for first principles prompt engineering

Towards actively reasoning LLM systems

Organizations

KnutJaegersberg's activity

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs