Ali El Filali's picture

Ali El Filali

alielfilali01

AI & ML interests

AI Psychometrician ? | NLP (mainly for Arabic) | Other interests include Reinforcement Learning and Cognitive sciences among others

Recent Activity

updated a dataset 1 day ago
inceptionai/requests-dataset
upvoted a collection 2 days ago
Deepseek Papers
upvoted a paper 2 days ago
DeepSeek-V3 Technical Report
View all activity

Articles

Organizations

Gradio-Themes-Party's profile picture Arabic Machine Learning 's profile picture BigLAM: BigScience Libraries, Archives and Museums's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture Blog-explorers's profile picture ASAS AI's profile picture Nt3awnou's profile picture Qwen's profile picture Mixed Arabic Datasets's profile picture ZeroGPU Explorers's profile picture 2A2I Legacy Models & Datasets's profile picture AtlasIA's profile picture 2A2I's profile picture Open Arabic LLM Leaderboard's profile picture MLX Community's profile picture Social Post Explorers's profile picture C4AI Community's profile picture Dev Mode Explorers's profile picture Chinese LLMs on Hugging Face's profile picture ThinkAI's profile picture KABOUR's profile picture Hugging Face Discord Community's profile picture llmc's profile picture Arabic Translation Prompt Engineering's profile picture Inception's profile picture Dataset Tools's profile picture ml-fw-prerelease's profile picture Data Is Better Together Contributor's profile picture Donut Earthers ๐Ÿฉ's profile picture QudraTech's profile picture

alielfilali01's activity

reacted to suayptalha's post with โค๏ธ 2 days ago
view post
Post
1698
๐Ÿš€ Introducing ๐…๐ข๐ซ๐ฌ๐ญ ๐‡๐ฎ๐ ๐ ๐ข๐ง๐  ๐…๐š๐œ๐ž ๐ˆ๐ง๐ญ๐ž๐ ๐ซ๐š๐ญ๐ข๐จ๐ง ๐จ๐Ÿ ๐ฆ๐ข๐ง๐†๐‘๐” ๐Œ๐จ๐๐ž๐ฅ๐ฌ from the paper ๐–๐ž๐ซ๐ž ๐‘๐๐๐ฌ ๐€๐ฅ๐ฅ ๐–๐ž ๐๐ž๐ž๐๐ž๐?

๐Ÿ–ฅ I have integrated ๐ง๐ž๐ฑ๐ญ-๐ ๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐จ๐ง ๐‘๐๐๐ฌ, specifically minGRU, which offer faster performance compared to Transformer architectures, into HuggingFace. This allows users to leverage the lighter and more efficient minGRU models with the "๐ญ๐ซ๐š๐ง๐ฌ๐Ÿ๐จ๐ซ๐ฆ๐ž๐ซ๐ฌ" ๐ฅ๐ข๐›๐ซ๐š๐ซ๐ฒ for both usage and training.

๐Ÿ’ป I integrated two main tasks: ๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐’๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž๐‚๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง and ๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐‚๐š๐ฎ๐ฌ๐š๐ฅ๐‹๐Œ.

๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐’๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž๐‚๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง:
You can use this class for ๐’๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž ๐‚๐ฅ๐š๐ฌ๐ฌ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง tasks. I also trained a Sentiment Analysis model with stanfordnlp/imdb dataset.

๐Œ๐ข๐ง๐†๐‘๐”๐…๐จ๐ซ๐‚๐š๐ฎ๐ฌ๐š๐ฅ๐‹๐Œ:
You can use this class for ๐‚๐š๐ฎ๐ฌ๐š๐ฅ ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐Œ๐จ๐๐ž๐ฅ tasks such as GPT, Llama. I also trained an example model with roneneldan/TinyStories dataset. You can fine-tune and use it!

๐Ÿ”— ๐‹๐ข๐ง๐ค๐ฌ:
Models: suayptalha/mingru-676fe8d90760d01b7955d7ab
GitHub: https://github.com/suayptalha/minGRU-hf
LinkedIn Post: https://www.linkedin.com/posts/suayp-talha-kocabay_mingru-a-suayptalha-collection-activity-7278755484172439552-wNY1

๐Ÿ“ฐ ๐‚๐ซ๐ž๐๐ข๐ญ๐ฌ:
Paper Link: https://arxiv.org/abs/2410.01201

I am thankful to Leo Feng, Frederick Tung, Mohamed Osama Ahmed, Yoshua Bengio and Hossein Hajimirsadeghi for their papers.
posted an update 2 days ago
view post
Post
1542
~75% on the challenging GPQA with only 40M parameters ๐Ÿ”ฅ๐Ÿฅณ

GREAT ACHIEVEMENT ! Or is it ?

This new Work, "Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation", take out the mystery about many models i personally suspected their results. Speacially on leaderboards other than the english one, Like the Open Arabic LLM Leaderbaord OALL/Open-Arabic-LLM-Leaderboard.

The authors of this work, first started by training a model on the GPQA data, which, unsurprisingly, led to the model achieving 100% performance.

Afterward, they trained what they referred to as a 'legitimate' model on legitimate data (MedMCQA). However, they introduced a distillation loss from the earlier, 'cheated' model.

What they discovered was fascinating: the knowledge of GPQA leaked through this distillation loss, even though the legitimate model was never explicitly trained on GPQA during this stage.

This raises important questions about the careful use of distillation in model training, especially when the training data is opaque. As they demonstrated, itโ€™s apparently possible to (intentionally or unintentionally) leak test data through this method.

Find out more: Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation (2412.15255)
  • 1 reply
ยท
replied to their post 19 days ago
posted an update 19 days ago
view post
Post
3353
Unpopular opinion: Open Source takes courage to do !

Not everyone is brave enough to release what they have done (the way they've done it) to the wild to be judged !
It really requires a high level of "knowing wth are you doing" ! It's kind of a super power !

Cheers to the heroes here who see this!
ยท
reacted to takarajordan's post with ๐Ÿ”ฅโค๏ธ 19 days ago
view post
Post
2236
I'm super excited to release my first open-source text dataset:

WorldScenario 20K is a novel dataset of 20,000 synthetically generated multi-stakeholder scenarios designed to simulate real-world decision-making processes. Each scenario explores a unique environmental, societal, or economic issue.

I used the brand new meta-llama/Llama-3.3-70B-Instruct model to generate this dataset and I put the dataset through some post processing to clean and evaluate the dataset for diversity.

I'd appreciate some feedback and thoughts on my new release! Thanks!

takarajordan/WorldScenario_20K
ยท
reacted to AdinaY's post with ๐Ÿค— 21 days ago
view post
Post
876
Updates from the Chinese community last week ๐Ÿ”ฅ

LLM:
โœจ Sailor 2 , multilingual model supporting 10+ South Asian languages by Sea AI Lab. https://huggingface.co/sailor2

MLLM:
โœจInternVL 2.5 , new open multimodal LLM by OpenGVLab
https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c
โœจQwen2-VL 2B/7B/72B base model, the latest iteration of our Qwen-VL model by Alibaba Qwen
Qwen/qwen2-vl-66cee7455501d7126940800d

Video model:
โœจHunyuanVideo , 13B open video model by Tencent
tencent/HunyuanVideo

Reasoning model:
โœจ LLaMA-O1 ๐Ÿฆ™ base & supervised model; pretrain & finetune datasets and demo all released
zh-ai-community/reasoning-models-67409fb3aa1ed78f10087cd7

Audio model:
โœจFish Speech 1.5, Text-to-speech in 13 languages, trained on 1M+ hours of audio by FishAudio
fishaudio/fish-speech-1.5
โœจClearVoice, An advanced voice processing framework by Alibaba Tongyi SpeechAI https://huggingface.co/alibabasglab

More details ๐Ÿ‘‰ https://huggingface.co/zh-ai-community
posted an update 23 days ago
view post
Post
1500
Apparently i forgot to put this here !

Well, this is a bit late but consider given our recent blog a read if you are interested in Evaluation.

You don't have to be into Arabic NLP in order to read it, the main contribution we are introducing is a new evaluation measure for NLG. We made the fisrt application of this measure on Arabic for now and we will be working with colleagues from the community to expand it to other languages.

Blog:
Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard
https://huggingface.co/blog/leaderboard-3c3h-aragen

Space:
inceptionai/AraGen-Leaderboard

Give it a read and let me know your thoughts ๐Ÿค—
reacted to dvilasuero's post with โค๏ธ๐Ÿ”ฅ 25 days ago
view post
Post
2274
๐ŸŒ Announcing Global-MMLU: an improved MMLU Open dataset with evaluation coverage across 42 languages, built with Argilla and the Hugging Face community.

Global-MMLU is the result of months of work with the goal of advancing Multilingual LLM evaluation. It's been an amazing open science effort with collaborators from Cohere For AI, Mila - Quebec Artificial Intelligence Institute, EPFL, Massachusetts Institute of Technology, AI Singapore, National University of Singapore, KAIST, Instituto Superior Tรฉcnico, Carnegie Mellon University, CONICET, and University of Buenos Aires.

๐Ÿท๏ธ +200 contributors used Argilla MMLU questions where regional, dialect, or cultural knowledge was required to answer correctly. 85% of the questions required Western-centric knowledge!

Thanks to this annotation process, the open dataset contains two subsets:

1. ๐Ÿ—ฝ Culturally Agnostic: no specific regional, cultural knowledge is required.
2. โš–๏ธ Culturally Sensitive: requires dialect, cultural knowledge or geographic knowledge to answer correctly.

Moreover, we provide high quality translations of 25 out of 42 languages, thanks again to the community and professional annotators leveraging Argilla on the Hub.

I hope this will ensure a better understanding of the limitations and challenges for making open AI useful for many languages.

Dataset: CohereForAI/Global-MMLU
reacted to stas's post with โค๏ธ 28 days ago
view post
Post
1158
If you remember my work on MAMF - to find the realistic TFLOPS achievable ceiling - the Intel AI team has shared their measurements and they scored ...

an incredible 99.4% TFLOPS efficiency for Gaudi 2!

That's quite amazing! Your ROI on these accelerators will be very high.

The full table is here: https://github.com/stas00/ml-engineering/tree/master/compute/accelerator#maximum-achievable-matmul-flops-comparison-table

As we have seen the competitors get their achievable efficiency worse with each new generation, I'm looking forward to see if Gaudi 3 will keep the high bar!

Thanks to Avi Rubin, Lakshman Chari, Imtiaz Sajwani, Ramy J and Zhiqi Tao for helping to get these numbers to the community.
reacted to AdinaY's post with โค๏ธ 28 days ago
view post
Post
1472
2023 & 2024 Top Downloaded (all time) Open Models on the hub are both from the Chinese community ๐Ÿ‘€

2023 ๐Ÿ‘‰ Bge base by BAAI
BAAI/bge-base-en-v1.5
2024 ๐Ÿ‘‰ Qwen 2.5 by Alibaba Qwen
Qwen/Qwen2.5-1.5B-Instruct

Canโ€™t wait to see what incredible models the Chinese community will bring in 2025๐Ÿš€

โœจ Follow https://huggingface.co/zh-ai-community to get the latest updates from the Chinese community
โœจ Explore the 2024 Year in Review huggingface/open-source-ai-year-in-review-2024
reacted to clem's post with โค๏ธ about 1 month ago
view post
Post
4365
Hugging Face is becoming the best place to share the most viral AI apps with spaces.

Kolors Virtual Try-on just crossed 6,000,000 unique visitors & is now the #5 most popular space. Congrats to the Kwai Kolors team!

Kwai-Kolors/Kolors-Virtual-Try-On
  • 2 replies
ยท
reacted to ariG23498's post with ๐Ÿš€ about 1 month ago
reacted to vincentg64's post with ๐Ÿง  about 1 month ago
view post
Post
1185
There is no such thing as a Trained LLM https://mltblog.com/3CEJ9Pt

What I mean here is that traditional LLMs are trained on tasks irrelevant to what they will do for the user. Itโ€™s like training a plane to efficiently operate on the runway, but not to fly. In short, it is almost impossible to train an LLM, and evaluating is just as challenging. Then, training is not even necessary. In this article, I dive on all these topics.

โžก๏ธ Training LLMs for the wrong tasks

Since the beginnings with Bert, training an LLM typically consists of predicting the next tokens in a sentence, or removing some tokens and then have your algorithm fill the blanks. You optimize the underlying deep neural networks to perform these supervised learning tasks as well as possible. Typically, it involves growing the list of tokens in the training set to billions or trillions, increasing the cost and time to train. However, recently, there is a tendency to work with smaller datasets, by distilling the input sources and token lists. After all, out of one trillion tokens, 99% are noise and do not contribute to improving the results for the end-user; they may even contribute to hallucinations. Keep in mind that human beings have a vocabulary of about 30,000 keywords, and that the number of potential standardized prompts on a specialized corpus (and thus the number of potential answers) is less than a million.

โžก๏ธ Read the full articles at https://mltblog.com/3CEJ9Pt, also featuring issues with evaluation metrics and the benefits of untrained LLMs.
reacted to malhajar's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
4255
๐Ÿ‡ซ๐Ÿ‡ท Lancement officiel de l'OpenLLM French Leaderboard : initiative open-source pour rรฉfรฉrencer lโ€™รฉvaluation des LLMs francophones

Aprรจs beaucoup dโ€™efforts et de sueurs avec Alexandre Lavallee, nous sommes ravis dโ€™annoncer que le OpenLLMFrenchLeaderboard est en ligne sur Hugging Face (space url: le-leadboard/OpenLLMFrenchLeaderboard) la toute premiรจre plateforme dรฉdiรฉe ร  lโ€™รฉvaluation des grands modรจles de langage (LLM) en franรงais. ๐Ÿ‡ซ๐Ÿ‡ทโœจ

Ce projet de longue haleine est avant tout une ล“uvre de passion mais surtout une nรฉcessitรฉ absolue. Il devient urgent et vital d'oeuvrer ร  plus de transparence dans ce domaine stratรฉgique des LLM dits multilingues. La premiรจre piรจce ร  l'รฉdifice est donc la mise en place d'une รฉvaluation systรฉmatique et systรฉmique des modรจles actuels et futurs.

Votre modรจle IA franรงais est-il prรชt ร  se dรฉmarquer ? Soumettez le dans notre espace, et voyez comment vous vous comparez par rapport aux autres modรจles.

โ“ Comment รงa marche :
Soumettez votre LLM franรงais pour รฉvaluation, et nous le testerons sur des benchmarks de rรฉfรฉrence spรฉcifiquement adaptรฉs pour la langue franรงaise โ€” notre suite de benchmarks comprend :

- BBH-fr : Raisonnement complexe
- IFEval-fr : Suivi d'instructions
- GPQA-fr : Connaissances avancรฉes
- MUSR-fr : Raisonnement narratif
- MATH_LVL5-fr : Capacitรฉs mathรฉmatiques
- MMMLU-fr : Comprรฉhension multitรขche

Le processus est encore manuel, mais nous travaillons sur son automatisation, avec le soutien de la communautรฉ Hugging Face.

@clem , on se prรฉpare pour une mise ร  niveau de lโ€™espace ? ๐Ÿ˜๐Ÿ‘€

Ce n'est pas qu'une question de chiffresโ€”il s'agit de crรฉer une IA qui reflรจte vraiment notre langue, notre culture et nos valeurs. OpenLLMFrenchLeaderboard est notre contribution personnelle pour faรงonner l'avenir des LLM en France.
  • 1 reply
ยท
reacted to elliesleightholm's post with ๐Ÿค—โค๏ธ about 1 month ago
reacted to LukeNeumann's post with ๐Ÿคฏ about 1 month ago
view post
Post
1220
Nine years ago, I uploaded the first 8K resolution video to YouTube and I've been stockpiling 8K footage ever since: https://www.youtube.com/watch?v=sLprVF6d7Ug&t

Should @Overlaiapp release the first open-source 8K video dataset?

Could anyone even fine tune a model with this?๐Ÿ˜…
ยท
reacted to monsoon-nlp's post with โค๏ธ about 1 month ago
view post
Post
1427
Great to see Tatta Bio release an embeddings version of their DNA/protein language model ๐Ÿงฌ: tattabio/gLM2_650M_embed
  • 4 replies
ยท