huu-ontocord
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -9,18 +9,18 @@ pinned: false
|
|
9 |
# Multi-Domain Expert Learning (M*DEL)**:
|
10 |
## How to increase knowledge without breaking the bank?
|
11 |
|
12 |
-
🍩 Ontocord.AI 🍩 and the open
|
13 |
|
14 |
-
M*DEL is a volunteer open
|
15 |
Bedrock AI, TurkuNLP, ETH, Redmond.AI, Incite, MICS CentraleSupelec, Centro de Excelência em Inteligência Artificial, VietAI, Technion - Israel Institute of Technology, Nous Research, University of Western Australia, KoboldAI Community, LAION.AI, Mila, Luleå University of Technology, Juelich Supercomputing Center, Tokyo Tech, RIKEN, Together
|
16 |
|
17 |
- [Try out our current proof of concept](https://huggingface.co/Multi-Domain-Expert-Layers/meow_1b/)
|
18 |
|
19 |
-
|
20 |
|
21 |
-
The proposed method that we call Multi-Domain Expert Learning (MDEL)
|
22 |
|
23 |
-
In this effort, we seek international labs and open
|
24 |
|
25 |
We will be using a varient of the c-BTM (https://arxiv.org/pdf/2303.14177v1.pdf) method and will be focusing on models ranging from 7-70B parameters.
|
26 |
|
|
|
9 |
# Multi-Domain Expert Learning (M*DEL)**:
|
10 |
## How to increase knowledge without breaking the bank?
|
11 |
|
12 |
+
🍩 Ontocord.AI 🍩 and the open science community.
|
13 |
|
14 |
+
M*DEL is a volunteer open science community for creating better mixtures of experts with volunteers from:
|
15 |
Bedrock AI, TurkuNLP, ETH, Redmond.AI, Incite, MICS CentraleSupelec, Centro de Excelência em Inteligência Artificial, VietAI, Technion - Israel Institute of Technology, Nous Research, University of Western Australia, KoboldAI Community, LAION.AI, Mila, Luleå University of Technology, Juelich Supercomputing Center, Tokyo Tech, RIKEN, Together
|
16 |
|
17 |
- [Try out our current proof of concept](https://huggingface.co/Multi-Domain-Expert-Layers/meow_1b/)
|
18 |
|
19 |
+
OSS AI models can lead to increased innovation, accessibility, transparency, and community building. However we need a mechanism to train more capable models in an efficient and modular way.
|
20 |
|
21 |
+
The proposed method that we call Multi-Domain Expert Learning (MDEL) involves branching from a base model, training each branch independently on a specific domain for specific layers or other adapters, and merging the trained models at the end. Additionally, the specific layers or adapters are kept as experts, with a classifier used as a router to activate the experts during inference. This approach makes it possible to easily increase expertise of a model, to independently train more "adapters", and to reuse previously trained experts and models without retraining, resulting in a modular and efficient system.
|
22 |
|
23 |
+
In this effort, we seek international labs and open science aligned researchers and companies in various countries to each train a set of domain experts of their choosing, thereby enabling international participation and knowledge sharing. This will also result in lower costs for training and a lower environmental impact due to reuse and lower energy usage. Currently we have volunteers from four continents and are looking for more.
|
24 |
|
25 |
We will be using a varient of the c-BTM (https://arxiv.org/pdf/2303.14177v1.pdf) method and will be focusing on models ranging from 7-70B parameters.
|
26 |
|