davidschulte commited on
Commit
0b5719e
·
verified ·
1 Parent(s): a4c158f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +142 -5
README.md CHANGED
@@ -1,9 +1,146 @@
1
  ---
 
 
 
 
2
  tags:
3
- - model_hub_mixin
4
- - pytorch_model_hub_mixin
5
  ---
6
 
7
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
8
- - Library: [More Information Needed]
9
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: bert-base-multilingual-uncased
3
+ datasets:
4
+ - MichiganNLP/TID-8
5
+ license: apache-2.0
6
  tags:
7
+ - embedding_space_map
8
+ - BaseLM:bert-base-multilingual-uncased
9
  ---
10
 
11
+ # ESM MichiganNLP/TID-8
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+
15
+
16
+
17
+ ## Model Details
18
+
19
+ ### Model Description
20
+
21
+ <!-- Provide a longer summary of what this model is. -->
22
+
23
+ ESM
24
+
25
+ - **Developed by:** David Schulte
26
+ - **Model type:** ESM
27
+ - **Base Model:** bert-base-multilingual-uncased
28
+ - **Intermediate Task:** MichiganNLP/TID-8
29
+ - **ESM architecture:** linear
30
+ - **Language(s) (NLP):** [More Information Needed]
31
+ - **License:** Apache-2.0 license
32
+
33
+ ## Training Details
34
+
35
+ ### Intermediate Task
36
+ - **Task ID:** MichiganNLP/TID-8
37
+ - **Subset [optional]:** friends_qia-atr
38
+ - **Text Column:** question
39
+ - **Label Column:** answer_label
40
+ - **Dataset Split:** train
41
+ - **Sample size [optional]:** 10000
42
+ - **Sample seed [optional]:** 42
43
+
44
+ ### Training Procedure [optional]
45
+
46
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
47
+
48
+ #### Language Model Training Hyperparameters [optional]
49
+ - **Epochs:** 3
50
+ - **Batch size:** 32
51
+ - **Learning rate:** 2e-05
52
+ - **Weight Decay:** 0.01
53
+ - **Optimizer**: AdamW
54
+
55
+ ### ESM Training Hyperparameters [optional]
56
+ - **Epochs:** 10
57
+ - **Batch size:** 32
58
+ - **Learning rate:** 0.001
59
+ - **Weight Decay:** 0.01
60
+ - **Optimizer**: AdamW
61
+
62
+
63
+ ### Additional trainiung details [optional]
64
+
65
+
66
+ ## Model evaluation
67
+
68
+ ### Evaluation of fine-tuned language model [optional]
69
+
70
+
71
+ ### Evaluation of ESM [optional]
72
+ MSE:
73
+
74
+ ### Additional evaluation details [optional]
75
+
76
+
77
+
78
+ ## What are Embedding Space Maps?
79
+
80
+ <!-- This section describes the evaluation protocols and provides the results. -->
81
+ Embedding Space Maps (ESMs) are neural networks that approximate the effect of fine-tuning a language model on a task. They can be used to quickly transform embeddings from a base model to approximate how a fine-tuned model would embed the the input text.
82
+ ESMs can be used for intermediate task selection with the ESM-LogME workflow.
83
+
84
+ ## How can I use Embedding Space Maps for Intermediate Task Selection?
85
+ [![PyPI version](https://img.shields.io/pypi/v/hf-dataset-selector.svg)](https://pypi.org/project/hf-dataset-selector)
86
+
87
+ We release **hf-dataset-selector**, a Python package for intermediate task selection using Embedding Space Maps.
88
+
89
+ **hf-dataset-selector** fetches ESMs for a given language model and uses it to find the best dataset for applying intermediate training to the target task. ESMs are found by their tags on the Huggingface Hub.
90
+
91
+ ```python
92
+ from hfselect import Dataset, compute_task_ranking
93
+
94
+ # Load target dataset from the Hugging Face Hub
95
+ dataset = Dataset.from_hugging_face(
96
+ name="stanfordnlp/imdb",
97
+ split="train",
98
+ text_col="text",
99
+ label_col="label",
100
+ is_regression=False,
101
+ num_examples=1000,
102
+ seed=42
103
+ )
104
+
105
+ # Fetch ESMs and rank tasks
106
+ task_ranking = compute_task_ranking(
107
+ dataset=dataset,
108
+ model_name="bert-base-multilingual-uncased"
109
+ )
110
+
111
+ # Display top 5 recommendations
112
+ print(task_ranking[:5])
113
+ ```
114
+
115
+ For more information on how to use ESMs please have a look at the [official Github repository](https://github.com/davidschulte/hf-dataset-selector).
116
+
117
+ ## Citation
118
+
119
+
120
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
121
+ If you are using this Embedding Space Maps, please cite our [paper](https://arxiv.org/abs/2410.15148).
122
+
123
+ **BibTeX:**
124
+
125
+
126
+ ```
127
+ @misc{schulte2024moreparameterefficientselectionintermediate,
128
+ title={Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning},
129
+ author={David Schulte and Felix Hamborg and Alan Akbik},
130
+ year={2024},
131
+ eprint={2410.15148},
132
+ archivePrefix={arXiv},
133
+ primaryClass={cs.CL},
134
+ url={https://arxiv.org/abs/2410.15148},
135
+ }
136
+ ```
137
+
138
+
139
+ **APA:**
140
+
141
+ ```
142
+ Schulte, D., Hamborg, F., & Akbik, A. (2024). Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning. arXiv preprint arXiv:2410.15148.
143
+ ```
144
+
145
+ ## Additional Information
146
+