File size: 1,104 Bytes
8e7c856
 
5c3e361
6dc0f2d
15c2649
 
 
5c3e361
15c2649
 
8e7c856
5c3e361
8219d08
 
15c2649
5c3e361
 
419bbb3
 
 
9451100
419bbb3
8a4b6c8
15c2649
46e33f5
 
15c2649
bc95fec
15c2649
 
bc95fec
15c2649
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: apache-2.0
language:
- en
base_model:
- nasa-impact/nasa-smd-ibm-v0.1
pipeline_tag: token-classification
tags:
- astronomy
- uat
---

# KAILAS
KAILAS (aka Keyword Labeler At SciX aka Indus-UAT-Labeler aka nasa-smd-ibm-v0.1_UAT_Labeler) is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search.  
This specific fork was finetuned on SciX Digital Library (https://scixplorer.org/, formerly NASA-ADS) proprietary data to label text with UAT labels (https://astrothesaurus.org/)

## Model Details
- **Base Model**: RoBERTa
- **Tokenizer**: Custom
- **Parameters**: 125M

## Training Data
- 18K titles, abstracts, body and acknowledgments from recent, quality astronomy papers
- approximately 217M tokens


<!-- ## Note -->

<!-- ## Citation -->
<!-- If you find this work useful, please cite using the following bibtex citation: -->

<!-- ## Disclaimer -->