Upload folder using huggingface_hub
Browse files
BUILT
CHANGED
@@ -1 +1 @@
|
|
1 |
-
|
|
|
1 |
+
2025-01-17T15:15:52.449967
|
GIT_REV
CHANGED
@@ -1 +1 @@
|
|
1 |
-
|
|
|
1 |
+
1ff61eb
|
GIT_REV_LEX
CHANGED
@@ -1 +1 @@
|
|
1 |
-
|
|
|
1 |
+
1ff61eb
|
README.md
CHANGED
@@ -20,7 +20,7 @@ model-index:
|
|
20 |
split: train
|
21 |
metrics:
|
22 |
- type: coverage
|
23 |
-
value: 0.
|
24 |
name: Coverage
|
25 |
- type: coverage
|
26 |
value: 1.0
|
@@ -32,7 +32,7 @@ model-index:
|
|
32 |
value: 0.9999580703997988
|
33 |
name: Coverage ($.)
|
34 |
- type: coverage
|
35 |
-
value: 0.
|
36 |
name: Coverage (ADJA)
|
37 |
- type: coverage
|
38 |
value: 0.7548407611333322
|
@@ -80,7 +80,7 @@ model-index:
|
|
80 |
value: 0.0618080812117821
|
81 |
name: Coverage (NE)
|
82 |
- type: coverage
|
83 |
-
value: 0.
|
84 |
name: Coverage (NN)
|
85 |
- type: coverage
|
86 |
value: 0.9799275737196068
|
@@ -183,21 +183,34 @@ model-index:
|
|
183 |
name: Coverage (XY)
|
184 |
---
|
185 |
|
186 |
-
# DWDSmor
|
187 |
|
188 |
-
_SFST/SMOR/DWDS-based German morphology_
|
189 |
|
190 |
|
191 |
|
192 |
|
|
|
|
|
|
|
|
|
|
|
|
|
193 |
|
194 |
-
|
195 |
-
|
196 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
197 |
|
198 |
## Usage
|
199 |
|
200 |
-
DWDSmor is available via PyPI:
|
201 |
|
202 |
``` plaintext
|
203 |
pip install dwdsmor
|
@@ -224,8 +237,7 @@ generation:
|
|
224 |
scripts for morphological analysis and for paradigm generation by
|
225 |
means of DWDSmor transducers.
|
226 |
* `share/` contains XSLT stylesheets for extracting lexical entries in SMORLemma
|
227 |
-
format
|
228 |
-
found in `samples/`.
|
229 |
* `lexicon/dwds/` contains scripts for building DWDSmor lexica by means of the
|
230 |
XSLT stylesheets in `share/` and DWDS sources in `lexicon/dwds/wb/`, which are
|
231 |
not part of this repository.
|
|
|
20 |
split: train
|
21 |
metrics:
|
22 |
- type: coverage
|
23 |
+
value: 0.8415324536382167
|
24 |
name: Coverage
|
25 |
- type: coverage
|
26 |
value: 1.0
|
|
|
32 |
value: 0.9999580703997988
|
33 |
name: Coverage ($.)
|
34 |
- type: coverage
|
35 |
+
value: 0.7740509710590406
|
36 |
name: Coverage (ADJA)
|
37 |
- type: coverage
|
38 |
value: 0.7548407611333322
|
|
|
80 |
value: 0.0618080812117821
|
81 |
name: Coverage (NE)
|
82 |
- type: coverage
|
83 |
+
value: 0.7440593189565299
|
84 |
name: Coverage (NN)
|
85 |
- type: coverage
|
86 |
value: 0.9799275737196068
|
|
|
183 |
name: Coverage (XY)
|
184 |
---
|
185 |
|
186 |
+
# DWDSmor – German morphology
|
187 |
|
|
|
188 |
|
189 |
|
190 |
|
191 |
|
192 |
+
DWDSmor implements the **lemmatisation and morphological analysis** of
|
193 |
+
word forms as well as the **generation of paradigms of lexical words**
|
194 |
+
in **written German**. Finite state transducers (automata) map word
|
195 |
+
forms to specifications of corresponding lexical words and tagging
|
196 |
+
which represents morphological properties. By traversing such
|
197 |
+
transducers
|
198 |
|
199 |
+
1. a given word form can be analysed and lemmatised, or
|
200 |
+
1. a lexical word together with a set of morphological tagging will
|
201 |
+
generate corresponding inflected word forms.
|
202 |
+
|
203 |
+
The automata are compiled and traversed via
|
204 |
+
[SFST](https://www.cis.uni-muenchen.de/~schmid/tools/SFST/), a C++
|
205 |
+
library and toolbox for finite-state transducers (FSTs). Their
|
206 |
+
coverage of the German language depends on
|
207 |
+
|
208 |
+
1. the DWDSmor grammar, defining the rules by which word formation happens, and
|
209 |
+
1. a lexicon, assigning inflection classes to lexical words.
|
210 |
|
211 |
## Usage
|
212 |
|
213 |
+
DWDSmor as a Python library is available via the package index PyPI:
|
214 |
|
215 |
``` plaintext
|
216 |
pip install dwdsmor
|
|
|
237 |
scripts for morphological analysis and for paradigm generation by
|
238 |
means of DWDSmor transducers.
|
239 |
* `share/` contains XSLT stylesheets for extracting lexical entries in SMORLemma
|
240 |
+
format from XML sources of DWDS articles.
|
|
|
241 |
* `lexicon/dwds/` contains scripts for building DWDSmor lexica by means of the
|
242 |
XSLT stylesheets in `share/` and DWDS sources in `lexicon/dwds/wb/`, which are
|
243 |
not part of this repository.
|
finite.a
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:95bfdcf91767d315c623b4cc48a43f715578575249b99357523d0289536554ee
|
3 |
+
size 1135931
|
finite.ca
CHANGED
Binary files a/finite.ca and b/finite.ca differ
|
|
index.a
CHANGED
Binary files a/index.a and b/index.a differ
|
|
index.ca
CHANGED
Binary files a/index.ca and b/index.ca differ
|
|
index.csv.lzma
CHANGED
Binary files a/index.csv.lzma and b/index.csv.lzma differ
|
|
lemma.a
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dd4241272ed62e7ad712d3fee625978580819d0022fb4daed528aea02884f327
|
3 |
+
size 1235294
|
lemma.ca
CHANGED
Binary files a/lemma.ca and b/lemma.ca differ
|
|
morph.a
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:36591d8397ca7d2bfcb6bbc2fc8ef265081fe2cda411c9007fac7cc7a46dd75e
|
3 |
+
size 1242812
|
morph.ca
CHANGED
Binary files a/morph.ca and b/morph.ca differ
|
|
root.a
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ad73caec7518a7b8908a44df97a5a56c25dc37c19352c0ca3417d7e8a7907396
|
3 |
+
size 6985697
|
root.ca
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:57d7e5e6aca069ea2bd4add3321fab28cae874bb5aee19dbba10ddab9b788f94
|
3 |
+
size 3635046
|