feat: update the README
Browse files
README.md
CHANGED
@@ -21,8 +21,8 @@ library_name: transformers
|
|
21 |
|
22 |
# ReaderLM-v2
|
23 |
|
24 |
-
`ReaderLM-v2` is the second generation of
|
25 |
-
|
26 |
|
27 |
## Model Overview
|
28 |
|
@@ -33,12 +33,12 @@ It supports multiple languages (29 in total) and is specialized for tasks involv
|
|
33 |
|
34 |
## What's New in `ReaderLM-v2`
|
35 |
|
36 |
-
`ReaderLM-v2` features several
|
37 |
|
38 |
- **Better Markdown Generation**: Generates cleaner, more readable Markdown output.
|
39 |
-
- **JSON Output**:
|
40 |
- **Longer Context Handling**: Can handle up to 512K tokens, which is beneficial for large HTML documents.
|
41 |
-
- **Multilingual Support**: Covers 29 languages for broader
|
42 |
|
43 |
---
|
44 |
|
|
|
21 |
|
22 |
# ReaderLM-v2
|
23 |
|
24 |
+
`ReaderLM-v2` is the second generation of [ReaderLM-v1](https://huggingface.co/jinaai/reader-lm-1.5b), a **1.5B** parameter language model that converts raw HTML into formatted markdown or structured JSON with improved accuracy and better support for longer contexts.
|
25 |
+
Supporting multiple languages (29 in total), `ReaderLM-v2` is specialized for tasks involving HTML parsing, transformation, and text extraction.
|
26 |
|
27 |
## Model Overview
|
28 |
|
|
|
33 |
|
34 |
## What's New in `ReaderLM-v2`
|
35 |
|
36 |
+
`ReaderLM-v2` features several improvements over [ReaderLM-v1](https://huggingface.co/jinaai/reader-lm-1.5b):
|
37 |
|
38 |
- **Better Markdown Generation**: Generates cleaner, more readable Markdown output.
|
39 |
+
- **JSON Output**: Produce structured JSON-formatted text, enabling structured extraction for further downstream processing.
|
40 |
- **Longer Context Handling**: Can handle up to 512K tokens, which is beneficial for large HTML documents.
|
41 |
+
- **Multilingual Support**: Covers 29 languages for broader applications across international web data.
|
42 |
|
43 |
---
|
44 |
|