nan commited on
Commit
1256ad3
·
1 Parent(s): bd791d0

feat: update the README

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -21,8 +21,8 @@ library_name: transformers
21
 
22
  # ReaderLM-v2
23
 
24
- `ReaderLM-v2` is the second generation of Jina ReaderLM, a **1.5B** parameter language model that converts raw HTML into beautifully formatted markdown or JSON with superior accuracy and improved longer context handling.
25
- It supports multiple languages (29 in total) and is specialized for tasks involving HTML parsing, transformation, and text extraction.
26
 
27
  ## Model Overview
28
 
@@ -33,12 +33,12 @@ It supports multiple languages (29 in total) and is specialized for tasks involv
33
 
34
  ## What's New in `ReaderLM-v2`
35
 
36
- `ReaderLM-v2` features several significant improvements over its predecessor:
37
 
38
  - **Better Markdown Generation**: Generates cleaner, more readable Markdown output.
39
- - **JSON Output**: Can produce JSON-formatted text, enabling structured extraction for further downstream processing.
40
  - **Longer Context Handling**: Can handle up to 512K tokens, which is beneficial for large HTML documents.
41
- - **Multilingual Support**: Covers 29 languages for broader application across international web data.
42
 
43
  ---
44
 
 
21
 
22
  # ReaderLM-v2
23
 
24
+ `ReaderLM-v2` is the second generation of [ReaderLM-v1](https://huggingface.co/jinaai/reader-lm-1.5b), a **1.5B** parameter language model that converts raw HTML into formatted markdown or structured JSON with improved accuracy and better support for longer contexts.
25
+ Supporting multiple languages (29 in total), `ReaderLM-v2` is specialized for tasks involving HTML parsing, transformation, and text extraction.
26
 
27
  ## Model Overview
28
 
 
33
 
34
  ## What's New in `ReaderLM-v2`
35
 
36
+ `ReaderLM-v2` features several improvements over [ReaderLM-v1](https://huggingface.co/jinaai/reader-lm-1.5b):
37
 
38
  - **Better Markdown Generation**: Generates cleaner, more readable Markdown output.
39
+ - **JSON Output**: Produce structured JSON-formatted text, enabling structured extraction for further downstream processing.
40
  - **Longer Context Handling**: Can handle up to 512K tokens, which is beneficial for large HTML documents.
41
+ - **Multilingual Support**: Covers 29 languages for broader applications across international web data.
42
 
43
  ---
44