numb3r3 commited on
Commit
bd791d0
·
verified ·
1 Parent(s): 14fbb0c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -37,7 +37,7 @@ It supports multiple languages (29 in total) and is specialized for tasks involv
37
 
38
  - **Better Markdown Generation**: Generates cleaner, more readable Markdown output.
39
  - **JSON Output**: Can produce JSON-formatted text, enabling structured extraction for further downstream processing.
40
- - **Longer Context Handling**: Can handle up to 512K tokens, which is beneficial for large HTML documents or combined transformations.
41
  - **Multilingual Support**: Covers 29 languages for broader application across international web data.
42
 
43
  ---
@@ -50,10 +50,12 @@ For a more hands-on experience in a hosted environment, see the [Google Colab No
50
  ## On Google Colab
51
 
52
  The easiest way to experience `ReaderLM-v2` is by running our [Colab notebook](https://colab.research.google.com/drive/1FfPjZwkMSocOLsEYH45B3B4NxDryKLGI?usp=sharing),
53
- The notebook runs on a free T4 GPU tier and uses vLLM and Triton for faster inference. You can feed any website’s HTML directly into the model.
 
 
 
 
54
 
55
- • For simple HTML-to-Markdown tasks, you only need to provide the raw HTML (no special instructions).
56
- • For JSON output and instruction-based extraction, use the prompt formatting guidelines in the notebook.
57
 
58
  ## Local Usage
59
 
 
37
 
38
  - **Better Markdown Generation**: Generates cleaner, more readable Markdown output.
39
  - **JSON Output**: Can produce JSON-formatted text, enabling structured extraction for further downstream processing.
40
+ - **Longer Context Handling**: Can handle up to 512K tokens, which is beneficial for large HTML documents.
41
  - **Multilingual Support**: Covers 29 languages for broader application across international web data.
42
 
43
  ---
 
50
  ## On Google Colab
51
 
52
  The easiest way to experience `ReaderLM-v2` is by running our [Colab notebook](https://colab.research.google.com/drive/1FfPjZwkMSocOLsEYH45B3B4NxDryKLGI?usp=sharing),
53
+ The notebook demonstrates HTML-to-markdown conversion, JSON extraction, and instruction-following using the HackerNews frontpage as an example.
54
+ The notebook is optimized for Colab's free T4 GPU tier and requires `vllm` and `triton` for acceleration and running.
55
+ Feel free to test it with any website.
56
+ For HTML-to-markdown tasks, simply input the raw HTML without any prefix instructions.
57
+ However, JSON output and instruction-based extraction require specific prompt formatting as shown in the examples.
58
 
 
 
59
 
60
  ## Local Usage
61