philippelaban commited on
Commit
137479f
·
1 Parent(s): 17b3036

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md CHANGED
@@ -1,3 +1,43 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Try out in the Hosted inference API
5
+
6
+ In the right panel, you can try to the model (although it only handles a short sequence length).
7
+ Enter the document you want to summarize in the panel on the right.
8
+
9
+ # Model Loading
10
+ The model (based on a GPT2 base architecture) can be loaded in the following way:
11
+ ```
12
+ from transformers import GPT2LMHeadModel, GPT2TokenizerFast
13
+
14
+ model = GPT2LMHeadModel.from_pretrained("philippelaban/summary_loop10")
15
+ tokenizer = GPT2TokenizerFast.from_pretrained("philippelaban/summary_loop10")
16
+ ```
17
+
18
+ # Example Use
19
+ ```
20
+ document = "Bouncing Boulders Point to Quakes on Mars. A preponderance of boulder tracks on the red planet may be evidence of recent seismic activity. If a rock falls on Mars, and no one is there to see it, does it leave a trace? Yes, and it's a beautiful herringbone-like pattern, new research reveals. Scientists have now spotted thousands of tracks on the red planet created by tumbling boulders. Delicate chevron-shaped piles of Martian dust and sand frame the tracks, the team showed, and most fade over the course of a few years. Rockfalls have been spotted elsewhere in the solar system, including on the moon and even a comet. But a big open question is the timing of these processes on other worlds — are they ongoing or did they predominantly occur in the past?"
21
+
22
+ tokenized_document = tokenizer([document], max_length=300, truncation=True, return_tensors="pt")["input_ids"].cuda()
23
+ input_shape = tokenized_document.shape
24
+ outputs = model.generate(tokenized_document, do_sample=False, max_length=500, num_beams=4, num_return_sequences=4, no_repeat_ngram_size=6, return_dict_in_generate=True, output_scores=True)
25
+ candidate_sequences = outputs.sequences[:, input_shape[1]:] # Remove the encoded text, keep only the summary
26
+ candidate_scores = outputs.sequences_scores.tolist()
27
+
28
+ for candidate_tokens, score in zip(candidate_sequences, candidate_scores):
29
+ summary = tokenizer.decode(candidate_tokens)
30
+ print("[Score: %.3f] %s" % (score, summary[:summary.index("END")]))
31
+ ```
32
+
33
+ # Example output
34
+ ```
35
+ [Score: -0.212] Here's what you need to know about rockfalls on Mars Before it's here, it's on the red planet
36
+ [Score: -0.277] Here's what you need to know about rockfalls on Mars Before it's here, it's on the red planet. Here are some facts about rockfalls on Mars
37
+ [Score: -0.320] Here's what you need to know about rockfalls on Mars Before it's here, it's on the red planet. Here are some facts about rockfalls on Mars: -- The tracks have been spotted on the red planet
38
+ [Score: -0.364] Here's what you need to know about rockfalls on Mars Before it's here, it's on the red planet. Here are some facts about rockfalls on Mars: -- The tracks have been spotted on the red planet Before it's there, it's
39
+ ```
40
+
41
+ # Github repo
42
+
43
+ You can access more information, access to the scoring function, the training script, or an example training log on the Github repo: https://github.com/CannyLab/summary_loop