BART (Bidirectional and Auto-Regressive Transformer) Architecture

row01

BART’s primary task is used to generate clean semantically coherent text from corrupted text data but it can also be used for a variety of different NLP sub-tasks like language translation, question-answering tasks, text summarization, paraphrasing, etc.

As BART is an autoencoder model, it consists of an encoder model and a decoder model. For its encoder model, BART uses a bi-directional encoder that is used in BERT, and for its decoder mode, it uses an autoregressive decoder that forms the core aspect of a GPT -1 model.

An autoregressive decoder is a neural network architecture that takes the previous input tokens as well as the current token to predict the next token at every time step. It is important to remember that the input accepted by a decoder is an embedding created by its corresponding encoder network.

Both the encoder and decoder architecture is built by the combination of multiple blocks or layers where each block processes information in a specific way.

It consists of 3 primary blocks:

Multi-head Attention block
Addition and Normalization block
Feed-forward layers

Multi-head attention block

This is one of the most important blocks as in this layer multiple levels of masking( replacing random tokens in a sentence with the

Parallel #1 thread: Entire sentence is replaced by the

Parallel #2 thread: Multiple bi-gram tokens are replaced by the

Parallel #3+ thread: Arbitrary words within the sentence are replaced by the

This masking is done in parallel instead of sequentially to avoid accumulating previous step errors for the same input sentence. Addition and Normalization block

Different parameters within the multiple blocks contain values within different ranges, hence to add those values together, we scale the values of all the parameters into a single range using a monotonic function whose value converges to a constant value k as the input closes to infinity. This is performed so that uniform weight for all parameters is ensured while concatenating multiple parameters into a single one. Feed-forward Layers

The feed-forward layers compose the basic building block of any neural network and are composed of hidden layers containing a fixed number of neurons. These layers contain the process, and store information coming from the previous layers as weights and forward the processed/ updated information to the next layer. The feed-forward neural network layers are specially designed to move information in a sequential uni-directional manner.

BART Model for Text Auto Completion in NLP

BART stands for Bidirectional and Auto-Regressive Transformer. It is a denoising autoencoder that is a pre-trained sequence-to-sequence method, that uses masked language modeling for Natural Language Generation and Translation. It is developed by Lewis et al. in 2019. BART architecture is similar to an encoder-decoder network except that it uses a combination of BERT and GPT models. The BART models can be fine-tuned over small supervised datasets to create domain-specific tasks.

Denoising autoencoder

An autoencoder is a special type of neural network that learns to encode an input sentence into lower dimensional representations and decode the embedded representations back to the corresponding original input sentences. In a general case, when the input and output sentence of an autoencoder is the same, over a large number of iterations, the autoencoder network directly maps the input token to the output tokens, and the embedded representation that is usually learned between them becomes redundant. Therefore, we modify the input sentence by randomly deleting word tokens and replacing them with a special

Implementation

Implementing a pre-trained BART model for automatic text completion:

Base from facebook bart /

from transformers import BartForConditionalGeneration, BartTokenizer
bart_model = BartForConditionalGeneration.from_pretrained("facebook/bart-large", forced_bos_token_id=0) # takes a while to load 
tokenizer = BartTokenizer.from_pretrained("facebook/bart-large")
sent = "-----------your text here----- <mask> -----your text here ---"
tokenized_sent = tokenizer(sent, return_tensors='pt')
generated_encoded = bart_model.generate(tokenized_sent['input_ids'])
print(tokenizer.batch_decode(generated_encoded, skip_special_tokens=True)[0])
Downloads last month
24
Safetensors
Model size
406M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.