Unlocking the Power of Deep Learning for Clause Classification: Revolutionizing Commercial Applications

In the dynamic landscape of commercial operations, efficiency and accuracy in document processing are paramount. Traditional methods of analyzing legal clauses and contracts have often been time-consuming and prone to human error. However, with the advent of deep learning technologies, particularly in the realm of clause classification, a new era of automation and precision has emerged.

This is a fine tune version of "google-bert/bert-base-cased" for classification using more than 3200 clause examples extracted from the contracts annotated by the Atticus Project [https://www.atticusprojectai.org/]

Through initiatives like the ATTICUS project and ongoing advancements in AI, the future of commercial document analysis is bright—a future where deep learning plays a pivotal role in unlocking efficiency, insight, and value from the vast sea of textual information that drives our global economy.

Real-World Applications

In practice, the integration of deep learning for clause classification extends across various industries:

  • Legal Services: Law firms and legal departments leverage deep learning to streamline contract review processes and extract key information efficiently.

  • Finance and Insurance: Deep learning models assist in analysing complex financial agreements, identifying clauses related to risk factors, liabilities, and compliance.

  • Healthcare and Pharmaceuticals: Companies in highly regulated sectors use deep learning for analyzing patient contracts, supplier agreements, and regulatory documents.

test_accuracy: 88 %

Labels:

"0": "Anti-Assignment",
"1": "Audit_Rights",
"2": "Cap_On_Liability",
"3": "Covenant_Not_To_Sue",
"4": "Effective_Date",
"5": "Expiration_Date",
"6": "Governing_Law",
"7": "Insurance",
"8": "License_Grant",
"9": "Non-Transferable_License",
"10": "Notice_ Period_To_Terminate_Renewal",
"11": "Parties",
"12": "Post-Termination_Services",
"13": "Renewal_Term",
"14": "Revenue/Profit_Sharing",
"15": "Uncapped_Liability",
"16": "Warranty_Duration"

Usage

To load the model first install transformer library in your environment

pip install transformers
from transformers import pipeline
classifier = pipeline("text-classification", model="mauro/bert-base-uncased-finetuned-clause-type")

Pipelines are the easiest way to use a model.

This is an example clause:

clause = """ The foregoing license shall be transferable or sublicensable by Parent Group solely 
to a Permitted Party  and  subject to the restrictions herein  with any sale or transfer of a 
Parent business that utilizes the Licensed SpinCo IP If Parent enters an agreement to transfer 
the License_Granted to it under this Section 3 1 in connection with any sale or transfer of a 
Parent business  then SpinCo and members of the SpinCo Group shall be made third party 
beneficiaries under such transfer agreement to enforce breaches of the license 
3 If SpinCo enters an agreement to transfer the License_Granted to it under this  
Section 3 2  in connection with any sale or transfer of a SpinCo business  then Parent 
and members of the Parent Group shall be made third party beneficiaries under such transfer 
agreement to enforce breaches of the license     Such agreement shall prohibit any further 
sublicensing or transfer of rights by the Permitted Party  or  in the case of a sale or 
transfer of a Parent business  the transferee  or any use of the Licensed SpinCo IP outside 
the scope of the License_Granted to Parent herein     Such agreement shall prohibit any further 
transfer of rights by such party or any use of the transferred Intellectual Property outside the 
scope of the License_Granted to SpinCo herein"""

classifier(clause, return_all_scores=False)

The result will be :

[{'label': 'Non-Transferable_License', 'score': 0.989809513092041}]

Visualization

Now will need for this Matplotlib and Pandas.

pip install matplotlib pandas
# all probabilities
preds = classifier(clause, return_all_scores=True)

# create a df with the result
df = pd.DataFrame([[x['label'], x['score']] for x in preds[0]], columns=['label', 'score'])

import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt

# probability distribution
plt.bar(df['label'], df['score'])
plt.xlabel('label')
plt.ylabel('score')  
plt.title('Probaility distribution for all clauses type')
plt.xticks(rotation=90) 
plt.show()

You will get the probability distribution of all classes:

image/png


License: Apache-2.0

Downloads last month
24
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train mauro/bert-base-uncased-finetuned-clause-type