Xtest commited on
Commit
3da1f00
·
1 Parent(s): d9a3671

Update README.md

Browse files

![accuracy.jpg](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F64be563701f1983a8694272a%2F-6Z2jNad4FtI-5nruDg-A.jpeg)%3Cbr%2F%3E!%5Bconfident .jpg](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F64be563701f1983a8694272a%2FyLTdnznqCVBWp6cy2dmJp.jpeg)%3C!-- HTML_TAG_END -->

Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -2,6 +2,55 @@
2
  license: bigscience-openrail-m
3
  metrics:
4
  - accuracy
5
- tags:
6
- - not-for-all-audiences
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: bigscience-openrail-m
3
  metrics:
4
  - accuracy
5
+ ---
6
+
7
+
8
+ # Document Classification with LayoutLM
9
+
10
+ This repository contains code for a document classification project using the LayoutLM model. The goal of this project is to accurately classify various types of documents, such as birth certificates, driving licenses, social security numbers, and tax documents, using layout-aware deep learning techniques.
11
+
12
+ ## Table of Contents
13
+
14
+ - [Introduction](#introduction)
15
+ - [Features](#features)
16
+ - [Getting Started](#getting-started)
17
+ - [Prerequisites](#prerequisites)
18
+ - [Installation](#installation)
19
+ - [Usage](#usage)
20
+ - [Data Preprocessing](#data-preprocessing)
21
+ - [Training](#training)
22
+ - [Evaluation](#evaluation)
23
+ - [Model Inference](#model-inference)
24
+ - [Contributing](#contributing)
25
+ - [License](#license)
26
+
27
+ ## Introduction
28
+
29
+ Document classification is a crucial task in various domains, including legal, finance, and healthcare. This project leverages the LayoutLM model, which is designed to understand the content and structure of documents by considering both text and bounding box information. With this model, we achieved an impressive accuracy of 89% on our test dataset.
30
+
31
+ ## Features
32
+
33
+ - Document classification using LayoutLM.
34
+ - Data preprocessing scripts for handling text and bounding box information.
35
+ - Training pipeline for fine-tuning the LayoutLM model.
36
+ - Evaluation scripts to measure model performance.
37
+ - Model inference code for classifying new documents.
38
+
39
+ ## Getting Started
40
+
41
+ ### Prerequisites
42
+
43
+ Before running the code, make sure you have the following prerequisites installed:
44
+
45
+ - Python 3.x
46
+ - PyTorch
47
+ - Transformers library by Hugging Face
48
+ - Datasets library by Hugging Face
49
+
50
+ ### Installation
51
+
52
+ 1. Clone this repository to your local machine:
53
+
54
+ ```bash
55
+ git clone https://github.com/atulpokharel-gp/Document-Classification-using-LayoutLM
56
+ cd Document-Classification-using-LayoutLM