File size: 6,032 Bytes
255bf30
 
 
 
 
 
 
 
 
 
 
 
a3e82d3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
---
title: Amazon Review Sentiment Analysis
emoji: πŸ“Š
colorFrom: indigo
colorTo: indigo
sdk: streamlit
sdk_version: 1.26.0
app_file: app.py
pinned: false
license: openrail
---

# Amazon review sentiment analysis
![GitHub Repo stars](https://img.shields.io/github/stars/tikendraw/Amazon-review-sentiment-analysis?style=flat&logo=github&logoColor=white&label=Github%20Stars)

Welcome to the Amazon Review Sentiment Analysis project! This repository contains code for training a sentiment analysis model on a large dataset of Amazon reviews using Long Short-Term Memory (LSTM) neural networks. The trained model can predict the sentiment (positive or negative) of Amazon reviews. The dataset used for training consists of over 2 million reviews, totaling 2.6 GB of data.

<img src='https://img.shields.io/badge/TensorFlow-FF6F00?style=for-the-badge&logo=tensorflow&logoColor=white'>

<img src='https://img.shields.io/badge/scikit--learn-%23F7931E.svg?style=for-the-badge&logo=scikit-learn&logoColor=white'>

<img src='https://img.shields.io/badge/Polars-CD792C.svg?style=for-the-badge&logo=Polars&logoColor=white'>


## Table of Contents
* Introduction
* Dataset
* Model
* Getting Started
    * Prerequisites
    * Training
    * Prediction
    * Running the Streamlit App
* Contributing
* Acknowledgements

## Introduction
Sentiment analysis is the process of determining the sentiment or emotion expressed in a piece of text. In this project, we focus on predicting whether Amazon reviews are positive or negative based on their text content. We use LSTM neural networks, a type of recurrent neural network (RNN), to capture the sequential patterns in the text data and make accurate sentiment predictions.

## Dataset
The dataset used for this project is a massive collection of Amazon reviews, comprising more than 2 million reviews with a total size of 2.6 GB. The dataset is [ here](https://www.kaggle.com/datasets/kritanjalijain/amazon-reviews). It contains both positive and negative reviews, making it suitable for training a sentiment analysis model.

### Challenges 
* Dataset is very larget (2.6 GB) with 2.6 Million Reviews
* Machine's resources are limiting as loading multiple variables with processed data is eating up RAM

### Work Arounds
* Used polars for data manipulation and Preprocessings ( Uses Parallel computation, Doesn't load data on memory)

## Model
The sentiment analysis model is built using TensorFlow and Keras libraries. We employ LSTM layers to effectively capture the sequential nature of text data. The model is trained on the labeled Amazon reviews dataset, and its performance is evaluated using various metrics such as accuracy, precision, recall, and F1-score.

## Model architectures
```
Model: "model_lstm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_3 (InputLayer)        [(None, 175)]             0         
                                                                 
 embedding_layer (Embedding)  (None, 175, 8)           2400000   
                                                                 
 lstm_layer_1 (LSTM)         (None, 175, 16)           1600      
                                                                 
 lstm_layer_2 (LSTM)         (None, 16)                2112      
                                                                 
 dropout_layer (Dropout)     (None, 16)                0         
                                                                 
 dense_layer_1 (Dense)       (None, 64)                1088      
                                                                 
 dense_layer_2_final (Dense)  (None, 1)                65        
                                                                 
=================================================================
Total params: 2,404,865
Trainable params: 2,404,865
Non-trainable params: 0
_________________________________________________________________
```
## Model Performance

| Model             | Accuracy           | Precision          | Recall             | F1                 | Description                                      |
|-------------------|--------------------|--------------------|--------------------|--------------------|--------------------------------------------------|
| model0: Naive Bayes | 84.79% | 84.82% | 84.79% | 84.79% |  |
| model1: **LSTM**(in use)      | 94.06%   | 94.06% | 94.06% | 94.06% | small lstm model with vectorizer and embedding layer |

## Getting Started
Follow these steps to get started with the project:

### Prerequisites
* Python 3.x
* TensorFlow
* Keras
* Polars
* Streamlit

You can install the required dependencies using the following command:

```
pip install -r requirements.txt
```

### Training
To train the LSTM model, run the train.py script:

```
python3 train.py
```
This script will preprocess the dataset, train the model, and save the trained weights to disk.

### Prediction

To use the trained model for making predictions on new reviews, run the predict.py script:

```
python3 predict.py
```
### Running the Streamlit App
We've also provided a user-friendly Streamlit app to interact with the trained
Model. Run the app using the following command:
```
streamlit run app.py
```
This will launch a local web app where you can input your own Amazon review and see the model's sentiment prediction.

## Contributing
Contributions are welcome! If you find any issues or have suggestions for improvements, please feel free to open an issue or create a pull request.


## Acknowledgements
We would like to express our gratitude to the open-source community for providing invaluable resources and tools that made this project possible.

Don't Forget to Star!
If you find this project interesting or useful, please consider starring the repository. Your support is greatly appreciated!

Star

Happy coding!

Your Name
Your Contact Info
Date