File size: 6,695 Bytes
94d30ab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5a38c5e
 
 
94d30ab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5a38c5e
94d30ab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
---
license: mit
datasets:
  - custom
metrics:
  - mean_squared_error
  - mean_absolute_error
  - r2_score
model_name: Fertilizer Recommendation System
tags:
  - random-forest
  - regression
  - multioutput
  - classification
  - agriculture
  - soil-nutrients
---

# Fertilizer Application Recommendation System

## Overview

This model predicts the fertilizer requirements for various crops based on input features such as crop type, target yield, field size, and soil properties. It utilizes a combination of Random Forest Regressor and Random Forest Classifier to predict both numerical values (e.g., nutrient needs) and categorical values (e.g., fertilizer application instructions).

## Training Data

The model was trained on a custom dataset containing the following features:

- Crop Name
- Target Yield
- Field Size
- pH (water)
- Organic Carbon
- Total Nitrogen
- Phosphorus (M3)
- Potassium (exch.)
- Soil moisture

The target variables include:

**Numerical Targets**:
- Nitrogen (N) Need
- Phosphorus (P2O5) Need
- Potassium (K2O) Need
- Organic Matter Need
- Lime Need
- Lime Application - Requirement
- Organic Matter Application - Requirement
- 1st Application - Requirement (1)
- 1st Application - Requirement (2)
- 2nd Application - Requirement (1)

**Categorical Targets**:
- Lime Application - Instruction
- Lime Application
- Organic Matter Application - Instruction
- Organic Matter Application
- 1st Application
- 1st Application - Type fertilizer (1)
- 1st Application - Type fertilizer (2)
- 2nd Application
- 2nd Application - Type fertilizer (1)

## Model Training

The model was trained using the following steps:

1. **Data Preprocessing**:
   - Handling missing values
   - Scaling numerical features using `StandardScaler`
   - One-hot encoding categorical features

2. **Modeling**:
   - Splitting the dataset into training and testing sets
   - Training a `RandomForestRegressor` for numerical targets using a `MultiOutputRegressor`
   - Training a `RandomForestClassifier` for categorical targets using a `MultiOutputClassifier`

3. **Evaluation**:
   - Evaluating the models using the test set with metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2) Score for regression, and accuracy for classification.

## Evaluation Metrics

The model was evaluated using the following metrics:

- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- R-squared (R2) Score
- Accuracy for categorical targets

## How to Use

### Input Format

The model expects input data in JSON format with the following fields:

- "Crop Name": String
- "Target Yield": Numeric
- "Field Size": Numeric
- "pH (water)": Numeric
- "Organic Carbon": Numeric
- "Total Nitrogen": Numeric
- "Phosphorus (M3)": Numeric
- "Potassium (exch.)": Numeric
- "Soil moisture": Numeric

### Preprocessing Steps

This script includes:

    Loading the models and preprocessor.
    Defining the categorical and numerical targets.
    Loading the label encoders.
    Creating a function make_predictions that processes the input data, makes predictions, and decodes the categorical predictions.

### Inference Procedure

```python
import pandas as pd
from joblib import load
from huggingface_hub import hf_hub_download
from sklearn.preprocessing import LabelEncoder

# Load models and preprocessor
preprocessor_path = hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename='preprocessor.joblib')
numerical_model_path = hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename='numerical_model.joblib')
categorical_model_path = hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename='categorical_model.joblib')

preprocessor = load(preprocessor_path)
numerical_model = load(numerical_model_path)
categorical_model = load(categorical_model_path)

# Define categorical targets
categorical_targets = [
    'Lime Application - Instruction',
    'Lime Application',
    'Organic Matter Application - Instruction',
    'Organic Matter Application',
    '1st Application',
    '1st Application - Type fertilizer (1)',
    '1st Application - Type fertilizer (2)',
    '2nd Application',
    '2nd Application - Type fertilizer (1)',
    '1st Application_1',
    '1st Application - Type fertilizer (1)_3',
    '1st Application - Type fertilizer (2)_5',
    '2nd Application_6',
    '1st Application_21',
    '1st Application - Type fertilizer (1)_23',
    '1st Application - Type fertilizer (2)_25',
    '2nd Application_26',
    '2nd Application - Type fertilizer (1)_28'
]

# Define numerical targets
numerical_targets = [
    'Nitrogen (N) Need',
    'Phosphorus (P2O5) Need',
    'Potassium (K2O) Need',
    'Organic Matter Need',
    'Lime Need',
    'Lime Application - Requirement',
    'Organic Matter Application - Requirement',
    '1st Application - Requirement (1)',
    '1st Application - Requirement (2)',
    '2nd Application - Requirement (1)'
]

# Load label encoders
label_encoders = {col: load(hf_hub_download(repo_id='DNgigi/FertiliserApplication', filename=f'label_encoder_{col}.joblib')) for col in categorical_targets}

def make_predictions(input_data):
    # Convert input data to DataFrame
    input_df = pd.DataFrame([input_data])

    # Preprocess the input data
    X_transformed = preprocessor.transform(input_df)

    # Predict with numerical model
    numerical_predictions = numerical_model.predict(X_transformed)

    # Predict with categorical model
    categorical_predictions_encoded = categorical_model.predict(X_transformed)

    # Decode categorical predictions
    categorical_predictions_decoded = {}
    for i, col in enumerate(categorical_targets):
        le = label_encoders[col]
        try:
            categorical_predictions_decoded[col] = le.inverse_transform(categorical_predictions_encoded[:, i])
        except ValueError as e:
            categorical_predictions_decoded[col] = ["Unknown"] * len(categorical_predictions_encoded[:, i])

    # Combine numerical and categorical predictions into a dictionary
    predictions_combined = {col: numerical_predictions[0, i] for i, col in enumerate(numerical_targets)}
    predictions_combined.update({col: categorical_predictions_decoded[col][0] for col in categorical_targets})

    return predictions_combined

# Example usage
input_data = {
    'Crop Name': 'maize(corn)',
    'Target Yield': 3600.0,
    'Field Size': 1.0,
    'pH (water)': 6.1,
    'Organic Carbon': 11.4,
    'Total Nitrogen': 1.1,
    'Phosphorus (M3)': 1.8,
    'Potassium (exch.)': 3.0,
    'Soil moisture': 20.0
}

predictions = make_predictions(input_data)

print("Predicted Fertilizer Requirements:")
for col, pred_value in predictions.items():
    print(f"{col}: {pred_value}")