added information about the auxiliary head

75354a2 verified 26 days ago

4.55 kB

	---
	tags:
	- model_hub_mixin
	- pytorch_model_hub_mixin
	pipeline_tag: tabular-regression
	library_name: pytorch
	datasets:
	- gvlassis/california_housing
	metrics:
	- rmse
	---

	# wide-and-deep-net-california-housing-v3

	A wide & deep neural network trained on the California Housing dataset. It is a PyTorch adaptation of the TensorFlow model in Chapter 10 of Aurelien Geron's book 'Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow'.

	![](https://raw.githubusercontent.com/sambitmukherjee/handson-ml3-pytorch/main/chapter10/Figure_10-15.png)

	The model takes eight features: `'MedInc'`, `'HouseAge'`, `'AveRooms'`, `'AveBedrms'`, `'Population'`, `'AveOccup'`, `'Latitude'` and `'Longitude'`. It predicts `'MedHouseVal'`.

	The first five features (`'MedInc'`, `'HouseAge'`, `'AveRooms'`, `'AveBedrms'` and `'Population'`) flow through the wide path.

	The last six features (`'AveRooms'`, `'AveBedrms'`, `'Population'`, `'AveOccup'`, `'Latitude'` and `'Longitude'`) flow through the deep path.

	Note: The features `'AveRooms'`, `'AveBedrms'` and `'Population'` flow through both the wide path and the deep path.

	The model also has an auxiliary head. The main head and the auxiliary head output the same thing (`'MedHouseVal'`). As mentioned in the book, this is a regularization technique, to try and ensure that the "underlying part of the network" (i.e., the deep path) learns something useful on its own, without relying on the rest of the network.

	Code: https://github.com/sambitmukherjee/handson-ml3-pytorch/blob/main/chapter10/wide_and_deep_net_california_housing_v3.ipynb

	Experiment tracking: https://wandb.ai/sadhaklal/wide-and-deep-net-california-housing

	## Usage

	```
	from sklearn.datasets import fetch_california_housing

	housing = fetch_california_housing(as_frame=True)

	from sklearn.model_selection import train_test_split

	X_train_full, X_test, y_train_full, y_test = train_test_split(housing['data'], housing['target'], test_size=0.25, random_state=42)
	X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full, test_size=0.25, random_state=42)

	X_means, X_stds = X_train.mean(axis=0), X_train.std(axis=0)
	X_train = (X_train - X_means) / X_stds
	X_valid = (X_valid - X_means) / X_stds
	X_test = (X_test - X_means) / X_stds

	import torch

	device = torch.device("cpu")

	from dataclasses import dataclass
	from typing import Optional

	@dataclass
	class WideAndDeepNetOutput:
	main_output: torch.Tensor
	aux_output: torch.Tensor
	main_loss: Optional[torch.Tensor] = None
	aux_loss: Optional[torch.Tensor] = None
	loss: Optional[torch.Tensor] = None

	import torch.nn as nn
	from huggingface_hub import PyTorchModelHubMixin

	class WideAndDeepNet(nn.Module, PyTorchModelHubMixin):
	def __init__(self):
	super().__init__()
	self.hidden1 = nn.Linear(6, 30)
	self.hidden2 = nn.Linear(30, 30)
	self.main_head = nn.Linear(35, 1)
	self.aux_head = nn.Linear(30, 1)
	self.main_loss_fn = nn.MSELoss(reduction='sum')
	self.aux_loss_fn = nn.MSELoss(reduction='sum')

	def forward(self, input_wide, input_deep, label=None):
	act = torch.relu(self.hidden1(input_deep))
	act = torch.relu(self.hidden2(act))
	concat = torch.cat([input_wide, act], dim=1)
	main_output = self.main_head(concat)
	aux_output = self.aux_head(act)
	if label is not None:
	main_loss = self.main_loss_fn(main_output.squeeze(), label)
	aux_loss = self.aux_loss_fn(aux_output.squeeze(), label)
	loss = 0.9 * main_loss + 0.1 * aux_loss
	return WideAndDeepNetOutput(main_output, aux_output, main_loss, aux_loss, loss)
	else:
	return WideAndDeepNetOutput(main_output, aux_output)

	model = WideAndDeepNet.from_pretrained("sadhaklal/wide-and-deep-net-california-housing-v3")
	model.to(device)
	model.eval()

	# Let's predict on 3 unseen examples from the test set:
	print(f"Ground truth housing prices: {y_test.values[:3]}")
	new = {
	'input_wide': torch.tensor(X_test.values[:3, :5], dtype=torch.float32),
	'input_deep': torch.tensor(X_test.values[:3, 2:], dtype=torch.float32)
	}
	new = {k: v.to(device) for k, v in new.items()}
	with torch.no_grad():
	output = model(**new)
	print(f"Predicted housing prices: {output.main_output.squeeze()}")
	```

	## Metric

	RMSE on the test set: 0.574

	---

	This model has been pushed to the Hub using the [PyTorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration.