Hi folks! I’m struggling with an error from some code that I inherited and have been working on. I don’t fully understand how the package works, so I might make some silly mistakes. I have trained an Informer on a bunch of data, but when I try to use model.generate()
to get predictions I get this error message:
Cell In[14], line 14
7 for batch in test_dataloader:
8 with torch.no_grad():
9 # print(batch["past_time_features"].to(device).shape)
10 # print(batch["past_values"].to(device).shape)
11 # print(batch["future_time_features"].to(device).shape)
12 # print(batch["past_observed_mask"].to(device).shape)
---> 14 outputs = model.generate(
15 static_categorical_features=batch["static_categorical_features"].to(device)
16 if model_config.num_static_categorical_features > 0
17 else None,
18 static_real_features=batch["static_real_features"].to(device)
19 if model_config.num_static_real_features > 0
20 else None,
21 past_time_features=batch["past_time_features"].to(device),
22 past_values=batch["past_values"].to(device),
23 future_time_features=batch["future_time_features"].to(device),
24 past_observed_mask=batch["past_observed_mask"].to(device),
25 )
26 forecasts_.append(outputs.sequences.cpu().numpy())
File ~/.conda/envs/pytorch-1.13.1/lib/python3.11/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)
File ~/.conda/envs/pytorch-1.13.1/lib/python3.11/site-packages/transformers/models/informer/modeling_informer.py:2076, in InformerForPrediction.generate(self, past_values, past_time_features, future_time_features, past_observed_mask, static_categorical_features, static_real_features, output_attentions, output_hidden_states)
2073 #SAM ADD. DONT FORGET TO REMOVE
2074 #print(lagged_sequence)
2075 print(reshaped_lagged_sequence.shape, repeated_features[:, : k + 1].shape)
-> 2076 decoder_input = torch.cat((reshaped_lagged_sequence, repeated_features[:, : k + 1]), dim=-1)
2078 dec_output = decoder(inputs_embeds=decoder_input, encoder_hidden_states=repeated_enc_last_hidden)
2079 dec_last_hidden = dec_output.last_hidden_state
RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 70 but got size 1 for tensor number 1 in the list.
I printed the size of these two tensors and reshaped_lagged_sequence
is the correct size at [51200, 70, 6], whereas repeated_features
is the wrong size at [51200, 1, 14].
Here’s the code I used to create the dataloader:
accelerator = Accelerator()
device = accelerator.device
num_variates = 6
model = InformerForPrediction.from_pretrained('anomaly_detection/70-110_pretrained_model_new/hf_model')
model.to(device)
model.eval()
with open("bigger_model_hyperparameters.yml", 'r') as f:
config = yaml.safe_load(f)
model_config = InformerConfig(
input_size=num_variates,
has_labels=False,
prediction_length=110,
context_length=70,
lags_sequence=[0],
num_time_features=len(config['time_features']) + 1,
dropout=0.2,
encoder_layers=config['num_encoder_layers'],
decoder_layers=config['num_decoder_layers'],
d_model=config['d_model']
)
test_dataloader = create_train_dataloader(
config=model_config,
dataset=test_dataset,
time_features=[month_of_year if x == 'month_of_year' else None for x in config['time_features']],
batch_size=512,
num_batches_per_epoch=10,
add_objid=True,
)
Then, here’s the code I used to try to generate the forecasts:
context = 70
prediction = 110
model.eval()
forecasts_ = []
for batch in test_dataloader:
with torch.no_grad():
outputs = model.generate(
static_categorical_features=batch["static_categorical_features"].to(device)
if model_config.num_static_categorical_features > 0
else None,
static_real_features=batch["static_real_features"].to(device)
if model_config.num_static_real_features > 0
else None,
past_time_features=batch["past_time_features"].to(device),
past_values=batch["past_values"].to(device),
future_time_features=batch["future_time_features"].to(device),
past_observed_mask=batch["past_observed_mask"].to(device),
)
forecasts_.append(outputs.sequences.cpu().numpy())
Also, if it’s relevant, here’s the output of transformers-cli env
:
- `transformers` version: 4.27.4
- Platform: Linux-5.14.21-150400.24.81_12.0.86-cray_shasta_c-x86_64-with-glibc2.31
- Python version: 3.11.2
- Huggingface_hub version: 0.13.3
- PyTorch version (GPU?): 2.1.0 (True)
I’m trying to run this on the NERSC platform, which is why it has that value there.
I’ve trying adjusting various parameters, adding actual lags, changing devices, and re-training the model, but nothing seems to affect the size of this tensor. Let me know if you need any more details, and any help would be appreciated!