Spaces:
Runtime error
Apply for community grant: Academic project (gpu)
This space is for the online demo of the ACL 2024 paper ReactXT: Understanding Molecular βReaction-shipβ via Reaction-Contextualized Molecule-Text Pretraining
We kindly apply for online GPU resources to deploy the demo. This is an open-sourced project for academic purposes.
Thanks for your help @hysts ! Now I can see the free grant ZeroGPU choice in settings/space hardware.
However there's a rotating loading mark at the right top of the ZeroGPU card, and I cannot select ZeroGPU on this page. Does this mean the GPU I applied for is still in the queue and requires waiting?
I see the following error in the log:
Collecting flash_attn
Downloading flash_attn-2.5.9.post1.tar.gz (2.6 MB)
ββββββββββββββββββββββββββββββββββββββββ 2.6/2.6 MB 358.9 MB/s eta 0:00:00
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'error'
error: subprocess-exited-with-error
Γ python setup.py egg_info did not run successfully.
β exit code: 1
β°β> [20 lines of output]
fatal: not a git repository (or any of the parent directories): .git
/tmp/pip-install-s0a7kjdi/flash-attn_efa6a25f031e41fc80fbaf9954824612/setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
warnings.warn(
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-s0a7kjdi/flash-attn_efa6a25f031e41fc80fbaf9954824612/setup.py", line 134, in <module>
CUDAExtension(
File "/usr/local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1074, in CUDAExtension
library_dirs += library_paths(cuda=True)
File "/usr/local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1201, in library_paths
if (not os.path.exists(_join_cuda_home(lib_dir)) and
File "/usr/local/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2407, in _join_cuda_home
raise OSError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
torch.__version__ = 2.2.0+cu121
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
I think it's because you added flash_attn
in your requirements.txt
here, but on ZeroGPU, CUDA is not available at build time, so I think you need to install it like this at startup.
Got it, I'll try to fix it. Thanks for your reply (οΎβοΎ)
@hysts
Sorry for the bothering, but I encountered some problems with spaces.GPU
and I didn't find the solution in the docs.
My model class uses a nn.Module from another file in my space, thus the imported nn.Module can not run on GPU.
In my app.py
, I do:
@spaces.GPU
@torch
.no_grad()
def predict(self, rxn_dict, temperature=1):
graphs, prompt_tokens = self.tokenize(rxn_dict)
result_dict = rxn_dict
samples = {'graphs': graphs, 'prompt_tokens': prompt_tokens}
prediction = self.model.blip2opt.generate(
samples,
do_sample=self.args.do_sample,
num_beams=self.args.num_beams,
max_length=self.args.max_inference_len,
min_length=self.args.min_inference_len,
num_captions=self.args.num_generate_captions,
temperature=temperature,
use_graph=True
)[0]
for k, v in result_dict['extracted_molecules'].items():
prediction = prediction.replace(v, k)
result_dict['prediction'] = prediction
return result_dict
Here self.model.blip2opt
uses graph_encoder
(which is a instance of GNN
in model/gin_model.py, line 213) to encode samples['graphs'], and uses an OPTForCausalLM
to encode the text.
When running the above code, I got the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 216, in thread_wrapper
res = future.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/user/app/tmp.py", line 205, in predict
prediction = self.model.generate(
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/user/app/model/blip2_opt.py", line 378, in generate
graph_embeds, graph_masks = self.graph_encoder(graphs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/user/app/model/gin_model.py", line 275, in forward
x = self.x_embedding1(x[:,0]) + self.x_embedding2(x[:,1])
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 162, in forward
return F.embedding(
File "/usr/local/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
Traceback (most recent call last):
File "/home/user/app/tmp.py", line 284, in
main(args)
File "/home/user/app/tmp.py", line 277, in main
online_chat(example_inputs[0])
File "/home/user/app/tmp.py", line 272, in online_chat
result = infer_runner.predict(data_item, temperature=temperature)
File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 177, in gradio_handler
raise res.value
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
I've checked that the input tensors have been put on GPU correctly, but the model params are still on CPU.
I tried wrap the forward function of class GNN
, but got this:
/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py:77: UserWarning: Using a ZeroGPU function outside of Gradio caching or request might block the app
warnings.warn("Using a ZeroGPU function outside of Gradio caching or request might block the app")
Oh, I think I fix it by manually moving the model to cuda during runtime.
@spaces.GPU
@torch
.no_grad()
def predict(self, rxn_dict, temperature=1):
graphs, prompt_tokens = self.tokenize(rxn_dict)
self.model.blip2opt = self.model.blip2opt.to('cuda')
result_dict = rxn_dict
samples = {'graphs': graphs, 'prompt_tokens': prompt_tokens}
prediction = self.model.blip2opt.generate(
samples,
do_sample=self.args.do_sample,
num_beams=self.args.num_beams,
max_length=self.args.max_inference_len,
min_length=self.args.min_inference_len,
num_captions=self.args.num_generate_captions,
temperature=temperature,
use_graph=True
)[0]
for k, v in result_dict['extracted_molecules'].items():
prediction = prediction.replace(v, k)
result_dict['prediction'] = prediction
return result_dict
Thanks for your patience.