How to set batchsize of inference

allenwang37 · October 17, 2024, 12:55am

I am a beginner and it seems that Transformers only supports processing one request at a time. Can we actually increase the parallel capability to process multiple requests in parallel like batchsize=n?

mahmutc · October 17, 2024, 10:00am

hi @allenwang37
I don’t know if this answers your question:

github.com/huggingface/transformers

How to use transformers for batch inference

opened 07:41AM - 20 Aug 21 UTC

closed 09:55AM - 20 Aug 21 UTC

wangdong1992

I use transformers to train text classification models，for a single text, it can… be inferred normally. The code is as follows from transformers import BertTokenizer, TFAlbertForSequenceClassification text = 'This is a sentence' model_path ='../albert_chinese_tiny' tokenizer = BertTokenizer.from_pretrained(model_path) model = TFAlbertForSequenceClassification.from_pretrained('../model_tf/20210818') encoding = tokenizer(text, truncation=True, padding=True, max_length=30, return_tensors="tf") result = model(encoding) When I predict more than one text at a time, an error will be reported. The code is as follows texts = ['This is a sentence', 'This is another sentence'] encodings = [] model_path ='../albert_chinese_tiny' tokenizer = BertTokenizer.from_pretrained(model_path) model = TFAlbertForSequenceClassification.from_pretrained('../model_tf/20210818') for text in texts: encoding = tokenizer(text, truncation=True, padding=True, max_length=30, return_tensors="tf") encodings.append(encoding) result = model(np.array(encodings)) The error information is as follows: tensorflow.python.framework.errors_impl.InvalidArgumentError: Value for attr ‘Tindices’ of string is not in the list of allowed values: int32, int64 ; NodeDef: {{node ResourceGather}}; Op<name=ResourceGather; signature=resource:resource, indices:Tindices → output:dtype; attr=batch_dims:int,default=0; attr=validate_indices:bool,default=true; attr=dtype:type; attr=Tindices:type,allowed=[DT_INT32, DT_INT64]; is_stateful=true> [Op:ResourceGather]

If you have multiple GPUs:

Topic		Replies	Views
How to use transformers for batch inference 🤗Transformers	1	26355	August 20, 2021
How to use transformers&tensorflow for batch inference Beginners	0	496	August 20, 2021
Parallelize model call for TFBertModel 🤗Transformers	3	1013	January 7, 2021
How truncation works when applying BERT tokenizer on the batch of sentence pairs in HuggingFace? 🤗Tokenizers	0	926	May 15, 2022
Reduce inference time with batches Beginners	0	392	September 14, 2021

How to set batchsize of inference

Related topics