Spaces:

spark-nlp
/

sparknlp-t5-closed-book-question-answering

Sleeping

App Files Files Community

sparknlp-t5-closed-book-question-answering / pages /Workflow & Model Overview.py

abdullahmubeen10

Upload 5 files

5da5ab8 verified 5 months ago

raw

history blame contribute delete

8.65 kB

	import streamlit as st

	# Page configuration
	st.set_page_config(
	layout="wide",
	initial_sidebar_state="auto"
	)

	# Custom CSS for better styling
	st.markdown("""
	<style>
	.main-title {
	font-size: 36px;
	color: #4A90E2;
	font-weight: bold;
	text-align: center;
	}
	.sub-title {
	font-size: 24px;
	color: #4A90E2;
	margin-top: 20px;
	}
	.section {
	background-color: #f9f9f9;
	padding: 15px;
	border-radius: 10px;
	margin-top: 20px;
	}
	.section h2 {
	font-size: 22px;
	color: #4A90E2;
	}
	.section p, .section ul {
	color: #666666;
	}
	.link {
	color: #4A90E2;
	text-decoration: none;
	}
	</style>
	""", unsafe_allow_html=True)

	# Title
	st.markdown('<div class="main-title">Automatically Answer Questions (CLOSED BOOK)</div>', unsafe_allow_html=True)

	# Introduction Section
	st.markdown("""
	<div class="section">
	<p>Closed-book question answering is a challenging task where a model is expected to generate accurate answers to questions without access to external information or documents during inference. This approach relies solely on the pre-trained knowledge embedded within the model, making it ideal for scenarios where retrieval-based methods are not feasible.</p>
	<p>In this page, we will explore how to implement a pipeline that can automatically answer questions in a closed-book setting using state-of-the-art NLP techniques. We utilize a T5 Transformer model fine-tuned for closed-book question answering, providing accurate and contextually relevant answers to a variety of trivia questions.</p>
	</div>
	""", unsafe_allow_html=True)

	# T5 Transformer Overview
	st.markdown('<div class="sub-title">Understanding the T5 Transformer for Closed-Book QA</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<p>The T5 (Text-To-Text Transfer Transformer) model by Google is a versatile transformer-based model designed to handle a wide range of NLP tasks in a unified text-to-text format. For closed-book question answering, T5 is fine-tuned to generate answers directly from its internal knowledge without relying on external sources.</p>
	<p>The model processes input questions and, based on its training, generates a text response that is both relevant and accurate. This makes it particularly effective in applications where access to external data sources is limited or impractical.</p>
	</div>
	""", unsafe_allow_html=True)

	# Performance Section
	st.markdown('<div class="sub-title">Performance and Benchmarks</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<p>The T5 model has been extensively benchmarked on various question-answering datasets, including natural questions and trivia challenges. In these evaluations, the closed-book variant of T5 has shown strong performance, often producing answers that are correct and contextually appropriate, even when the model is not allowed to reference any external data.</p>
	<p>This makes the T5 model a powerful tool for generating answers in applications such as virtual assistants, educational tools, and any scenario where pre-trained knowledge is sufficient to provide responses.</p>
	</div>
	""", unsafe_allow_html=True)

	# Implementation Section
	st.markdown('<div class="sub-title">Implementing Closed-Book Question Answering</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<p>The following example demonstrates how to implement a closed-book question answering pipeline using Spark NLP. The pipeline includes a document assembler, a sentence detector to identify questions, and the T5 model to generate answers.</p>
	</div>
	""", unsafe_allow_html=True)

	st.code('''
	from sparknlp.base import *
	from sparknlp.annotator import *
	from pyspark.ml import Pipeline
	from pyspark.sql.functions import col, expr

	document_assembler = DocumentAssembler()\\
	.setInputCol("text")\\
	.setOutputCol("documents")

	sentence_detector = SentenceDetectorDLModel\\
	.pretrained("sentence_detector_dl", "en")\\
	.setInputCols(["documents"])\\
	.setOutputCol("questions")

	t5 = T5Transformer()\\
	.pretrained("google_t5_small_ssm_nq")\\
	.setTask('trivia question:')\\
	.setInputCols(["questions"])\\
	.setOutputCol("answers")

	pipeline = Pipeline().setStages([document_assembler, sentence_detector, t5])

	data = spark.createDataFrame([["What is the capital of France?"]]).toDF("text")
	result = pipeline.fit(data).transform(data)
	result.select("answers.result").show(truncate=False)
	''', language='python')

	# Example Output
	st.text("""
	+---------------------------+
	\|answers.result \|
	+---------------------------+
	\|[Paris] \|
	+---------------------------+
	""")

	# Model Info Section
	st.markdown('<div class="sub-title">Choosing the Right T5 Model</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<p>Several T5 models are available, each pre-trained on different datasets and tasks. For closed-book question answering, it's important to select a model that has been fine-tuned specifically for this task. The model used in the example, "google_t5_small_ssm_nq," is optimized for answering trivia questions in a closed-book setting.</p>
	<p>For more complex or varied question-answering tasks, consider using larger T5 models like T5-Base or T5-Large, which may offer improved accuracy and context comprehension. Explore the available models on the <a class="link" href="https://sparknlp.org/models?annotator=T5Transformer" target="_blank">Spark NLP Models Hub</a> to find the best fit for your application.</p>
	</div>
	""", unsafe_allow_html=True)

	# Footer
	# References Section
	st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html" target="_blank">Google AI Blog</a>: Exploring Transfer Learning with T5</li>
	<li><a class="link" href="https://sparknlp.org/models?annotator=T5Transformer" target="_blank">Spark NLP Model Hub</a>: Explore T5 models</li>
	<li>Model used: <a class="link" href="https://sparknlp.org/2022/05/31/google_t5_small_ssm_nq_en_3_0.html" target="_blank">google_t5_small_ssm_nq</a></li>
	<li><a class="link" href="https://github.com/google-research/text-to-text-transfer-transformer" target="_blank">GitHub</a>: T5 Transformer repository</li>
	<li><a class="link" href="https://arxiv.org/abs/1910.10683" target="_blank">T5 Paper</a>: Detailed insights from the developers</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	st.markdown('<div class="sub-title">Community & Support</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
	<li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
	<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
	<li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
	<li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	st.markdown('<div class="sub-title">Quick Links</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://sparknlp.org/docs/en/quickstart" target="_blank">Getting Started</a></li>
	<li><a class="link" href="https://nlp.johnsnowlabs.com/models" target="_blank">Pretrained Models</a></li>
	<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/annotation/text/english" target="_blank">Example Notebooks</a></li>
	<li><a class="link" href="https://sparknlp.org/docs/en/install" target="_blank">Installation Guide</a></li>
	</ul>
	</div>
	""", unsafe_allow_html=True)