Spaces:
Sleeping
Sleeping
metadata
title: Raccoon
emoji: 🦝
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.10.0
python_version: 3.9
app_file: app.py
pinned: false
license: mit
Raccoon
Installation
It is recommend to use virtual environment using venv
.
The fol
- If using Apple Silicon install rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
andbrew install cmake
- Create the virtual envoirnment:
python3 -m venv .venv
- Activate the virtual envoirnment:
source .venv/bin/activate
- To deactive the virtual envoirnment run
deactivate
within the virtual envoirnment.
- To deactive the virtual envoirnment run
- Install the required packages:
.venv/bin/pip install -r requirements.txt
.venv/bin/pip install -e .
- Create a custom search engine in Google.
- Create a API for the custom search engine.
- Add the custom search engine key and PI key to
.streamlit/secrets.toml
.
google_search_api_key = "api-key"
google_search_engine_id = "search-engine-id"
- To start the interface:
streamlit run app.py
Todo
- Improve fetched content.
- Fix issue of duplicate content extracted by beautifulsoup.
- Exclude code from content
- Find sentences that contain the search keywords.
- Find sentences that contain the search keywords taking into account different spellings health care vs healthcare.
- Get some content from every search result.
- Div's with text & tags. Extract text from tags and then decompose the tags. Keep order of content and no duplicates.
- Summarization requires truncation. Find solution where not needed.
- Support German content with language switcher.
- Improve queries to include more keywords (Expand abrivations & define context)
- Control the number of results from the UI.
- Control summary length via settings: https://docs.streamlit.io/library/advanced-features/session-state