Spaces:

llamaindex
/

README

Running

App Files Files Community

cheesyFishes commited on 8 days ago

Commit

74f745b

verified ·

1 Parent(s): d58dec6

Update README.md

Browse files

Files changed (1) hide show

README.md +124 -19

README.md CHANGED Viewed

@@ -10,49 +10,154 @@ pinned: false
 # 🗂️ LlamaIndex 🦙
-LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data.
-PyPI:
-- LlamaIndex: https://pypi.org/project/llama-index/.
-- GPT Index (duplicate): https://pypi.org/project/gpt-index/.
-Documentation: https://gpt-index.readthedocs.io/.
-Twitter: https://twitter.com/llama_index.
-Discord: https://discord.gg/dGcwcsnxhU.
 ### Ecosystem
-- LlamaHub (community library of data loaders): https://llamahub.ai
-- LlamaLab (cutting-edge AGI projects using LlamaIndex): https://github.com/run-llama/llama-lab
 ## 💻 Example Usage
 ```
 pip install llama-index
 ```
-Examples are in the `examples` folder. Indices are in the `indices` folder (see list of indices below).
-To build a simple vector store index:
 ```python
 import os
-os.environ["OPENAI_API_KEY"] = 'YOUR_OPENAI_API_KEY'
-from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader
-documents = SimpleDirectoryReader('data').load_data()
-index = GPTVectorStoreIndex.from_documents(documents)
 ```
 To query:
 ```python
 query_engine = index.as_query_engine()
-query_engine.query("<question_text>?")
 ```
 By default, data is stored in-memory.
 To persist to disk (under `./storage`):
@@ -62,12 +167,12 @@ index.storage_context.persist()
 ```
 To reload from disk:
 ```python
-from llama_index import StorageContext, load_index_from_storage
 # rebuild storage context
-storage_context = StorageContext.from_defaults(persist_dir='./storage')
 # load index
 index = load_index_from_storage(storage_context)
 ```

 # 🗂️ LlamaIndex 🦙
+[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-index)](https://pypi.org/project/llama-index/)
+[![GitHub contributors](https://img.shields.io/github/contributors/jerryjliu/llama_index)](https://github.com/jerryjliu/llama_index/graphs/contributors)
+[![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.gg/dGcwcsnxhU)
+[![Ask AI](https://img.shields.io/badge/Phorm-Ask_AI-%23F2777A.svg?&logo=data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iNSIgaGVpZ2h0PSI0IiBmaWxsPSJub25lIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPgogIDxwYXRoIGQ9Ik00LjQzIDEuODgyYTEuNDQgMS40NCAwIDAgMS0uMDk4LjQyNmMtLjA1LjEyMy0uMTE1LjIzLS4xOTIuMzIyLS4wNzUuMDktLjE2LjE2NS0uMjU1LjIyNmExLjM1MyAxLjM1MyAwIDAgMS0uNTk1LjIxMmMtLjA5OS4wMTItLjE5Mi4wMTQtLjI3OS4wMDZsLTEuNTkzLS4xNHYtLjQwNmgxLjY1OGMuMDkuMDAxLjE3LS4xNjkuMjQ2LS4xOTFhLjYwMy42MDMgMCAwIDAgLjItLjEwNi41MjkuNTI5IDAgMCAwIC4xMzgtLjE3LjY1NC42NTQgMCAwIDAgLjA2NS0uMjRsLjAyOC0uMzJhLjkzLjkzIDAgMCAwLS4wMzYtLjI0OS41NjcuNTY3IDAgMCAwLS4xMDMtLjIuNTAyLjUwMiAwIDAgMC0uMTY4LS4xMzguNjA4LjYwOCAwIDAgMC0uMjQtLjA2N0wyLjQzNy43MjkgMS42MjUuNjcxYS4zMjIuMzIyIDAgMCAwLS4yMzIuMDU4LjM3NS4zNzUgMCAwIDAtLjExNi4yMzJsLS4xMTYgMS40NS0uMDU4LjY5Ny0uMDU4Ljc1NEwuNzA1IDRsLS4zNTctLjA3OUwuNjAyLjkwNkMuNjE3LjcyNi42NjMuNTc0LjczOS40NTRhLjk1OC45NTggMCAwIDEgLjI3NC0uMjg1Ljk3MS45NzEgMCAwIDEgLjMzNy0uMTRjLjExOS0uMDI2LjIyNy0uMDM0LjMyNS0uMDI2TDMuMjMyLjE2Yy4xNTkuMDE0LjMzNi4wMy40NTkuMDgyYTEuMTczIDEuMTczIDAgMCAxIC41NDUuNDQ3Yy4wNi4wOTQuMTA5LjE5Mi4xNDQuMjkzYTEuMzkyIDEuMzkyIDAgMCAxIC4wNzguNThsLS4wMjkuMzJaIiBmaWxsPSIjRjI3NzdBIi8+CiAgPHBhdGggZD0iTTQuMDgyIDIuMDA3YTEuNDU1IDEuNDU1IDAgMCAxLS4wOTguNDI3Yy0uMDUuMTI0LS4xMTQuMjMyLS4xOTIuMzI0YTEuMTMgMS4xMyAwIDAgMS0uMjU0LjIyNyAxLjM1MyAxLjM1MyAwIDAgMS0uNTk1LjIxNGMtLjEuMDEyLS4xOTMuMDE0LS4yOC4wMDZsLTEuNTYtLjEwOC4wMzQtLjQwNi4wMy0uMzQ4IDEuNTU5LjE1NGMuMDkgMCAuMTczLS4wMS4yNDgtLjAzM2EuNjAzLjYwMyAwIDAgMCAuMi0uMTA2LjUzMi41MzIgMCAwIDAgLjEzOS0uMTcyLjY2LjY2IDAgMCAwIC4wNjQtLjI0MWwuMDI5LS4zMjFhLjk0Ljk0IDAgMCAwLS4wMzYtLjI1LjU3LjU3IDAgMCAwLS4xMDMtLjIwMi41MDIuNTAyIDAgMCAwLS4xNjgtLjEzOC42MDUuNjA1IDAgMCAwLS4yNC0uMDY3TDEuMjczLjgyN2MtLjA5NC0uMDA4LS4xNjguMDEtLjIyMS4wNTUtLjA1My4wNDUtLjA4NC4xMTQtLjA5Mi4yMDZMLjcwNSA0IDAgMy45MzhsLjI1NS0yLjkxMUExLjAxIDEuMDEgMCAwIDEgLjM5My41NzIuOTYyLjk2MiAwIDAgMSAuNjY2LjI4NmEuOTcuOTcgMCAwIDEgLjMzOC0uMTRDMS4xMjIuMTIgMS4yMy4xMSAxLjMyOC4xMTlsMS41OTMuMTRjLjE2LjAxNC4zLjA0Ny40MjMuMWExLjE3IDEuMTcgMCAwIDEgLjU0NS40NDhjLjA2MS4wOTUuMTA5LjE5My4xNDQuMjk1YTEuNDA2IDEuNDA2IDAgMCAxIC4wNzcuNTgzbC0uMDI4LjMyMloiIGZpbGw9IndoaXRlIi8+CiAgPHBhdGggZD0iTTQuMDgyIDIuMDA3YTEuNDU1IDEuNDU1IDAgMCAxLS4wOTguNDI3Yy0uMDUuMTI0LS4xMTQuMjMyLS4xOTIuMzI0YTEuMTMgMS4xMyAwIDAgMS0uMjU0LjIyNyAxLjM1MyAxLjM1MyAwIDAgMS0uNTk1LjIxNGMtLjEuMDEyLS4xOTMuMDE0LS4yOC4wMDZsLTEuNTYtLjEwOC4wMzQtLjQwNi4wMy0uMzQ4IDEuNTU5LjE1NGMuMDkgMCAuMTczLS4wMS4yNDgtLjAzM2EuNjAzLjYwMyAwIDAgMCAuMi0uMTA2LjUzMi41MzIgMCAwIDAgLjEzOS0uMTcyLjY2LjY2IDAgMCAwIC4wNjQtLjI0MWwuMDI5LS4zMjFhLjk0Ljk0IDAgMCAwLS4wMzYtLjI1LjU3LjU3IDAgMCAwLS4xMDMtLjIwMi41MDIuNTAyIDAgMCAwLS4xNjgtLjEzOC42MDUuNjA1IDAgMCAwLS4yNC0uMDY3TDEuMjczLjgyN2MtLjA5NC0uMDA4LS4xNjguMDEtLjIyMS4wNTUtLjA1My4wNDUtLjA4NC4xMTQtLjA5Mi4yMDZMLjcwNSA0IDAgMy45MzhsLjI1NS0yLjkxMUExLjAxIDEuMDEgMCAwIDEgLjM5My41NzIuOTYyLjk2MiAwIDAgMSAuNjY2LjI4NmEuOTcuOTcgMCAwIDEgLjMzOC0uMTRDMS4xMjIuMTIgMS4yMy4xMSAxLjMyOC4xMTlsMS41OTMuMTRjLjE2LjAxNC4zLjA0Ny40MjMuMWExLjE3IDEuMTcgMCAwIDEgLjU0NS40NDhjLjA2MS4wOTUuMTA5LjE5My4xNDQuMjk1YTEuNDA2IDEuNDA2IDAgMCAxIC4wNzcuNTgzbC0uMDI4LjMyMloiIGZpbGw9IndoaXRlIi8+Cjwvc3ZnPgo=)](https://www.phorm.ai/query?projectId=c5863b56-6703-4a5d-87b6-7e6031bf16b6)
+LlamaIndex (GPT Index) is a data framework for your LLM application. Building with LlamaIndex typically involves working with LlamaIndex core and a chosen set of integrations (or plugins). There are two ways to start building with LlamaIndex in
+Python:
+1. **Starter**: [`pip install llama-index`](https://pypi.org/project/llama-index/). A starter Python package that includes core LlamaIndex as well as a selection of integrations.
+2. **Customized**: [`pip install llama-index-core`](https://pypi.org/project/llama-index-core/). Install core LlamaIndex and add your chosen LlamaIndex integration packages on [LlamaHub](https://llamahub.ai/)
+   that are required for your application. There are over 300 LlamaIndex integration
+   packages that work seamlessly with core, allowing you to build with your preferred
+   LLM, embedding, and vector store providers.
+### Important Links
+LlamaIndex.TS [(Typescript/Javascript)](https://github.com/run-llama/LlamaIndexTS)
+[Documentation](https://docs.llamaindex.ai/en/stable/)
+[Twitter](https://twitter.com/llama_index)
+[Discord](https://discord.gg/dGcwcsnxhU)
 ### Ecosystem
+- LlamaHub [(community library of data loaders)](https://llamahub.ai)
+- LlamaLab [(cutting-edge AGI projects using LlamaIndex)](https://github.com/run-llama/llama-lab)
+## 🚀 Overview
+**NOTE**: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!
+### Context
+- LLMs are a phenomenal piece of technology for knowledge generation and reasoning. They are pre-trained on large amounts of publicly available data.
+- How do we best augment LLMs with our own private data?
+We need a comprehensive toolkit to help perform this data augmentation for LLMs.
+### Proposed Solution
+That's where **LlamaIndex** comes in. LlamaIndex is a "data framework" to help you build LLM apps. It provides the following tools:
+- Offers **data connectors** to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.).
+- Provides ways to **structure your data** (indices, graphs) so that this data can be easily used with LLMs.
+- Provides an **advanced retrieval/query interface over your data**: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
+- Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, or anything else).
+LlamaIndex provides tools for both beginner users and advanced users. Our high-level API allows beginner users to use LlamaIndex to ingest and query their data in
+5 lines of code. Our lower-level APIs allow advanced users to customize and extend any module (data connectors, indices, retrievers, query engines, reranking modules),
+to fit their needs.
+## 📄 Documentation
+Full documentation can be found [here](https://docs.llamaindex.ai/en/latest/)
+Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!
 ## 💻 Example Usage
+The LlamaIndex Python library is namespaced such that import statements which
+include `core` imply that the core package is being used. In contrast, those
+statements without `core` imply that an integration package is being used.
+```python
+# typical pattern
+from llama_index.core.xxx import ClassABC  # core submodule xxx
+from llama_index.xxx.yyy import (
+    SubclassABC,
+)  # integration yyy for submodule xxx
+# concrete example
+from llama_index.core.llms import LLM
+from llama_index.llms.openai import OpenAI
 ```
+To get started, we can install llama-index directly using the starter dependencies (mainly OpenAI):
+```sh
 pip install llama-index
 ```
+Or we can do a more custom isntallation:
+```sh
+# custom selection of integrations to work with core
+pip install llama-index-core
+pip install llama-index-llms-openai
+pip install llama-index-llms-ollama
+pip install llama-index-embeddings-huggingface
+pip install llama-index-readers-file
+```
+To build a simple vector store index using OpenAI:
 ```python
 import os
+os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
+from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
+documents = SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()
+index = VectorStoreIndex.from_documents(documents)
 ```
+To build a simple vector store index using non-OpenAI models, we can leverage Ollama and HuggingFace. This assumes you've already installed Ollama and have pulled the model you want to use.
+```python
+from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
+from llama_index.embeddings.huggingface import HuggingFaceEmbedding
+from llama_index.llms.ollama import Ollama
+# set the LLM
+llama2_7b_chat = "meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e"
+Settings.llm = Ollama(
+    model="llama3.1:latest",
+    temperature=0.1,
+    request_timeout=360.0,
+)
+# set the embed model
+Settings.embed_model = HuggingFaceEmbedding(
+    model_name="BAAI/bge-small-en-v1.5",
+    embed_batch_size=2,
+)
+documents = SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()
+index = VectorStoreIndex.from_documents(
+    documents,
+)
+```
 To query:
 ```python
 query_engine = index.as_query_engine()
+query_engine.query("YOUR_QUESTION")
 ```
+Or chat:
+```python
+chat_engine = index.as_chat_engine(chat_mode="condense_plus_context")
+chat_engine.chat("YOUR MESSAGE")
+```
 By default, data is stored in-memory.
 To persist to disk (under `./storage`):
 ```
 To reload from disk:
 ```python
+from llama_index.core import StorageContext, load_index_from_storage
 # rebuild storage context
+storage_context = StorageContext.from_defaults(persist_dir="./storage")
 # load index
 index = load_index_from_storage(storage_context)
 ```