Usage#
To use this framework, you must first install AutoGluon RAG:
git clone https://github.com/autogluon/autogluon-rag
cd autogluon-rag
# Create a virtual environment (using Python, or conda if you prefer)
python3 -m virtualenv venv
source venv/bin/activate
#Install the package
pip install -e .
You can now use the package in two ways.
Use AutoGluon-RAG through the command line as agrag
:#
AutoGluon-RAG
usage: agrag [-h] [--config_file] [--preset_quality] [--web_urls [...]]
[--base_urls [...]] [--parse_urls_recursive] [--data_dir]
AutoGluon-RAG - Retrieval-Augmented Generation Pipeline
options:
-h, --help show this help message and exit
--config_file Path to the configuration file
--preset_quality Preset quality settings for the RAG pipeline
(default: medium_quality)
--web_urls [ ...] List of URLs to use for RAG
--base_urls [ ...] List of base URLs to restrict web URL parsing.
Only URLs stemming from a base URL will be
processed.
--parse_urls_recursive
Enable recursive parsing of all URLs from the
provided web URL list
--data_dir Directory containing files to use for RAG.
Supports local or S3 paths.
Use AutoGluon-RAG through code:#
from agrag.agrag import AutoGluonRAG
def ag_rag():
agrag = AutoGluonRAG(
preset_quality="medium_quality", # or path to config file
web_urls=["https://auto.gluon.ai/stable/index.html"],
base_urls=["https://auto.gluon.ai/stable/"],
parse_urls_recursive=True,
data_dir="s3://autogluon-rag-github-dev/autogluon_docs/"
)
agrag.initialize_rag_pipeline() # Initializes all modules in the RAG pipeline
agrag.generate_response("What is AutoGluon?") # Generator
if __name__ == "__main__":
ag_rag()
Configuring Parameters for AutoGluon-RAG:#
Using AutoGluonRAG
class#
For a list of configurable parameters that can be passed into the AutoGluonRAG
class, refer to the tutorial here.
Using Configuration File#
You can also use a configuration file with AutoGluonRAG
.
The configuration file contains the specific parameters to use for each module in the RAG pipeline. For an example of a config file, please refer to example_config.yaml
in src/agrag/configs/
. For specific details about the parameters in each individual module, refer to the README
files in each module in src/agrag/modules/
.
There is also a shared
section in the config file for parameters that do not refer to a specific module. Currently, the parameters in shared
are:
pipeline_batch_size: Optional batch size to use for pre-processing stage (Data Processing, Embedding, Vector DB Module). This represents the number of files in each batch. The default value is 20.