In advanced mode, the Chunking transformation improves the effectiveness of natural language processing (NLP) and retrieval-augmented generation (RAG) by breaking down text and converting it to an efficient form. The transformation can split large pieces of text into smaller segments, or chunks, and it can process text to make the data cleaner and semantically more consistent for vector embedding.
You can pass output from a Chunking transformation to a Vector Embedding transformation to create vector embeddings for the text. A Chunking transformation increases the content's relevance before the Target transformation writes the embeddings and metedata to a vector database. For more information, see Vector Embedding transformation.
Note: The Chunking transformation can't run in a serverless runtime environment on AWS, or on GPUs. If the transformation runs on a GPU-enabled cluster, GPUs are disabled and the transformation consumes CPUs.