Pinecone Connector > Introduction to Pinecone Connector
Introduction to Pinecone Connector
You can use Pinecone Connector to write unstructured text files, such as PDFs, as vector embeddings along with their associated metadata to the Pinecone vector database.
You can use a Pinecone connection as a target in mappings in advanced mode. When you run a mapping in advanced mode or a task based on a mapping in advanced mode, Data Integration analyzes the data logic in the mapping to automatically assign data logic to run on an advanced cluster.
Example
You run a legal firm that handles a massive repository of legal documents, contracts, and case files in PDF format in Microsoft Azure Data Lake Storage Gen2.
Legal professionals need efficient ways to search and retrieve relevant documents quickly, often using complex queries that go beyond simple keyword searches.
To enhance the speed and accuracy of document retrieval, you want to integrate your data into the Pinecone vector database. This integration allows legal professionals to efficiently query high-dimensional vectors that represent complex data, such as text and tables, streamlining their search processes and enhancing overall productivity.
To integrate data in Pinecone, you perform the following steps:
1Create a mapping in advanced mode.
2Add a Source transformation to read data from Microsoft Azure Data Lake Storage Gen2.
3Add a Chunking transformation to split the text into smaller manageable chunks for enhanced vector representation.
4Add an Embedding transformation to convert the text to vector embeddings.
5Add a Target transformation to use the Pinecone connection and load the vector embeddings with their associated metadata to Pinecone.