Public Procurement


Unstructured Data Extraction

Achievements

  • Developed and deployed a pipeline leveraging Retrieval-Augmented Generation (RAG) and Large Language Models (LLM) to extract critical information from public procurement documents.
  • Implemented the solution on Google Cloud Platform (GCP) for scalability and performance.
  • Streamlined the analysis process to enhance decision-making in public procurement.
  • Designed a robust architecture enabling efficient text extraction, storage, and querying.

Context

This project aimed to streamline the analysis and decision-making process for public procurement documents. By employing Retrieval-Augmented Generation (RAG) and Large Language Models (LLM), the solution effectively extracts and contextualizes information from unstructured text.

The pipeline consists of the following key components:

  • Text Extraction: Processes unstructured text from public procurement documents for analysis.
  • Information Retrieval: Utilizes RAG to identify and extract relevant data points.
  • Scalability: Deployed on GCP, ensuring high availability and performance for large-scale processing.

Technologies Used

  • Google Cloud Platform (GCP): For scalable and reliable deployment.
  • Retrieval-Augmented Generation (RAG): To enhance the accuracy and relevance of information retrieval.
  • Large Language Models (LLM): For intelligent text processing and querying.