SummaLegal
- #Legal
A solution for legal domain text document summarization using the OpenAI GPT-3/4 Davinci model. The goal was to create a set of scripts that could process legal documents of up to 25k words and generate a summarized output in a specific format.
- Natural Language Processing

Impact
Legal Document Summarization Solution:
- Algorithm to convert PDF into text and make short summarization with OpenAI GPT-3/4 API
Services we provided
Algorithm to convert PDF into text and make short summarization with OpenAI GPT-3/4 API
Tech Stack
GPT 3.5
GPT 4
Flask
PY2PDF

Challenges and Solutions
🧐 Challenges
- Text compression: Condensing the text by 50% while preserving critical information.
- Accuracy and quality control: Ensuring the accuracy of the compressed text through a rigorous quality control process.
💡 Solutions
We developed a solution that involved leveraging the power of the OpenAI GPT-3 Davinci model to perform a recursive chunk summarisation on the legal documents.
- The recursive chunk summarization algorithm broke down text into smaller chunks of sentences and summarised each chunk using the GPT-3 model
- Summarized chunks were combined and further summarised until the final output was achieved
- Output was formatted into three sections: “What this means,” “Why it matters,” and “Some other details that are relevant.”
User flow