FUW TRENDS IN SCIENCE & TECHNOLOGY JOURNAL

(A Peer Review Journal)
e–ISSN: 2408–5162; p–ISSN: 2048–5170

FUW TRENDS IN SCIENCE & TECHNOLOGY JOURNAL

AUTOMATIC SUMMARIZATION OF LEGAL DOCUMENTS USING SUMY
Pages: 307-315
Shakirat Aderonke Salihu et al


keywords: Contracts, Privacy Policy, SUMY, ATS, TOS

Abstract

Automatic Text Summarization (ATS) is a natural language processing technique that attempts to extract or generate a shorter version of lengthy text while preserving the overall context of the text. The rise in digital organization has resulted in an influx of lengthy digital contracts, including Privacy Policy (PP) and Terms of Service (ToS). Consequently, this led to the aim of this paper which is to develop an ATS system. In this paper, Sumy, a Python-based library was utilized for the efficient summarization of these lengthy contracts both in plain text and URL. The Sumy library employs multiple extractive techniques such as Luhn, Edmundson, Latent Semantic, and Textrank to carry out text summarization. It also accommodates multiple languages as input. Following an in-depth assessment of these techniques, it can be concluded that the Latent Semantic Analysis (LSA) technique performs best on PP and ToS with an F1-score of 76.1% while Luhn has the lowest percentage of 46.3%. It is recommended that organizations adopt the use of this system to enhance contract readability and it also saves time.

References

Highlights