Summarize your documents according to your own criteria

Challenge

The rapid analysis of large volumes of unstructured documents is a necessary part of the processes of financial analysis, auditing, market research or responding to tenders. This represents a wide array of documents:

  • Annual reports
  • ESG reports
  • Financial analysis reports
  • Calls for tender, Request for Proposal…

The analysis of these documents consists in extracting, in each of them, information of the same nature. This information can come in different kinds: figures, paragraphs, entire pages, a simple name, etc.

Here are some frequent examples:

  • Composition of the board of directors in annual reports
  • CO2 eq emissions scope 1, 2 and 3 in ESG reports
  • Technical data inside a RFP
  • Regional strategy within financial analysis notes.

Basic “keywords” approaches cannot cope with this complexity.

This kind of task can mobilize teams of experts for hundred of hours.

Solution

Thanks to its advanced extraction functions (Paragraph Extract, Value Extract, Question Extract), the reciTAL platform provides a suite of intelligent tools to optimize your analysis processes. The combination of the different tools automates the generation of “summaries” built according to your needs.

  • Paragraph Extract: based on a first example of text, the platform learns to suggest comparable sections in each document, by applying syntactic and semantic similarity algorithms which increase their performance with the number of examples.
  • Value Extract: by specifying a location in the document (start, end, top of page, bottom of page, proximity to a word or expression), the platform extracts standard entities (NER) or specific entities via the definition of specific Regex.
  • Question Extract: just fill in the list of questions you are interested in and let the engine return you the most relevan paragraphs for each question, so your experts only have to review a few pieces of text instead of a whole document.

reciTAL assets

Saas or On-Prem deployment

OCR for scanned documents

Complete set of APIs to build complex workflows

Customer benefits

Personalized reports generated to “structure unstructured documents”

Time saving

Paragraph Extract accuracy > 95% with 10 examples.