Case study

Invoice Field Recognition – Automation of invoice data recognition, extraction, and processing

Industry
Healthcare
Cooperation period
2025
Osoba w białym fartuchu medycznym trzyma tablet i rysik, stojąc obok aparatu do USG w gabinecie medycznym.

About client

HAMMERmed Medical Polska is one of the market leaders in the distribution of medical devices in Poland. The company actively participates in public procurement procedures, offering a broad portfolio of products to medical facilities across the country.

A leading distributor of medical products and modern treatment methods on the Polish market.

About project

The client approached us with the need to improve invoice processing within operational accounting processes. The existing workflow relied on manual reading of data from PDF documents, which resulted in high time consumption and a significant risk of human error.

As the organization continued to grow and the volume of documents increased, manual processing became increasingly inefficient and difficult to scale.

Solution

In response to these needs, we delivered a solution enabling automated invoice field recognition, dynamic data transformation, and secure storage in a database system, based on a scalable Microsoft Azure cloud architecture.

A key requirement was maintaining flexibility and the ability to easily adapt the solution to changing business needs.

Additionally, the solution ensures high data consistency and quality in downstream database systems, enabling further use of the data in analytical processes.

Implementation and development

The full scope of work for HAMMERmed consisted of 4 key phases, ensuring delivery of a solution aligned with the client’s expectations.

Phase I – Data analysis and preparation

  • Review and classification of invoice formats provided by the client,
  • Identification of key fields (e.g. amounts, dates, invoice numbers, suppliers),
  • Design of input and output data structures,
  • Development of data standardization guidelines.

Phase II – Value extraction and transformation mechanism

A module was designed and implemented to enable:

  • Reading content from PDF files using a specialized PDF field reader,
  • Application of a dynamic parameters dictionary defining fields to be extracted and transformation rules,
  • Real-time data modification, such as date standardization, amount formatting, and text normalization,
  • Flexible extension of extracted fields without changes to application code.

Phase III – Process automation

  • Development of a fully automated invoice processing workflow: retrieving files from a shared repository, launching the extraction mechanism, transforming data according to defined rules, and saving results to the database,
  • Implementation of scheduling and monitoring processes,
  • Use of a serverless architecture, enabling scalability and handling a variable number of documents without the need to manage servers.

Phase IV – Cloud infrastructure implementation

The solution was delivered using Microsoft Azure and included:

  • Runtime infrastructure based on Docker containerization,
  • Serverless environment (e.g. Azure Functions / Container Apps),
  • Shared storage for incoming invoices,
  • Secure database for extracted data,
  • Logging and observability mechanisms,
  • Automated deployment and versioning of components.

Solution architecture

01. Input layer – HAMMERmed

  1. Invoices (PDF) are uploaded to shared storage,
  2. Processed data is saved to the output database.

02. Compute layer – Azure

  • Containerized Invoice Scrapper App running as Docker/serverless,
  • Processing triggered automatically by events (e.g. new file upload).

03. Application logic layer

  • PDF field reader – extracts fields from documents,
  • Parameters dictionary – defines fields and transformation rules,
  • Transform value – real-time data processing module,
  • Final data preparation / saved – data persistence in the database.
Diagram architektury aplikacji do przetwarzania faktur: pliki PDF z faktur trafiają ze współdzielonej przestrzeni do aplikacji „Invoice scrapper app” uruchomionej w kontenerze Docker w środowisku serverless na Microsoft Azure, a wyniki zapisywane są w bazie danych. Proces obejmuje odczyt pól PDF, transformację wartości z użyciem słownika parametrów oraz końcowe przygotowanie i zapis danych.

“The 3Soft team delivered a comprehensive implementation for us – from data analysis and standardization, through the development of a content extraction module and transformation logic, to the design of a modern cloud architecture. By leveraging serverless mechanisms and containerization, the developed solution can flexibly handle varying document volumes without engaging internal IT resources.

The entire process – from invoice ingestion, through data extraction and transformation using Machine Learning-based solutions, to data persistence – was fully automated. This significantly reduced operational processing time and eliminated human errors.

We appreciate the professionalism of the 3Soft team, their ability to quickly understand the business context, and their flexibility in execution. We confidently recommend 3Soft as a reliable, high-quality technology partner in the area of intelligent business process automation.”

Tomasz Rajca

IT Project Manager, HAMMERmed Medical Polska

Key results

Implementation of the solution enabled the client to:

  • Reduce invoice data processing time through full automation,
  • Eliminate errors resulting from manual data entry,
  • Enable process scalability depending on workload,
  • Flexibly introduce new rules and fields without changing application logic,
  • Increase the efficiency of financial and operational processes across the organization.

Contact

Let’s talk

We’re eagerly waiting for
a message from you!

Contact form

Formularz kontaktowy ENG

Detailed information on the processing of personal data is available in the Privacy Policy.