Validation System Of Large Data Sets

The solution helps companies collect, sort, and analyze big data in one system

Project Overview

The solution helps large companies collect, sort, and analyze data by category in one system, as well as format, run and import files automatically at any convenient time. Businesses get an estimate of data quality and a history of how data has changed by month or day. The quality shows the error rate, the total amount of data. The project uses a fully distributed cloud architecture, which allows individual services to be developed, tested, and scaled independently of each other.

On this project, our team’s task was to migrate the existing client system to an external service and validate data. The problem was the existence of a huge amount of data that was inconvenient to manage and manipulate.

Also maybe interesting

Twilio projects

Cordova projects

Opencv projects

Ionic projects


What Was Done

In order to automate the process of transferring data to an external service and organizing the information, we developed filters for quick search and sorting, which the data uses to select valid data.  This also helps to find any anomalies and errors in huge data sets.

The data is downloaded automatically through the user interface, and you can also manually send files to sources or to the cloud. The user can upload all types of data formats to the application: excel, JSON, etc. We can upload data automatically or with manual input. The user can also set up a schedule for launching files (once a week, or on certain days at certain times).  

We have a fully distributed cloud architecture and the system is divided into a kind of microservice. All the parts are independent of each other and the site itself sends commands between the microservices. The data is uploaded not just to the database, but to a search index, and because of this search and validation take a matter of seconds. Thanks to the clever design of the solution, you can use only 10% of the functionality without having to run the whole system. We can scale separate parts, for example, start 2 validation modules which then work in parallel and turn on/off when needed to save money.

We also developed visualisations of information and data in the form of various graphs. 

Project team: 2 developers, 1 PM.
Technology: Azure Search Index, .NET web API, React, Azure Functions, Azure Service Bus, Azure Queue, Azure Search index.

In case of some specific request, or technology not listed here, you can discuss it with an expert at or Skype Innowise

Need a technological solution?

Contact us