The project was developed to create a scalable, cost-effective, and flexible data system for storing, processing and analyzing medical records coming from clinics in NA.

The unstructured data would be annotated, and classified data would be used for retrospective study analysis in R&D activities, investigation of drug efficacy, and evidence based medicine. This would allow healthcare providers to have easy access to best industry practices and to enhance their capabilities in preventative medicine.

Company accumulated a big amount of raw data, retrieved from their customers. The Data are polytypic and not classified and can’t be used in reports/statistics/data mining etc. Data is constantly received from external customers with Apache Storm and RabbitMQ solution.

Small samples of data (~100k records) are annotated manually, then framework is used to train models on annotated samples, then apply these models to the rest of the records, which allows to obtain the data suitable for further analysis and making predictions.


