Ingenious Framework For Resilient And Reliable Data Pipeline

Main Article Content

KARI VENKATRAM , GEETHA MARY A

Abstract

Data integration is one of the critical requirements for enterprise systems. Data pipelines are used for data integration among distributed databases. Real-time data integration is a very essential requirement in the Internet of things era. The quality of services (QoS) in data pipelines are impacted due to data integration issues while in data sync up. Present data pipeline systems are focused majorly on the real-time data synchronization, the throughput that is essentially scalable solutions in a distributed data pipeline platform. In this context, data reliability problems such as data inconsistency, incompleteness, and conflict in reconciliation, etc.,  were not prioritized as controlling and monitoring mechanisms not focused. The ingenious framework presented in this article has a comprehensive framework that includes controller, monitoring, and auditing framework to address those issues. Also considered validation, evaluation, and durability aspects in this solution. Message reconciliation and auto retransmission have been proposed as part of the validation and auditing components to measure the effectivenss of the framework. Reliability, partitioning, and fault tolerance, durability, and scalability are few measures used to evaluate the effectiveness. Features such as auditing, reconciliation, and retransmission of error messages for improving QoS. The Auto-retransmission restricts the error percent (<1%) in the data pipeline and also gives high throughput – 180k/ 90K messages for producer/consumer and latency is as low as 5 milliseconds for 1-12 nodes. Compared to the above measures with industry benchmarks and found the framework outperformed as compared to state of art existing frameworks

Article Details

Section
Articles