32 subscribers
התחל במצב לא מקוון עם האפליקציה Player FM !
פודקאסטים ששווה להאזין
בחסות


Real-Time Change Data Capture and Data Integration with Apache Kafka and Qlik
Manage episode 424666775 series 2510642
Getting data from a database management system (DBMS) into Apache Kafka® in real time is a subject of ongoing innovation. John Neal (Principal Solution Architect, Qlik) and Adam Mayer (Senior Technical Producer Marketing Manager, Qlik) explain how leveraging change data capture (CDC) for data ingestion into Kafka enables real-time data-driven insights.
It can be challenging to ingest data in real time. It is even more challenging when you have multiple data sources, including both traditional databases and mainframes, such as SAP and Oracle. Extracting data in batch for transfer and replication purposes is slow, and often incurs significant performance penalties. However, analytical queries are often even more resource intensive and are prohibitively expensive to run on production transactional databases. CDC enables the capture of source operations as a sequence of incrementing events, converting the data into events to be written to Kafka.
Once this data is available in the Kafka topics, it can be used for both analytical and operational use cases. Data can be consumed and modeled for analytics by individual groups across your organization. Meanwhile, the same Kafka topics can be used to help power microservice applications and help ensure data governance without impacting your production data source. Kafka makes it easy to integrate your CDC data into your data warehouses, data lake, NoSQL database, microservices, and any other system.
Adam and John highlight a few use cases where they see real-time Kafka data ingestion, processing, and analytics moving the needle—including real-time customer predictions, supply chain optimizations, and operational reporting. Finally, Adam and John cap it off with a discussion on how capturing and tracking data changes are critical for your machine learning model to enrich data quality.
EPISODE LINKS
- Fast Track Business Insights with Data in Motion
- Watch the video version of this podcast
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
265 פרקים
Manage episode 424666775 series 2510642
Getting data from a database management system (DBMS) into Apache Kafka® in real time is a subject of ongoing innovation. John Neal (Principal Solution Architect, Qlik) and Adam Mayer (Senior Technical Producer Marketing Manager, Qlik) explain how leveraging change data capture (CDC) for data ingestion into Kafka enables real-time data-driven insights.
It can be challenging to ingest data in real time. It is even more challenging when you have multiple data sources, including both traditional databases and mainframes, such as SAP and Oracle. Extracting data in batch for transfer and replication purposes is slow, and often incurs significant performance penalties. However, analytical queries are often even more resource intensive and are prohibitively expensive to run on production transactional databases. CDC enables the capture of source operations as a sequence of incrementing events, converting the data into events to be written to Kafka.
Once this data is available in the Kafka topics, it can be used for both analytical and operational use cases. Data can be consumed and modeled for analytics by individual groups across your organization. Meanwhile, the same Kafka topics can be used to help power microservice applications and help ensure data governance without impacting your production data source. Kafka makes it easy to integrate your CDC data into your data warehouses, data lake, NoSQL database, microservices, and any other system.
Adam and John highlight a few use cases where they see real-time Kafka data ingestion, processing, and analytics moving the needle—including real-time customer predictions, supply chain optimizations, and operational reporting. Finally, Adam and John cap it off with a discussion on how capturing and tracking data changes are critical for your machine learning model to enrich data quality.
EPISODE LINKS
- Fast Track Business Insights with Data in Motion
- Watch the video version of this podcast
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
265 פרקים
همه قسمت ها
×
1 Apache Kafka 3.5 - Kafka Core, Connect, Streams, & Client Updates 11:25

1 How to use Data Contracts for Long-Term Schema Management 57:28

1 How to use Python with Apache Kafka 31:57

1 Next-Gen Data Modeling, Integrity, and Governance with YODA 55:55

1 Migrate Your Kafka Cluster with Minimal Downtime 1:01:30

1 Real-Time Data Transformation and Analytics with dbt Labs 43:41

1 What is the Future of Streaming Data? 41:29

1 What can Apache Kafka Developers learn from Online Gaming? 55:32

1 How to use OpenTelemetry to Trace and Monitor Apache Kafka Systems 50:01

1 What is Data Democratization and Why is it Important? 47:27

1 Git for Data: Managing Data like Code with lakeFS 30:42

1 Using Kafka-Leader-Election to Improve Scalability and Performance 51:06

1 Real-Time Machine Learning and Smarter AI with Data Streaming 38:56

1 The Present and Future of Stream Processing 31:19

1 Top 6 Worst Apache Kafka JIRA Bugs 1:10:58

1 Learn How Stream-Processing Works The Simplest Way Possible 31:29

1 Building and Designing Events and Event Streams with Apache Kafka 53:06

1 Rethinking Apache Kafka Security and Account Management 41:23

1 Real-time Threat Detection Using Machine Learning and Apache Kafka 29:18

1 Improving Apache Kafka Scalability and Elasticity with Tiered Storage 29:32

1 Decoupling with Event-Driven Architecture 38:38

1 If Streaming Is the Answer, Why Are We Still Doing Batch? 43:58

1 Security for Real-Time Data Stream Processing with Confluent Cloud 48:33

1 Running Apache Kafka in Production 58:44

1 Build a Real Time AI Data Platform with Apache Kafka 37:18

1 Optimizing Apache JVMs for Apache Kafka 1:11:42


1 Application Data Streaming with Apache Kafka and Swim 39:10

1 International Podcast Day - Apache Kafka Edition | Streaming Audio Special 1:02:22


1 Real-Time Stream Processing, Monitoring, and Analytics With Apache Kafka 34:07

1 Reddit Sentiment Analysis with Apache Kafka-Based Microservices 35:23

1 Capacity Planning Your Apache Kafka Cluster 1:01:54

1 Streaming Real-Time Sporting Analytics for World Table Tennis 34:29

1 Real-Time Event Distribution with Data Mesh 48:59

1 Apache Kafka Security Best Practices 39:10

1 What Could Go Wrong with a Kafka JDBC Connector? 41:10

1 Apache Kafka Networking with Confluent Cloud 37:22

1 Event-Driven Systems and Agile Operations 53:22

1 Streaming Analytics and Real-Time Signal Processing with Apache Kafka 1:06:33

1 Blockchain Data Integration with Apache Kafka 50:59

1 Automating Multi-Cloud Apache Kafka Cluster Rollouts 48:29

1 Common Apache Kafka Mistakes to Avoid 1:09:43

1 Tips For Writing Abstracts and Speaking at Conferences 48:56

1 How I Became a Developer Advocate 29:48

1 Data Mesh Architecture: A Modern Distributed Data Model 48:42

1 Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools 55:55

1 Practical Data Pipeline: Build a Plant Monitoring System with ksqlDB 33:56


1 Scaling Apache Kafka Clusters on Confluent Cloud ft. Ajit Yagaty and Aashish Kohli 49:07

1 Streaming Analytics on 50M Events Per Day with Confluent Cloud at Picnic 34:41


1 Optimizing Apache Kafka's Internals with Its Co-Creator Jun Rao 48:54

1 Using Event-Driven Design with Apache Kafka Streaming Applications ft. Bobby Calderwood 51:09

1 Monitoring Extreme-Scale Apache Kafka Using eBPF at New Relic 38:25

1 Confluent Platform 7.1: New Features + Updates 10:01

1 Scaling an Apache Kafka Based Architecture at Therapie Clinic 1:10:56

1 Bridging Frontend and Backend with GraphQL and Apache Kafka ft. Gerard Klijs 23:13

1 Building Real-Time Data Governance at Scale with Apache Kafka ft. Tushar Thole 42:58

1 Handling 2 Million Apache Kafka Messages Per Second at Honeycomb 41:36


1 Serverless Stream Processing with Apache Kafka ft. Bill Bejeck 42:23

1 The Evolution of Apache Kafka: From In-House Infrastructure to Managed Cloud Service ft. Jay Kreps 46:32


1 Intro to Event Sourcing with Apache Kafka ft. Anna McDonald 30:14

1 Expanding Apache Kafka Multi-Tenancy for Cloud-Native Systems ft. Anna Povzner and Anastasia Vela 31:01


1 Optimizing Cloud-Native Apache Kafka Performance ft. Alok Nikhil and Adithya Chandra 30:40

1 From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine 29:50

1 Real-Time Change Data Capture and Data Integration with Apache Kafka and Qlik 34:51

1 Modernizing Banking Architectures with Apache Kafka ft. Fotios Filacouris 34:59

1 Running Hundreds of Stream Processing Applications with Apache Kafka at Wise 31:08

1 Lessons Learned From Designing Serverless Apache Kafka ft. Prachetaa Raghavan 28:20

1 Using Apache Kafka as Cloud-Native Data System ft. Gwen Shapira 33:57

1 ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work Together ft. Simon Aubury 30:42

1 Explaining Stream Processing and Apache Kafka ft. Eugene Meidinger 29:28

1 Handling Message Errors and Dead Letter Queues in Apache Kafka ft. Jason Bell 37:41

1 Confluent Platform 7.0: New Features + Updates 12:16

1 Real-Time Stream Processing with Kafka Streams ft. Bill Bejeck 35:32

1 Automating Infrastructure as Code with Apache Kafka and Confluent ft. Rosemary Wang 30:08

1 Getting Started with Spring for Apache Kafka ft. Viktor Gamov 32:44

1 Powering Event-Driven Architectures on Microsoft Azure with Confluent 38:42

1 Automating DevOps for Apache Kafka and Confluent ft. Pere Urbón-Bayes 26:08

1 Intro to Kafka Connect: Core Components and Architecture ft. Robin Moffatt 31:18

1 Designing a Cluster Rollout Management System for Apache Kafka ft. Twesha Modi 30:08

1 Apache Kafka 3.0 - Improving KRaft and an Overview of New Features 15:17

1 How to Build a Strong Developer Community with Global Engagement ft. Robin Moffatt and Ale Murray 35:18
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.