32 subscribers
התחל במצב לא מקוון עם האפליקציה Player FM !
פודקאסטים ששווה להאזין
בחסות


1 #52: Navigating the effect of AI on marketing jobs and the job market with Sue Keith, Landrum Talent Solutions 19:09
From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine
Manage episode 424666774 series 2510642
Implementing an event-driven data pipeline can be challenging, but doing so within the context of a legacy architecture is even more complex. Having spent three years building a streaming data infrastructure and being on the first team at a financial organization to implement Apache Kafka® event-driven data pipelines, Danica Fine (Senior Developer Advocate, Confluent) shares about the development process and how ksqlDB and Kafka Connect became instrumental to the implementation.
By moving away from batch processing to streaming data pipelines with Kafka, data can be distributed with increased data scalability and resiliency. Kafka decouples the source from the target systems, so you can react to data as it changes while ensuring accurate data in the target system.
In order to transition from monolithic micro-batching applications to real-time microservices that can integrate with a legacy system that has been around for decades, Danica and her team started developing Kafka connectors to connect to various sources and target systems.
- Kafka connectors: Building two major connectors for the data pipeline, including a source connector to connect the legacy data source to stream data into Kafka, and another target connector to pipe data from Kafka back into the legacy architecture.
- Algorithm: Implementing Kafka Streams applications to migrate data from a monolithic architecture to a stream processing architecture.
- Data join: Leveraging Kafka Connect and the JDBC source connector to bring in all data streams to complete the pipeline.
- Streams join: Using ksqlDB to join streams—the legacy data system continues to produce streams while the Kafka data pipeline is another stream of data.
As a final tip, Danica suggests breaking algorithms into process steps. She also describes how her experience relates to the data pipelines course on Confluent Developer and encourages anyone who is interested in learning more to check it out.
EPISODE LINKS
- Data Pipelines course
- Introduction to Streaming Data Pipelines with Apache Kafka and ksqlDB
- Guided Exercise on Building Streaming Data Pipelines
- Migrating from a Legacy System to Kafka Streams
- Watch the video version of this podcast
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
265 פרקים
Manage episode 424666774 series 2510642
Implementing an event-driven data pipeline can be challenging, but doing so within the context of a legacy architecture is even more complex. Having spent three years building a streaming data infrastructure and being on the first team at a financial organization to implement Apache Kafka® event-driven data pipelines, Danica Fine (Senior Developer Advocate, Confluent) shares about the development process and how ksqlDB and Kafka Connect became instrumental to the implementation.
By moving away from batch processing to streaming data pipelines with Kafka, data can be distributed with increased data scalability and resiliency. Kafka decouples the source from the target systems, so you can react to data as it changes while ensuring accurate data in the target system.
In order to transition from monolithic micro-batching applications to real-time microservices that can integrate with a legacy system that has been around for decades, Danica and her team started developing Kafka connectors to connect to various sources and target systems.
- Kafka connectors: Building two major connectors for the data pipeline, including a source connector to connect the legacy data source to stream data into Kafka, and another target connector to pipe data from Kafka back into the legacy architecture.
- Algorithm: Implementing Kafka Streams applications to migrate data from a monolithic architecture to a stream processing architecture.
- Data join: Leveraging Kafka Connect and the JDBC source connector to bring in all data streams to complete the pipeline.
- Streams join: Using ksqlDB to join streams—the legacy data system continues to produce streams while the Kafka data pipeline is another stream of data.
As a final tip, Danica suggests breaking algorithms into process steps. She also describes how her experience relates to the data pipelines course on Confluent Developer and encourages anyone who is interested in learning more to check it out.
EPISODE LINKS
- Data Pipelines course
- Introduction to Streaming Data Pipelines with Apache Kafka and ksqlDB
- Guided Exercise on Building Streaming Data Pipelines
- Migrating from a Legacy System to Kafka Streams
- Watch the video version of this podcast
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
265 פרקים
सभी एपिसोड
×
1 Apache Kafka 3.5 - Kafka Core, Connect, Streams, & Client Updates 11:25

1 How to use Data Contracts for Long-Term Schema Management 57:28

1 How to use Python with Apache Kafka 31:57

1 Next-Gen Data Modeling, Integrity, and Governance with YODA 55:55

1 Migrate Your Kafka Cluster with Minimal Downtime 1:01:30

1 Real-Time Data Transformation and Analytics with dbt Labs 43:41

1 What is the Future of Streaming Data? 41:29

1 What can Apache Kafka Developers learn from Online Gaming? 55:32


1 How to use OpenTelemetry to Trace and Monitor Apache Kafka Systems 50:01

1 What is Data Democratization and Why is it Important? 47:27

1 Git for Data: Managing Data like Code with lakeFS 30:42

1 Using Kafka-Leader-Election to Improve Scalability and Performance 51:06

1 Real-Time Machine Learning and Smarter AI with Data Streaming 38:56

1 The Present and Future of Stream Processing 31:19

1 Top 6 Worst Apache Kafka JIRA Bugs 1:10:58

1 Learn How Stream-Processing Works The Simplest Way Possible 31:29

1 Building and Designing Events and Event Streams with Apache Kafka 53:06

1 Rethinking Apache Kafka Security and Account Management 41:23

1 Real-time Threat Detection Using Machine Learning and Apache Kafka 29:18

1 Improving Apache Kafka Scalability and Elasticity with Tiered Storage 29:32

1 Decoupling with Event-Driven Architecture 38:38

1 If Streaming Is the Answer, Why Are We Still Doing Batch? 43:58

1 Security for Real-Time Data Stream Processing with Confluent Cloud 48:33

1 Running Apache Kafka in Production 58:44

1 Build a Real Time AI Data Platform with Apache Kafka 37:18

1 Optimizing Apache JVMs for Apache Kafka 1:11:42


1 Application Data Streaming with Apache Kafka and Swim 39:10
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.