Running Apache Kafka in Production

Streaming Audio: Apache Kafka® & Real-Time Data

תוכן מסופק על ידי Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.

<div class="span index">1</div> <span><a class="" data-remote="true" data-type="html" href="/series/all-about-change">All About Change</a></span>

1
All About Change

בטל רישום

לפני 12 ימיםלפני 12d ago

בטל רישום

חודשי+

How do we build an inclusive world? Hear intimate and in-depth conversations with changemakers on disability rights, youth mental health advocacy, prison reform, grassroots activism, and more. First-hand stories about activism, change, and courage from people who are changing the world: from how a teen mom became the Planned Parenthood CEO, to NBA player Kevin Love on mental health in professional sports, to Beetlejuice actress Geena Davis on Hollywood’s role in women’s rights. All About Change is hosted by Jay Ruderman, whose life’s work is seeking social justice and inclusion for people with disabilities worldwide. Join Jay as he interviews iconic guests who have gone through adversity and harnessed their experiences to better the world. This show ultimately offers the message of hope that we need to keep going. All About Change is a production of the Ruderman Family Foundation. Listen and subscribe to All About Change wherever you get podcasts. https://allaboutchangepodcast.com/

לפני 3 שנים 58:44

MP3•בית הפרקים

What are some recommendations to consider when running Apache Kafka® in production? Jun Rao, one of the original Kafka creators, as well as an ongoing committer and PMC member, shares the essential wisdom he's gained from developing Kafka and dealing with a large number of Kafka use cases.

Here are 6 recommendations for maximizing Kafka in production:
1. Nail Down the Operational Part
When setting up your cluster, in addition to dealing with the usual architectural issues, make sure to also invest time into alerting, monitoring, logging, and other operational concerns. Managing a distributed system can be tricky and you have to make sure that all of its parts are healthy together. This will give you a chance at catching cluster problems early, rather than after they have become full-blown crises.
2. Reason Properly About Serialization and Schemas Up Front
At the Kafka API level, events are just bytes, which gives your application the flexibility to use various serialization mechanisms. Avro has the benefit of decoupling schemas from data serialization, whereas Protobuf is often preferable to those practiced with remote procedure calls; JSON Schema is user friendly but verbose. When you are choosing your serialization, it's a good time to reason about schemas, which should be well-thought-out contracts between your publishers and subscribers. You should know who owns a schema as well as the path for evolving that schema over time.
3. Use Kafka As a Central Nervous System Rather Than As a Single Cluster
Teams typically start out with a single, independent Kafka cluster, but they could benefit, even from the outset, by thinking of Kafka more as a central nervous system that they can use to connect disparate data sources. This enables data to be shared among more applications.
4. Utilize Dead Letter Queues (DLQs)
DLQs can keep service delays from blocking the processing of your messages. For example, instead of using a unique topic for each customer to which you need to send data (potentially millions of topics), you may prefer to use a shared topic, or a series of shared topics that contain all of your customers. But if you are sending to multiple customers from a shared topic and one customer's REST API is down—instead of delaying the process entirely—you can have that customer's events divert into a dead letter queue. You can then process them later from that queue.
5. Understand Compacted Topics
By default in Kafka topics, data is kept by time. But there is also another type of topic, a compacted topic, which stores data by key and replaces old data with new data as it comes in. This is particularly useful for working with data that is updateable, for example, data that may be coming in through a change-data-capture log. A practical example of this would be a retailer that needs to update prices and product descriptions to send out to all of its locations.
6. Imagine New Use Cases Enabled by Kafka's Recent Evolution
The biggest recent change in Kafka's history is its migration to the cloud. By using Kafka there, you can reserve your engineering talent for business logic. The unlimited storage enabled by the cloud also means that you can truly keep data forever at reasonable cost, and thus you don't have to build a separate system for your historical data needs.
EPISODE LINKS

פרקים

1. Intro (00:00:00)

2. Nail down the operational part (00:02:40)

3. Reason properly about serialization and schemas up front (00:08:11)

4. Utilize Dead Letter Queues (00:17:16)

5. Use Kafka as a central nervous system vs. a single cluster (00:27:55)

6. Understand compacted topics (00:36:56)

7. Imagine new use cases enabled by Kafka's recent evolution (00:46:40)

8. It's a wrap! (00:57:01)

265 פרקים