32 subscribers
התחל במצב לא מקוון עם האפליקציה Player FM !
Running Apache Kafka Efficiently on the Cloud ft. Adithya Chandra
Manage episode 424666808 series 2510642
Focused on optimizing Apache Kafka® performance with maximized efficiency, Confluent’s Product Infrastructure team has been actively exploring opportunities for scaling out Kafka clusters. They are able to run Kafka workloads with half the typical memory usage while saving infrastructure costs, which they have tested and now safely rolled out across Confluent Cloud.
After spending seven years at Amazon Web Services (AWS) working on search services and Amazon Aurora as a software engineer, Adithya Chandra decided to apply his expertise in cluster management, load balancing, elasticity, and performance of search and storage clusters to the Confluent team.
Last year, Confluent shipped Tiered Storage, which moves eligible data to remote storage from a Kafka broker. As most of the data moves to remote storage, we can upgrade to better storage volumes backed by solid-state drives (SSDs). SSDs are capable of higher throughput compared to hard disk drives (HDDs), capable of fast, random IO, yet more expensive per provisioned gigabyte. Given that SSDs are useful at random IO and can support higher throughput, Confluent started investigating whether it was possible to run Kafka with lesser RAM, which is comparatively much more expensive per gigabyte compared to SSD. Instance types in the cloud had the same CPU but half the memory was 20% cheaper.
In this episode, Adithya covers how to run Kafka more efficiently on Confluent Cloud and dives into the following:
- Memory allocation on an instance running Kafka
- What is a JVM heap? Why should it be sized? How much is enough? What are the downsides of a small heap?
- Memory usage of Datadog, Kubernetes, and other processes, and allocating memory correctly
- What is the ideal page cache size? What is a page cache used for? Are there any parameters that can be tuned? How does Kafka use the page cache?
- Testing via the simulation of a variety of workloads using Trogdor
- High-throughput, high-connection, and high-partition tests and their results
- Available cloud hardware and finding the best fit, including choosing the number of instance types, migrating from one instance to another, and using nodepools to migrate brokers safely, one by one
- What do you do when your preferred hardware is not available? Can you run hybrid Kafka clusters if the preferred instance is not widely available?
- Building infrastructure that allows you to perform testing easily and that can support newer hardware faster (ARM processors, SSDs, etc.)
EPISODE LINKS
265 פרקים
Manage episode 424666808 series 2510642
Focused on optimizing Apache Kafka® performance with maximized efficiency, Confluent’s Product Infrastructure team has been actively exploring opportunities for scaling out Kafka clusters. They are able to run Kafka workloads with half the typical memory usage while saving infrastructure costs, which they have tested and now safely rolled out across Confluent Cloud.
After spending seven years at Amazon Web Services (AWS) working on search services and Amazon Aurora as a software engineer, Adithya Chandra decided to apply his expertise in cluster management, load balancing, elasticity, and performance of search and storage clusters to the Confluent team.
Last year, Confluent shipped Tiered Storage, which moves eligible data to remote storage from a Kafka broker. As most of the data moves to remote storage, we can upgrade to better storage volumes backed by solid-state drives (SSDs). SSDs are capable of higher throughput compared to hard disk drives (HDDs), capable of fast, random IO, yet more expensive per provisioned gigabyte. Given that SSDs are useful at random IO and can support higher throughput, Confluent started investigating whether it was possible to run Kafka with lesser RAM, which is comparatively much more expensive per gigabyte compared to SSD. Instance types in the cloud had the same CPU but half the memory was 20% cheaper.
In this episode, Adithya covers how to run Kafka more efficiently on Confluent Cloud and dives into the following:
- Memory allocation on an instance running Kafka
- What is a JVM heap? Why should it be sized? How much is enough? What are the downsides of a small heap?
- Memory usage of Datadog, Kubernetes, and other processes, and allocating memory correctly
- What is the ideal page cache size? What is a page cache used for? Are there any parameters that can be tuned? How does Kafka use the page cache?
- Testing via the simulation of a variety of workloads using Trogdor
- High-throughput, high-connection, and high-partition tests and their results
- Available cloud hardware and finding the best fit, including choosing the number of instance types, migrating from one instance to another, and using nodepools to migrate brokers safely, one by one
- What do you do when your preferred hardware is not available? Can you run hybrid Kafka clusters if the preferred instance is not widely available?
- Building infrastructure that allows you to perform testing easily and that can support newer hardware faster (ARM processors, SSDs, etc.)
EPISODE LINKS
265 פרקים
כל הפרקים
×
1 Apache Kafka 3.5 - Kafka Core, Connect, Streams, & Client Updates 11:25

1 How to use Data Contracts for Long-Term Schema Management 57:28

1 How to use Python with Apache Kafka 31:57

1 Next-Gen Data Modeling, Integrity, and Governance with YODA 55:55

1 Migrate Your Kafka Cluster with Minimal Downtime 1:01:30

1 Real-Time Data Transformation and Analytics with dbt Labs 43:41

1 What is the Future of Streaming Data? 41:29

1 What can Apache Kafka Developers learn from Online Gaming? 55:32


1 How to use OpenTelemetry to Trace and Monitor Apache Kafka Systems 50:01

1 What is Data Democratization and Why is it Important? 47:27

1 Git for Data: Managing Data like Code with lakeFS 30:42

1 Using Kafka-Leader-Election to Improve Scalability and Performance 51:06

1 Real-Time Machine Learning and Smarter AI with Data Streaming 38:56

1 The Present and Future of Stream Processing 31:19

1 Top 6 Worst Apache Kafka JIRA Bugs 1:10:58

1 Learn How Stream-Processing Works The Simplest Way Possible 31:29

1 Building and Designing Events and Event Streams with Apache Kafka 53:06

1 Rethinking Apache Kafka Security and Account Management 41:23

1 Real-time Threat Detection Using Machine Learning and Apache Kafka 29:18

1 Improving Apache Kafka Scalability and Elasticity with Tiered Storage 29:32

1 Decoupling with Event-Driven Architecture 38:38

1 If Streaming Is the Answer, Why Are We Still Doing Batch? 43:58

1 Security for Real-Time Data Stream Processing with Confluent Cloud 48:33

1 Running Apache Kafka in Production 58:44

1 Build a Real Time AI Data Platform with Apache Kafka 37:18

1 Optimizing Apache JVMs for Apache Kafka 1:11:42


1 Application Data Streaming with Apache Kafka and Swim 39:10

1 International Podcast Day - Apache Kafka Edition | Streaming Audio Special 1:02:22


1 Real-Time Stream Processing, Monitoring, and Analytics With Apache Kafka 34:07

1 Reddit Sentiment Analysis with Apache Kafka-Based Microservices 35:23

1 Capacity Planning Your Apache Kafka Cluster 1:01:54

1 Streaming Real-Time Sporting Analytics for World Table Tennis 34:29

1 Real-Time Event Distribution with Data Mesh 48:59

1 Apache Kafka Security Best Practices 39:10

1 What Could Go Wrong with a Kafka JDBC Connector? 41:10

1 Apache Kafka Networking with Confluent Cloud 37:22

1 Event-Driven Systems and Agile Operations 53:22

1 Streaming Analytics and Real-Time Signal Processing with Apache Kafka 1:06:33

1 Blockchain Data Integration with Apache Kafka 50:59

1 Automating Multi-Cloud Apache Kafka Cluster Rollouts 48:29

1 Common Apache Kafka Mistakes to Avoid 1:09:43
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.