27 subscribers
התחל במצב לא מקוון עם האפליקציה Player FM !
פודקאסטים ששווה להאזין
בחסות


1 Battle Camp S1: Reality Rivalries with Dana Moon & QT 1:00:36
DoK Talks#103 -Performant and Version-Aware Analytics With Spark & lakeFS on K8s // Itai Admi
Manage episode 307565024 series 2865115
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
Spark and lakeFS are revolutionizing large scale data processing that is version-aware. Is it possible to run this architecture over Kubernetes? We’ll cover the fastest way to get this environment up and running, and the benefits you get with it. Finally we’ll show how horizontal scaling and the lakeFS Hadoop Filesystem avoid processing bottlenecks as workloads increase.
BIO
Itai is a R&D team leader at Treeverse, the company behind open-source lakeFS. He thrives on finding creative solutions for complex problems, especially if it involves code. Previously, Itai worked at Microsoft and Ridge on data infrastructure, tooling, and performance. Itai received his B.Sc degree in Computer Science and an MBA from Tel Aviv University.
KEY TAKE-AWAYS FROM THE TALK
- Importance of building reproducible data pipelines.
- Managing your data the same way you're managing your code.
243 פרקים
Manage episode 307565024 series 2865115
https://go.dok.community/slack
https://dok.community/
ABSTRACT OF THE TALK
Spark and lakeFS are revolutionizing large scale data processing that is version-aware. Is it possible to run this architecture over Kubernetes? We’ll cover the fastest way to get this environment up and running, and the benefits you get with it. Finally we’ll show how horizontal scaling and the lakeFS Hadoop Filesystem avoid processing bottlenecks as workloads increase.
BIO
Itai is a R&D team leader at Treeverse, the company behind open-source lakeFS. He thrives on finding creative solutions for complex problems, especially if it involves code. Previously, Itai worked at Microsoft and Ridge on data infrastructure, tooling, and performance. Itai received his B.Sc degree in Computer Science and an MBA from Tel Aviv University.
KEY TAKE-AWAYS FROM THE TALK
- Importance of building reproducible data pipelines.
- Managing your data the same way you're managing your code.
243 פרקים
Tất cả các tập
×
1 Implementing Data & Databases on K8s within the Dutch Government | DoKC Town Hall 44:54

1 Unsticking Ourselves from Glue: Migrating PayIt’s Data Pipelines to Argo Workflows and Hera | DoKC Town Hall 23:17

1 Repel Boarders! How to find a Kubernetes operator that really protects your data | DoKC Town Hall 19:22

1 DoK + Apache Spark | DoKC Town Hall 19:52

1 DoK @ Comcast - Deliver Business Outcomes & Improved DevX with Data Services on K8s | DoKC Town Hall 16:43

1 DoK Talks - What is Kafka? The rise of one of the world's most used streaming data technologies // Abbey Russell 15:28

1 DoK Talks - (almost)Everything you need to know about stateful cloud native network applications // W Watson 43:39

1 The Outer Nerd #001 - Dungeons & Dragons - Why should you care? // Abhi Vaidyanatha, Fabian Met & Chase Christensen 58:25

1 DoK Talks #155 - Databases at the edge with K3s and ARM devices // Sergio Méndez 49:40

1 DoK Talks #154 - StatefulSets in K8 // Srinivas Karnati 31:55

1 Data-driven Diversity, Equity, and Inclusion // Lisa-Marie Namphy, Melissa Logan, Tiffany Jachja, Audra Montenegro & Cortney Nickerson (DoK Day North America 2022) 19:50

1 Formula 1 telemetry processing using Apache Kafka on Kubernetes // Paolo Patierno (DoK Day North America 2022) 15:36

1 Choosing Kubernetes for Stateful Applications // Akshay Ram & Peter Schuurman (DoK Day North America 2022) 18:31

1 Kubernetes 360º - Data driven observability - from Secrets to logs // Ben Hirschberg (DoK Day North America 2022) 17:11

1 Shifting Left Stateful Applications In Kubernetes // Viktor Farcic (DoK Day North America 2022) 15:52

1 Medical - Healthcare Data on Kubernetes // Olyvia Rakshit & Prasad Dorbala (DoK Day North America 2022) 13:41

1 Highly Available Postgres Clusters In Kubernetes // John Long & Jonathan Gonzalez (DoK Day North America 2022) 15:04

1 Inter-Cluster PostreSQL on Kubernetes // Julian Fischer (DoK Day North America 2022) 17:07

1 Open Source Databases on Kubernetes- Best Practices // Peter Zaitsev (DoK Day North America 2022) 16:04

1 The Kubernetes Native Database // Jeffrey Carpenter (DoK Day North America 2022) 16:26

1 Databases on Kubernetes: Why are they important? // With Bhavin Shah, Xing Yang, Gabriele Bartolini & Patrick McFadin (DoK Day North America 2022) 34:51

1 Data streaming on Kubernetes // Yaniv Ben Hemo (DoK Day North America 2022) 13:51

1 Architecting Your First Event Driven Serverless Streaming Applications on K8 // Timothy Spann (DoK Day North America 2022) 13:29

1 Fybrik - A Kubernetes based platform for governed data use // Flora Gilboa-Solomon, Alexey Roytman, Maryna Strelchuk & Barry Hijkoop (DoK Day North America 2022) 20:59

1 The Challenges of Data Processing On Kubernetes - A look at Spark, Flink, Dask, and Ray // Holden Karau (DoK Day North America 2022) 20:09

1 Scaling our SaaS offering to thousands of clusters // Dax McDonald (DoK Day North America 2022) 21:04

1 Why we decided to migrate our Jaeger storage to ClickHouse on Kubernetes // Arul Jegadish Francis (DoK Day North America 2022) 13:48

1 Building a Digital Factory for the Sheet Metal Industry // Elie Assi (From the DoK Day North America 2022) 20:48

1 How we built our Big Data Stack (almost) entirely on top of Kubernetes // Neylson Crepalde (From DoK Day NA 2022) 16:00

1 Dok Talks #153 - CRD Panel // Eyar Zilberman & Álvaro Hernández 58:05
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.