Player FM - Internet Radio Done Right
20 subscribers
Checked 8d ago
הוסף לפני two שנים
תוכן מסופק על ידי The Data Bros and The Firebolt Data Bros. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי The Data Bros and The Firebolt Data Bros או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.
Player FM - אפליקציית פודקאסט
התחל במצב לא מקוון עם האפליקציה Player FM !
התחל במצב לא מקוון עם האפליקציה Player FM !
פודקאסטים ששווה להאזין
בחסות
A
All About Change


1 Eli Beer & United Hatzalah: Saving Lives in 90 seconds or Less 30:20
30:20
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי30:20
Eli Beer is a pioneer, social entrepreneur, President and Founder of United Hatzalah of Israel. In thirty years, the organization has grown to more than 6,500 volunteers who unite together to provide immediate, life-saving care to anyone in need - regardless of race or religion. This community EMS force network treats over 730,000 incidents per year, in Israel, as they wait for ambulances and medical attention. Eli’s vision is to bring this life-saving model across the world. In 2015, Beer expanded internationally with the establishment of branches in South America and other countries, including “United Rescue” in Jersey City, USA, where the response time was reduced to just two minutes and thirty-five seconds. Episode Chapters (0:00) intro (1:04) Hatzalah’s reputation for speed (4:48) Hatzalah’s volunteer EMTs and ambucycles (5:50) Entrepreneurism at Hatzalah (8:09) Chutzpah (14:15) Hatzalah’s recruitment (18:31) Volunteers from all walks of life (22:51) Having COVID changed Eli’s perspective (26:00) operating around the world amid antisemitism (28:06) goodbye For video episodes, watch on www.youtube.com/@therudermanfamilyfoundation Stay in touch: X: @JayRuderman | @RudermanFdn LinkedIn: Jay Ruderman | Ruderman Family Foundation Instagram: All About Change Podcast | Ruderman Family Foundation To learn more about the podcast, visit https://allaboutchangepodcast.com/ Looking for more insights into the world of activism? Be sure to check out Jay’s brand new book, Find Your Fight , in which Jay teaches the next generation of activists and advocates how to step up and bring about lasting change. You can find Find Your Fight wherever you buy your books, and you can learn more about it at www.jayruderman.com .…
Database Technology in the Age of AI with DuckDB Labs co-creator Hannes Mühleisen
Manage episode 472232750 series 3418247
תוכן מסופק על ידי The Data Bros and The Firebolt Data Bros. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי The Data Bros and The Firebolt Data Bros או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.
In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with CEO DuckDB Labs and co-creator DuckDB, Hannes Mühleisen.
Together, they:
- Talk about the journey of DuckDB, an open-source analytical database system designed as a universal wrangling tool.
- Explain how DuckDB differs from SQLite, highlighting the analytical and transactional use cases.
- Discuss DuckDB’s special feature and its approach to innovation including creating their Parquet Reader.
- Explore the simple and efficient ecosystem of DuckDB, allowing developers to add custom functionality without changing its core stability.
- Consider Hannes' perspective on the role of AI in databases.
- Delve into the system’s infrastructure, design choices and the dedication of the team to ensure a continuous, reliable database system.
If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts, instructions on how to do this are [insert link].
Hannes Mühleisen is the CEO of DuckDB Labs and a Professor in The Netherlands, renowned for co-creating DuckDB, an open-source analytical database system. With a background in database architecture and research from CWI database architectures group, he has pioneered the development of DuckDB as a universal data wrangling tool that can run everywhere from phones to space satellites. Under his leadership, DuckDB has achieved remarkable success, reaching 10 million downloads monthly and becoming a go-to solution for analytical database needs. His commitment to keeping DuckDB lightweight, portable, and hardware-agnostic while maintaining high performance has revolutionized how developers approach analytical database solutions. As both an academic and technology leader, Hannes brings unique insights into database architecture, open-source development, and the future of analytical data processing.
Episode Highlights:
- The Purpose of DuckDB (01:04)
Hannes gives a full description of what DuckDB is as well as what it is designed to do. He describes the tool as one that understands SQL and is specifically designed to simplify complex analytical use cases.
- SQLite vs DuckDB (02:53)
Hannes compares two different tools stating that SQLite is an amazing system that is not meant for analytical queries but for transactional use cases while DuckDB is specifically designed for that exact purpose - analytical use cases.
- The Importance of Collaboration (08:14)
Hannes states the need for community collaboration as the database engine space seems to have hundreds of brilliant people trying to solve the same problems. He shares his profound admiration for a team in Munich, praising them for their exploits in implementing concepts only described in paper.
- The Component-Based Architecture of DuckDB (11:25)
Hannes highlights a special feature in DuckDB, that is, it can be used as a component and he explains that the in-process architecture is a success because of the memory of data sharing that can be achieved.
- The Parquet Reader Journey (17:51)
Hannes explains how he built his Parquet Reader out of necessity, although he would have preferred not to. He shares how a creator named Ove Korn from Germany donated the reader to a project named “The Arrow Project” and managed it to the degree that the entire project depended on the use of the Parquet Reader and it became an issue to use both independently. Hannes adds that a parquet reader that is competent has no choice but to become a database engine which is one of the interesting things about development.
- The Role of AI in Database Interaction (22:41)
Hannes states that he doesn’t think that AI has a place in a database engine but rather, it is needed for optimization because the researchers who built their careers on optimization are out of jobs. He explains that the role of AI should be for assistance tasks and not for a total execution.
- SQL - A Defined Interface (29:20)
Hannes introduces us to a tool that allows us to pro-programmatically build a query called relational API stating that it helps to simplify the tasks of a programmer. Although, Hannes agrees that using a well-defined interface is important for components like databases, he also argues that SQL can provide a relatively defined behavior within a single system.
- The Golden Age of Database (38:57)
Hannes concludes the episode by appreciating Firebolt and other engineers for taking on core engine tasks. He shares his excitement for the golden age of databases where there is a showcasing of what is possible.
Quotes:
- “DuckDB is a universal data wrangling tool. It is a relational data management system that speaks SQL designed to do well on analytical use cases.”
- “We call ourselves the SQLite for analytics because it explains the original design goal of DuckDB very well.”
- “Within the database engine space, we are all working to solve the same problems, and that's like, a hundred of us on the planet.”
- “It actually turns out in order to make a competent parquet reader, you do need query execution. There is just no way around it.”
- “I really like this golden age of databases we are in and personally, as somebody who really likes tables and SQL, I'm quite happy to see things like firebolt and others really working on core engine stuff.”
The Data Engineering Show is handcrafted by our friends over at: fame.so
Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.
Check out our three most downloaded episodes:
Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.
Check out our three most downloaded episodes:
58 פרקים
Manage episode 472232750 series 3418247
תוכן מסופק על ידי The Data Bros and The Firebolt Data Bros. כל תוכן הפודקאסטים כולל פרקים, גרפיקה ותיאורי פודקאסטים מועלים ומסופקים ישירות על ידי The Data Bros and The Firebolt Data Bros או שותף פלטפורמת הפודקאסט שלהם. אם אתה מאמין שמישהו משתמש ביצירה שלך המוגנת בזכויות יוצרים ללא רשותך, אתה יכול לעקוב אחר התהליך המתואר כאן https://he.player.fm/legal.
In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with CEO DuckDB Labs and co-creator DuckDB, Hannes Mühleisen.
Together, they:
- Talk about the journey of DuckDB, an open-source analytical database system designed as a universal wrangling tool.
- Explain how DuckDB differs from SQLite, highlighting the analytical and transactional use cases.
- Discuss DuckDB’s special feature and its approach to innovation including creating their Parquet Reader.
- Explore the simple and efficient ecosystem of DuckDB, allowing developers to add custom functionality without changing its core stability.
- Consider Hannes' perspective on the role of AI in databases.
- Delve into the system’s infrastructure, design choices and the dedication of the team to ensure a continuous, reliable database system.
If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts, instructions on how to do this are [insert link].
Hannes Mühleisen is the CEO of DuckDB Labs and a Professor in The Netherlands, renowned for co-creating DuckDB, an open-source analytical database system. With a background in database architecture and research from CWI database architectures group, he has pioneered the development of DuckDB as a universal data wrangling tool that can run everywhere from phones to space satellites. Under his leadership, DuckDB has achieved remarkable success, reaching 10 million downloads monthly and becoming a go-to solution for analytical database needs. His commitment to keeping DuckDB lightweight, portable, and hardware-agnostic while maintaining high performance has revolutionized how developers approach analytical database solutions. As both an academic and technology leader, Hannes brings unique insights into database architecture, open-source development, and the future of analytical data processing.
Episode Highlights:
- The Purpose of DuckDB (01:04)
Hannes gives a full description of what DuckDB is as well as what it is designed to do. He describes the tool as one that understands SQL and is specifically designed to simplify complex analytical use cases.
- SQLite vs DuckDB (02:53)
Hannes compares two different tools stating that SQLite is an amazing system that is not meant for analytical queries but for transactional use cases while DuckDB is specifically designed for that exact purpose - analytical use cases.
- The Importance of Collaboration (08:14)
Hannes states the need for community collaboration as the database engine space seems to have hundreds of brilliant people trying to solve the same problems. He shares his profound admiration for a team in Munich, praising them for their exploits in implementing concepts only described in paper.
- The Component-Based Architecture of DuckDB (11:25)
Hannes highlights a special feature in DuckDB, that is, it can be used as a component and he explains that the in-process architecture is a success because of the memory of data sharing that can be achieved.
- The Parquet Reader Journey (17:51)
Hannes explains how he built his Parquet Reader out of necessity, although he would have preferred not to. He shares how a creator named Ove Korn from Germany donated the reader to a project named “The Arrow Project” and managed it to the degree that the entire project depended on the use of the Parquet Reader and it became an issue to use both independently. Hannes adds that a parquet reader that is competent has no choice but to become a database engine which is one of the interesting things about development.
- The Role of AI in Database Interaction (22:41)
Hannes states that he doesn’t think that AI has a place in a database engine but rather, it is needed for optimization because the researchers who built their careers on optimization are out of jobs. He explains that the role of AI should be for assistance tasks and not for a total execution.
- SQL - A Defined Interface (29:20)
Hannes introduces us to a tool that allows us to pro-programmatically build a query called relational API stating that it helps to simplify the tasks of a programmer. Although, Hannes agrees that using a well-defined interface is important for components like databases, he also argues that SQL can provide a relatively defined behavior within a single system.
- The Golden Age of Database (38:57)
Hannes concludes the episode by appreciating Firebolt and other engineers for taking on core engine tasks. He shares his excitement for the golden age of databases where there is a showcasing of what is possible.
Quotes:
- “DuckDB is a universal data wrangling tool. It is a relational data management system that speaks SQL designed to do well on analytical use cases.”
- “We call ourselves the SQLite for analytics because it explains the original design goal of DuckDB very well.”
- “Within the database engine space, we are all working to solve the same problems, and that's like, a hundred of us on the planet.”
- “It actually turns out in order to make a competent parquet reader, you do need query execution. There is just no way around it.”
- “I really like this golden age of databases we are in and personally, as somebody who really likes tables and SQL, I'm quite happy to see things like firebolt and others really working on core engine stuff.”
The Data Engineering Show is handcrafted by our friends over at: fame.so
Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.
Check out our three most downloaded episodes:
Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.
Check out our three most downloaded episodes:
58 פרקים
Wszystkie odcinki
×
1 How Rising Wave Is Redefining Real-Time Data with Postgres Power 31:36
31:36
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי31:36
In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with Yingjun Wu , founder and CEO of Rising Wave , to explore the evolution of stream processing systems and the innovations his company is bringing to the space. What you’ll learn: Yingjun's journey from academic research in stream processing to founding Rising Wave, and the challenges of building trust in a new database system. How Rising Wave's architecture, using S3 as primary storage, delivers second-level scalability, while other systems can take hours to scale. The competitive landscape of stream processing, with Rising Wave's Postgres compatibility providing a significant advantage in ease of use. How one major company reduced its CPU requirements from 20,000 to just 600 by switching from a traditional stream processing system to Rising Wave. The rising importance of Apache Iceberg as a destination for stream processing output, helping companies avoid vendor lock-in. How streaming systems fit into modern data stacks, especially as companies seek to avoid being locked into proprietary systems. Yingjun Wu is the founder and CEO of Rising Wave, a stream processing system built in Rust and designed with a cloud-native architecture. With a PhD focused on stream processing and database systems, Yingjun previously worked at Redshift and IBM Research before founding Rising Wave. His company has developed a system that achieves significant performance and resource efficiency advantages over traditional stream processing solutions, while maintaining Postgres compatibility for ease of use. Episode Highlights: The Origins of Rising Wave (00:30) Yingjun shares his background in stream processing from his PhD days and explains how his experience at Redshift revealed the need for better stream processing solutions, especially since many data warehouse workloads involve data ingested from streaming sources like Kinesis or Kafka. Building a System from Scratch (04:10) Yingjun describes the challenging first 2-3 years of developing Rising Wave without customers, highlighting how trust is a major barrier for new database systems. After 2.5 years, they secured their first customers, including a startup and several larger companies, which helped establish Rising Wave's credibility. The Current Stream Processing Landscape (07:47) Benjamin asks about the current stream processing space, with Yingjun positioning Rising Wave as a leader, particularly for SQL-based workloads. He highlights several key advantages of Rising Wave, including its Rust-based implementation and S3-based storage architecture. S3 as Primary Storage (10:27) Yingjun explains their decision to use S3 as primary storage from day one, despite its slowness and expense. He discusses how they've optimized for these challenges and would still make the same architectural choice today due to benefits like simplified state management and superior elastic scaling. The Business Model (13:52) Rising Wave offers open-source, cloud, and on-premise versions of its product. Yingjun notes that many highly regulated industries require on-premise deployment, including customers in the banking and aerospace sectors. Typical Users and Competitive Advantages (15:01) When asked about their typical users, Yingjun explains they directly compete with Flink but have advantages in ease of use due to Postgres compatibility. Their users are either new to stream processing or are migrating from systems like Spark Streaming or Flink due to performance issues or development complexity. Apache Iceberg Integration (19:25) Yingjun discusses how Apache Iceberg is emerging as an important destination for Rising Wave output, as companies seek to avoid vendor lock-in with proprietary data warehouses. He explains how Rising Wave typically performs ETL functions before data is sent to Iceberg tables. The Future of Data Management (32:06) The conversation concludes with a discussion about Iceberg becoming a "single source of truth" for data, with multiple specialized query engines potentially accessing the same data. Yingjun and Eldad share perspectives on how this shift away from proprietary data lock-in is changing the data ecosystem. Episode Resources: Rising Wave Website Yingjun Wu LinkedIn The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Revolutionizing Data Governance with DataStrato’s Unified Open Source Approach 23:36
23:36
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי23:36
In this episode of The Data Engineering Show, the bros sit with Lisa Cao, Product Manager at DataStrato, to explore data catalogs and Apache Gravitino, a unified metadata lake used to manage access and perform data governance for all data sources. What You’ll Learn: How Apache Gravitino differs from others like Unity catalog and Polaris by being able to support multiple catalog systems. What the “Push-Down Permission Management” security model is and how to implement it across different data systems. How to maintain consistent governance across various query engines like Spark, Trino, and Flink. Why interoperability, flexibility and open source ecosystem are becoming an important dynamics of data infrastructure rather than performance benchmarking. How to evaluate new data tools based on their real-world adoption rather than the social media hype. If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts instructions on how to do this here [insert link]. Lisa Cao is a Product Manager at DataStrato, specializing in AI/ML product partnerships and developer relations. With deep expertise in data catalog technologies and open-source ecosystems, she plays a key role in developing Apache Gravitino, an ASF incubating project that provides a unified governance and security layer for diverse data systems. Her work in developing extensible catalog frameworks has helped organizations manage complex data environments across multiple platforms. Episode Highlights: What is Apache Gravitino? (01:24) Apache Gravitino is a meta-catalog that serves as a unified data governance and security layer used to manage different data systems. Lisa shares that Gravitino was the first to release an iceberg rest catalog and ended up open sourcing for the general community to use and as time passed, Polaris and Unity Catalog were also announced in open source. She highlights that although Gravitino, Polaris and Unity Catalog are very similar, Gravitino differs in that it is able to support multiple catalogs. Unifying AI/ML and Big Data Stack (03:15) One of the interesting things about Gravitino is that it offers more than just a catalog of data models and these model catalogs are the first step into looking at how to merge two worlds of AI and ML catalogs. Lisa shares the goal of effective management, that is, creating a system that can store and manage different types of data models, track changes to the models, and control access to the models. Simplifying Data Governance (10:49) Think of Gravitino as a “traffic cop” that helps to manage and secure data from multiple sources. It is crucial to have a system that provides unified access control across all data sources, allowing teams to manage access and data governance so that ML teams don't have to worry about access. Lisa says that Apache Gravitino is the system that makes data accessible to different teams and users while making sure that it is secure and governed appropriately. The Gravitino’s Query Engine Solution (21:34) Every query engine has its own way of managing data, which makes it difficult to switch between engines - you have to reconfigure everything. Lisa highlights that Gravitino solves the problem by providing a single layer of data governance that works across multiple query engines. Navigating the Fast-Paced World of Data Engineering (24:41) Lisa talks about how fast the data engineering space is moving and shares some insights to catching up; Don’t try to learn everything at once. Don't get too deep into every tool Look for real-world adoption She warns against the social media hype that can amplify the messaging around new tools, making it seem everyone is using it, when in reality, that can’t be easily seen. Episode Resources: Apache Gravitino website The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Database Technology in the Age of AI with DuckDB Labs co-creator Hannes Mühleisen 30:52
30:52
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי30:52
In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with CEO DuckDB Labs and co-creator DuckDB, Hannes Mühleisen. Together, they: Talk about the journey of DuckDB, an open-source analytical database system designed as a universal wrangling tool. Explain how DuckDB differs from SQLite, highlighting the analytical and transactional use cases. Discuss DuckDB’s special feature and its approach to innovation including creating their Parquet Reader. Explore the simple and efficient ecosystem of DuckDB, allowing developers to add custom functionality without changing its core stability. Consider Hannes' perspective on the role of AI in databases. Delve into the system’s infrastructure, design choices and the dedication of the team to ensure a continuous, reliable database system. If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts, instructions on how to do this are [insert link]. Hannes Mühleisen is the CEO of DuckDB Labs and a Professor in The Netherlands, renowned for co-creating DuckDB, an open-source analytical database system. With a background in database architecture and research from CWI database architectures group, he has pioneered the development of DuckDB as a universal data wrangling tool that can run everywhere from phones to space satellites. Under his leadership, DuckDB has achieved remarkable success, reaching 10 million downloads monthly and becoming a go-to solution for analytical database needs. His commitment to keeping DuckDB lightweight, portable, and hardware-agnostic while maintaining high performance has revolutionized how developers approach analytical database solutions. As both an academic and technology leader, Hannes brings unique insights into database architecture, open-source development, and the future of analytical data processing. Episode Highlights: The Purpose of DuckDB (01:04) Hannes gives a full description of what DuckDB is as well as what it is designed to do. He describes the tool as one that understands SQL and is specifically designed to simplify complex analytical use cases. SQLite vs DuckDB (02:53) Hannes compares two different tools stating that SQLite is an amazing system that is not meant for analytical queries but for transactional use cases while DuckDB is specifically designed for that exact purpose - analytical use cases. The Importance of Collaboration (08:14) Hannes states the need for community collaboration as the database engine space seems to have hundreds of brilliant people trying to solve the same problems. He shares his profound admiration for a team in Munich, praising them for their exploits in implementing concepts only described in paper. The Component-Based Architecture of DuckDB (11:25) Hannes highlights a special feature in DuckDB, that is, it can be used as a component and he explains that the in-process architecture is a success because of the memory of data sharing that can be achieved. The Parquet Reader Journey (17:51) Hannes explains how he built his Parquet Reader out of necessity, although he would have preferred not to. He shares how a creator named Ove Korn from Germany donated the reader to a project named “The Arrow Project” and managed it to the degree that the entire project depended on the use of the Parquet Reader and it became an issue to use both independently. Hannes adds that a parquet reader that is competent has no choice but to become a database engine which is one of the interesting things about development. The Role of AI in Database Interaction (22:41) Hannes states that he doesn’t think that AI has a place in a database engine but rather, it is needed for optimization because the researchers who built their careers on optimization are out of jobs. He explains that the role of AI should be for assistance tasks and not for a total execution. SQL - A Defined Interface (29:20) Hannes introduces us to a tool that allows us to pro-programmatically build a query called relational API stating that it helps to simplify the tasks of a programmer. Although, Hannes agrees that using a well-defined interface is important for components like databases, he also argues that SQL can provide a relatively defined behavior within a single system. The Golden Age of Database (38:57) Hannes concludes the episode by appreciating Firebolt and other engineers for taking on core engine tasks. He shares his excitement for the golden age of databases where there is a showcasing of what is possible. Quotes: “DuckDB is a universal data wrangling tool. It is a relational data management system that speaks SQL designed to do well on analytical use cases.” “We call ourselves the SQLite for analytics because it explains the original design goal of DuckDB very well.” “Within the database engine space, we are all working to solve the same problems, and that's like, a hundred of us on the planet.” “It actually turns out in order to make a competent parquet reader, you do need query execution. There is just no way around it.” “I really like this golden age of databases we are in and personally, as somebody who really likes tables and SQL, I'm quite happy to see things like firebolt and others really working on core engine stuff.” The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 AI and Data Movement: Trends and Best Practices with Estuary’s Daniel Pálma 30:33
30:33
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי30:33
In this episode of The Data Engineering Show , the bros sit with Daniel Pálma, Head of Marketing at Estuary. Join them as they; Talk about Daniel’s career transition from data engineering to marketing and how his background in data engineering has been a tremendous help to his marketing competence. Discuss the role of AI in the evolution of data movement ensuring a faster and easier process of creating data pipelines. Shine light on the challenges of vector databases and structured data in AI applications. Delve into the future of Apache Iceberg and data lakehouses, highlighting their current challenges. Shares insights on the golden age of data expressing the need for more data engineers, data analysts and data practitioners in the data space. If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts, instructions on how to do this are [insert link]. Daniel Pálma serves as Head of Marketing at Estuary, bringing a unique blend of technical expertise and marketing acumen to the data integration space. With nearly a decade of experience as a data engineer across startups, enterprises, and consulting roles, Daniel made a strategic pivot to marketing to help bridge the gap between complex technical solutions and their practical applications for data practitioners. His background in data engineering enables him to deeply understand the customers' challenges and create authentic, education-focused marketing content that resonates with technical audiences. Daniel’s thought leadership and content creation in the data engineering space, combined with his hands-on technical experience, positions him as a valuable voice in conversations about the evolution of data infrastructure and integration technologies. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 AI and Data Change Management with Chad Sanderson, CEO Gable AI 36:43
36:43
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי36:43
In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with Chad Sanderson, CEO and co-founder of Gable AI to explore the interesting world of data change management. Join them as they: Delve into challenges of data quality, how it degrades over time and the one-sided data quality checks on the “last mile” of the data supply chain. Talk about how Gable works through a 3-layer flow of technology which is to identify data production points, trace the data flow and communicate the impact of changes before they reach production. Explain why the gap between data producers and consumers need to be bridged and how Gable continues to emphasize the need for effective communication and understanding data change management across teams Shine light on how AI can enhance data management by extracting semantics from code and effectively manage the translation output. Discuss Chad’s vision for 2025 which is to help companies start to care about data and how the changes made to data affect other people. Chad Sanderson is the CEO and co-founder of Gable AI, a data change management platform. Chad has over a decade of experience in data engineering and infrastructure space, holding significant roles at major companies like Microsoft, Oracle, Sephora where he focused on data quality and governance challenges. He is a former Head of Data at Convoy, a LinkedIn writer, and a published author. He lives in Seattle, Washington, and is the Chief Operator of the Data Quality Camp. His journey from data scientist to data engineer and ultimately to CEO was driven by a desire to transform how organizations manage and utilize data. Gable AI addresses the complexities of the data supply chain, by providing tools for code scanning, data contracts and governance as code, enabling teams to proactively manage data changes and impact. If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube. Episode Resources Gable AI website Chad Sanderson on LinkedIn The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Tech Stacks and Tradeoffs: Xudo's Founder on Picking the Right Tools for BI Success 24:56
24:56
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי24:56
Wouter Trappers is the founder of Xudo and shares his slightly unconventional path from philosopher to data consultant with the Bros in this latest episode of The Data Engineering Show. Wouter’s grounding in philosophy has proved to be a shaping influence on his approach to business intelligence. Much more than just a software solution, for Wouter, BI is all about change management and aligning leadership with data projects. They discuss: From Excel to Expert: From basic Excel tasks to a full mastery of BI tools like QlikView, Wouter has blended his technical and philosophical approaches to data to become a bona fide expert. Data Strategy as Transformation: Good change management principles have to be adhered to if a BI project is going to bear fruit. Focus on leadership alignment, KPI clarity, and user empowerment instead of simply implementing software. Challenges of Starting Small: Wouter has some tips to offer smaller companies around bootstrapping their data journey using existing tools, practical education, and even Gen AI. Balancing Scales: Smaller startups compared to large enterprises face a very different set of challenges. Wouter’s combination of philosophy and pragmatism brings fresh takes to building effective data solutions. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Data Rewind: Conversation Highlights from Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan 28:02
28:02
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי28:02
In this special roundup episode of The Data Engineering Show , the Bros revisits some of the best bits from episodes with data thought leaders Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan, spotlighting essential trends and lessons learned across the evolving data engineering landscape. From data observability to bridging academia with real-world practice, this episode covers perspectives on where data engineering is heading and why certain challenges persist. Topics include: Foundations of Data Engineering : Zach Wilson emphasizes the importance of core, tech-agnostic skills in data modeling, quality assurance, and storytelling. By sharing his experiences at Airbnb and in education, he reveals that effective data engineering hinges on creating robust data models, quality controls, and persuasive narratives rather than expertise in any single tool or language. Bridging Academia and Practice: Matthew Housley and Joe Reis delve into the need for better data education, emphasizing hands-on experience and data fundamentals over tool-specific training, and advocate for apprenticeships and real-world collaborations in educational settings. Legacy Meets Modern in Data Engineering: Krishnan Viswanathan reflects on recurring themes in data engineering and the importance of adapting legacy approaches to new data needs, underscoring the challenges and benefits of vendor-built versus in-house solutions. Join the Bros for a well-rounded exploration of current themes in data engineering, filled with practical advice for data professionals at any stage of their journey. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 The Resurgence of SQL: Insights from Ryanne Dolan from LinkedIn 32:57
32:57
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי32:57
In this episode of The Data Engineering Show , the bros, Eldad and Benjamin are joined by Ryanne Dolan from LinkedIn to discuss the innovative Hoptimator (H2) project. This conversation reveals how LinkedIn has improved its data pipelines by automating the setup and management of complex workflows. Together they cover: Automated Data Pipelines: Ryanne explains how Hoptimator allows users to create and manage data pipelines using just a simple SQL SELECT query, streamlining the process of setting up Kafka topics, Flink jobs, and schemas. Integration with Kubernetes: The project utilizes Kubernetes to handle infrastructure tasks, treating Kubernetes as a database for managing state. This integration simplifies the orchestration of data workflows and automates routine tasks. Consumer-Driven Model: Ryanne discusses the shift from a producer-driven to a consumer-driven data model, emphasizing the importance of understanding and addressing consumer needs to reduce engineering complexity and optimize data systems. Future of Data Engineering: The conversation touches on the ongoing experimental nature of Hoptimator and its potential to transform data engineering practices, highlighting its impact on LinkedIn's data infrastructure. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Vector Databases Won’t Replace SQL - Andy Pavlo 42:59
42:59
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי42:59
SQL’s slow. SQL’s stupid. We hear these claims every time a new shiny tool enters the market, only to realize five years later when the hype dies down that SQL is actually a good idea. In this super techie episode of the Data Engineering Show, Andy Pavlo, Associate Professor at Carnegie Mellon University, joins the bros to delve into database internals and optimization. Andy discusses leveraging ML for autonomous database optimization, using Postgres for practical applications, tuning production databases safely, and why SQL is here to stay. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 How ZoomInfo transitioned from data graveyards to ROI-driven data projects 39:46
39:46
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי39:46
Too often expensive resources and manhours are spent on dashboards no one uses, resulting in zero ROI. Philip Philip Zelitchenko, VP of Data & Analytics at ZoomInfo met the bros to talk about adopting product management principles to ensure data projects have value, and provide an unfiltered peak into ZoomInfo’s data stack and unique tech culture. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Matthew Weingarten from Disney Streaming about Data Quality Best Practices 27:21
27:21
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי27:21
Matthew Weingarten, Lead Data Engineer at Disney Streaming, talks about principles essential for data quality, cost optimization, debugging, and data modeling, as adopted by the world's leading companies. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Joseph Machado, Senior Data Engineer @ LinkedIn talks best practices 25:59
25:59
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי25:59
Data engineering should be less about the stack and more about best practices. While tools may change, foundational principles will remain constant. Joseph Mercado, Senior Data Engineer at LinkedIn, is on The Data Engineering Show to talk about principles that are key to success, leveraging AI for automation, and adopting software engineering methods. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Professors Joe Hellerstein and Joseph Gonzalez on LLMs 46:07
46:07
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי46:07
Joe Hellerstein is the Jim Gray Professor of Computer Science at Berkeley and Joseph Gonzalez is an Associate Professor in the Electrical Engineering and Computer Science department. They’ve inspired generations of database enthusiasts (including Benji and Eldad) and have come on the show to talk about all things LLM and RunLLM which they co-founded. If you consider yourself a hardcore engineer, this episode is for you. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Megan Lieu on powerful notebooks that enable collaboration 31:31
31:31
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי31:31
There are two types of data influencers on LinkedIn: 1. Those who talk directly about the products and companies they work for 2. Those that provide more general guidance, tips and opinions Can influencers actually be passionate about the products they’re developing and straightforwardly talk about them without sounding salesly? We’re kicking off 2024 with the amazing Megan Lieu on a new Data Engineering Show episode. Megan is one of those influencers that combine the two approaches, and with almost 100K followers, her content seems to be resonating with many data folks. She talked to the bros about her approach to data advocacy as well as the power of notebooks, especially when they become broader and enable collaboration. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…

1 Transitioning from software engineering to data engineering 29:48
29:48
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי29:48
Every data team should have at least one data engineer with a software engineering background. This time on The Data Engineering Show, Xiaoxu Gao is an inspiring Python and data engineering expert with 10.6K followers on Medium. She’s a data engineer at Adyen with a software engineering background, and she met the bros to talk about why both software and data engineering skills are so important. Without software engineering skills you’ll be limited to the rigid capabilities of your stack. But without data engineering skills you’ll find it hard to be cost effective and see the bigger picture. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Vin Vashishta explains why we should stop using dashboards 35:45
35:45
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי35:45
Vin Vashista, the guy we all love to follow, has never seen a dashboard with positive ROI. This time on The Data Engineering Show, he met the bros to talk about the difference between BI dashboards and analytics that actually introduce knowledge. It’s no longer just about the data volume, it’s about quality and relevance. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Joe Reis and Matt Housley on the fundamentals of data engineering 42:11
42:11
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי42:11
After co-writing the best-selling book ‘Fundamentals of Data Engineering’, Joe Reis and Matt Housely joined the bros for some much-needed ranting, priceless data advice, and good laughs. So why are we still talking about providing business value and dashboards, even though we don’t really have anything new to say? If there are so many great tools in the data stack, why are we still so troubled? How can we focus more on things like data governance and data quality that’ll actually push the industry forward? The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Bill Inmon, the Godfather of Data Warehousing 30:32
30:32
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי30:32
As people in the data industry go, Bill Inmon is among the top, often seen as the godfather of the data warehouse. In this Data Engineering Show episode, Bill Inmon talks about surviving rabbit holes throughout the evolution of data, the data modeling renaissance, and why ChatGPT is not Textual ETL. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Large-scale data engineering at Momentive.ai - Meenal Iyer 38:40
38:40
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי38:40
As companies scale, data gets messy. The data team says one thing, the business team says something completely different. Meenal Iyer, VP Data at Momentive.ai, Met the Data Bros to talk about enforcing collaboration in large organizations to ensure what she considers the three most important data factors: Adoption, Trust, and Value. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Data engineering from the early 2000s till today - BlackRock 41:49
41:49
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי41:49
When it comes to data management, have we come a long way since the early 2000s? Or has it simply taken us 20 years to finally realize that you can’t scale properly without data modeling. With over 20 years of experience in the data space, leading engineering teams at Cisco, Oracle, Greenplum, and now as Sr. Director of Engineering at BlackRock, Krishnan Viswanathan talks about the data engineering challenges that existed two decades ago and still exist today. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Zach Wilson on what makes a great data engineer 34:02
34:02
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי34:02
How good you are at Spark or Flink ≠ how good you are at data engineering. After years of data engineering experience at Airbnb, Netflix, and Facebook, Zach Wilson is now focused on spreading the knowledge in EcZachly and all over social media. He met Benjamin Wagner to explain why data modeling and storytelling are more important than the actual tech, why data engineering is going to see more job growth than data science, and what brought him to start creating content, reaching over 250K followers on LinkedIn. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How ZipRecruiter and Yotpo power self-service data platforms that work 45:48
45:48
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי45:48
Data engineers are not paid to do support. Liran Yogev, Director of Engineering at ZipRecruiter, and Doron Porat, Director of Infrastructure at Yotpo talk about building resilient self-service products that keep customers happy and engineers calm. They walked the bros through their data stacks and explained how ZipRecruiter is completely rebuilding its data layer from scratch. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Data Observability with Millions of Users - Barr Moses 38:36
38:36
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי38:36
Barr Moses, CEO of Monte Carlo explains the difference between data quality and data observability, and how to make sure your data is accurate in a world where so many different teams are accessing it. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Amplitude Engineers Process 5 Trillion Real-time Events 27:59
27:59
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי27:59
Weichen Wang, Senior Engineering Manager at Amplitude, came to meet the bros to talk about Amplitude's cutting-edge data stack and how it processes 5 Trillion real-time events while dealing with mutable data and massive scale. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Making Observability a Key Business Driver 48:59
48:59
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי48:59
80% of the code that you write doesn’t work on the first try. And that’s fine. But knowing which 80% is not working and which 20% is working is the actual challenge. After 10 years at Facebook, managing and scaling the Seattle site to over 6000 engineers(!) Vijaye Raji founded Statsig to make observability automated and real-time. How is the semantic layer managed? How was the Statsig team able to build an observability product that handles real-time ever-changing metadata? What are Vijaye’s main takeaways from engineering at Facebook? Tune in. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 A ClickHouse Review from a Practitioner’s Point of View 34:43
34:43
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי34:43
Sudeep Kumar, Principal Engineer at Salesforce is a ClickHouse fan. He considers the shift to Clickhouse as one of his biggest accomplishments during his eBay days and walks Boaz through his experience with the platform. How on one hand it handled 2B events per minute, but also how it required rollups which compromised granularity when extending time windows. Besides a ClickHouse review from a practitioner’s point of view, Sudeep tells us about interesting use-cases he’s working on at Salesforce. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 The Creator of Airflow About His Recipe for Smart Data-Driven Companies 45:56
45:56
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי45:56
According to Maxime Beauchemin, CEO & Founder at Preset and Creator of Apache Superset and Apache Airflow, it's not so straight-forward to understand what you're really getting into and the vastness of the skills that are required in order to build a thriving company. Picking the right system and services is key for a successful start, and can help you avoid the chaos of having too many tools spread across multiple teams. Plus, Max walks the bros through the genesis of Airflow, Superset & Presto, and Airflow's old school marketing approach that won the hearts of developers across the world. And just like the terminator, once the machine takes over, you can't stop. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Similarweb Delivers Customer Facing Analytics Over 100s of TBs 37:11
37:11
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי37:11
According to Yoav Shmaria, VP R&D Platform at Similarweb, the best way to manage data warehouse costs is to tag every table, database or ETL running to have good granularity over every feature. Besides handy cost management tips, Yoav walks the bros through the tech stack he implemented to analyze 100s of TBs of web data to serve fast customer-facing analytics. Full disclosure, Similarweb is a Firebolt customer, but the bros kept it objective, and there’s no Firebolt talk in this episode. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Klarna Designed a New Data Platform in the Cloud 40:37
40:37
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי40:37
Klarna is one of the leading fintech companies in the world, valued at $45B. While many corporations are “stuck” on-prem, Klarna made the move and today is a cloud-only company. Gunnar Tangring, Klarna’s Lead Data Engineer tells Boaz what this new modernized stack looks like. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Eventbrite is Modernizing its Data Stack 23:25
23:25
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי23:25
Archana shares Eventbrite’s data stack modernization process, and how you get engineers to adopt new technologies like dbt which may be outside their comfort zone. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 A Deep Dive into Slack's Data Architecture 34:06
34:06
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי34:06
Growing from a startup to an IPOed and then an acquired company meant that Slack’s sales org was scaling rapidly. Apun Hiran, Slack’s Director of Software Engineering explains how the data stack and architecture evolved to support this growth with more reliable and timely metrics. Speaker: Apun Hiran, Director of Software Engineering (Data), Slack Hosts: Eldad and Boaz Farkash, CEO and CPO, Firebolt The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Transitioning Scopely’s 5.5 PB Data Platform to the Modern Data Stack 31:52
31:52
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי31:52
Should data engineering AND BI be handled by the same people? According to Jonathan Palmer, VP Data Platform at Scopely – YES. By Analytics Engineers. His team of Analytics Engineers is in the final stages of transitioning 5.5 PBs of data which include 15B evens per day to the modern data stack. Tune in to learn how they did it. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Getting rid of raw data with Jens Larsson 29:01
29:01
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי29:01
Why would you create ugly data? According to Jens Larsson, don’t even go near raw data. Jens started off at Google, continued to manage data science at Spotify, caught the startup bug at Tink, and recently joined an exciting new company called Ark Kapital, together with Spotify’s former VP Analytics. Jens explains how he and his team killed the notion of raw data at Tink and walks us through the Google, Spotify and Ark Kapital data stacks. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Zendesk engineers manage customer-facing data applications 33:28
33:28
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי33:28
This time on the data engineering show, Eldad abandoned his brother Boaz but it’s ok because Boaz got the full 30 minutes to talk to one of the most interesting people in the data space. Ananth Packkildurai is Principal Software Engineer at Zendesk and runs one of the strongest newsletters in data – Data Engineering Weekly. He talked about data applications at Zendesk and how they’re built, technologies that excite him like data lineage and data catalog, and the best routes for software engineers to get their hands dirty in the data world. INTERVIEWER: Boaz Farkash. ZENDESK GUEST: Ananth Packkildura - Principal Software Engineer. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How are those data intensive customer facing apps engineered at Gong? 26:16
26:16
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי26:16
Gong manages hundreds of thousands of videoconferences and millions of emails PER DAY, which add up to hundreds of TBs. The Data Bros met Yarin Benado, Gong’s engineering manager to understand what is required to move to a modern data stack to support all this, what this stack looks like, and why it all comes down to data quality at the end of the day. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Bolt Engineers Are Designing Its Next-Gen Data Platform 35:55
35:55
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי35:55
Bolt's ride-hailing app serves 2B users in Europe and Africa and handles 500K queries every day. Erik Heintare along with Bolt's engineering team is in the midst of designing a new next-gen data platform and is sharing how it's going to solve their biggest data challenges. Guest: Erik Heintare - Senior Analytics Engineer at Bolt Hosts: Eldad and Boaz Farkash, AKA The Data Bros The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How did Agoda scale its data platform to support 1.5T events per day? 38:40
38:40
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי38:40
Scaling a data platform to support 1.5T events per day requires complicated technical migrations and alignment between hundreds of engineers. What to see how Agoda did it. Guests: Amir Arad, Director of Machine Learning, Agoda Shaun Sit, Senior Dev Manager, Agoda Hosts: The Data Bros - Eldad and Boaz Farkash The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

It’s the mother of all development projects. You use it daily. And so do 65M developers around the world. This time on the Data Engineering Show – A deep dive into GitHub’s data stack. Arfon Smith KimYen (Truong) Ladia shared GitHub’s data engineering challenges and solutions and explained why every developer should know and adopt the ADR protocol. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 Building Data Products For Data Engineers 39:51
39:51
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי39:51
How does a tech stack that always needs to be at the forefront of technology look like? Roy Miara from Explorium talks about building data products for the audience that can’t be fooled – Data Engineers. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Vimeo Keeps Data Intact with 85B Events Per Month 40:13
40:13
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי40:13
How does the Viemo data team deal with 2 PBs of data and 85B events per month? What made them recently build a data ops team? What data tool does the team love? And why (the hell) did they call their legacy platform Fatal Attraction? Guest: Lior Solomon, VP Data Engineering at Vimeo. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Substack's Data Stack Supports 500K Paying Subscribers 24:24
24:24
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי24:24
Substack is an amazing — if not the most amazing — content publishing platform out there. Essentially, it allows anyone to become a journalist or to start their own newsletters and charge subscriptions for them. So how did they build a data stack that can support all of their 500K paying subscribers? Guest: Mike Cohen, Data Engineer at SubStack Hosts: The Data Bros, Eldad and Boaz Farkash, CEO and CPO at Firebolt The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 A Technical Deep Dive to Yelp's Data Infrastructure - With Steven Moy 50:09
50:09
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי50:09
As an expert in query engines and performance-related challenges, Steven Moy explains how Yelp handled its huge data growth in the past ten years. Guest: Steven Moy, Software Engineer at Yelp Hosts: The Data Bros, Eldad and Boaz Farkash, CEO and CPO at Firebolt The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How Canva's Data Engineers and Analysts Support 55M Active Users 43:18
43:18
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי43:18
Canva is one of the hottest, if not the hottest, graphic design platforms out there. Only a week ago it was announced that they reached a staggering 16 Billion dollar valuation, after having seen even stronger growth during the pandemic. With 55 million active users and around 500 million dollars in annual revenue, it seems that Canva is unstoppable. So how do Canva analysts and engineers scale their data platforms to meet the company's insane growth? Guest: Krishna Naidu, Data Engineer at Canva Hosts: The Data Bros, Eldad and Boaz Farkash, CEO and CPO at Firebolt The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
T
The Data Engineering Show

1 How AppsFlyer Delivers Sub-Second BI to 1000 Looker Users - With Alexandra Sudilovsky 31:46
31:46
הפעל מאוחר יותר
הפעל מאוחר יותר
רשימות
לייק
אהבתי31:46
AppsFlyer has exploded in size, growing from a small company of 200 people to 1000 people in just three years. Dealing not only with a huge amount of data on a daily basis but doing so while growing quickly as a company can come with many challenges. Guest: Alexandra Sudilovsky, Senior BI Expert at AppsFlyer Hosts: The Data Bros, Eldad and Boaz Farkash, CEO and CPO at Firebolt The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
The Data Engineering Show is a podcast for data engineering and BI practitioners to go beyond theory, and learn from the biggest influencers in tech about their practical day to day data challenges and solutions in a casual and fun setting. The Data Engineering Show is handcrafted by our friends over at: fame.so Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen. Check out our three most downloaded episodes: Zach Wilson on What Makes a Great Data Engineer Joe Reis and Matt Housley on The Fundamentals of Data Engineering Bill Inmon, The Godfather of Data Warehousing…
ברוכים הבאים אל Player FM!
Player FM סורק את האינטרנט עבור פודקאסטים באיכות גבוהה בשבילכם כדי שתהנו מהם כרגע. זה יישום הפודקאסט הטוב ביותר והוא עובד על אנדרואיד, iPhone ואינטרנט. הירשמו לסנכרון מנויים במכשירים שונים.