#289 Building The Right Foundations For Generative AI - Interview W/ May Xu Data Mesh Radio podcast

D

Data Mesh Radio

1
Summer Hiatus Announcement - Back in August 4:28

לפני 1 year4:28

4:28

Taking a needed break to focus on getting healthy. Be back in August!

D

Data Mesh Radio

1
#306 Building with People for People - Swisscom's Data Mesh Approach and Learnings - Interview w/ Mirela Navodaru 1:09:06

לפני 1 year1:09:06

1:09:06

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Mirela's LinkedIn: https://www.linkedin.com/in/mirelanavodaru/ In this episode, Scott interviewed Mirela Navodaru, Enterprise and Solution Architect for Data, Analytics, and AI at Swisscom. Some key takeaways/thoughts from Mirela's point of view: Specifically at Swisscom, it's not about doing data mesh. They want to make data a key part of all their major decisions - operational and strategic - and data mesh means they can put the data production and consumption in far more people's hands. Data mesh is a way to achieve their data goals, not the goal. When you are trying to get people bought in to something like data mesh, you always have to consider what is in it for them. Yes, the overall organization benefiting is great but it’s not the best selling point 😅 try to develop your approach to truly benefit everyone. Data literacy is crucial to getting the most value from data mesh. Data mesh is not about throwing away the important knowledge your data people have but it's about unlocking the value of the knowledge your business people have to be shared with the rest of the organization effectively, reliably, and scalably. ?Controversial? You really have to talk to a lot of people early in your data mesh journey to discover the broader benefits to the organization. That way you can talk to people's specific challenges to get them bought in. When designing your journey, it is important to get input from a large number of people. When talking data as a product versus data products, the first is the core concept and the second is the deliverables. Scott note: this is a really simple but powerful delineation "No value, no party." If there isn't a value proposition, there shouldn't be any action. You need to stay focused on value because there are so many potential places to focus in a data mesh implementation. You have to balance value at the use case level to the domain versus more global value to the organization. At the end of the day, everything you do should add value to the organization but sometimes use cases are much more focused at the domain and that's perfectly expected and acceptable. Data mesh, to really change the organization in the right way, needs top level buy-in. You can't only be the data team trying to head down the data mesh path. Everything in data mesh is about iterating to better. You need the space and room to learn as you go along. You can - and must - deliver value before you've got everything figured out perfectly. Relatedly, you will learn how to better iterate towards value throughout your journey. It will be tough at the start as with any learning journey. Obviously, data mesh is a large cultural change. You need to have empathy and give people the chance to grow instead of trying to move too fast. Upskilling, especially around data literacy, is crucial. There are two very valuable aspects of data mesh: the value you deliver via use cases along the way and the value you get from learning to do data better across your organization. The first is from integrating data into far more of your decisions and the second means you can react more quickly to new opportunities and build scalable and reliable approaches to data management. Something like data mesh is a big change. But it shouldn't be a shock to people. You can do it gradually and incrementally while you deliver value. One of the best ways to lose people is to thrust disruptive change on them instead of working with them through the change to prevent large-scale negative disruptions. There are so many areas where data mesh helps organizations, whether it is getting away from silos, reducing redundancy, improving quality and reliability, etc. It's not just about doing data management itself better, which has been the focus of most data approaches historically. Again, data work is not the point. The point is to make your colleagues better at their job through being more informed. That comes down to the data but it's never the actual point, it's the vehicle to delivering value. Transparency and managing expectations - and communication in general - are crucial to doing data mesh well. You need to have that space to learn and iterate. Let people know what you are doing and especially why you are doing it. Data modeling in data mesh is of course a challenge. But it's important to have some level of common language between the domains or you will have data silos. It's a balance but it's crucial to give domains flexibility but also create easy paths for people to combine data across domains. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#305 Combining the Technical and Business Perspectives for Data Mesh - Interview w/ Alyona Galyeva 1:05:59

לפני 1 year1:05:59

1:05:59

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Alyona's LinkedIn: https://www.linkedin.com/in/alyonagalyeva/ In this episode, Scott interviewed Alyona Galyeva, Principal Data Engineer at Thoughtworks. To be clear, she was only representing her own views on the episode. Some key takeaways/thoughts from Alyona's point of view: ?Controversial? People keep coming up with simple phrasing and a few sentences about where to focus in data mesh. But if you're headed in the right direction, data mesh will be hard, it's a big change. You might want things to be simple but simplistic answers aren't really going to lead to lasting, high-value change to the way your org does data. Be prepared to put in the effort to make mesh a success at your organization, not a few magic answers. !Controversial! Stop focusing so much on the data work as the point. It's a way to derive and deliver value but the data work isn't the value itself. Relatedly, ask what are the key decisions people need to make and what is currently preventing them from making those decisions. Those are likely to be your best use cases. When it comes to Zhamak's data mesh book, it needs to be used as a source of inspiration instead of trying to use it as a manual. Large concepts like data mesh cannot be copy/paste, they must be adapted to your organization. It's really important to understand your internal data flows. Many people inside organizations - especially the data people - think they know the way data flows across the organization, especially for key use cases. But when you dig in, they don't. Those are some key places to deeply investigate first to add value. On centralization versus decentralization, it's better to think of each decision as a slider rather than one or the other. You need to find your balances and also it's okay to take your time as you shift more towards decentralization for many aspects. Change management is best done incrementally. ?Controversial? A major misunderstanding of data mesh that some long-time data people have is that it is just sticking a better self-serve consumption layer on top and we can continue to do monolithic data work under the hood. Be prepared for lots of friction in convincing some data architects that this isn't just a reskin or another layer on top of the enterprise warehouse or data lake. For data mesh, it's crucial to understand necessary changes at the technical and the business level. You can't only work on one but you also don't have to 'solve' them at the start, make progress. It's like with the four principles, you need to thin slice change across the technical and business aspects rather than only focusing on one. You can sell data engineers on data mesh by making their work more meaningful and impactful. Instead of mostly firefighting - which is the case in many organizations - they can focus on shipping new features and adding incremental value. With data mesh, you want people focusing on more than just making data valuable - what is valuable will change so how do you make your data products evolvable and maintainable? You always want to be focused on addressing people's pain points in data mesh, driving towards value. That's how you can get data people bought in as well, not just business people. !Controversial! Doing aggregated data products across domains is usually the data mesh inflection point - basically answering can data mesh work in your organization. If you can't get that cross domain collaboration going well, you should consider another model like hub and spoke. Relatedly, aggregating data across multiple domains is where there is usually the most value for an organization. But it's very hard to find good champions there because you need more vision and more hard work to collaborate across domain boundaries. Identify the people with the vision early in your journey, even if it's often better to actually only start working with them once you have more momentum and data products. Too often, there is a rush to build _something_ instead of the right thing. Don't get fooled by the idea that data work always creates value. Even if the client or business partner asks you to build something of value, always circle back to the use cases. As much as we'd like to build universal data products, they just don't exist. Relatedly though, don’t get so focused only on trying to build the consumer-aligned data products for hyper-specific use cases that you miss the forest for the trees. Sometimes the use case is something like 'we need to understand what data we even have to be able to use it to address our current problems in XYZ business line.' To kind of sum it up: stop focusing on what you can build first. Focus on what you should build and then look at the realities. What matters to the business and why? Then focus on what's possible and what will deliver sustainable and maintainable value through data work. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#304 Getting Your Data Mesh Journey Moving Forward - Interview w/ Chris Ford and Arne Lapõnin 1:01:50

לפני 1 year1:01:50

1:01:50

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Arne's LinkedIn: https://www.linkedin.com/in/arnelaponin/ Chris' LinkedIn: https://www.linkedin.com/in/ctford/ Foundations of Data Mesh O'Reilly Course: https://www.oreilly.com/videos/foundations-of-data/0636920971191/ Data Mesh Accelerate workshop article: https://martinfowler.com/articles/data-mesh-accelerate-workshop.html In this episode, Scott interviewed Arne Lapõnin, Data Engineer and Chris Ford, Technology Director, both at Thoughtworks. From here forward in this write-up, I am combining Chris and Arne's points of view rather than trying to specifically call out who said which part. Some key takeaways/thoughts from Arne and Chris' point of view: Before you start a data mesh journey, you need an idea of what you want to achieve, a bet you are making on what will drive value. It doesn't have to be all-encompassing but doing data mesh can't be the point, it's an approach for delivering on the point 😅 Relatedly, there should be a business aspiration for doing data mesh rather than simply a change to the way of doing data aspiration. What does doing data better mean for your organization? What does a "data mesh nirvana" look like for the organization? Work backwards from that to figure where to head with your journey. A common early data mesh anti-pattern is trying to skip both ownership and data as a product. There are existing data assets that leverage spaghetti code and some just rename them to data products and pretend that's moved the needle. "A data product is a data set + love." The real difference between a data product and a data set is that true ownership and care. ?Controversial?: Another common mesh anti-pattern is trying to get too specific with definitions or prescriptive advice. There isn't a copy/paste approach that will work and getting a specific definition of a data product doesn't really change much. Mindset is far more important than definitions. It can be very helpful to have some simple checklists around your data products. While there is no prescriptive way to build, checklists remove a lot of the uncertainty for teams asking 'am I doing this right?' It gives some simple reassurances that you aren't missing out on key pieces of what they're building. ?Controversial?: Most organizations probably don't need to do a ton of pre-work before starting on a data mesh implementation. They need some achievable goals, a roadmap for how they plan to achieve those goals, and a lot of willpower to push things forward and keep going when the going gets tough. You also need an enticing vision for people to buy into. THIN SLICE! Don't try to take everything on at once but also don't try to skip over any of the four pillars. There's a reason they haven't changed from Zhamak's initial blog post. Scott note: don't try to argue the governance pillar wasn't in the first blog post, it just wasn't called out separately… Three key questions to answer if you are considering data mesh: A) Do you have sufficient scale? B) Do you have a strategy that depends on deriving value from data? C) Are you prepared to take advantage of the autonomy Data Mesh will afford to your product teams? If you don't have satisfactory answers to those three questions, data mesh is probably not right/overkill for you. If people don't see the strong need to transform your business through data, it's likely to lead to troubles 6-9 months into your data mesh journey. If you aren't addressing key organizational pain points or delivering value, you will likely lose support for your data mesh implementation initiative. Doing data better has to be valued to get more budget to keep going. Another anti-pattern is focusing too much on use cases at the expense of the platform and the journey. Data mesh is designed to work at scale and that only works by finding repeatable processes. You can't treat each data product like a one-off. In order to get buy-in from the data engineers - or whoever are your data product developers - you need to invest in changing hearts and minds through the platform. If creating and managing data products is significantly harder than the old way of dealing with data, you will lose people quickly. Read about the data mesh accelerate workshop 😅 When you think about first steps with data mesh, A) build buy-in at the strategic level that you want to actually start leveraging your data for high-value purposes; B) find use cases to support those strategic initiatives; and C) make sure you are ready to actually thin slice and not try to only tackle on pillar - you have to be ready to take on a LOT of challenging work. !Controversial!: None of the four data mesh principles are all that useful on their own. Scott note: there's a figure Zhamak has that explains why all four are necessary in conjunction that's very helpful here. It's easy to want to skip bringing all your key stakeholders into alignment early in your data mesh journey. But you need matching expectations and shared understandings of what you are trying to accomplish and why. Scott note: this doesn't mean everyone has to be bought in that they are first, there's a balance to be found here. You need room to make mistakes and adjust your data mesh implementation because you will not get it all right at the start. Data mesh is as much about learning how to do data well as doing data well. It's crucial to not just ask if you are succeeding with your data mesh implementation but measure that. It can be hard to measure but consider what matters to your implementation's success and find things to measure if you're succeeding in each of those areas. Otherwise, how do you know where to focus and optimize? Subsidiarity: "everything should be decided as locally as possibly but no more so." Basically, there are many decisions that should be made in the domains but there are some that need to be made centrally. The challenge is figuring out which decisions should be made where 😅 There will be capability challenges in every organization when doing data mesh. That will impact initial decisions around how much to centralize or decentralize but as you upskill the teams, you may want to decentralize more. Find your equilibriums but equilibriums change. It's all about trade-offs! Many people are too focused on exactly if they are doing data mesh instead of are they delivering value in a scalable way through data. That happened in microservices too and it took them 10+ years to really get to best practices. Data mesh is only 5 years old and only ~3 with any number of organizations attempting it - focus on getting better instead of being worried if you're the perfect picture of data mesh yet. When talking about your data mesh success internally, you need to talk about the value from use cases AND the value of improving your data capabilities in general. You prove out you are delivering specific value along the way but also that you are getting more and more capable at doing the data work to make the organization better. Both are of valuable and you should promote the value of both aspects: use case value and capability value. When talking about data mesh, use the ADKAR method: create Awareness and Desire, give them the Knowledge about how you're doing it, upskill people so they have the Ability to do data mesh, and finally constantly Reinforce the value and that it's important. Without touting your mesh successes, you'll lose momentum. When looking for your first data mesh use case, look for something that has a customer impact - what can you do for them that you couldn't before. Personalization is a good example. Legal is potentially another place re reducing risk. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#303 Delivering What Matters - Value - Through Strong Business Collaboration - Interview w/ Saba Ishaq 1:10:37

לפני 1 year1:10:37

1:10:37

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Saba's LinkedIn: https://www.linkedin.com/in/sabaishaq/ Decide Data website: ttps://www.decidedata.com/ In this episode, Scott interviewed Saba Ishaq, CEO and Founder of her own data as a service consultancy, Decide Data, which also provides 3rd party DAaaS (Data Analytics as a Service) solutions. Some key takeaways/thoughts from Saba's point of view: "If you don't know what you want, you're going to end up with a lot of what you don't want." This is especially true in collaborating with business stakeholders when it comes to data 😅 Focus on delivering value through data instead of delivering data and assuming it has value. – “Not all data is created equal.” As a data leader, it's your role to help people figure out what they actually want by asking great questions and being a strong partner when it comes to the data/data work. Don't only focus on the data work itself but it's very easy to do data work for the sake of it instead of something that is valuable. To deliver data work that actually moves the needle, we need to start from what are the key business processes and then understand the pain points and opportunities. Then, good data work is about how do we support and improve those business processes. Relatedly, that's also the best way to drive exec alignment - talking about their business processes and how they can be improved first, data work second. They will feel seen and heard and are far more likely to lean in. At the end of the day addressing business and operational challenges is what data and analytics is all about. Deliver something valuable early in any data collaboration with a business stakeholder. You don't have to deliver an entire completed project but time to first insight is time to value and you build momentum and credibility with that stakeholder. At the beginning of a project - and delivering a data product is itself a project - you should work with stakeholders to not just define target outcomes but also define how are you going to collaborate and communicate. You can't just get requirements, go away and build. Working with data should be iterative and should have an element of continuous improvement to evolve what you deliver as you build value. Start any data work by asking someone about their business objectives, challenges, and target outcomes. You need your business stakeholders to have a clear vision of what they want to achieve, otherwise you are likely to be delivering only data work instead of business value that leverages your data work. By doing deep discovery work, you can find where are the key lynchpins and value drivers in a use case. There are points of criticality that are easy to lose in a sea of potential requirements that are really requests. Find those crucial value leverage points! Relatedly, you can use those value leverage points to keep your business execs engaged. They will - hopefully - see the importance and help you narrow in on what matters in their use case. Then it's no longer about the data work but the value to them. ?Controversial?: For data people, you have to balance career management and interesting project/technology work versus value delivery. That doesn't mean delivering value isn't interesting but it doesn't always mean getting to play with the latest and greatest. But if data people never get to have fun and play with cool tech, many will leave. It's a tough balance. Try to make the valuable work also interesting 😅. Relatedly, try understanding the data team’s learning areas of interest and see how you can build seeds to foster their skill growth while making data work valuable. Sometimes it turns out to be a win-win situation. Relatedly, be very transparent and communicate a lot to your data teams about what you are prioritizing and why. It's very easy to get lost in telling data people to do certain work rather than why they are doing that work. Keeping your data people in the loop of the why will keep them focused on what matters. For many organizations, the rate of change of their technology - application and data technologies - is growing at faster than the rate of their people change management/transformation processes. You need parallel streams to modernize both or your people will fall further behind, leading to chaos. ?Controversial?: Relatedly, your overall org and/or digital transformation strategy should be tied to your data strategy. Otherwise, they will likely be heading in different directions, creating more challenges. Scott note: Benny Benford talked a lot about this in episode #244, going far together. Data management is a very crucial element of digital transformation but it’s not the same thing as change management. The data team shouldn't be the ones leading the overall digital transformation of the organization. That's too much on a team that specializes in data rather than change management. If you are in that situation, it's a very tough spot to do well. It's very important to focus on communication to stakeholders when you think about data governance and digital transformation. For many execs, these are foreign topics so you have to work hard to engage them and keep them leaning forward on the necessary work. Data governance is beneficial for everyone, so if explained and defined well people will engage willingly after knowing what’s in it for them. As someone in the data team, you have to be well informed about digital transformation initiatives inside your organization. Otherwise, you will miss opportunities to align to those initiatives AND have all your data sources break when there is a migration you weren't told about 😅 It's easy to screw up the data steward/ownership conversation letting someone know they are responsible for the governance of their data. It's often a scary conversation for both parties. But it's necessary and you can show people why it makes sense and adds value to their work too! Relatedly, link people's pain points to current weaknesses in the data governance. Show them they are causing issues for themselves and give them an easier path to fix it without having to learn everything about data work. Data governance doesn't have to be some wholly - or holy 😅 - separate practice. It should just be part of normal work related to data. Make it less scary and more approachable for your business stakeholders. It's a team effort and it drives real, measurable benefits and value. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
No Episode This Week 1:31

לפני 1 year1:31

1:31

Craziness of the overseas move (including a faulty office chair... long story) are to blame. Back to the normally scheduled one episode a week next week! Episode list and links to all available episode transcripts here .

D

Data Mesh Radio

1
#302 Finding and Delivering on a Good Initial Data Mesh Use Case - Interview w/ Basten Carmio 1:11:47

לפני 1 year1:11:47

1:11:47

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Basten's LinkedIn: https://www.linkedin.com/in/basten-carmio-2585576/ In this episode, Scott interviewed Basten Carmio, Customer Delivery Architect of Data and Analytics at AWS Professional Services. To be clear, he was only representing his own views on the episode. Some key takeaways/thoughts from Basten's point of view: Your first use case - at the core - should A) deliver value in and of itself and B) improve your capabilities to deliver on incremental use cases. That's balancing value delivery, improving capabilities, and building momentum which are all key to a successful long-term mesh implementation. When thinking about data mesh - or really any tech initiative - it's crucial to understand your starting state, not just your target end state. You need to adjust any approach to your realities and make incremental progress. ?Controversial?: Relatedly, it's very important to define what success looks like. Doing data mesh cannot be the goal. You need to consider your maturity levels and where you want to focus and what will deliver value for your organization. That is different for each organization. Scott note: this shouldn't be controversial but many companies are not defining their mesh value bet… Even aligning everyone on your organization's definition of mesh success will probably be hard. But it's important to do. For a data mesh readiness assessment, consider where you can deliver incremental value and align it to your general business strategy. If you aren't ready to build incrementally, you aren't going to do well with data mesh. A common value theme for data mesh implementations is easier collaboration across the organization through data; that leads to faster reactions to changes and opportunities in your markets. Mesh done well means it's far faster and easier for lines of business to collaborate with each other - especially in a reliable and scalable way - and there are far better standard rules/policies/ways of working around that collaboration. But organizations have to see value in that or there may be mesh resistance. As many have said, you must approach data mesh in a thin slice. Trying to focus too much on any pillar at the expense of the others leads to challenges. Scott note: Zhamak literally has a figure on this she shares often. It's easy to get unbalanced if you ignore a principle and fixing that takes more effort than thin slicing. ?Controversial?: As you build out your mesh capabilities, especially your platform, think about what you need to actually deliver on the use case(s) at hand and deliver only that. Don't get ahead of yourself. It's fun to build capabilities but it's easy to build a monstrosity of a platform instead of proper abstractions to make building and managing data products simple. Doing data mesh well is all about managing trade-offs. There are trade-offs at the use case and implementation level. And it's okay to get those wrong, just look to fail fast rather than hold on to bad decisions. Don't be precious with your decisions and build in ways to make evolving and improving easier. MVP can be minimum viable product or minimum valuable product. You need to define what value actually means for each use case. It's easy to point to pain and start solutioning but not end up addressing the pain or creating value. This will help you prioritize and deliver the most value compared to effort early. Relatedly, having a clear perspective on value in your MVP means you are more easily able to change and adapt as you learn and deliver. If you thought value was going to come from X but early indications are Y is way more important or is much different than expectations, you can more easily pivot towards Y. Data products don't magically create value. They should be bets on value delivery. So you need to have ways to gracefully retire data products when the cost exceeds the incremental value. Often times a bet was a good bet even when it didn't pay off or it is no longer paying off. Continuous communication and driving buy-in, especially with C-level execs, is unfortunately crucial to your data mesh implementation success. If they don't value better data capabilities, few people will invest their time and effort to really reinvent the way your organization does data. Relatedly, you need to tie your data mesh implementation to the overall business strategy. Execs need to know why you are doing something like data mesh and how it will deliver value. It's not an overnight success but it also can't be a 3 year project before value. Speaking to that gets more people to lean in so you can build momentum. Finding and leveraging your data success champions is always hard but it's necessary to build that momentum. Think about building champions quite early. When you're trying to get an exec to buy in to something like data mesh, always put it in terms of what's in it for them. Don't try to get them bought in to some grand vision first, why would they invest their valuable time and resources here? Find their pain points and what something like data mesh will do to address those specifically. Relatedly, trying to get someone to just take on data ownership without the understanding of what the rest of the organization is going to do to make ownership much easier - mainly through the platform and federated governance - is probably going to be a … not fun conversation. When thinking about data mesh, it should all come back to value. If you can't specifically point to value that will exceed your effort for doing something like data mesh, it's probably not worth it for your organization. What is valuable to your organization and how will data mesh help you capture that value? Relatedly, as you are on your data mesh journey, you have to be honest in assessing if data mesh is really working for your organization. It's okay to stop if it's not. The data mesh principles sound really great and helpful in the abstract. But they just might not be aligned to value with the way your company does business. ?Controversial?: It's perfectly acceptable to have hybrid approaches in a data mesh journey. There is a perception that everything should be decentralized and that just isn't always the best approach, don't get caught up in dogma. Scott note: the "decentralize everything" is a misunderstanding of Zhamak. Also, as you are on a journey, your current state won't match an ideal end state. Focus on evolving while delivering value, not trying to be perfect today instead of _getting_ to good. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#301 Learnings From 25+ Years in Data Quality - Interview w/ Olga Maydanchik 1:01:57

לפני 1 year1:01:57

1:01:57

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Olga's LinkedIn: https://www.linkedin.com/in/olga-maydanchik-23b3508/ Walter Shewhart - Father of Statistical Quality Control: https://en.wikipedia.org/wiki/Walter_A._Shewhart William Edwards Deming - Father of Quality Improvement/Control: https://en.wikipedia.org/wiki/W._Edwards_Deming Larry English - Information Quality Pioneer: https://www.cdomagazine.tech/opinion-analysis/article_da6de4b6-7127-11eb-970e-6bb1aee7a52f.html Tom Redman - 'The Data Doc': https://www.linkedin.com/in/tomredman/ In this episode, Scott interviewed Olga Maydanchik, an Information Management Practitioner, Educator, and Evangelist. Some key takeaways/thoughts from Olga's point of view: Learn your data quality history. There are people who have been fighting this good fight for 25+ years. Even for over a century if you look at statistical quality control. Don't needlessly reinvent some of it :) Data literacy is a very important aspect of data quality. If people don't understand the costs of bad quality, they are far less likely to care about quality. Data quality can be a tricky topic - if you let consumers know that the data quality isn't perfect, they can lose trust. But A) in general, that conversation is getting better/easier to have and B) we _have_ to be able to identify quality as a problem in order to fix it. Data quality is NOT a project - it's a continuous process. Even now, people are finding it hard to use the well-established data quality dimensions. It's a framework for considering/measuring/understanding data quality so it’s not very helpful to data stewards / data engineers in creating data quality rules. The majority of quality errors are not random, they come from faulty data mapping / bugs in pipelines. Having good quality rules will catch a large percentage of errors that can be fixed in bulk. When thinking about getting started around data quality, it doesn't have to be complex and with lots of tools. It can be people looking at the data for potential issues and talking to producers. Then you can build a business case for fixing the data to get funding. You have to roll up your sleeves and talk to people but you can get forward momentum. Data quality issues aren't inherently material to the business processes - they are only bad when they cause issues for the business. You have to find those actual business issues to get people to care and get funding for fixing it. Quality for the sake of quality is just extra cost. Do not create too many data quality rules that do not matter. Relatedly, being able to show someone a relatively basic quality indicator early is far better than asking for a lot of budget to figure out the quality levels. You can do that with something as simple as random sampling 100-200 records and an hour of 1-2 people's time. To understand which data quality challenges and use cases are the most important, data people simply have to learn more about the business. Good data quality is about fit for purpose and that means understanding the purposes :) To find your initial good data quality use cases, look to mission criticality. What dashboards or reports are actually important to the company and why? Then work backwards to see if quality is an issue for those dashboards and reports. That's how you find your early buy-in to work on a quality initiative that can scale. !Controversial!: Data contracts are not at all new, we just now have a good enough set of tools and technologies to be able to do them better at scale. ?Controversial?: Most are doing data contracts … not that well. For them, it's about the technology and not the process. There isn't a continuous approach. Scott note: Andrew Jones has said the same. It's about ensuring a process that results in quality data, not the tools. For data contracts, there MUST be a feedback loop or we aren't actually delivering to needs, especially as needs evolve. Look to the widely used customer supply model for insights into what we need to achieve and how when it comes to data contracts. Many companies are creating actual financial incentives tied to data quality in order to ensure people care about data quality. That's not right for every organization but it does send a clear message as to the importance of data quality. You have to consider your data supply chain - if your interface for data input is bad, your data is very likely to be bad. People will simply enter garbage to move forward. Doing data quality manually is not sustainable/scalable. But you don't need to start with expensive tools, you can get your arms around things initially pretty easily. It will help you identify your actual problems instead of spending time specifically on tools. ?Controversial?: Many vendors are selling their tools as the fix to data quality. But detecting data errors with the tools is only the start of the data quality improvements. Once errors are detected, root cause analysis for the errors needs to be performed and the processes / code need to be fixed. None of the data quality tools can do this. It is human’s job. Beware the snake oil. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#300 Panel: How to Treat Your Data Platform as a Product - Led by Michael Toland w/ Sadie Martin, Marta Diaz, and Sean Gustafson 1:03:01

לפני 1 year1:03:01

1:03:01

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Michael's LinkedIn: https://www.linkedin.com/in/mjtoland/ Marta's LinkedIn: https://www.linkedin.com/in/diazmarta/ Sadie's LinkedIn: https://www.linkedin.com/in/sadie-martin-06404125/ Sean's LinkedIn: https://www.linkedin.com/in/seangustafson/ The Magic of Platforms by Gregor Hohpe: https://platformengineering.org/talks-library/the-magic-of-platforms Start with why -- how great leaders inspire action | Simon Sinek: https://www.youtube.com/watch?v=u4ZoJKF_VuA In this episode, guest host Michael Toland Senior Product Manager at Pathfinder Product Labs/Testdouble and host of the upcoming Data Product Management in Action Podcast facilitated a discussion with Sadie Martin, Product Manager at Fivetran (guest of episode #64), Sean Gustafson, Director of Engineering - Data Platform at Delivery Hero (guest of episode #274), and Marta Diaz, Product Manager Data Platform at Adevinta Spain. As per usual, all guests were only reflecting their own views. The topic for this panel was how to treat your data platform as a product. While many people in the data space are talking about data products, not nearly as many are treating the platform used for creating and managing those data products as a product itself. This is about moving beyond the IT services model for your data work. Platforms have life-cycles and need product management principles too! Also, in data mesh, it is crucial to understand that 'platform' can be plural, it doesn't have to be one monolithic platform, users don't care. Scott note: As per usual, I share my takeaways rather than trying to reflect the nuance of the panelists' views individually. Scott's Top Takeaways: You will hear "product mindset" a lot in this panel. It's important to embrace product management as a mindset and not an exact set of things to do and approaches to take. The whole point of a product mindset is to find what works and deliver reliably while focusing on value. Be ruthless. Ruthlessly prioritize and ruthlessly focus on user centricity. In data, we have a tendency to fall in love with the tools instead of the jobs to be done. But a good platform is about making it easy to get high-value work done and that's rarely about exposing tooling to users. Your platform isn't going to magically drive usage. There is change management required to get people to leverage your data platform more. Be prepared for that change management work and closely aligning with users, whether they are 'data people' or not. Platforms and products in general are about scalability. Through a platform, instead of reacting to tickets, you build services and capabilities to address the problems that caused the tickets - far more scalable than reacting to the individual tickets, addressing the disease not the symptom. By building your platform as a product, you focus on what actual capabilities are within scope so you can continue to manage and expand your data platform. As much as it would be amazing to build a data platform from scratch - think how amazing you could build it! - in many cases, you will have to build off of existing services and platforms. Don't be too dogmatic - what matters is continuing to get better not being perfect. Give yourself the space to improve the platform but products live in the real world and the real world of your organization has current/existing business needs and constraints. Products - especially in software - are as much or more about evolution as they are about their form/function at initial launch. The same should happen with your platform. And the same should happen with your vision. Don't get locked in, don't get tunnel vision. Generally speaking, most data platforms are still not serving the role a software platform - data or otherwise - should serve. They are still about the tech instead of the capabilities. You aren't alone if your platform doesn't meet the ideal vision. Your role is to make it better but it's still probably not going to be a sparkling beacon of perfection anytime soon 😅 Your data platform needs to be aligned to your company culture. So you have to meet people where they are and properly set down the easy path to where you want them to go. It's a long journey. Other Important Takeaways (many touch on similar points from different aspects): The product mindset is for the entire team, not just the platform leader and/or product manager. Your entire team will have to change their approach and mindset. Not overnight but it's hard to break old ticket-taker habits. Make sure to put your work in the context of the business. It's easy to get bogged down in the platform engineering aspects or the data tools but your data platform is there to serve business purposes. Focus on what delivers value for the business. Your platform doesn't have to address all your organization's data challenges at once, right at the start. Find where you are seeing the biggest challenges with delivering value - maybe listen back to episode #297 on the Data Value Chain - and look to focus there first. Build up to better. "Build with [your users], not for them." Don't treat your platform as a project. Yes, that's the subject of the panel but it's very important to always keep close at heart. User experience is such a crucial aspect of good software platforms. It's probably even more important when it comes to data platforms if you want to bring new users to producing and consuming data. But it's HARD and rarely discussed. What is the goal of your platform? That might sound like an obvious question but it's not really when you go to answer it. Maybe it's "make it easier to work with data" but easier for who? Really map out good outcomes of a well-made platform. Good products - at least good software products - aren't designed to be perfect at launch. Give yourself the space to not be perfect but set yourself up to understand where things can be improved and then iterate. Test, learn, iterate. What are your data platform KPIs? What will measure if you are being successful? Really consider what bets you are making and why. Usage isn't always good but it's a good first indicator as you are getting moving. To make your platform a product, you have to come back to the vision (or goal mentioned above). Otherwise, you have a collection of services. How is that vision tied to business value? If you need budget, do those holding the purse strings care about your tech decisions or business impact? A good platform helps people get what they need done with the agency to do it. It limits - and where possible eliminates - bottlenecks. A crucial aspect of product management is taking user needs and then translating that to build something to serve those needs. But when it comes to your data platform, the users and the builders (the data engineering team) usually don't speak the same language so you will probably have to spend more time translating between the users and the team building the platform than even in software. Be prepared for you data team to grumble about more meetings too 😅 Combine Simon Sinek's 'Start from the why' and user centricity. The why should be about what are you enabling your users to actually accomplish and what value that drives. Really focus on building something that drives value for the business instead of leveraging the coolest tech. Relatedly, it's likely going to be very difficult to measure the return on investment on your data platform. Be prepared for those conversations but it can be pretty squishy. At their heart, good products are about creating good user experiences and delivering/capturing value. Products need to deliver value to users and producers need to capture value as well. That is a complex topic when the value captured is internal, but it is still an appropriate mindset. If you aren't able to prove value, will you get further funding? Consider your company needs relative to data maturity. An incredibly cutting edge platform for an organization that has low data maturity is not the right fit. Even if you have a cutting edge vision for their work, your 4 year old won't be able to out sculpt Michealangelo. Be realistic and use the platform to drive towards better data capabilities but you have to meet your organization at least close to where they are. You might have difficulty getting people bought in on the vision of your data platform. If people view data as ticket-takers/a service instead of an integral part of the company's way of doing business, be prepared for the fun of lots of stakeholder alignment work. Where possible, don't only look to sell people on your vision. Try to change - or possibly nudge at most - people's behavior through your platform. It's a subtle art and hard to pull off but it will mean people do what you and they both need but without having to build a million PowerPoint decks 😅 It's important to separate your concept of a data platform and the approach of treating it as a product. Otherwise, your platform can easily fall into the trap of designing to user wants instead of what is a platform's function - to make it easier to do work in a scalable, reliable, and repeatable fashion. You need to consider the purpose of the platform instead of throw product management at a collection of tools. Leaders/execs want change. Leaders/exec do not want _to change_. Be prepared for that. Users aren't necessarily ready for the data platform to be a product. They are used to the ticket-taker model. Be prepared to shift their mindset too, not just that of the (data) engineers building the platform. It won't be a simple switch over either - very deep topic… Relatedly, you will probably need a longer run-way to transition your team to a product model for the platform. It is a mindset shift but it's also a big change in the ways of working. You will need to have some patience, you can't switch from a ticket model to only building your platform as a product overnight. A key potential area of value for the platform is making it fast to prototype and get data in consumers' hands. It can reduce a ton of back and forth and also help you quickly discover if further work on a potential data product is useful or if it should be scrapped. Research spikes are very important in data and enabling them can be very valuable - if not always valued 😅 Make sure to think about the actual scope of your data platform. Products have scope, don't try to be all things to all people and don't bite off more than you can chew. There's a reason SaaS offerings in the data space have post-sales and customer success engineers. Many of your users won't just simply be able to read the documentation and use your platform no matter how well you build and document it. Be prepared to ensure success through a bit of hand-holding. Ensuring usage that drives value is (usually) what makes software and data offerings successful. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#299 Empowering Development with Actionable Data - Interview w/ Carol Assis and Eduardo Santos 1:13:01

לפני 1 year1:13:01

1:13:01

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Carol's LinkedIn: https://www.linkedin.com/in/carol-assis/ Eduardo's LinkedIn: https://www.linkedin.com/in/eduardosan/ Continuous Integration book: https://www.amazon.com/Continuous-Integration-Improving-Software-Reducing/dp/0321336380 Measure What Matters book: https://www.amazon.com/Measure-What-Matters-Google-Foundation/dp/0525536221 Inspired by Marty Cagan: https://www.amazon.com/INSPIRED-Create-Tech-Products-Customers/dp/1119387507 Empowered by Marty Cagan: https://www.amazon.com/EMPOWERED-Ordinary-Extraordinary-Products-Silicon/dp/111969129X In this episode, Scott interviewed Carol Assis, Data Analyst/Data Product Manager and Eduardo Santos, Professor and Consultant, both at Thoughtworks. To be clear, they were only representing their own views on the episode. From here forward in this write-up, I will be generally combining both Carol and Eduardo's views into one rather than trying to specifically call out who said which part. Some key takeaways/thoughts from Eduardo and Carol's point of view: At the end of the day, the team that produces the data will get the most use out of it 9/10 times. Getting teams used to developing with data in mind isn't just useful for the organization, it is for maximizing their own team's success. Continuous integration is a crucial concept in general for learning how to automate and focus on delivering more, which leads to focusing on value. Read the book :) ?Controversial?: Data mesh is an extension of the continuous integration book/concept because of the focus on delivering value quickly and building to scale reliably. There are many methodologies for understanding value delivery in software. We just have to adapt them better to data. Don't reinvent the wheel. Far more organizations need to think about the goals of their products and then how to measure success against those goals _at product inception_. Design data into your products from the start. Data people often make the data overly complicated for non-data people to grasp. What does the data tell us and what are some simple numbers? Then people can feel like they understand without going too deep into stochastic modeling or something. Relatedly, engage data consumers' curiosity - including the producers that will consume their own data. Try to meet them where they are to get them to engage with data more. Lower the perceived bar to leveraging data. Application development teams need convincing that working with data is 1) essential to understand their own success to further improve their products and 2) much easier than it has been historically. There is a LOT of scar tissue out there… A potential good hook to get people to build their applications with data in mind is being able to show metrics and measure success from day one. The business side can get a better idea and are more likely to engage; it gives a communication bridge between developers and the business people. Having data early in the application development cycle means you have more proof points for making your decisions - assuming you side with the data 😅 that makes it easier to justify decisions instead of people making guesses. Measuring what matters is a crucial concept for the entire team to adopt. It will help people understand what data they need and why. Pressing people on what to measure and then how to measure it crystalizes what bets they are making. How are they expecting users to interact with their applications? For many business people, you may need someone playing the data translator role, translating data to business and business to data. Most organizations' data literacy is still quite low and again, you need to lower the bar to business people leveraging data. ?Controversial?: You don't need to start with data products. Yes, they are great, but teaching your teams that data matters is groundwork to head in the direction of something more scalable. A spreadsheet is a fine place to start. Focus on delivering insights that deliver value, then work towards the productized aspects :) Ask people what action they will take once they have data. Get them in the mindset that data drives action and data that won’t drive action isn't where they should focus. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#298 Effective Partnering With Business Execs - Learnings from Another Data Mesh Journey - Interview w/ Jessika Milhomem 1:07:39

לפני 1 year1:07:39

1:07:39

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Jessika's LinkedIn: https://www.linkedin.com/in/jmilhomem/ In this episode, Scott interviewed Jessika Milhomem, Analytics Engineering Manager and Global Fraud Data Squad Leader at Nubank. To be clear, she was only representing her own views on the episode. Some key takeaways/thoughts from Jessika's point of view: There are no silver bullets in data. Be prepared to make trade-offs. And make non data folks understand that too! Far too often, people are looking only at a target end-result of leveraging data. Many execs aren't leaning in to how to actually work with the data, set themselves up to succeed through data. Data isn't a magic wand, it takes effort to drive results. Relatedly, there is a disconnect between the impact of bad quality data and what business partners need to do to ensure data is high enough quality for them. Poor data quality results in 4 potential issues that cost the company: regulatory violations/fines, higher operational costs, loss of revenue, and negative reputational impact. There's a real lack of understanding by the business execs of how the data work ties directly into their strategy and day-to-day. It's not integrated. Good data work isn't simply an output, it needs to be integrated into your general business initiatives. More business execs really need to embrace data as a product and data product thinking. Instead of a focus on only the short-term impact of data - typically answering a single question - how can we integrate data into our work to drive short, mid, and long-term value? ?Controversial?: In data mesh, within larger domains like Marketing or Credit Cards in a bank, it is absolutely okay to have a centralized data team rather than trying to have smaller data product teams in each subdomain. Scott note: this is actually a common pattern and seems to work well. Relatedly, the pattern of centralized data teams in the domains leads to easier compliance with regulators because there is one team focused on reporting one view instead of trying to have multiple teams contribute to that view. When you really start to federate data ownership, business execs can now partner far easier with other business execs in other domains leveraging data. Instead of having the central data team trying to translate, there is a focus on what needs to get done and the data work flows from that instead of the data work being the focus. It's the engine that powers their collaboration but it's no longer 'the point'. Partnering with those who "are closer to the reality" of the business, it's easier and more likely to drive good outcomes. Meaning: not the senior execs. But the senior execs often have to be on board with the work and the target results. So work on communicating up but closely collaborating at lower levels. Data for regulators often has a LOT of potential reuse for your own organization. Lean into finding those areas where you can do the data work once and get value twice :) ?Controversial?: Really consider role titles in data mesh. Data product owner might be too nebulous and quickly accumulate too many responsibilities. Data product manager is easier to understand the scope of responsibilities and the specific areas of focus. Scott note: this comes up A LOT and is generally starting with data product owner and moving to data product manager. ?Controversial?: Data leaders need to understand product management. To really scale data work, we have to start treating all aspects as a product practice. CTOs down to software engineers need to understand product management, it's time for the data org to as well. Data leaders need to have significant communication skills while maintaining their understandings of data best practices. It's all a delicate balance but the data work doesn't speak for itself. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#297 Panel: Understanding and Leveraging the Data Value Chain - Led by Marisa Fish w/ Tina Albrecht, Karolina Stosio, and Kinda El Maarry, PhD 58:10

לפני 1 year58:10

58:10

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Marisa's LinkedIn: https://www.linkedin.com/in/marisafish/ Karolina's LinkedIn: https://www.linkedin.com/in/karolinastosio/ Tina's LinkedIn: https://www.linkedin.com/in/christina-albrecht-69a6833a/ Kinda's LinkedIn: https://www.linkedin.com/in/kindamaarry/ In this episode, guest host Marisa Fish (guest of episode #115), Senior Technical Architect at Salesforce facilitated a discussion with Kinda El Maarry, PhD, Director of Data Governance and Business Intelligence at Prima (guest of episode #246), Tina Albrecht, Senior Director Transformation at Exxeta (guest of episode #228), and Karolina Stosio, Senior Project Manager of AI at Munich Re. As per usual, all guests were only reflecting their own views. The topic for this panel was understanding and leveraging the data value chain. This is a complicated but crucial topic as so many companies struggle to understand the collection + storage, processing, and then specifically usage of data to drive value. There is way too much focus on the processing as if upstream of processing isn't a crucial aspect and as if value just happens by creating high-quality data. A note from Marisa: Our panel is comprised of a group of data professionals who study business, architecture, artificial intelligence, and data because we want to know how (direct) data adds value to the development of goods and services within a business; and how (indirect) data enables that development. Most importantly, we want to help stakeholders better understand why data is critical to their organization's business administration strategy and is a keystone in their value chain. Also, we lost Karolina for a bit there towards the end due to a spotty internet connection. Scott note: As per usual, I share my takeaways rather than trying to reflect the nuance of the panelists' views individually. Scott's Top Takeaways: If you want to dig deeper into the data value chain, consider looking into the value streams concept. What flows through your business in terms of process to generate value? Where are there points of value leakage? The same concepts are crucial in your value chain. Organizations need to really educate their entire organization on the data value chain. Part of why there are so many issues in data from upstream changes by developers breaking downstream data is they simply don't know what parts of their data are used and why. Communication is a much bigger aspect of doing data than people think. Even talking about the specific data value chain can cause people to focus too much on the data work instead of the business value delivered via data. The data value chain is crucial to understand but it's also crucial to understand data work doesn't inherently create value, it's about how it's used in the business. Dig into the value created and focus on working backwards from that to what data work needs to be done. The data value chain is crucial for companies of all sizes across all industries. At its heart, the concept is about focusing on ensuring you aren't leaking value in your business value streams/pipelines. You need to focus on what drives value and how to improve the processes there. Data value chains often cross line of business/domain boundaries. After all, a lot of the value of data is about combining information across those boundaries. That can mean cross-team handoffs, which make understanding and ensuring the success of those data value chains even harder. Who owns what isn't inherently understood/agreed to, you need to get specific. It's important to not get overly focused on a single end-point of value when it comes to data work, especially when it comes to a data product. If we want re-use, we have to focus on the processes of creating reusable value. Maintaining that larger picture focus while still ensuring each data consumer can still get value from a data product is a very hard balance. Focusing heavily on your data value chain is going to be hard. It means hard work and a lot of internal collaboration - and thus negotiation - across domain boundaries. You all have to be in it together to really get the best results - and some organizations aren't ready for that. But the hard work pays off because you are ensuring value actually gets created. As with anything in data, you have to make bets. That doesn't mean every bit of data work will create significant value or even exceed the investment. But an approach like data value chain is crucial to understand 1) what bets are you making and why and 2) who owns what aspects of the data work. That can help you really focus on the what and why rather than focusing on outputs. Other Important Takeaways (many touch on similar points from different aspects): As with many things in data, ownership is crucial to understanding your value chain. The weakest points in a value chain are the handoffs between teams. Strong ownership, including of those handoffs, prevents value leakage (from the value streams concept). To understand your data value chain, you will have to go deeper than many are willing to in the (dreaded?) operational plane. You have to understand what data you have, how it's collected, what data you can collect, etc. Some of it is working backward from what data you need/want but a lot of it is working from what data you have or can get. Relatedly, the value you can create from data is heavily reliant on what matters to the business. To think about value, you have to understand your business processes and what generates actual value. You really need to consider your approach to data collection and storage. How do you want to consider data that may have value but hasn't yet proven to have value? You don't want to have costs go out of control and most data is never back-cleaned/filled if it wasn't collected and stored for use. But you can't know all your data use cases at the launch of a new application or product. It's a balancing act. There is a question of how mature do you need to be as an organization to actually really consider using data value chain as a framework instead of merely some principles to guide your work. It can be hard to get people to understand the value and what drives that value in data when they don't understand data work in general. Relatedly, really digging into the data value chain can shine a light on underperforming activities inside and outside the data function. So you need to be prepared for some hard realizations and questions. Are you ready for transparency? What aspect of data value chains fall on the business? It's a hard question. At the end of the day, data value chains are supportive of the business value chains/streams but it depends on who has ownership over data work: the lines of business or a centralized team. Your data value chains should have explicit ownership, at least of the different 'links' of the chain. In data mesh especially but true in any data work, it's important to not see the data product as the end of the data value chain. The data product is there to make it easy for producers to reliably and scalably deliver value through data. But there is only value if that data is consumed, the value happens when someone takes action. When launching new applications/products, you have to consider what data you might want to collect even if you don't need it right at the start. Especially if that is something like hardware where you can't augment many aspects of the devices once they've been deployed. Focusing on data value chains is a mindset shift for most organizations, much like data as a product thinking. You need to get people to stop handwaving about aspects of data work and focus specifically on value and understanding that all parts of the data creation and transformation process are crucial to driving rich and sustainable value from data. Even if you do a good job at understanding your data value chains, there will still need to be rework. But it can help you prioritize data rework - you aren't going to get your data preparation perfect, especially for multiple consumers, on the first try. You have to be realistic about your data value. Your company probably won't value data and analytics that are internal facing as much as they do external-facing interactions until you prove out the value of treating those internal users with as much care. Part of that is getting specific about how much value you are generating and how :) At some point in your value chain, you aren't dealing with raw data anymore. Think about who wants what and why. Most execs want aggregated information - again, that point of driving business value instead of data work. Make sure there is clear communication to drive outcomes instead of outputs. A data value chain isn't about getting everything perfect upfront. Everything is about incremental delivery and getting better. What is the cost/benefit of that improvement? Get something out that works and is supportable/stable and then improve. Iteration is your friend. When thinking about your data value chain, it's usually best to focus again on target business outcomes/objectives. After all, that is where the value is. You can get more business people interested in data work if you are constantly talking in their language about their key objectives. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#296 Patience in Product Thinking in Data - Building to Large-Scale Behavior Change - Interview w/ Darren Wood 1:02:58

לפני 1 year1:02:58

1:02:58

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Darren's LinkedIn: https://www.linkedin.com/in/darrenjwoodagileheadofproduct/ Darren's Big Data LDN Presentation: https://youtu.be/vUjoJrl_MEs?si=WzB0sBStVIAyqDJs In this episode, Scott interviewed Darren Wood, Head of Data Product Strategy at UK media and broadcast company ITV. To be clear, he was only representing his own views on the episode. Scott note: I use "coalition of the willing" to refer to those willing to participate early in your data mesh implementation. I wasn't aware of the historical context here, especially when it came to being used in war, e.g. the Iraq war of the early 2000s. I apologize for using a phrase like this. Some key takeaways/thoughts from Darren's point of view: Overall, when thinking about moving to product thinking in data, it's as much about behavior change as action. You have to understand how humans react to change and support that. You can't expect change to happen overnight - patience, persistence, and empathy are all crucial aspects. Transformation takes time and teamwork. ?Controversial?: In data mesh, it's crucial to think about flexibility and adaptability of your approach. Things will change, your understanding of how you deliver value will change. Your key targets will change. Be prepared or you will miss the main point of product thinking in data. When choosing your initial domains and use cases in data mesh, think about big picture benefits. You aren't looking for exact value measurements for return on investment but you also want to target a tangible impact, e.g. if we do X, we think we can increase Y part of the business revenue Z%. Zhamak defines a data product quite well in her book on data mesh. But data as a product is a much broader definition of bringing product management best practices to data. That's harder to define but quite important to get right. When thinking about product discovery - what do data consumers actually need producers to create - there is often a big difference between consumers' initial suggested requirements and what they actually want. It's the role of data product management to bridge that gap and deliver what they need instead of what they requested. ?Controversial?: There is a big difference between data product management and regular product management: in regular product management, the ability to take requirements and go away and come back with something months later works. But data is about what's happening with the business now and needs to evolve as the understanding of requirements evolves. Relatedly, data products need to be even less rigid than regular products because they are a reaction to the real world as it changes; you must focus on building something that flexible. Otherwise, you aren't going to be reflecting what matters. When building data products, get to an MVP fast. Get something in people's hands and evolve it to what they need/want. Don't try to get it perfect. When it comes to doing data products well, there is far more collaboration than most people are used to around data work. When considering what to build, data producers need to ask consumers what question they are actually trying to answer rather than what dashboard do they want. Outcomes over outputs. It's about what you are trying to do. Scott note: And then come to an agreement on the specifics - are you delivering data, the insights, or the 'so what'? ?Controversial?: Data products can - and probably should? - absolutely look to address multiple business questions. But when you are thinking about your MVP, focus on what is the most critical question and the minimum requirements to address that question. You can improve the product but getting to first valuable insights is a great initial milestone to build off. ?Controversial?: Similarly, in regular product management, you can go away for six months and come back with something new and shiny but in data, it might take that long just to change two metrics on a dashboard because it took that long to clean up the data. And at the end no one cares, no one understands what the value was, and worst still people are annoyed because they don't trust the new metrics. Relatedly, maybe consider MLP instead of MVP. No, not 'My Little Pony' but Minimum Lovable Product. What is the minimum you can deliver that someone will love - what are the need to haves instead of the nice to haves? What is at least one feature that users will love and can you deliver that early to maintain engagement as you deliver more aspects of value? Actually sit with users and see how they leverage your products. That's a crucial aspect of product management, it shouldn't be any different in data. It can be a bit harder sometimes to get to specifics but that doesn't let you off the hook. It's necessary work to do. ?Controversial?: Bring as many of the folks on the data product producer team as you can into the discussions with the initial data product consumer. That will give a more complete picture to the team. Don't treat the rest of the team as ticket takers internally. And you can probably find and address challenges earlier - e.g. flagging a low quality data source - if everyone is more informed. As with anything in product management, prioritization is crucial. Focus on delivering value, not simply data products. Again, outcomes over outputs. Initial delivery of the data product to a consumer isn't a mark of 'done', it's a great place to focus on how the consumer actually uses and interacts with the data product. Get a sense of the friction points to find places to further improve it. No more throwing things - data projects - over the wall and treating them as done. Teams that understand product thinking in general are easier to teach product thinking around data. But for those teams that don't understand product thinking at all yet, you will need to spend more time with them. It can seem obvious but each team has different educational needs to bring them to product thinking for data. You can't try to win over everyone to something like treating data as a product yourself. You need to find your champions and advocates and then provide platforms internally for them to spread the messages of value and success. It's not only about delivering value but showing that off a bit too to get others excited. Use momentum levers. When choosing your first domains to work with and finding your first data products for data mesh, look for people that are enthusiasts. You want partners early on, not someone you have to constantly convince. Also look for data products that will be reusable to find additional users to provide additional value. It's as much about building momentum as it is about delivering value early. ?Controversial?: Behavior change is a crucial aspect of implementing data mesh. You have to understand behavior change takes time. And you have to know where you will be rigid and where you will be more flexible. If you say my way or the highway, most people are just going to head for the highway. You have to work out what is non-negotiable early. Relatedly, behavior change happens at different paces for different people. That will happen inside your organization too. Look to build up the critical mass, that momentum, over time so people feel like they are joining a successful movement. But you need to do your internal PR to make sure they know about it - data success isn't self-evident. During your transition to data mesh - and is it ever really done? -, you will have data sources that aren't part of the mesh. It will potentially be hard to integrate those with data from the mesh because your data in your data mesh has legal and governance at the core. Sources outside the mesh tend to have one-off policy approaches applied. Be prepared for some tension and consternation. In most cases, people will inherently trust existing data sources over new data sources. That will be a challenge to your data mesh adoption. Even if they objectively know the quality is better with the new source, it's human nature to trust what you already know more all else being equal. Involve consumers heavily in the conversations about what is changing and why. Pushing change on people is likely to cause pushback. No one wants change to happen to them instead of with them. When doing Domain Driven Design (DDD) in data mesh, do NOT try to start with all your domains at once. Look to find ones that are capable enough and that you can learn from to make it easier to partner with other domains in the future. Relatedly, your domain map will change. That's a part of DDD. Don't try to hold onto things rigidly. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#295 Data Shouldn't be a Four-Letter Word - Making Data a Forethought - Interview w/ Wendy Turner-Williams 1:16:25

לפני 1 year1:16:25

1:16:25

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Wendy's LinkedIn: https://www.linkedin.com/in/wendy-turner-williams-8b66039/ Culstrata website: https://www.culstrata-ai.com/ TheAssociation.AI website: https://www.theassociation.ai/ In this episode, Scott interviewed Wendy Turner-Williams, Managing Partner at both TheAssociation.AI and Culstrata and the former CDO of Tableau. TheAssociation.AI is "a global nonprofit business organization …focused on bridging the disciplines of AI, data, ethics, privacy, robotics, and security." It is focusing on things like networking and knowledge sharing to drive towards better outcomes including ethical AI. Some key takeaways/thoughts from Wendy's point of view: Right now, we try to break up the aspects of data into discrete disciplines - and then work on each completely separately - far too much. Privacy, security, compliance, performance, etc. Instead, we need to focus on the holistic picture of what we're trying to do and why. Communication is key to effective data work and driving value from data. Hire product managers and focus on the why. Break through the historical perceptions of data as a service organization. Drive to what matters - outcomes over outputs - and focus on delivering value. "What's the point of being focused on the data if you don't understand the business that the data is supposed to be used for?" ?Controversial?: "There is no transformation without automation." If you want data to play a part in transforming the business, you need to focus on automation. Data related work can't be toil work or most won't even do it. "You will never be as successful as you can be as a data organization if you're not able to influence your IT partners, your product teams, your business teams." For far too many companies, data is just an afterthought. It's not the core around how they build out initiatives. When you bolt-on the data to any aspect of the business instead of integrate it from the start and build with data in mind, it's far less impactful. You're always playing catch-up. Make data a forethought. In many respects, data has become a 'four letter word' to lots of people - meaning it has a bad connotation. There are a lot of internal politics around data. Data can mean power and it can also give people perspective on your team's performance. Try to work towards removing the politics if possible but also good luck… 😅 There is so much data in many large organizations that execs can't make sense of it. They often don't understand what data they will need to support their decisions or how to get in place the data they do know they need. There's also often a disconnect between strategy and targets/feasibility when it comes to data. There may be a strategy of grow X product with a target of 'grow X product 15%' but there isn't a good reason why 15% is the target. It becomes a dartboard instead of data feeding into creating the goals. Execution and tactical decisions are powered by data far less often than they should be. There is far too little thought or process around strategy and tactics enabled by and about creating data. Many line of business or domain leaders are simply not great at data. They may be able to leverage insights but they don't get the information cycle, especially sourcing necessary data. Data teams need to partner with them effectively - that is definitely a two-way street. ?Controversial?: Relatedly, too many data people are focused on the data work itself instead of the impact of the work. There needs to be a better understanding of what teams are trying to accomplish with the data work. It's not about the pipeline, it's about the goal of the work and the impact. You need internal processes and clear delineation of ownership or you will have multiple teams measuring the same things and getting different answers. Far too often, people are myopic in focusing on their own job instead of how they fit into the bigger picture of the organization and delivering value to customers. That leads to teams not considering how they exchange information internally, only focusing on their own usage. ?Controversial?: Data teams need to spend more time creating their own data around the impact of data work and impact of issues like data downtime. Move past the service-only/cost-center perspective. Lack of data fluency, especially among execs, causes so many issues. If people don't understand data, they don't understand how much they can trust it and thus won't rely on it. Relatedly, there is a significant lack of understanding of upstream and downstream data and business processes and needs that could be fixed by better communication. What do you need from your upstream and how are your downstream users leveraging your data? Communicate! Automating data work enables business partners to identify "business choke-points" and address them. ?Controversial?: You can't have AI without really understanding your business processes and how data supports those, how people combine data and their degree of trust and understanding of the data. Wendy started out with her perspective that in some respects, data has become "a four letter word". There's so much data and everyone is trying to use it but everyone also feels inundated. Instead of being data-driven, we are data-flooded or data-dragged. And there is a major lack of tying the data work to the actual strategy and execution. Where do we need data to support our decisions? We need a strategy to get that data in place. Relatedly, Wendy sees the major breakdown between strategy and goals when it comes to data. There may be a strategy to grow a product but how much growth is feasible, a good target? Why is that growth feasible? What does the data say about growing that product, e.g. the market dynamics and your positioning in the market? So when goals are set, it is a 'finger in the air' type guess as to how much it could grow or worse, simply how much leaders want a product to grow. And then what data do we need in place to enable the team managing that product to actually be able to grow it that much? How do we enable them to make smart tactical decisions? Basically, it's a lot of things looping back on each other. We need data to set good strategic decisions. But we need a strategy to set up our ability to capture and analyze that data. We need data to make better tactical decisions. But most companies lack the ability to make good tactical decisions to get the necessary data in place. It's top-down driven but far too often the ones who understand what needs to be done for and with data are too far down in the organization and there's a communication gap. Thus, there is a significant lack of being data-driven. We have to admit the problem first. How to fix all of that is another fun process 😅 For Wendy, far too many organizations have data as an afterthought. And that leads to subpar understanding of what's actually happening with the business and lacking the information to fine-tune their strategic decisions. There isn't a strong strategic connection flowing from the business strategy to the data work. Execs aren't spending the time to really follow-through on exactly what the tactics should be to get the right data in place. Scott note: we literally have a panel on doing that, tying the data work to the business strategy and vice versa 😎 episode #251 In many parts of many organizations, e.g. Marketing, Wendy sees there being very competent people who just don't really understand how to do data well. They need a great partner. Should the marketing leader be focused on what data sources they need and why? Or should we be able to translate their needs into the work? But first, we need to actually be able to partner and they need to understand their needs. Data people can supercharge their efforts but the business partners need to lean in. Scott note: in data mesh, part of the role is enabling them to get better. We need people to up their fluency but doing data mesh or not, everyone starts somewhere and we need to help them level up. On the flip side, Wendy also sees how often data people are stuck focusing on the data work instead of the business aspects of what that data work is tied to. Without the business context, all you are doing is pushing 1s and 0s. What do business partners need and why?! There needs to be the ability and the courage to just hammer out the understanding differences or the problems will persist. Wendy also gave some specific examples of too many cooks in the kitchen relative to certain measurements. Instead of there being one official perspective or measurement for something like usage of a cloud product, in a previous role there were many measurements across engineering, finance, marketing, sales, etc. And every single one was different because they all used slightly different methodologies and even sources. So when they tried to look at success of the product, everything told a different story. And when they tried to have a simple bill the customers could understand, it was just not possible. While single source of truth is a complicated and overloaded term, one official source of truth for a question is something you should be able to rally around. A big problem in many organizations is people are only focused on their own job and lose sight of the bigger picture and especially how they play into that bigger picture of the organization's success according to Wendy. Even if your role isn't directly improving the customer experience, your work can have a positive impact on that if you drive towards that goal. Sometimes, politics around data also gets in the way of collaboration across teams and lines of business. Wendy talked about another persistent problem in data: the service model. If your data teams are only focused on supporting other teams, you can lose sight of your big picture impact as well as the impact of bad data. She believes data teams need to spend more time creating their own data around their impact and also quantifying the costs of data issues. What are the actual impacts to the organization? And do execs outside the data team understand data well enough to understand those impacts? If they don't understand data, can they even trust it enough to rely on it? Circling back to the bigger picture, Wendy believes that teams can drive significant process improvements if they just understand the impact of their work - especially through data - upstream and downstream. What do they actually need from others? Who is consuming their data and why? What impact will changes have? How are communications set up to prevent issues and create strong understanding and trust? And then of course, try to automate as much as possible to lower the burden on everyone involved in the data flowing around :) As part of that, please just hire good product managers 😅 Wendy said, "You will never be as successful as you can be as a data organization if you're not able to influence your IT partners, your product teams, your business teams." Data is a team sport, data is about making the organization better. You need others to play with you or it won't work. When thinking about actual business transformation around data, Wendy said, "There is no transformation without automation." Historically, doing data work has required a lot of effort. The business side just wants to leverage the data, help them automate as much as possible. Otherwise many - most? - business partners won't want to engage with the data and leverage data to improve their work - it's too much effort. Also, removing the friction from data work helps people identify the friction in general business processes. So automating the data work allows them to more easily identify and then address "business choke-points". For Wendy, too many aspects of data work are treated as wholly separate disciplines instead of treating it as all part of one whole. Security, privacy, compliance/regulatory, performance, etc. We have to shift it left but also stop trying to treat them as discrete challenges to overcome instead of interoperating aspects of a working, scalable solution. Think data by design 😎 That's why she created TheAssociation.AI, "a global nonprofit business organization …focused on bridging the disciplines of AI, data, ethics, privacy, robotics, and security." She said, "there is no security, there is no privacy, there is no ethics, there is no AI without data," but we also need organizations to actually implement their policies into their data and data work. There isn't going to be ethical AI without someone leading that charge and TheAssociation.AI is looking to push that effort forward. In wrapping up, Wendy circled back to the start. What is the point of doing data work. She said, "What's the point of being focused on the data if you don't understand the business that the data is supposed to be used for?" Being a data leader, especially the CDAO, is VERY tough because you often don't own much of the infrastructure if at all and have to do your work essentially via influence. But if you build the right relationships and understanding of the business, you can still have a major impact and drive significant value for your organization. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#294 Panel: Product Discovery and Data Discoverability in a Data Mesh World - Led by Ecem Biyik w/ Frannie Helforoush, Marta Debska-Barcinska, and Ole Olesen-Bagneux 1:03:15

לפני 1 year1:03:15

1:03:15

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#293 Adapting Product Management to Data - Finding the Customer Pain and the Value - Interview w/ Amritha Arun Babu Mysore 1:05:31

לפני 1 year1:05:31

1:05:31

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Amritha's LinkedIn: https://www.linkedin.com/in/amritha-arun-babu-a2273729/ In this episode, Scott interviewed Amritha Arun Babu Mysore, Manager of Technical Product Management in ML at Amazon. To be clear, she was only representing only own views on the episode. In this episode, we use the phrase 'data product management' to mean 'product management around data' rather than specific to product management for data products. It can apply to data products but also something like an ML model or pipeline which will be called 'data elements' in this write-up. Some key takeaways/thoughts from Amritha's point of view: "As a product manager, it's just part of the job that you have to work backwards from a customer pain point." If you aren't building to a customer pain, if you don't have a customer, is it even a product? Always focus on who you are building a product for, why, and what is the impact. Data product management is different from software product management in a few key ways. In software, you are focused "on solving a particular user problem." In data, you have the same goal but there are often more complications like not owning the source of your data and potentially more related problems to solve across multiple users. In data product management, start from the user journey and the user problem then work back to not only what a solution looks like but also what data you need. What are the sources and then do they exist yet? Product management is about delivering business value. Data product management is no different. Always come back to the business value from addressing the user problem. Even your data cleaning methodology can impact your data. Make sure consumers that care - usually data scientists - are aware of the decisions you've made. Bring them in as early as possible to help you make decisions that work for all. ?Controversial?: Try not to over customize your solutions but oftentimes you will still need to really consider the very specific needs of your consumers. Build for reuse but also build where your consumers are actually having their needs met. A mediocre solution for all is usually worse than a few specialized solutions. Prioritization is crucial in product management. That applies to features within the products but also the products themselves. There are many potential use cases that won't be met because there isn't enough value. That's the name of the game, return on investment; it's not about capturing all value possible. Communication and building relationships/trust are foundational in product management. It's an art as much as a science. If you can't have tough conversations and get alignment, it is FAR harder to build a product that meets customer's needs. Relatedly, establish regular communication with your customers. You shouldn't only be talking to them when things go wrong. Stay on top of what is driving value for them and look to augment your product proactively, not only reactively. Product management requires patience as much as diligence. Sometimes your data product/element violates its SLAs but it's an outlier, a one-off. Don't look to overreact and jump to changing things. But you obviously need to have serious conversations if elements aren't meeting expectations over a more extended time period. If you aren't sure what products you should create in a new area, talk to people and find the points of friction. What are the pain points and is there enough value in addressing them to justify doing the work? It's crucial to deeply converse with potential users of a data product/element to assess if it's really going to be worth the effort. There is always a chance you build something that isn't used/valuable but through deep investigation and ideation with potential customers, you can avoid that far more often. When you are building something, even before it hits 'GA', get validation. You can save yourself a ton of effort in rework as you find a better solution sooner. Product management is about collaborating to drive towards value. You are there to prioritize and coordinate. You don't have to know everything, but your job is to uncover as much understanding as possible to maximize your value creation and minimize wasted work. Always ask what value building something for your customer will drive. But also ask what happens if we don't build it. What is the cost of not acting? The only constant is change, especially in data. Leverage a "loosely dependent architecture" to be able to adapt to change. And be open and honest with customers that things will change. Emphasize you'll work with them to adapt to those changes. Amritha started the conversation on some key differences between software product management and product management around data - whether specific to 'data products' or not. One similarity is the focus on solving a particular user problem but in data, you might have to build something to address multiple users' problems. A much bigger difference is that in data, you often don't own the entire process as you might be reliant on others to source your data. In software, you are generally building the data sourcing because you own the interaction creating the data. How the data is stored and collected throughout the upstream process impacts what you can do. The user problem, the business value, and the user journey are some key guides to doing data product management well for Amritha. Keep coming back to those as you build out your solution. Focus on understanding what the user really needs and work backwards to the sources. And then of course focus on making sure you are actually addressing user needs when you deploy the solution. There are many reasons a data element may not be performing up to expectations so be prepared to deep dive; is there a problem with what you've built, what's feeding your data element - maybe sources have changed or there is a quality issue -, or is it just not performing to expectations because the hypothesis was wrong? Amritha dug a bit more into some challenges specific to product management in machine learning and AI. While data scientists want clean data, when possible they want to even be part of the process of selecting the cleaning methodologies - even that can impact the data enough to change outcomes. So really start from the process of bringing them in as a stakeholder as soon as you can and don't throw data over the wall at them. And if you already have something developed, share your methodologies and help them figure out if it's the right fit for them or if something new needs to be developed. Again, we want reuse but we also want solutions that address their problems. Always a hard set of needles to thread. "As a product manager, it's just part of the job that you have to work backwards from a customer pain point." Amritha questions if you are even building a product if you don't have a customer. What is the business value of the work? For a product manager of product without a customer, are you focused on your own thoughts and biases rather than the needs of consumers? "So the point here is that at any given point, you have to be cognizant of who are you building this for, why, and what that is the primary customer. And the secondary is: who else if I build this, what are the impacts it will have on my secondary customers, or other downstream or interacting applications?" Amritha talked about one crucial rule in product management: prioritize. There are many use cases you _could_ solve but are they actually worth the effort? Think about what will impact your organization the most. Don't try to solve every use case and don't try to make products that can serve every potential customer - focus on delivering value. Scott note: this can be a slippery slope in data mesh. You want to take on use cases you actually can tackle when you are learning. Don't only go for the biggest value but also tackle problems where the juice is worth the squeeze, where the outcome is worth the effort. In product management, Amritha believes it's absolutely crucial to understand the art and the science. The science is more about is this product specifically meeting the needs it was designed for. Basically, measuring the level of success and determining if that's good enough or especially is it _still_ good enough. But even that last bit can be a bit of art. The real art is all about communication and building relationships. If you build the world's objectively best product but no one trusts it or understands it enough to use it, it's not a valuable product. You must build strong relationships and have the tough conversations with stakeholders, earning their trust, to align on what needs to get built and why as well as when a product isn't meeting expectations. Establish regular lines of communication so it's not that the only time you talk to your customers, it's bad news or big changes. Continue to extract information from them to drive to business value. When it comes back to the science, that's when Amritha believes you should dig into the why something isn't meeting expectations from the technical perspective :) And have some patience around that. Sometimes it's a blip on the radar, not anything more. When figuring out what products/data elements you might want to build in a specific area, Amritha recommends digging into the potential workflows and user journeys. Start to really think about what you think could exist and why. But, instead of trying to ideate only yourself, go and talk to people and listen for their pain and points of friction. They may not even realize they have pain but you can find the challenges that people will want to address. Again, work backwards from the user journeys to discover what products you should build 😅 Amritha talked about how to make maximize the chance that what you're building will be used/valuable. A lot of it is simply digging in deep with potential customers in the ideation phase to make sure this will actually drive value. There are ways to do that but a lot of it is simply spending the time to really understand the likely impact of what you're building. As Alla Hale said in episode #122, "What would having this unlock for you?" Also, ask, "what if we don't do this, what is the impact of not doing this?" And make sure to get validation as you're building. It might be the value hypothesis was wrong or that you're building something that is the wrong or suboptimal way to address the challenge/opportunity. You can save yourself a lot of headaches and rework. It's all about that collaboration to drive to value. In wrapping up, Amritha talked about how changes, especially in data, are inevitable. Make sure to communicate with consumers so they have realistic expectations. Sometimes those are proactive changes but often, you don't have that much control over changes, especially coming from upstream in data. Look to build in a way that can adapt and leverage a "loosely dependent architecture". Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#292 Aligning Your Data Transformation to the Business - Interview w/ Nailya Sabirzyanova 1:05:30

לפני 1 year1:05:30

1:05:30

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Nailya's LinkedIn: https://www.linkedin.com/in/nailya-sabirzyanova-5b724310b/ In this episode, Scott interviewed Nailya Sabirzyanova, Digitalization Manager at DHL and a PhD Candidate around data architecture and data driven transformation. To be clear, she was only representing her own views on the episode. Some key takeaways/thoughts from Nailya's point of view: When it came to microservices and digital transformation, we aligned our application and business architectures. Now, we have to align our application, business, and data architectures if we want to really move towards being data-driven. To do data transformation well, you must align it to your application architecture transformation. Otherwise, you have two things transforming simultaneously but not in conjunction. It's crucial to involve business counterparts in your data architectural transformation. They know the business architecture best and the data architecture is there to best serve the business. That is a prerequisite to enable continuous business value-generation from the transformation. Re a transformation, ask two simple questions to your stakeholders: What should this transformation enable? How should we enable it? It will give them a chance to share their pain points and their ideas on how to address them. The business stakeholders know their business problems better than the data people 😅 Your approach to data mesh, at the start and throughout your journey, MUST be adapted to your organization's organizational model and ways of working. Everyone starts from completely different places. Data mesh won't work if you overly decentralize. You must find your balances between centralization and decentralization yourself. ?Controversial?: Historically, teams were charged for data work and resources but with something like data mesh, they can manage their data and data costs far more efficiently. Framework processes, tools, and skills help teams to identify which data is valuable for their own or other domains and requires investment, and which data or data processing operations are redundant, and thus, a source of savings. ?Controversial?: You should consider two phases of your early data mesh implementation: “foundation” - enabling teams to own their data by building corresponding teams, processes, and tools and “operationalization and scaling” - enabling/incentivizing them to share their data well with others. They have overlap but if you don't focus on enabling to own data for themselves, you may have trouble incentivizing them to even own their own data let alone share it. To drive incentivization and prioritization well to do something like data mesh - or really most large-scale transformations - you need top-down support from the highest levels in the organization. ?Controversial?: In highly regulated industries, you will have domains that already have very strong governance practices. Focus on enabling them to safely manage their data within a new framework, rather than trying to change their ways of working. Relatedly, focus on the frameworks and guidelines as well as the tooling to enable those domains that aren't nearly as advanced in their governance. If you are looking to federate/decentralize to all domains at the same time, consider how you leverage central committees. If you don't have someone helping guide people towards some degree of consistency, you can't find repeatable scaling patterns/best practices and are likely to create data silos. The level of importance your company places on your data transformation - mesh or otherwise - should determine how you combine it with your digital transformation. It might be under it as part of digital transformation, entirely separate but at a peer level, etc. But they should get aligned no matter what to have the best impact. Everyone must understand that data mesh is journey. You will learn and adjust along the way. Getting budget for your data transformation / data mesh journey is probably more political than many expect. The easiest way to get budget is high-level management attention. Scott note: but of course, it can be hard to get that buy-in first 😅 Nailya started the conversation on the need for application, business, and data architectures to all be aligned. She gave some history about how application and business architecture were brought into alignment when we started with microservices and digital transformation. But now we have to add data into the mix which makes things even more difficult. We need both the application and data architectures to be designed to specifically support the business goals. When it comes to actually transforming your data architecture, Nailya believes the transformation should be led from the business side where possible. At the very least, the business side should be involved. They understand the business needs best so they can help direct the transformation to serve those needs. Great data work that doesn't support the business needs is often just a well-designed money pit 😅 You obviously need the strong data expertise but a transformation led exclusively by the data team is far less likely to align to business goals and priorities. In a successful past digital and data transformation, Nailya used two simple questions to the key business stakeholders: What should this transformation enable? And how should we enable it? It gave the business stakeholders a chance to fully lay out their pain points as well as some ideas how to address them. That way, the team had a very broad perspective and could come back to each of the stakeholders with solutions that worked for them somewhat tailored to their needs and thoughts. You need repeatable patterns/approaches but you also need people to feel seen and heard in order to drive buy-in where you address their specific pains and ideas. When asked about adapting data mesh to an organization's specific challenges, Nailya pointed to how every culture is so different and you need to take into account how people internally exchange information and work together as you design how you want to go forward. Overly decentralizing - so not doing any federation - or decentralizing too quickly won't work well. You have to find your balance between centralization and decentralization throughout your journey. One interesting buy-in point Nailya mentioned was cost control over data work. Because teams have traditionally been charged for data resources and work by central data teams, they were not as involved in managing costs. Data mesh empowers business teams with tools to control cost-effectiveness of their data, and thus they can identify easier which data is valuable for their business and requires investment, and which data or data processing operations are redundant, and thus, a source of savings. They can see it as a chance to do things better and align better on what work is worth doing - the central data team might have done work that the domain doesn't see as valuable when really considering it more deeply. At first, it might be only for their internal-facing to the domain data work before we can get them bought in that they are now responsible for also sharing their data with other domains. Nailya talked about her experience with a large-scale data mesh implementation. They focused on first enabling teams to own their own data. So again, giving them the chance to gain transparency to their most valuable data as well as define, align, and prioritize their data initiatives. Then, they started to work to incentivize and better enable them to share their data with the rest of the organization identifying new use cases and data customers. This may delay the biggest benefits of data mesh - high-quality, reusable data across domain boundaries - but it does mean that teams aren't struggling to own their data at the same time as learning to share it with others; this also helps with the incentivization challenge as they can take advantage of their data first for themselves before being asked to focus on sharing it with other domains. Data governance in data mesh will - unsurprisingly - be hard for every organization in Nailya's view. If there are domains that already know how to handle their data well, work to enable them to better share their information but also don't try to push them towards central ways of working. If they can safely secure and share their data in a way the rest of the organization can consume it, don't get in their way. But you should also look to create frameworks and standards for those that aren't as mature to help guide them along. Scott note: this is an adjustment for those that are already somewhat decentralized with their data work. Again, adjust for your circumstances! Nailya also recommends central committees to ensure teams are meeting some degree of conceptual consistency and also technical/architectural consistency. That way, you can really find your scalability patterns and best practices. Scott note: if you decentralize/federate all at once, this might be the best bet. But many - most? - are going domain by domain so this function is already embedded in an enabling team. In her experience, Nailya believes that aligning your data and digital transformation is very important to be able to succeed at data mesh. Again, you need to align the application and data architectures with the business architecture. Really take stock - is your data transformation part of your overall digital transformation, are they at the same level and should be partnered, etc.? Nailya again circled back to engaging with your business partners/stakeholders to have them help design your transformation efforts. They will lean in from feeling seen and heard but also, they know the most critical pain points best and can help direct where you should focus first. The conversation finished up around getting and maintaining a budget around your data mesh implementation. For Nailya, this is typically far more political than many might expect. But if you have the proper top-down support and management attention/buy-in, you should at least be able to get going. You have to show value along the way but that should be part of the prerequisite to start: what value are you trying to capture and how will you measure it? Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#291 Panel: Data as a Product in Practice - Led by Jen Tedrow w/ Martina Ivaničová and Xavier Gumara Rigol 1:01:48

לפני 1 year1:01:48

1:01:48

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Jen's LinkedIn: https://www.linkedin.com/in/jentedrow/ Martina's LinkedIn: https://www.linkedin.com/in/martina-ivanicova/ Xavier's LinkedIn: https://www.linkedin.com/in/xgumara/ Xavier's blog post on data as a product versus data products: https://towardsdatascience.com/data-as-a-product-vs-data-products-what-are-the-differences-b43ddbb0f123 Results of Jen's survey 'The State of Data as a Product in the Real World' (NOT info-gated 😎👍): https://pathfinderproduct.com/wp-content/uploads/2023/12/2023-State-of-DaaP-Real-World-Study.pdf?mtm_campaign=daap-study&mtm_source=pp-blog&mtm_content=pdf-daap-study In this episode, guest host Jen Tedrow, Jen Tedrow, Director, Product Management at Pathfinder Product, a Test Double Operation (guest of episode #98) facilitated a discussion with Martina Ivaničová, Data Engineering Manager and Tech Ambassador at Kiwi.com (guest of episode #112), and Xavier Gumara Rigol, Data Engineering Manager at Oda (guest of episode #40). As per usual, all guests were only reflecting their own views. The topic for this panel was data as a product generally and especially how can we actually apply it to data in the real world. This is Scott's #1 most important aspect to get when it comes to doing data - especially data mesh - well. It's the holistic practice of applying product management approaches to data. It ends up shaping all the other data mesh principles and is a much broader topic than data mesh is in his view. But it can also be quite simple in concept when you really boil it down, it just takes patience and focus. Scott note: I wanted to share my takeaways rather than trying to reflect the nuance of the panelists' views individually. Scott's Top Takeaways: At its core, data as a product is more an organizational mindset approach than anything else. It is something you work towards. It's not an overnight change but getting the mindset right first - at least with a small core group - will help the organization figure out how to best move towards treating your data as a product. Data as a product and data product thinking must not fall into only thinking about data products - It's product thinking about data! In general product management, it isn't about creating a product, it's about creating an experience for the customer that generates value for them in some way. The best way to do that sustainably in data is a data product. But the value and experience are the point, that's your focus, the data product is merely a vehicle to deliver those. Think about good and bad product experiences. The incomprehensible manuals and unintuitive design of a bad product. The awesome tutorials and documentation around a good one. We need to create easy paths for people to not only discover our data products but discover the best ways to generate value from them. Data as a product includes considering your data sourcing strategy. If you think about a physical good, you don't just have the labor to put it together, you need to consider the materials that need to be combined to create the product. It's not just about what parts you have laying around, you manage the supply chain - hopefully reliably and scalably - to make your widgets. The same should be part of data product practices. It's not just what data you have but what data you will need. If we actually want our organizations to be data-driven (whatever that means to you 😅), people need to be able to rely on the data. If we want to embed leveraging data into the DNA, the day-to-day operations, of the company, we need to make our data creation and management processes reliable. The best way to do that is via data products because it makes it easy and reliable for consumers to leverage data. The data isn't the point, it's the mechanism to drive business value. A first step on the road to your organization treating data as a product is getting away from the data team as a service mindset. That they are ticket takers. Data shouldn't be a cost center or ticket-taking organization rather than a value generating one. A key aspect of good products is usability. We need to focus more on usability in data. That is somewhat wrapped into user experience but there are other aspects that are even more often overlooked than UX. A lot of that usability falls on the platform as well so there isn't a different user experience for each data product. Moving to a data as a product approach, instilling that data as a product mindset in your organization, will be hard. And it is quite a bit of cognitive load for people who haven't focused on data historically. Look to introduce the concept and changes needed to implement data as a product over time. This isn't a switch you flip. Other Important Takeaways (many touch on similar points from different aspects): Data as a product is a lot about the mindset and approach you bring to data. What have we learned from product management - in software and elsewhere - that we can bring to data? Most good general product experiences - at least on the consumer side - don't need a ton of hand-holding. Can we actually get there on data? Do we want it to be that easy since it's still easy to misinterpret the data? This balance will be different for every organization and probably every data product. Relatedly, fully self-service is a real question. You want to lower the amount of information requests to data owners but some of those questions can be value generating. So you want to build out the experience - especially documentation - to answer the basics but there's a question of how far you go relative to trying to document everything. There's a major question about how much change management is involved in learning to treat your data as a product. Is it just a mindset shift? Probably not. But then how do you actually change the organization to start focusing on data as a product? Jen said, "… the primary purpose of data as a product is to maximize data as utility." There isn't a single solution or approach to 'solving' product management. There won't for data as a product either 😅 prepare yourself and your teams for that. It's going to take sustained learning and evolution to get better and better. There is no 'done' but there is a ton of value to accrue along that learning journey. It's okay - if not ideal - to have multiple things in your organization called 'data products'. Not all have to meet the data mesh definition. But make it clear to people what you mean around data products - potentially call them mesh data products - or their thinking on data products will be, "okay, but just what the heck is a data product?" 😅 Software product management is to software products as data product management (the application of data as a product) is to data products. No one thinks software products and product management are the same. We shouldn't in data either. Data work and learning how to do data well both generally have a high cognitive load. Be prepared for that. Don't expect everyone to get it right away, whether that is the why of treating data as a product or the how do we actually do this :) Data as a product will be hard to instill even inside your data team. Again, this will take time and you have to let people know the why and the how. Similarly, you need to be prepared for sustained effort to communicate the benefits to those in the business. People aren't jumping up and down to own their own data and especially not to do it well 😅 Are people ready for the best product approaches in data? Doing product management well is about trying, fast failing, learning, then iterating. Are people ready for there to be more continual change in how data is served? Part of product management is stakeholder management and especially communication. In data, we need to move past requirement gathering but we also have to find better ways to communicate value and get - and then retain - buy-in. Product management, at the end of the day, is about capturing value through creating value for others. With data as a product, you need to understand what value is expected to be created from your work and where you might increase that value. But data doesn't have inherent value unless it is used so you need to stimulate data usage to create value. An important benefit of managing your data as a product is the improvements in data user experience. That means more time spent on creating value instead of wrangling data. Part of good product management is product marketing. That means discovering what data should exist but also finding champions, internally marketing successes, etc. You need people to see the value of the data work to get them to want to lean in. People aren't paying close enough attention to inherently know the impact of the data work, you have to tell them 😅 Products have owners. Treating your data as a product means your data has strong ownership. It's easy to say you are creating data products but really instilling that ownership is crucial to doing data at scale reliably. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#290 Applying Platform Engineering Best Practices to Your Mesh Data Platform - Interview w/ Tom De Wolf 1:05:45

לפני 1 year1:05:45

1:05:45

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Tom's LinkedIn: https://www.linkedin.com/in/tomdw/ Data Mesh Belgium: https://www.meetup.com/data-mesh-belgium/ Video by Tom: 'Platform Building for Data Mesh - Show me how it is done!': https://www.youtube.com/watch?v=wG2g67RHYyo ACA Group Data Mesh Landing Page: https://acagroup.be/en/services/data-mesh/ In this episode, Scott interviewed Tom De Wolf, Senior Architect and Innovation Lead at ACA Group and Host of the Data Mesh Belgium Meetup. Some key takeaways/thoughts from Tom's point of view: Platform engineering, at its core, is about delivering a great and reliable self-service experience to developers. That's just as true in data as in software. Focus on automation, lowering cognitive load, hiding complexity, etc. If provisioning decision specifics don't matter, why make developers deal with them? The key to a good platform is something your users _want_ to use not simply must use. That's your user experience measuring stick. When building a platform, you want to hide a lot of the things that don't matter. But when you start, especially with a platform in data mesh, there will be many things you aren't sure if they matter. That's okay, automate those decisions that don't matter as you find them but exposing them early is normal/fine. Relatedly, make that hiding easy to see through the curtain if the developer cares. Sometimes it matters to 5% of use cases but also often, engineers really want to understand the details just because they are engineers 😅 Make a platform where people can customize their experience where possible without going overboard. ?Controversial?: Few - if any - current tools in data are "aware" of the data product, they are still focused on their specific tasks instead of the target of creating an actual data product. Relatedly, the developers should be able to focus on creating and maintaining data products instead of focusing on leveraging specific tools. We need platforms that allow them to deliver value through creating and managing data products, not a focus on working with tools. ?Controversial?: Data mesh without technology is just theory. It can't only be about the people - if you focus on evangelizing without anything practical to show, it is too theoretical or abstract for people. You need a platform early to be able to show people what you mean. Scott note: you need a thin slice that has at least some aspect of all the 4 mesh principles early or your implementation becomes lopsided. Relatedly, get to something to show people in a demo as soon as possible with your mesh implementation so they can picture it and understand what you're trying to accomplish. In data mesh, you will still have data developer power users that really want to dig deep. But a key focus of your platform should be to make it easier for non-power users to still build and maintain great and valuable data products. Expand the potential number of people creating data products by lowering the bar. ?Controversial?: The platform team shouldn't be a blocker to new data products being developed. However, you should probably have certain cost guidelines/guardrails so someone doesn't develop a very expensive data product - it should only go to a central team for oversight when cost becomes an issue. That way, you prevent unnecessary friction and costs simultaneously. When there is an escalation because of a problem with a data product related to cost or governance, look to frame it as a collaborative deep-dive into an issue. Rather than a central 'you can't do this', it’s much more of a 'why is this made this way? Is this optimal or can we change it?' That collaborative discussion can keep people engaged and leaning in. You can get domains more bought in on something like data mesh/data products by showing them how this new approach won't directly tie data schema to their application schema. That way, they can still easily make changes to their application schemas and not break their data products. Good engineering is all about managing tradeoffs. Platform engineering is no different. You'll have to look at what should be specific versus generic. Orchestration is one area that should be very generic. In data mesh, you want to think about the holistic flow of data and data product work. That's why we need a platform. But the tools aren't really that well built to work together. Be ready for that frustration of having to build on top of the tools to get them to play nicely. ?Controversial?: Even if you aren't doing data mesh, your platform should focus on abstractions. What matters and why is a fundamental question. You want people solving challenges that add value. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#289 Building the Right Foundations for Generative AI - Interview w/ May Xu 51:26

לפני 1 year51:26

51:26

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . May's LinkedIn: https://www.linkedin.com/in/may-xu-sydney/ In this episode, Scott interviewed May Xu, Head of Technology, APAC Digital Engineering at Thoughtworks. To be clear, she was only representing her own views on the episode. We will use the terms GenAI and LLMs to mean Generative AI and Large-Language Models in this write-up rather than use the entire phrase each time :) Some key takeaways/thoughts from May's point of view: Garbage-in, garbage-out: if you don't have good quality data - across many dimensions - and "solid data architecture", you won't get good results from trying to leverage LLMs on your data. Or really on most of your data initiatives 😅 There are 3 approaches to LLMs: train your own, start from pre-trained and tune them, or use existing pre-trained models. Many organizations should focus on the second. Relatedly, per a survey, most organizations understand they aren't capable of training their own LLMs from scratch at this point. It will likely take any organization around three months at least to train their own LLM from scratch. Parallel training and throwing money at the problem can only take you so far. And you need a LOT of high-quality data to train an LLM from scratch. There's a trend towards more people exploring and leveraging models that aren't so 'large', that have fewer parameters. They can often perform specific tasks better than general large parameter models. Similarly, there is a trend towards organizations exploring more domain-specific models instead of general purpose models like ChatGPT. ?Controversial?: Machines have given humanity scalability through predictability and reliability. But GenAI inherently lacks predictability. You have to treat GenAI like working with a person and that means less inherent trust in their responses. Generative AI is definitely not the right approach to all problems. As always, you have to understand your tradeoffs. If you don’t feed your GenAI the right information, it will give you bad answers. It only knows what it has been told. Always start from the problem you are trying to solve rather than the approach you are trying to use. Then evaluate if GenAI is the right approach for that problem. Simple, fundamental stuff but it's crucial to remember: start with the problem before the proposed solution. Many people are leaping to use GenAI because their past approaches to certain problems haven't worked. Dig into those pains. GenAI may or may not be the right approach but either way it can be great for surfacing persistent challenges. Leverage people's enthusiasm for GenAI to have deeper conversations about general business challenges. It can really start to highlight friction points across organizational boundaries and who is responsible for what. Scott note: But as the data team, be careful not to try to fix the entire organization, that's not what you are responsible for 😅 Right now, despite all the hype, most organizations are still at most in small-scale PoCs around GenAI. There is less of an initial focus on return on investment versus what capabilities GenAI might unlock but there is also a focus on what risks GenAI may introduce. Despite the hype, many to most organizations are doing their diligence. May started with three general approaches organizations are taking to generative AI (GenAI): 1) building their own LLMs from scratch, 2) fine tune specific, pre-trained existing LLMs, or 3) leverage pre-trained LLMs as is. Many organizations may want to do the first but it is prohibitively expensive to train your own LLMs from scratch just for the compute and you also need (very expensive) people with very specific expertise to do so. Tuning pre-trained models will likely become the standard approach for many organizations. However, being able to leverage LLMs on internal data in general requires "existing good quality data and solid data architecture." When considering training a model from scratch, May also pointed to time as an issue. Typically, it takes at least three months to properly train an LLM from scratch. Parallel training is helpful but you need to fine-tune results and retrain so you can't just throw compute at it and make the process that much faster. So again, you need high quality data - and you need a LOT of it - plus a fair amount of time plus a ton of money. Once you are in production, it also takes a lot of money and effort to keep them running and tuned properly 😅 Luckily, according to some surveys Thoughtworks did, most organizations recognize training LLMs from scratch isn't the right call for them just yet. May is seeing a trend of people moving away from the 'bigger is better' mentality. More people are starting to explore more targeted and specialized models that have fewer parameters. And often, for specific tasks, they perform better than the first L in LLMs. So we may see a trend towards more and more targeted LLMs/models. Scott note: Madhav Srinath really leaned into this in his episode, #264. Humanity in general has benefited greatly from machines through predictability and reliability according to May. Essentially, if they are made well, you essentially know what you should/will get from machines. But GenAI is designed specifically to act like humans and humans are not predictable and often not that reliable. So people have to get used to interacting with machines that may give wrong answers and are designed - in a way - to do so 😅 We can't expect predictability and reliability from GenAI. Relatedly, when thinking about where is GenAI the right choice versus like traditional machine learning/AI, May believes you really have to dig into the tradeoffs. If you really understand the problem set and what you are trying to accomplish, traditional ML/AI is probably the better approach for you. You need to really understand where the strengths of GenAI will play and feed it the data/information it needs to succeed, otherwise you'll be asking an uniformed and unpredictable entity to solve your most pressing business problems. That's probably not going to go well… May talked about going back to the basics of problem solving when it comes to Generative AI: what problem are you trying to solve instead of what way are you trying to solve a problem and then finding your way back to the problem. It can sound obvious but really, many are in such a rush to leverage these tools, it's crucial to stop and consider. Start with the problem before the solution 😅 GenAI may also surface a number of internal business challenges that aren't spoken about or people have essentially given up on tackling previously according to May. We have a new tool in the toolbox so people want to see if it will be useful to tackle something they haven't been able to address well previously. Lean into GenAI as a conversational lubricant. GenAI may not be the right tool for every one of these challenges but it means there is more internal conversation and sharing :) From what May is seeing, many to most organizations are still in the early experimenting and PoC phase with Generative AI. They are trying to figure out what opportunities GenAI brings and also what risks. Despite the hype, people are taking their time but they aren't as focused on initial return on investment, more to validate if they can actually leverage GenAI to create value. Also, there is strong trend towards domain-specific LLMs rather than general purpose ones, e.g. financial sector or media specific models. May finished on the idea that data mesh and other data management paradigms are crucial to doing something like GenAI right. There is still a strong need for quality data that is accessible, interoperable, privacy-aware, secured, etc. to be able to leverage GenAI well. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Major Programming Announcement 4:25

לפני 1 year4:25

4:25

Announcing moving to one episode per week :)

D

Data Mesh Radio

1
#288 Panel: Master Data Management in a Data Mesh World - Led by Ole Olesen-Bagneux w/ Liz Henderson, Piethein Strengholt, and Samia Rahman 1:04:59

לפני 1 year1:04:59

1:04:59

IRM UK Conference, March 11-14: https://irmuk.co.uk/dgmdm-2024-2-2/ use code DM10 for a 10% off discount! Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Ole's LinkedIn: https://www.linkedin.com/in/ole-olesen-bagneux-2b73449a/ Piethein's LinkedIn: https://www.linkedin.com/in/pietheinstrengholt/ Samia's LinkedIn: https://www.linkedin.com/in/samia-rahman-b7b65216/ Liz's LinkedIn: https://www.linkedin.com/in/lizhendersondata/ Ole's book The Enterprise Data Catalog: https://www.oreilly.com/library/view/the-enterprise-data/9781492098706/ Piethein's book Data Management at Scale (2nd Edition): https://www.oreilly.com/library/view/data-management-at/9781098138851/ Liz's blog: https://lizhendersondata.wordpress.com/ In this episode, guest host Ole Olesen-Bagneux, Chief Evangelist at Zeenea (guest of episode #82) facilitated a discussion with Piethein Strengholt, CDO at Microsoft Netherlands (guest of episode #20), Liz Henderson AKA The Data Queen, a board advisor, non-executive director, and mentor in digital and data at Capgemini (guest of episode #106), and Samia Rahman, Director of Enterprise Data Strategy, Architecture, and Governance at SeaGen/Pfizer (guest of episode #67). As per usual, all guests were only reflecting their own views. The topic for this panel was modernizing master data management (MDM) and applying that to data mesh. It's a very challenging topic to cover because even people's general definition of MDM can be pretty different and there is a question between simply mastering data versus trying to globally compared to locally manage master data. It's a very tricky topic in data mesh. I sometimes use the term 'mastered data' because I think it is far more applicable in that situation than 'master data' - audio goes through a mastering process, so must the master data to actually reach a certain quality level. 'Master data' is more the core linking data. But even that is still just one person's definition. Scott note: As per usual, I share my takeaways rather than trying to reflect the nuance of the panelists' views individually. Also, there was a bit of a misstep around intros if that gets a bit lost in there. Scott's Top Takeaways: The historical impression of MDM - striving for that single golden record - needs to change. Trying to head down that path in a federated/decentralized approach is even more difficult that in a centralized world. And the benefits just keep proving out to not be worth costs. You need to consider your master data management strategy to do data mesh well. Do you want official sources of truth relative to specific questions? Who owns data quality for core linking data? Etc. You don't have to get it perfect at the start but if people don't trust the data on the mesh, your mesh implementation probably fails. Mastering data can improve the quality - especially the provability of that quality - and thus trust. And without strong linking data between data products, do you just have high quality data silos? Relatedly, there is a tipping point in an organization related to size and complexity where you need to consider data mesh. There is a tipping point in a mesh implementation where you need to really start to push MDM. It doesn't have to be on day one but you should plan from the start around MDM. MDM - at least done well - is not about getting to 'perfect' data, it's about understanding the needs of your organization and helping people get to the right quality level and providing core data when it's needed. Not all data needs mastering. Not all data is master data. There's a difference between 'single source of truth' and 'most trusted source of truth'. It's absolutely okay to designate 'sources of truth' for specific questions. Other data sources may provide different perspectives on the same topic - e.g. customer is different in sales, marketing, and finance - but there needs to be one right, repeatable answer for things like regulatory reporting or financial statements 😅 One reason MDM is such a risk to data mesh is that, to be effective, some part of master data essentially has to be centrally managed. You need ways for domains to adhere to central standards, guidelines, policies, etc. Otherwise you risk silos. BUT centrally managed often leads to inflexibility. It can be tough to thread this needle. Look to provide the value add from MDM but limit the overhead and rigidity. How that will apply to your organization will be quite specific but go talk to others in your space to very specifically understand their approaches. No one-size-fits-all, no copy/paste. This is going to be very hard but don't skip it. If you are going to do MDM in your data mesh implementation, much like with anything in data mesh: test, learn, and iterate. Don't do a huge upfront implementation or it will cost far too much and limit your agility and flexibility far too much. Plan some aspects of your MDM implementation out ahead of time but trying to do everything at the start is a massive anti-pattern. There is a massive push and pull in data mesh - one where you have to find the right balance specific to your organization - between master data and maintaining the domain-level meaning and understanding of data in domain-specific data products. This is where the enterprise data warehouse often goes wrong: focusing on fitting the data together (master data) at the expense of its actual meaning and uniqueness. It's a balance. Scott note: I often call this local versus global maximization, this time related to business context :) Other Important Takeaways (many touch on similar points from different aspects): Liz asked an interesting question: if we are creating true high-quality data products, have we by definition mastered the data inside? Is that good enough? The answer to the first is maybe? The answer to the second seems to be no, we need to look at how data in a data product fits in the broader scope of the organization's data needs or we're just creating high-quality data silos as data products. That is that master data versus mastered data. Do we only need to consider data that is broadly reusable across the organization for mastering? It can be quite complex - and political too 😅 - figuring out what data is deserving of being mastered and what data should be master data. Historical MDM approaches have been extremely costly often without the return on investment to justify them. Look to consider a point Khanh Chau made way back in episode #44: if everyone is doing similar cleaning work instead of that being in the data product itself, your total cost of ownership skyrockets. In MDM, when there isn't a clear owner for data because it's so broadly used for linking, that may fall back to the data team. NOT IDEAL but such is life/the real world. Be super clear internally as to your definitions around what MDM means and its application. It will look different in any organization. Much like the fun "what is a data product?" conversation. Dive into the difference between reference data - e.g. standardized country codes - and master data. Data that can be combined but isn't of a high enough quality still needs to be mastered. It's not only about creating interoperable data but data you would actually want to use upon combining it. As Piethein mentioned, MDM should not be managed at the domain levels. That creates a massive mess of contention. MDM is especially key in industries with lots of externally purchased data - e.g. financial services or life sciences - because many domains all leverage the same data. It saves time and money to have that managed once instead of 10s of times 😅 Should data consumers push insights and data improvements back to their data producers? How do business logic and transformations of the data play into MDM? Does it matter what we call it as long as we allow value-add work to easily flow and become scalable? Because of the data products approach with very clear ownership, data mesh may actually make MDM easier than the traditional centralized approach. There is - at least there is supposed to be - clear lineage, documentation, other metadata, ownership, etc. so we can all understand more easily what data we should use and for what purposes. Or at least ask the owners for more information to get a better understanding of that. MDM without strong metadata ownership and management is just like data mesh without them: a disaster waiting to happen. MDM and DDD: there is a complex interplay between how MDM and Domain Driven Design work together. Every domain has its own unique 'language', even a somewhat unique language for communicating externally, but you do need some more broad language of the organization. Getting that crisp so your data products fit a broader taxonomy or ontology is going to be a MASSIVE challenge. If you don't have strong documentation and metadata management, you are far less likely to see a valuable MDM implementation, data mesh or otherwise. People will discover data and use it whether they really understand it or not 😅 Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#287 Driving Data Value Through Creativity, Curiosity, Collaboration, and Communication - Interview w/ Tiankai Feng 56:16

לפני 1 year56:16

56:16

IRM UK Conference, March 11-14: https://irmuk.co.uk/dgmdm-2024-2-2/ use code DM10 for a 10% off discount! Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#286 Mastering Master Data Management in a Modern World - Interview w/ Sue Geuens 54:07

לפני 1 year54:07

54:07

IRM UK Conference, March 11-14: https://irmuk.co.uk/dgmdm-2024-2-2/ use code DM10 for a 10% off discount! Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Sue's LinkedIn: https://www.linkedin.com/in/suegeuens/ In this episode, Scott interviewed Sue Geuens, Director of Data Governance and Product Data at Elsevier. To be clear, she was only representing her own views on the episode. We use the phrase MDM to mean master data management throughout the episode. Some key takeaways/thoughts from Sue's point of view: At the end of the day, if you want to do data governance well, it's about the people. Go talk to them, find out their specific needs and desires and work to tailor your language - and presumably your application of policies when possible - to their situations. People want good data, help them get there! Relatedly, get good at telling stories about data work. Get people to lean in and get them involved. Personalize your communication! While policies and standards are crucial, they are about creating better data for the organization. Try to leverage them as a carrot instead of a stick. ?Controversial?: Don't talk about someone owning data. That's scary for most. Find ways to get them excited about owning the data without making it scary by using different phrasing. The key to doing data governance well is getting people to care. We need them to care about the data because others have to use it. And that means the people are the most important focus. Data governance is too focused on 'governance' and that means oversight. The word governance has a bad connotation for a reason - it makes many potential allies uncomfortable. So governance folks have to really work to make it less scary. Don't focus so much on the data aspects of data work when talking with stakeholders. It's about achieving outcomes through data, not data work itself. Focus on what gets your business partners excited and that's (unfortunately) usually not the data itself. It's easy to fall into thinking about what you want from others in governance. But where you can add far more value is starting with what others want from you and working back towards solutions that accomplish both your and their goals. ?Controversial?: Prioritization in data governance work is crucial. A good method is looking for who shouts loudest for help. They are ready to lean in and are more likely to be leveraged as your governance advocates once you help them. The two big reasons MDM initiatives have failed historically are 1) not having the governance, quality, and metadata embedded into the data and 2) striving for "perfect" data, that single golden record. Relatedly, in data we don't need perfection, the juice isn't worth the squeeze. We need good enough and we need to reflect the realities that our world is ever changing and there are multiple perspectives on the same data that can all be right/correct. It's easy to lose people if you start talking the 1s and 0s of data. Focus on finding stories that resonate with them. If they see the value of the work, you have a much better chance of getting them to actually do the work 😅 MDM in data mesh should be all about "ensuring that you get the right data for the right purpose at the right time for the right person." Sue started the conversation as other data governance experts have - the word governance strikes at least discomfort if not fear into the hearts of many of our colleagues. We need to expect that discomfort and be active in dispelling the myths around data governance as it really is about achieving better outcomes for all. But that means more carrots than sticks, which can be a tall task when it comes to things like regulatory compliance. Basically, it's not easy 😅 Another aspect Sue pointed to is that many - most? - data people really like to talk data. So, instead of talking to outcomes, they talk about the data work, and data work for the sake of data work has kind of been one of the big historical challenges of data - instead we need to focus on the value that comes from the data work. If your business partners are already uncomfortable simply by the phrase data governance, not leaning into their value from the data work and target outcomes is likely to lose them even further. Start the conversation with what they might want from you, not what you might want from them. Sue specifically said she starts partnering with people by focusing on those target outcomes and how might she be helpful to them. Especially, what are their expectations of her? By trying to walk in their shoes, she can come to better conclusions and find working solutions. It's about getting them to lean in. Scott note: and then she can trap them! In a virtuous bi-directional value trap of course… Relatedly, prioritization in data governance is key in Sue's view. What are the problems that really matter? While the "who shouts loudest" test may not point to the most valuable problems, it often points to the problems people value most and thus you can find willing partners. Trying to enforce others to care about their data is a hard road but if people are ready for your help, you can make a huge difference and they are willing to lean in. Those are also likely to be your biggest advocates once you help them, gaining your governance efforts more momentum by leveraging champions. There are many reasons why Sue believes people are skeptical of master data management (MDM). Historically, there were two big reasons MDM projects failed. The first is not really focusing on integrating MDM into the data so not having the governance, quality, and metadata embedded into the data and processes. The second is the drive towards perfection. Instead of focusing on what was good enough, there was this focus on the 'golden record'. That led to inflexibility, poor scaling, high costs, etc. Good data work isn't about being perfect, it's about being good enough. Sue circled back to her focus on working with people. Good governance isn't about perfect data, it's about getting people to care about the quality of the data. That means working to get them to understand what is good enough and why should they care. It's not all just empathy - there needs to be some oversight and making it part of their job - but with humans in the loop, your data quality will be much better if you get people to care about who else uses their data and why. When it comes to actually getting people to understand data governance work - whether MDM or anything else - Sue recommends personalizing your communication. While that may not scale perfectly, again, find your key stakeholders and partners. Stories about data work in a vacuum just don't resonate - Scott note: is there a physics/sound joke in there as there is no air for sound to resonate in a vacuum…? 😅 - Getting people to understand that the work has a purpose and it really is useful to specifically them is crucial. Don't talk to the 1s and 0s of data! When it comes to specifically data ownership, Sue has seen just how scary that ownership word can be. It's not an easy task but we need to find ways to instill people with the excitement around ownership without the fear. Again, easier said than done but it's about getting things to the right place not about doing something right now. It will take time but it's better to do it right. If you don’t take care with implementing data mesh well, Sue believes it will be a far bigger mess than if you didn't try data mesh at all. (Scott note: strong agree) You need to focus again back to what are you trying to accomplish and what data needs to be put in place to do that. MDM in data mesh should be about "ensuring that you get the right data for the right purpose at the right time for the right person." In wrapping up, Sue emphasized the need for personalizing your communication around getting people to do data work with care and prioritize it. You need to be able to speak to them in their language and get them excited about the impact of the work. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#285 Getting Depth and Value From Generative AI - In Data Mesh and in General - Zhamak's Corner 33 19:15

לפני 1 year19:15

19:15

Key points: Thus far, most of the generative AI stuff Zhamak has seen is not that much of a differentiator. They are doing far better chat bots but that hasn't really changed the game. When it comes to any ML work - and GenAI is just a subset of ML work - engineers need data products to make their data work easy. Reliable sources of data, ability to version, etc. Data mesh obviously plays well there. Relatedly, we need to continue to make things easier for people to leverage data products for GenAI. Engineers shouldn't have to spend all their time moving data around and using many systems. GenAI really could be game changing in data mesh but right now we don't have enough information to really do it well. We need far more metadata around things like data products. GenAI often gives extremely shallow answers that just aren't that helpful. If we can get better answers, amazing. But right now, it's not there. Sponsored by NextData , Zhamak's company that is helping ease data product creation. For more great content from Zhamak, check out her book on data mesh , a book she collaborated on , her LinkedIn , and her Twitter . Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Data Mesh Radio episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#284 Breaking Down the Monolith - Incentivizing Good Choices - Interview w/ Frederik Nielsen 1:03:13

לפני 1 year1:03:13

1:03:13

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Frederik's LinkedIn: https://www.linkedin.com/in/frederikgnielsen/ In this episode, Scott interviewed Frederik Nielsen, Engineering Manager at Pandora (the jewelry one, not the music one 😅). Some key takeaways/thoughts from Frederik's point of view: Your data technology and architecture choices incentivize certain behaviors. Consider what behaviors you want before you lock yourself in to anything. Advice to past data mesh self: "construct a data architecture and platform that can adapt to the business requirements and wishes [which] will change over time." Build a composable platform as it's "easier to adapt to changing business requirements." Focus on decentralization features and make it decoupled and composable. Trying to go too wide with your data mesh implementation at the start with all your domains makes it harder to really find your groove and build momentum. Cost transparency can be a big driver for data mesh adoption. Teams want to understand their costs and many organizations are driving cost cutting initiatives. Decomposing the monolithic approach to data means better understanding the cost of individual pieces of data work. Relatedly, when teams are responsible for their own costs, it's easier to spot when someone is making tradeoffs related to cost. It's a more tangible decision and can be a conscious decision to take on tech debt. When taking a concept like data mesh to the highest levels in the organization, attach it to tangible use cases. Make it something that is worth their while, the 'juice must be worth the squeeze'. Focus on the strategic business goals and priorities. It's okay to leverage management consultants. But your data ownership should very clearly be internal - external parties should not own any aspects if you want long-term success. Regarding consultants: "you would rather be driving them than them driving you." It's absolutely normal for some teams to be more data mature than others. If teams raise their hands saying they need help with their data work, your culture is mature enough where teams ask for help and they should get it where possible. ?Controversial?: It's potentially better to focus on your more data mature teams first when going with data mesh so you can move faster early. If possible, create golden paths or pre-configured approaches for less data mature teams to be able to still create data products. It can be hard to show domains why they should move to data mesh. Focusing on use cases is probably the best approach but finding use cases enticing enough to each domain can be a challenge 😅 Tying your data initiatives to the company strategic priorities is crucial to get buy-in. E.g. personalization and omni-channel experience - how do you tie your use cases back to what is most important to the business? At the heart of it, data mesh should be about driving business outcomes - especially the ones people really care about. Focus on that and you will have a far higher chance of success and getting/maintaining buy-in. Make sure you build your data products in a scalable way. That means understanding when you need to put information into separate data products instead of trying to combine it all into one - that is just a mini enterprise data warehouse / microlith. If your data team is remaining quite productive but the backlog is ever increasing as is the time between request and delivery, then your central data team might be a bottleneck. Consider addressing that with something like data mesh. ?Controversial?: Less mature domains can get a more "watered down" version of data mesh as they learn to actually manage and own their data. You don't need to start with the most complicated aspects and use cases first. Scott note: this can be a slippery slope When mapping out potential use cases, ask the amount of effort - if it's even possible - to execute in your existing (non-data mesh) architecture. If it's not possible, data mesh can mean far more data capability for the organization, which can be a great selling point. A decentralized architecture can mean cost savings by getting far more fine grained, e.g. shutting off test environments over the weekend or at night. You can find places to be more efficient far more easily. Frederik started with a bit about how their initial data mesh journey started - and it wasn't great 😅 It was led by management consultants and was focused on real-time data with a very tangible use case. However, two things came from it: 1) a better understanding of what data mesh should actually be used for and 2) buy-in around a very specific use case at the highest levels. So while there was a misinterpretation of data mesh and the use case wasn't the best fit, there was still excitement about the term - and somewhat the actual meaning broadly internally. Making it tangible got people to see the potential benefits. Cost transparency has been a major driver for data mesh internally according to Frederik. Because the costs in a large monolithic stack are very opaque, decomposing the architecture has led to a far better understanding of the cost of individual pieces of work. Because inflation concerns were a big factor for retail in 2023, there was a bigger focus on cost reductions. Being able to give teams the freedom to take different approaches but making them responsible for the costs has led to better cost efficiency - teams can choose more costly methods but those decisions are more exposed. Also, because you have much finer-grained control, there are far more levers to pull when it comes to cost savings, e.g. shutting off test and dev environments at night or scaling up and down dynamically. Frederik talked about a common pattern when moving to data mesh: some teams are more data mature than others. There will be plenty of teams that need help when it comes to data mesh, especially building good data products. They are considering creating a sort of golden path or easy button approach for those teams that aren't as mature as well to make things relatively pre-configured instead of having to make many complex decisions. When driving buy-in at the wider level for data mesh, Frederik talked about pitching data mesh as an entire organization transformation versus pitching use case by use case. He believes it's probably better to focus on the use cases but it can be hard to focus on the complete picture of everything you need when you are also focused on specific use cases. It's always a balance between what is needed only for the use case and what is good for the overall company approach to data. For Frederik, there are two big company strategic priorities: personalization and omni-channel experience (experience across in-store and online). So much of what they have been focusing on is finding use cases that tie into at least one of the priorities because then there will be executive support. Constantly tying the data work back to what people care about shows an understanding of the business instead of doing data work for the sake of data work. However, these are very big challenges across many domains and teams. So making sure to do things in a scalable way and finding the right balance between data products with still high interoperability is crucial. When discussing bottlenecks, Frederik talked about how the measure for the centralized data team becoming a bottleneck was when the time between a data request and the actually delivery was expanding. The backlog was ballooning even though the data team was quite productive. Many people will feel the pain of the increasing time to delivery, leverage that while still showing a productive team. If you are executing well but aren't succeeding, you need a new strategy. Frederik talked about the fact your data technology and architecture decisions will incentivize certain behaviors. A monolithic platform incentivizes monolithic ownership and handing off work, responsibilities, etc. When they introduced Kafka, it enabled them to push ownership upstream to data producers because the new technology allowed data producers to more easily own their data. It's of course difficult to incentivize your desired behaviors but always think about what you want to happen and try to make that the easy/happy path. When it comes to ownership of data, Frederik thinks maturity really matters. When you want to go down the path of data mesh, trying to get every domain to really be advanced with data is just not that realistic. Some teams just don't see data as their focus so if they won't leverage much data for analytical or ML/AI use cases, they are less likely to want to own their data. And less capable quite frankly. Circling back to tangible use cases, Frederik talked about one really key use case that saw a really big uptake that they couldn't really accomplish before going the data mesh route. Being able to tie something to actual impact, that really helped people get more interested. Whether that is a business capability or directly impacting a business metric. Similarly, when trying to find new use cases, the team did a lot of user journey mapping. The data for that user journey lives in many systems so you need lots of teams participating to make the data available but it can have a big impact on business. Many companies probably can't do something that complex in their existing architecture. You can use the inability to do amazing things in your existing architecture as a potential selling point. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#283 Selling Data Mesh to Your C-Suite and Board - Mesh Musing 58 20:05

לפני 1 year20:05

20:05

Quick Summary Points Talk to the business strategy importance - data is there to make things better for the business. What could being better informed mean for your execs? When people ask about the strategy, that is when you can mention data mesh. It isn't about doing data mesh but you also aren't inventing this whole-cloth. 100s to 1000s of organizations are already on the journey. But data mesh is not some magic phrase, it is merely a framing for doing data better at scale. Think of the first hidden data demon from my upcoming mini-book: this is about getting to data driven, not being data dragged. This is about better equipping the people you thought were good enough to hire for their expertise and making them even better. Think of the second hidden data demon: data isn't only about strategic decisions - this gets us into a place where we can make better day-to-day execution decisions too. We don't get to skip leg day. I originally typed 'leg data' and maybe that's what we call the foundations 😅 Please Rate and Review us on your podcast app of choice! Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#282 Not Sweating the Small Stuff in Data Mesh - Interview w/ Mandeep Kaur 1:15:54

לפני 1 year1:15:54

1:15:54

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Mandeep's LinkedIn: https://www.linkedin.com/in/kaurmandeep80/ In this episode, Scott interviewed Mandeep Kaur, Enterprise Information Architect at Nordea Asset Management. To be clear, she was only representing her own views on the episode. Nordea has been on their data mesh journey for a while and Mandeep has been trying to figure out best practices for the hundreds - thousands - of micro decisions in a journey. So how do we get comfortable with making so many calls? Some key takeaways/thoughts from Mandeep's point of view: "1) don't overthink it; 2) bring value out as soon as possible; [and] 3) evolution before completion." The micro decisions in data mesh do matter, give them some thought. But it's important to simply get some perspective from the people who should know best and move forward. That can be from people inside or outside your organization but think about the blast radius of getting something wrong before you fix it. Most times it's smaller than you'd expect. Your first question when considering data mesh: what value am I trying to get out of it? Think about what are the target value propositions and what does it do for the business if this is successful. If you don't have good answers, should you do data mesh? The answers to the 'what value' question of your own mesh journey above should drive your strategy, where you should focus early and what will measure your success. And every organization will have different answers. ?Controversial?: There's a LOT of overthinking in most data mesh implementations 😅 come back to your anchoring points around ownership/accountability, product thinking, value proposition, etc. What's important? You can try something and see if it works and change it if it doesn't, don't get caught in analysis paralysis. Relatedly, always focus on the value proposition. If you are delivering value, you can improve the other aspects as you move along and learn to do aspects of your journey better. There's a major challenge in abstract communication, especially about something like data mesh: those doing the abstracting have so much more time and specific research that there will always be logic leaps and things that only map to their own mental model. Get specific in your data mesh examples and anecdotes while providing abstractions. And be prepared to dive deeper in 1:1 conversations when it makes sense. The internal communication of something like data mesh shouldn't only fall on the data team. Find/create your ambassadors/champions. They can communicate concepts and aspects well in the language of the business and have mental models closer to their peers. ?Controversial?: As the data team, our role is best served guiding people to the right ways to answer their own questions rather than answering things for them. That's part of data mesh but it's a good practice outside it too. With data mesh, you will probably feel it's about 'getting it right'. It's as much about learning how to 'get it right' as it is about actually getting it right. That's product thinking for you. When making your decisions, it's all about trade-offs. Think about what you aren't willing to trade-off. That will guide you more and more to your crucial decisions. Nothing is perfect and nothing is 100% sure. When making your data mesh journey plan, you must take into account your competency gaps - either to fill them before your journey or how you will compensate for them and fill them as you go along. Scott note: ignore your competency gaps at your own - significant? - risk. Set milestones for your data mesh journey so you can measure your success to some degree. It also makes you feel better about the progress you've made. Celebrate the success even if there is far to go! We need to make business users understand data and technology are there to enable them to be better. Too many still see them as a threat. ?Controversial?: Something like data mesh might be seen as threatening to some data consumers by reducing their control and their value creation. Previously, many consumers were in charge of getting access to the data and doing the analysis but that is often pushed to the producer in data mesh. Scott note: this was a really interesting point that hasn't come up before A product is only a product if it's providing value. Ingrain this into people doing data work. Really drive to a common understanding of terms like a data product or data as a product. It's very easy for many people across the organization to interpret/understand them differently making communication extra challenging. Product thinking isn't only focusing on the end product but the supply chain and component parts. That's even more important when that is relating to data; the inputs all matter and the reliability/sustainability of the inputs matter. Transformation journeys are often quite disjointed at the start. You have to align people to start the transformation and people aren't moving together. There is a disjointed phase before the planning phase 😅 That's normal and to be expected. If you focus on the target instead of the journey to get to the target, you're as likely to make things worse. You can't simply jump to the end target state, you have to learn how to get there and improve along the way. This is especially true with data mesh. Mandeep started by discussing one of the key challenges in talking about data mesh: there are so many areas to cover that we often discuss things abstractly. Those abstractions are based on a significant amount of research, discussions, and related work that create a mental model. When we communicate those abstractions, it's hard to communicate the mental model as well. The listeners just aren't as deep into it so much of it goes over their (our) heads. So, we need to get far more specific with anecdotes and examples. We also can't forget the value of 1:1 conversations to drive to deeper understanding. That might not always be the most scalable but it is the best way to prevent misunderstandings. Basically, communication around data mesh is hard! Go talk to people. Scott note: Data Mesh Understanding exists for this reason… When looking at how to get specific internally with data mesh communication, Mandeep is always on the lookout for her ambassadors or champions. Within a domain, they have strong domain knowledge to connect what you are trying to achieve with data mesh to what the domain is specifically focused on. And they can obviously communicate well in the language of the domain. Connecting the changes data mesh brings to real world problems helps people understand the what and the why. There is a lot of risk of analysis paralysis in any data mesh implementation according to Mandeep. There are hundreds of 'micro decisions' but if you focus on the core aspects of what you're trying to do, that should guide you to the ones that matter the most. A bit of don't sweat the small stuff. Always come back to the value proposition because you can change things as you learn more. That's not to say be sloppy or careless, there are important aspects like using the right architecture, having strong ownership/accountability, product thinking, etc. But data mesh is as much about learning to get it right as getting it right. And always return to your trade-offs. What aren't you willing to trade-off and why? Once you answer that, more and more solutions become tenable and you can weigh the pros and cons. Mandeep started to dig into the crucial first question to a data mesh implementation: what is the value you hope to get out of it? And there are different answers for each organization. Those answers will start to inform where you should focus and when in your mesh journey. That will help you set your plan because "a target without a plan is just a dream". And when you form your data mesh plan, think about what you have to adapt to your organization and why. This is not a copy/paste approach! You almost certainly will have competency gaps so how do you plan to fill those gaps and make progress while doing that? Or do you have to fill those gaps before starting the journey because they are journey blockers? Really consider the journey, not only the target outcome. Relatedly, set some milestones for your journey to help you measure your progress and celebrate the progress you've made. They might not be the best success measures once you're further along in your journey but that's okay, you can adjust. That's product thinking. Mandeep wanted to stress three quick points: "1) don't overthink it. 2) bring value out as soon as possible. 3) evolution before completion." Many business users still see technology as a threat rather than an enabler in Mandeep's view. They aren't even thinking about data yet, they are still on just tech 😅So there is a lot of work in communication to get them to see data as a major innovation enabler, something to drive their part of the business to new heights. Another interesting aspect Mandeep talked about was that self-serve might actually be seen as threatening to data consumers. Previously, they controlled - to some degree - their own ability to get access to data but now, it's on the producing team and consumers only get what producers are willing to share. The consumers created the business value by doing the analysis and transformation and now that is pushed much more onto the data producers. Will consumers feel their power and importance is diminished? If the value of data work is attributed to the producers, will data fluent consumers still lean in to leveraging data as much as they did previously? Mandeep returned to product thinking and her view that a product is only a product if it's providing value. You start from the value you are trying to deliver and work backwards. Build your KPIs around actually delivering value instead of simply creating data products with the hope they create value. When thinking about data as a product, Mandeep encourages everyone to have conversations about it in their organization and what it will actually mean and look like in their specific organization. Because it's easy to assume everyone is on the same page when they really aren't. And that confusion will bite you in the end with more friction than clearing it up early. Mandeep believes that in a transformation journey, it almost always starts as somewhat disjointed - a disruptive phase before the planning phase. Part of going on a journey is preparing for that journey. Once people are aligned, that is when you can really start all heading forward. You need pioneers or leaders, those front runners to show people it's safe. But it will still take some time before that alignment. Don't get concerned when that happens even if it feels like everyone should align after the first presentation 😅 Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Weekly Episode Summaries and Programming Notes – Week of December 31, 2023 14:39

לפני 1 year14:39

14:39

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#281 Panel: Data Contracts and Data Mesh - Led by Jean-Georges Perrin w/ Amy Raygada and Andrew Jones 1:06:28

לפני 1 year1:06:28

1:06:28

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . JGP's LinkedIn: https://www.linkedin.com/in/jgperrin/ Amy's LinkedIn: https://www.linkedin.com/in/amy-raygada/ Andrew's LinkedIn: https://www.linkedin.com/in/andrewrhysjones/ Andrew's website: https://andrew-jones.com/daily/ Andrew's book: https://data-contracts.com/ Data contract standard project Bitol: https://lfaidata.foundation/projects/bitol/ JGP's blog: https://jgp.ai/ In this episode, guest host Jean-Georges Perrin, Data Innovation Consultant at ProfitOptics (guest of episode #130 and panelist in episode #227), facilitated a discussion with Amy Raygada, Senior Data Product Manager at Swiss Marketplace Group (guest of episode #165), and Andrew Jones, Principal Engineer and Author of the book on Data Contracts (guest of episode #29). As per usual, all guests were only reflecting their own views. The topic for this panel was all about data contracts and how do we go about getting them in place. Much of it was about the general concept but some of it was specifically about how do we think about data contracts applying to data mesh. This was the first topic I really did a deep dive into in early 2022 and it has evolved but is definitely still evolving. Scott note: As per usual, I share my takeaways rather than trying to reflect the nuance of the panelists' views individually. Scott's Top Takeaways: Data contracts are about trust and understanding. Trust that there is an owner and there are rules, there is a minder that knows this data matters. Trust that things aren't going to break - at least as often as many things in data have historically and they will be told if it breaks. And understanding that what you're getting isn't perfect and there are rules but also limitations. It's no longer buyer beware, consumers can understand what they should get. To do data products well, you almost certainly need some concept of a data contract. Otherwise, you are essentially just putting out a data asset and calling it a product. Products come with guarantees of some sort. Data contracts are about ensuring better outputs with less effort for all parties. They are a quality assurance mechanism but also a scaling mechanism. It's a printing press for data in a sense - reusability where you don't have to carve things in wood each time, you assemble the tiles to have it say what you want but it's more about arranging tiles than defining everything - carving from scratch in this analogy. Standardized aspects of contracts help both producers and consumers communicate about the aspects of a data product. Like with anything related to data mesh - or really any good data practices - you can roll out data contracts over time. It's not a switch you flip and suddenly everything is covered. Start small and find value. Start with one or two teams / data products, figure how this can work in your organization, and then scale from there. While many may see data contracts as additional overhead for data producers, it's quite often a safety mechanism for them. They (hopefully) don't want to break things for downstream consumers but they often don't know exactly how their data is used. Now we have a way for them to understand the impacts of their changes and easy mechanisms to get in touch with the users of their data. Far fewer emergency response tickets to data breakages. Data contracts are very useful - potentially necessary? - when we think about interoperability between data products in a larger context. The contract isn't only about what is in the specific data product but how it relates to the rest of your data products, mesh or not. If you have interoperability standards or linking keys, those are important aspects to mention in a contract. To realize the vision of data mesh, we have to be technology agnostic. There will be tons of vendors releasing their own versions and visions. But at the end of the day, to actually be able to let teams have the freedom to develop their data products to best serve users, we need approaches over tools. Scott note: If you can't tell, I am skeptical of tooling in this space… Other Important Takeaways (many touch on similar points from different aspects): Define your contracts where it's most likely to be updated. That's probably in the code for the data product, not having to go to some separate tool. Circling back to understanding, data contracts set expectations. Literally, they contain the expectations of what you should get with the data product. Expectations setting and boundaries are crucial to good human communication :) As always with data work, data contracts don't come for free. They take time for producers to engage with. Reduce the friction of dealing with contracts for producers but also incentivize them to actually leverage data contracts. Otherwise, it's just a request not a requirement. There are two different main approaches to data contracts when it comes to breaking changes - to collaborate on changes before they happen or to alert people a breaking change has occurred. It's better to be the first but you might start with only the second capability and that's okay. ?Controversial?: Using data contracts only as a blame mechanism when data breaks is missing the point. They can be a GREAT collaboration tool for negotiating between producers and consumers. They are a great starting point for those negotiations and then an agreement tracking and enforcement mechanism. ?Controversial?: As I've noted many times before, contracts can be a double-edged sword. If you have consumers that never meet with producers and share information, that can lead to someone leveraging data they don't fully understand. Contracts can give people trust in the data products they discover without digging deep enough. It's a very nuanced and hidden issue. Like any products practice, you will probably start out pretty raw and unsophisticated when doing data contracts. It's about getting to good, not starting there. Find value, find scale, find repeatability. Iterate to good, here's your permission to suck when you start. Data contracts can behave as great automated communication tools. Instead of trying to find all your users to update them about an upcoming change, it's automatic. Without automation, Amy said data contracts are "just a bit more paperwork." Data contract standards are important but must be extensible. Don't expect a standard to solve all your problems or fit all your needs, especially as they are just emerging. There are many choices you have to make for your organization around your data contract setup. Who owns the data contract is especially important. It should probably lie with the owner but if you don't have clear ownership of a data product/asset, then it's more likely a fact sheet about your data product/asset, not a contract. That said, consumer-driven testing is great in software, will we have some aspect of it in data? Circling back to communication, to get producers to lean into contracts, look to have real conversations with producers about the challenges the organization is having with data and things breaking. Work with them to find a better solution. They are typically software engineers, they like solving problems. Give them the KPIs that let them focus on data and solving things through contracts. It can't simply be more work, it needs prioritization. Data consumers need to be accountable to watching for changes to the data products they use. You need a good mechanism to alert them but if they aren't paying attention and something breaks, that's on them. Everyone has accountabilities. Should you have your security and privacy encoded in the data contract itself? I think it's early days there. It might live in the contract as the place of record for the platform or not. It's an interesting concept. ?Controversial?: Should we try to create the contract automatically and have a human change and validate or the other way around? Probably human with a template works best right now. Automation would be great but there's probably too much room for error. We want super clean implementations of something like data contracts - one standard contract across the organization. But it's just not realistic at the end of the day, especially early in your data contract journey. Every organization is messy in its own way especially with multi-cloud and many platforms, this is no different. How data quality plays into data contracts is a bit more complicated than people think. There are quality standards but checking if a data product actually complies with its SLAs and the standards is another interesting question that people are approaching differently, whether that quality enforcement is in the contract or not. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#280 Enabling Your Domains to Create Maintainable Data Products - Interview w/ Alexandra Diem, PhD 59:40

לפני 1 year59:40

59:40

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Alexandra's LinkedIn: https://www.linkedin.com/in/dralexdiem/ In this episode, Scott interviewed Alexandra Diem, PhD, Head of Cloud Analytics and MLOps at Norwegian insurance company Gjensidige. Gjensidige's approach closely aligns with data mesh but they are starting with a focus on consumer-aligned data products as they have a well-functioning data warehouse and are not looking to replace what isn't broken. Some key takeaways/thoughts from Alexandra's point of view: Advice to past data mesh self: stop talking to people about data mesh, talk to the changes in the way of working. It can be very tiresome to try to explain data mesh instead of those changes. Data mesh isn't the point. There aren't really any reasons we can't apply many software engineering best practices to data, it's simply we haven't done it broadly in the data world. There is a push and pull between software best practices and data understanding. Consider which you see as more important and when. Do you bring data understanding to software engineers or software best practices to those with data understanding. When you leverage pair programming between enablement software engineers and data analysts that understand the domain, the software engineers learn more about data and the domain and the analysts learn good software engineering/product practices. It's a win-win. The people you enable to do work in a data mesh way should serve as ambassadors of your ways of working, especially within the domain. Both helping others learn and as champions. That provides organizational scale. You can't individually enable every person in a large company. "Too many cooks spoil the broth." Think about having that 'two pizza team' kind of approach so you have concentrated understanding by those involved in creating data products who then can again help others learn. This is good for those in the domain and also for an enablement team bringing learnings back to a platform team. Having a team with intimate knowledge of what data products/data product features have been built can speed time to market for other teams and improve reuse. Each time they sit with a team, that new team has far greater access to what's been built before, whether that is existing data sources, existing models or transformations, output ports, etc. ?Controversial?: With a central enablement team, your job is more to teach the domains how to do the work, get them to minimum viable data products. Otherwise, that central enablement just isn't scalable in a large organization. ?Controversial?: A perfectly filled data catalog still won't connect all the dots for consumers. Yes, good documentation is important but there still is a significant value in helping people connect the dots. Scott note: this shouldn't be controversial but is. It's also my 'data sherpa' pattern emerging yet again as highly valuable If you can, make sure you have a shielding and prioritization mechanism for any central team or you can head back down the overloaded central data team as a bottleneck pattern/challenge. As anyone in the organization, your ultimate role is value generation. Consider how the data teams do that. If it's an enabling team, it's helping teams to do data work quicker and better. Those teams don't care that you're doing that via data mesh. Relatedly, terms like data product and self-service platform resonate far more than data mesh. Lean in to what generates value, not the implementation details behind the scenes. Potentially read The Lean Startup to dig deeper into this philosophy. Data reuse is not actually that obvious of a concept to many. Probably because it's meant so much cleaning and manual work in the past from finding poorly owned data sources or processes. Train your domain teams to look for places to reuse what has come before them. !Controversial!: Potentially look to build out your data products from a source of already clean data. That may be an existing data warehouse or something centrally managed. Scott note: this is a data mesh end-state anti-pattern but is it an anti-pattern when in transition? If something isn't broken, do you need to 'fix' it? Relatedly, "[I] don't really see the point of having to destroy value before I should be able to generate new value. I can very happily just generate value on top of the value that I already have." Hypothesis testing and fast fail are great software engineering practices but in data 1) it can be hard to hypothesis test value and 2) it can be quite hard culturally to get people to learn to iterate and embrace fast fail. Alexandra started with a little about her background coming from academia into the commercial world and how that shaped her views of things. She was the first data scientist at a company so before she could really do data science, she essentially had to work as a software engineer; that meant learning many good software engineering practices. When she moved to data, she thought 'we should use these practices here too' and then she also came across Zhamak's posts on Martin Fowler's site and it all started to click. Specifically at Gjensidige, Alexandra was brought in to lead of team of software engineers acting as an enablement team plus a platform team. Their role was to focus on bringing these good software practices - e.g. DevOps, automation, testing, etc. - to the data/business analysts to help them build data products. It has evolved to be more sophisticated but the team is still about enabling people to build data products. Gjensidige already had embedded analyst teams in many of their domains so when Alexandra and team started to roll out the data mesh implementation, there were already a number of data-capable folks who understood the actual business aspects of the domains. That meant there wasn't the typical pushback on the domain actually owning their data products, it was more about enabling them to do so and build a maintainable and scalable product. This process of pair programming between her team of software engineers and the domain data experts means her team becomes more and more data fluent while the domain learns how to write good software code. They specifically leverage a model of two of her team and two of the analysts in a data product creation team to provide enough information exchange but not too much overhead. That intimate understanding of what has been created also helps her team to help find reuse in other domains - they more deeply understand what has been built and can direct teams towards it quickly. That speeds time to market as well for the new team. Lots of wins all around! When looking at the central enablement team's strategy, Alexandra strongly believes in a minimum viable data product approach. Her team only has a handful of people and they have 25 analyst teams to work with. The team has to focus on getting each analyst team to capable via the first data product - again with only two analysts on the team - and then letting those two analysts propagate the knowledge to the rest of their own teams. Otherwise, the central team would be too overloaded. So again, the focus is on teaching the analyst teams how to build good data products and then moving on. Otherwise the central team just isn't scalable or you have so many people in the central team that it becomes far harder to find patterns and share information. The domains have to deliver value themselves so teaching them to do so and then moving on is a sustainable strategy. When communicating with the rest of the organization, Alexandra rarely uses the term data mesh. She points to data product and self-service platform as things that resonate with people and help communicate what she's actually focused on doing: generating value. Most people don't care that the way you are generating value is data mesh. It's simply a mechanism. 'Lean' into that. Scott note: lean is a bad pun here because she mentioned how helpful The Lean Startup is to focusing on value generation. One very interesting note Alexandra talked about was training the domains in reusing data. Historically, it's been very difficult to reuse data because you didn't have the information about how it was created and didn't really have a reliable source. Getting the spreadsheet from a colleague each month isn't that reliable 😅 so, you will likely need to train your domains on reuse, especially finding sources to reuse and how to see if something fits their purpose. That can be the producing team too, teaching them how to share what they've built to other parties that might want their data. Alexandra noted that most of the data products they are building use an existing clean and well understood data source: the cloud data warehouse. They are leveraging a hub and spoke pattern from that warehouse for their products. People already know and trust the warehouse so it made sense to them to start there. Essentially, everything ends up as consumer-aligned data products in a sense. Relatedly, for Alexandra and team, they don't see a need to adhere to every aspect of data mesh, especially at the start of their journey. She said, "[I] don't really see the point of having to destroy value before I should be able to generate new value. I can very happily just generate value on top of the value that I already have." They had some things that were working well already and breaking it all down to fit the paradigm didn't make sense to them. However, she is aware of the additional challenges this can bring and made the conscious trade-off. Scott note: this is an obvious data mesh anti-pattern because the upstream isn't directly from source systems - the teams building the data products don't control the source or their source-aligned data products. But if you don’t have an existing bottleneck from your cloud data warehouse, why fix something that isn't broken? This may become a bigger challenge later - Zhamak has written why not owning source data creates challenges - but if they are willing to take the tradeoff and understand those tradeoffs, is it a bad approach? I don't think so _in their case_ because the data warehouse is functioning well/isn't a bottleneck. Alexandra talked about how to really embrace a culture around 'minimum viable x' in data. In data science, at least there is a good understanding of hypothesis testing but even then, it's often hard to embrace the necessary 'fast fail' model touted by things like The Lean Startup. Trying to understand how to hypothesis test value is also difficult and people have historically seen the challenges in iterating on anything data related. So there is a learning curve but also generally a necessary cultural change to embrace hypothesis testing and fast fail around data. On advice to her past 'data mesh self', Alexandra gave a reasonably common response, circling back to an earlier point: stop talking about data mesh, at least early in the process. Data mesh is a set of guiding principles, not the answer. Talk to people about changes to their ways of working and target outcomes. Why are we taking on change? People hear data mesh and expect it to be some technology or technological approach. You can use the name when people ask for what you're calling the approach but selling it as doing data mesh doesn't help your business partners get it. It becomes a much more tiresome approach to specifically focus on data mesh instead of the ways things change and what matters to them. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Weekly Episode Summaries and Programming Notes – Week of December 24, 2023 28:06

לפני 1 year28:06

28:06

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#279 Driving Towards a Cohesive Developer Experience - At the Expense of Snowflake and Databricks? - Zhamak's Corner 32 18:40

לפני 1 year18:40

18:40

Key Points: We need API-first technologies in data. Not just offering APIs but being able to integrate seamlessly with each other via API. We have that in software but it's been a long-time coming in data. If we want an actual modern data stack, we need to have tooling providers make a real change. Simple made easy: we need to make things simple for data product development and consumption. It's not simplistic but it removes unnecessary complexities. Overall, there is such a trend in data where people aren't building things that remove toil - there is this assumption of increasing complexity of use cases but so much of the work is not that complicated. We need to make it so most people can do most of the work relatively easily without making it overly simplistic - easier said than done of course. Sponsored by NextData , Zhamak's company that is helping ease data product creation. For more great content from Zhamak, check out her book on data mesh , a book she collaborated on , her LinkedIn , and her Twitter . Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Data Mesh Radio episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#278 Data Contracts for the Rest of Us - Approaching Contracts in Evolving Companies - Interview w/ Ryan Collingwood 1:19:26

לפני 1 year1:19:26

1:19:26

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Ryan's LinkedIn: https://www.linkedin.com/in/ryancollingwood/ In this episode, Scott interviewed Ryan Collingwood, Head of Data and Analytics at OrotonGroup. To be clear, he was only representing his own views on the episode. Some key takeaways/thoughts from Ryan's point of view: Have empathy for yourselves and others in all things you do around data. You won't always get it right the first time. Build the relationships, build the trust to continually drive and iterate towards better. In tech, far too often we hear what people need and provide a poor solution to actually solving their needs. It's focusing on the tech instead of the people. Far too many technical solutions/approaches - e.g. data mesh, data contracts, etc. - are really presented for tech-heavy/forward companies e.g. startups. Most companies, large or small, are not capable to leverage the approaches as presented so they must be adapted for 'the rest of us' companies. Scott note: data mesh is like this Far too often, these tech approaches focus purely on the tech instead of the people. That's partially because every org has a different culture so you can't cover them all; but if you only follow the approach as presented instead of focus on the people/ways of working in your org, it's far less likely to go well. You've implemented a great technical solution that no wants to or can use. ?Controversial?: "What are the trade-offs that I can make, while still being true to the value and the benefits that I want to get out of this?" Scott note: SO important to consider when looking at any technical pattern/approach. What is true to the value of the approach? Data contracts really rely on 3 things: at least two parties, an agreement of some kind that is recorded, and access to data that conforms to that agreement. You can add value building beyond those 3 but you have to start somewhere and you can deliver value with something that only satisfies those 3. ?Controversial?: It's hard not to have a sense of imposter syndrome when you actually strip a concept down and implement something that doesn't look like the public examples. That's okay and to be expected. If you're delivering value reliably, you're probably doing something right 😅 The world changes all the time. Your systems will change. Your data sources will change. Your understanding of the world will change. Your processes will change. Create/use approaches that can handle change or you're just going to create more headaches down the road. With a centralized data team, the data team is often considered the data producer, at least to the consumer in a data contract. So with strong testing, the data team can be far more sure about meeting their contracts with consumers. Express data quality "in a way that engages people who are not you." Make it understandable. It doesn't have to be - and shouldn't be - rocket science. Data contracts should be the culmination of multiple conversations. Because then the producer isn't just posting data, they understand the needs of the consumer(s) and can best serve them. It can be incredibly helpful to just go talk to your business colleagues about data in their applications or that they use in the warehouse. They can explain the context of what is going on in the real world but you can show them how that is represented in data. Both people gain a lot of understanding. Similar to data mesh, most people in your organization won't care you are specifically doing data contracts. Talk to them about what changes and why you are doing it and how it will deliver value. Speak in the language of the business, meet them where they are. Also similar to data mesh, you don't need to convert the whole company upfront. Find an ally to test at a small scale and prove value. Use that to learn, get a champion, and show value to gain more converts. Some data isn't worth cleaning if it fails your contracts testing. Really consider what is of value and negotiate that with your consumers. They may want 6 years of data but really only 6 months or weeks is of considerable value. What is the risk, what is the fear of a piece of data being wrong? Really assess that. If it's relatively close, especially for now, will that be good enough? People need to consider that data is never 100% accurate so what is good enough and how comfortable are they with uncertainty. ?Controversial?: Relatedly, signal is what matters, not (usually) exact measurement. Get people used to finding the signal but also get them used to understanding how reliable that signal is and then acting on it to an appropriate degree. If something you are measuring, some data point, isn't going to cause action no matter the result, why measure it? Ryan started off with some framing of how he looks at tech approaches in general but especially how he started looking at data contracts. Most paradigms are presented as if every organization is very tech-y, like a tech startup. With data contracts, much of the content "…there was this assumption that you had multiple teams of people that had a fairly high degree of technical sophistication, … or maybe even data was their primary focus." So when a less tech-y company wants to leverage the paradigm, there is always some adjustments necessary 😅 and when it comes to those types of companies, it’s so much more about the people than the way most paradigms are presented. It makes some sense because every org's ways of working and culture are different but it still can feel very removed from reality for less tech-heavy companies. When focusing specifically on data contracts, Ryan's company is far more batch than streaming. So trying to even leverage the best advice (Scott note: I highly recommend Andrew Jones for that), he had to adjust some aspects to a world where things were a bit more messy and with teams that aren't as data mature. When approaching how to tweak data contracts to still work, he asked the rhetorical but crucial question: "What are the trade-offs that I can make, while still being true to the value and the benefits that I want to get out of this?" Ryan moved into what he sees as the minimum viable value aspects of data contracts. You need two parties, you need an agreement of some kind that is recorded, and you need access to data that conforms to the agreement*. As to the parts of the agreement, Ryan focused on two factors at the start: semantics and data quality. If people can't understand the data can they use it? If they don't understand the quality, can they really trust it enough to rely on it? So they worked to create a data dictionary and also provide people a better understanding of the different angles on data quality. * Scott note: this could somewhat disagree with the idea many have around data contracts of merely publishing data with SLAs because while there is a consuming party, they aren't really part of the agreement, they only choose to use the data based on the existing SLAs/contract around it. There's lots of nuance but I HIGHLY believe in the communication-heavy aspect Ryan and Andrew Jones both present. Often, when comparing with what was presented for a tech-heavy company to what is possible at a more regular organization can be disheartening according to Ryan. The idea that the end picture at your organization should look like the one presented is pervasive. So it's not only hard to adapt the approach but then you wonder if you even captured the value 😅 Can you even call it 'data contracts' or whatever you are working on?! Imposter syndrome is very common here. Scott note: you could definitely call what Ryan and team are doing data contracts :) Ryan also talked about how in data contracts, you must build for change. Change is the only constant after all. So creating systems that don't handle change well is a great way to manufacture more headaches down the road. Much like in software testing, you can more easily tell when something no longer works and needs to be changed. And when the data team is the actual data producer - if the data team are the ones transforming the data, that's often the case or at least is the only group of people consumers talk to with a centralized data team - they are much more sure that what they are doing is correct. Another key learning Ryan had along the journey was that when displaying data quality, make the metrics more easy to understand to the layperson. Historically, data quality has been measured with complex statistics. Most people can't easily read the charts from that to understand what's going on. Make the data quality metrics understandable so people can see progress but also get a sense of how well they can rely on data. It is a sad truth that you can deliver value but if you can't get others to see that value, it isn't valued. Showing that value gets people to lean in. Ryan dug a bit deeper into creating systems that act with empathy. If you approach data contracts as consumers only get what the producer shares, that doesn't end up serving the end needs that well. But if you are treating the contracts as the culmination of multiple conversations, the producer can start to really understand the impact of bad data. How much work do data consumers have to do to actually use the data? This is where empathy and product thinking come in. "…data, as we know, it is merely a side effect of activity, of stuff happening." Ryan believes we need to move past the 1s and 0s thinking in data and focus on what it reflects and how that impacts the people in the organization. Conversations can be hard but they give you the context necessary to maximize the impact of your deep systems work. Talking with people can help both parties bridge the gap between understanding what is happening in the real world versus the data 😅 Internally in Ryan's org, they wanted to review their general processes. Part of that was the uncomfortable truth that change, especially to processes, impacts the data. So that review created a great opportunity to start to implement data contracts. It wasn't about telling people they were doing data contracts, it was about getting people bought in to what value could be delivered if they did data quality and trust better. It just happened to be via data contracts. When actually starting out, Ryan looked for one ally that was willing to take on some of the complexity of dealing with data contracts and saw the potential benefits. Instead of trying to convert the whole organization, it was contained and let Ryan learn how to implement data contracts well in his specific organization. That initial success gave him the confidence to move further and the success story to entice additional partners/allies. Ryan discussed the push and pull of data quality and value. While it might be valuable to have a long history of data, is the cleanup worth it? Really have conversations and make hard choices that align to return on investment instead of merely do consumers want it. Similarly, people need to confront the idea of data being right or wrong. They need to consider what is the cost of some data being wrong, especially slightly off. If that's for a regulator, potentially high. But if it's your weekly marketing leads report and it's off by 0.2%, how big of a deal is that? And how much trust is lost if it's wrong? Can we get people to understand data is never 100% clean/right? Getting people to act on signals will likely be somewhat challenging but it's a better way to navigate than trying to wait for exact measurement in many - most? - cases. Ryan wrapped up back on dealing with yourself and others with empathy. You might not get it right at first but if there's trust, you can iterate towards better together. That goes for your data, your processes, and your relationships. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Weekly Episode Summaries and Programming Notes – Week of December 17, 2023 14:41

לפני 1 year14:41

14:41

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#277 Mesh Momentum Versus Value - What to Choose When and Why - Mesh Musings 57 13:30

לפני 1 year13:30

13:30

Please Rate and Review us on your podcast app of choice! Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#276 Making Self-Service Actually Work Well Safely - Interview w/ Kate Carruthers 1:02:55

לפני 2 years1:02:55

1:02:55

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Kate's LinkedIn: https://www.linkedin.com/in/katecarruthers/ Kate's 'Data Revolution' Podcast: https://datarevolution.tech/ In this episode, Scott interviewed Kate Carruthers, Head Of Business Intelligence at the UNSW AI Institute and Chief Data & Insights Officer at UNSW (University of New South Wales). To be clear, she was only representing her own views on the episode. UNSW is not currently implementing data mesh but are preparing to be able to do so. This is a great lesson in building up the capabilities to move forward towards your goals but not rush. Some key takeaways/thoughts from Kate's point of view: Universities can teach us some really interesting perspectives on self-serve. Because universities are such complex organizations and so many departments are involved in deep investigations in very specific areas, they really are the only domain experts. So enabling them to even just own their own data can be very challenging, let alone helping them share with others safely. Relatedly, each academic researcher is essentially a micro-domain themselves with their own ways of working. That just adds to the need to enable freedom in ways of working but still "keep them safe." Scott note: safety was a key theme of the conversation "At the end of the day, data mesh is about controlling the bits that you need to control, and giving people the freedom to do what they need to do, safely." "Technology is kind of the least of your problems." When it comes to data, be prepared to start with some people not even recognizing there is a problem with the current ways of working or a need to improve. Connect their pain to data immaturity to win them over. The best way to win people over is show, don't tell. Show them the power of self-service instead of pitch them on it. Get a PoC going and get people to tangibly see - and hopefully soon touch - your self-service capabilities early. Always look to anchor your data work - especially things like platform work - to a business need. How will doing the work impact the business? Why is it important to do and to do now? When tying your data work to the overall business strategy for your organization, do NOT forget the people aspect. The relationships matter. Your work on the data team definitely isn't only about technical execution. ?Controversial?: Build a culture around data that is as focused on building human relationships as it is on building data pipelines and platforms. ?Controversial?: To share personal/sensitive information - e.g. PII - a producer should justify why it's appropriate and a data controller should review that. Keep humans in the loop. Giving data owners (UNSW calls them data controllers) a say in how their data is actually used can get them more excited to share their data. It isn't a silver bullet to data sharing incentivization but it adds value to them. Good conversations about access to sensitive data shouldn't be yes/no. They are about getting to what is acceptable and maximizing value within that framing. Get people to share what they are trying to accomplish and partner to best achieve it! Invest in business analysts. They are your front-line to figuring out how to proceed around data and generate value. You need people who can speak business and data simultaneously to drive to great outcomes. Find ways to prevent data puddles, especially places where people are copying data and not securing it well. "People overestimate the power of making change really fast, and underestimate the power of … sustained incremental change." Give people a mental map for change. It removes the fear of the change and lets them lean in. You are creating change with and through them instead of pushing change on them. ?Controversial?: ChatGPT and other GenAI can actually be a great benefit to education. We have to lean in to it as it's not as though students won't have access to these tools in their work life. So getting them to still learn but leveraging better tools is essential to their progress. Kate started out with a bit about the catalyst for her current data journey towards data mesh. About 10 years ago, she saw that universities and especially UNSW were going to "undergo a very big digital transformation and that data would underpin it as an organization. [So] we would need to be on top of our data if we were going to be able to ride that wave." She also gave some color on what running a data office at a university entails. At UNSW, it's split into three general areas of administration, learning + teaching, and pure research. There are some major challenges when it comes to providing data capabilities - especially self-service - to the academic research arm of a university according to Kate. They all have their own ways of working and want - demand? - freedom to work the way they want. Yet, the data team's job is also to "keep them safe." That safety has many facets as well. And the research capabilities of a university can mean some truly world-changing interdisciplinary collaboration. But that only happens if the teams can actually, you know, collaborate 😅 When it comes to the non-research area, Kate believes data mesh is an even better fit. "At the end of the day, data mesh is about controlling the bits that you need to control, and giving people the freedom to do what they need to do, safely." As many guests have noted, Kate believes when it comes to your organization's data journey, "technology is kind of the least of your problems." It’s about people and often even getting them to recognize the problem with their ways of working and how better data maturity will help alleviate their problems. It's not just the data itself but their understanding and relationship to data. Kate and team built a quick cloud warehouse PoC that showed people the ability to onboard new data sources in weeks instead of taking up to six months. Showing them instead of simply telling them really won people over. People could connect moving to a cloud data warehouse to business benefits. They also anchored it all to business needs. Yes, rebuilding their architecture to move to the cloud was going to be work but it meant speed to new data use cases and easier management. When Kate was working to tie her team's work to the overall business strategy, she remained focused on the human relationships and people aspects of doing business. She really recommends building relationships with "customers" of your data work because then they feel comfortable to come to you with more types of problems and challenges. And sometimes that kind of culture/approach isn't for everyone and that's okay. If people aren't willing to treat customers as people, they aren't right for her team. When asked about her frequent use of the word "safe", Kate talked about keeping people from misusing data or even misusing the trust people who provided that data - e.g. the students at UNSW - gave the organization. Anytime someone wants to share sensitive information like PII, there is a data controller that needs to review the justification. Keeping that human in the loop means there is a real understanding and consideration of 'is this okay?' On the flip side, the team has been proactive in sharing information that someone should have access to, e.g. a professor being able to know who is in their class and being able to contact them. Kate mentioned that when they implemented the data controller review, the data producers were much happier. Previously, they had no real say in how their data was used but now, they are listened to. It also strengthened relationships because consumers had to actually collaborate with producers to get access to their data. It's creating interesting conversations and people can get more creative around data to achieve their goals with more data safety. And her investment in hiring a bunch of business analysts has created some great value leverage points. Going back to keeping people and data safe, Kate talked about their struggles with data puddles - where people are copying data into lots of areas instead of accessing the data where it is. And they aren't securing that data well, which leads to more challenges and potential issues. But it's still a process to give people all the access they need and make that copying data less attractive. As like many areas, it's a work in progress 😅 Kate sees the attractiveness of moving fast but believes people need to focus more on sustained incremental change, that they overestimate the value of the former and underestimate the value of the latter. It's similar to transformation versus a change that will revert. Fast changes are far less likely to stick or even work. And people feel less of the suddenness and fight against it far less if at all when it is gradual incremental progress. Another point Kate emphasized was that people need a mental map for change. If they don't understand what is changing and why, they will inherently fight back, even if the change is good for them. It's simply human nature to not want change. So take away the fear of change to make it easier for people. Basically create change with and through people instead of pushing change on them whether they want it or not 😅 The conversation wrapped up around GenAI, especially because Kate is involved in the UNSW AI Institute. She is seeing the open source large-language models (LLMs) improving at a rapid pace, sometimes multiple times a day. And there is a lot of promise even if things are early days. At UNSW, they are figuring out good ways to leverage GenAI in education instead of trying to fight against it like some math teachers did against calculators. It's here to stay so they have to adapt and adopt. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Weekly Episode Summaries and Programming Notes – Week of December 10, 2023 15:18

לפני 2 years15:18

15:18

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#275 Panel: Why Data Mesh Needs Digital and Org Transformation - Led by Benny Benford w/ Nailya Sabirzyanova, Iulia Varvara, and Stefan Zima 1:05:41

לפני 2 years1:05:41

1:05:41

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Benny's LinkedIn: https://www.linkedin.com/in/bennybenford/ Iulia's LinkedIn: https://www.linkedin.com/in/iuliavarvara/ Nailya's LinkedIn: https://www.linkedin.com/in/nailya-sabirzyanova-5b724310b/ Stefan's LinkedIn: https://www.linkedin.com/in/stefan-zima-650229b7/ In this episode, guest host Benny Benford, Founder and CEO at Datent - a data transformation focused consultancy/community - and guest of episode #244 facilitated a discussion with Iulia Varvara, Advisory Consultant in Digital and Organizational Transformation at Thoughtworks (guest of episode #268), Nailya Sabirzyanova, Digitalization Manager at DHL (guest of a soon-to-be-released episode), and Stefan Zima, Data Transformation Lead at Raiffeisen Bank International AG (guest of episode #270). As per usual, all guests were only reflecting their own views. The topic for this panel was transformation when it comes to data and data mesh in general but especially understanding how organizational transformation must play a large part in a data mesh implementation to be successful. And that transformation is not simply making changes, it is making _lasting_ changes. Organizational transformation is a crucial aspect of doing data mesh even if it's not spoken about all that often. Scott note: I wanted to share my takeaways rather than trying to reflect the nuance of the panelists' views individually. Scott's Top Takeaways: Transformation means changing something. We aren't starting from scratch. You have to consider the starting points, not only the target end points - and in data mesh, there isn't really an end. Every organization's transformation starting point, whether a data mesh transformation or otherwise, will be unique so adjust your transformation journey plan accordingly. There are so many reasons transformation initiatives, especially data transformations, can fail but a big one is not preparing for the long-term change necessary to make changes actually stick. It's easy to try to make changes but actually making them to last for the long run is something else entirely. There needs to be a sense of urgency to drive forward a large-scale top down-driven organizational transformation. If there isn't a real business reason and one where there is a need - or at least a strong desire - to be addressed in the near-term, you are far more likely to lose momentum/sponsorship. And you need lots of momentum and sponsorship for large-scale sustainable transformation. If you are trying to pitch something like data mesh, speak to real pain points. Just selling the potential benefits instead of solving real, painful existing challenges is not likely to win you as many converts. There's a reason painkillers are easier to sell than vitamins. Transformation and product thinking have a lot in common. Org transformation is treating the organization as something like a product to improve over time. That means prioritization. You can't take everything on at once. Work with your stakeholders to make progress on what matters most. Driving data transformation - data mesh or otherwise - will likely take a lot of education. There is a general sense people should be using data for additional use cases but really, many aren't thinking of the great ways they could use data. Help them find the link between their business priorities/pains and data. Your business partners don't need to know the particulars of your transformation initiative. Sell them a story, give them an enticing vision. Why is this worth doing and what is the payoff? Stop taking them on the sausage factory tour against their will, give them - or promise them - a wonderful sausage tasting party instead. Organizational transformation - data or otherwise - only happens when things change, when they transform. Sounds obvious but you really have to get your business partners to engage or your transformation won't be as successful and is likely to stall/fail. Trying to change the entire organization from just the data team is daunting at best. Other Important Takeaways (many touch on similar points from different aspects): Transformation might not be the best word since transformation implies an end state, an end to transforming. And while you should have some kind of target future state in mind, that's likely to be a target state along the journey rather than an end state. The only constant is change as Iulia said. If you try to break your data mesh transformation down into very separate component parts and transform them separately, it's going to make it more difficult. You need to transform across technology, mindset, understanding, ownership, etc. simultaneously - pushing in the same cohesive direction. The organization has to be ready and capable for a change. Relatedly, you can't start with some massive change or only look to deliver value starting in year 3. Find ways to make progress and deliver value - especially provable/marketable value - along the way. Incremental value delivery is the key to maintaining exec attention and sponsorship, which are crucial to maintaining momentum. Also relatedly, this doesn't mean your progress on different aspects will all be at the same pace. Maturity, buy-in, capacity, etc. will determine how far you can transform different aspects and when. Don't try to wait for everything to move together. A data mesh transformation driven from the bottom up and mostly by the data team is possible but will likely be harder - far harder? - than top-down. You need to constantly win more support but that can also have its advantages than something with fanfare but not a lot of specific direction. Prioritization is key to doing org transformation well; so is measuring progress. If you aren't addressing the real pain points - or at least the pain points your exec sponsors care about 😅 - you will likely lose that sponsorship. Show them you are making progress against those pain points. Get your business partners to tell you about their actual pain points. Not just about data but about areas where data may be able to improve their work. They will often literally tell you how to sell doing something like data to them by making them feel seen and heard, actually creating a plan to address their specific pain. Relatedly, work with stakeholders to define success metrics around your progress. If you can continually show them incremental value and that you are addressing their needs, you are far more likely to be successful with your transformation initiative. But getting to clear metrics around the data work will be _hard_. Scott note: I'm writing a book on this for a reason 😅 ?Controversial?: If you want to 'prove' value from data work, create a way for other teams to measure the value created. A data team claiming they created value versus Finance claiming the data team created value is a world of difference when it comes to credibility. When it comes to prioritization of data transformation, should the data team really be setting the priorities? For certain aspects like the platform, probably. But really, the business should tell you what are the highest priorities where you should focus your work. ?Controversial?: The head of the data org should be there to enable other parts of the business to derive value from data. It's about making everyone else better. Because of the central nature of many - most? - data teams in large organizations, too many people are used to them essentially being free - incremental data work doesn't typically cost the line of business or at least doesn't cost much. Transforming that mindset to get them to focus on extracting value from data work might be challenging. As constantly comes up in almost every data mesh conversation, incentivizing data producers is hard. You should try to create structural incentivization at the organizational level to push some of the value created back to the data producers. Data mesh isn't the point. It should never be the point. Zhamak has said this as well. We are looking for ways to achieve our goals and data mesh (hopefully) provides good framing to do that. At the end of the day, data work should be about impact. Focus on impact with your business partners and they will be far more likely to continue to engage. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#274 Your Data Platform is a Product, Treat it Like One! - Interview w/ Sean Gustafson 1:01:32

לפני 2 years1:01:32

1:01:32

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Sean's LinkedIn: https://www.linkedin.com/in/seangustafson/ In this episode, Scott interviewed Sean Gustafson, Director of the Data Platform at Delivery Hero. Delivery Hero has been on the data mesh journey for longer than most organizations, at least over 3 years. Some key takeaways/thoughts from Sean's point of view: It's extremely hard but still important to try to impact your culture through things like your data platform. Who are you trying to make information available to? How do you make it accessible? How do you make data ownership easier? A key role of the data platform is that golden/easy path. Showing people easy ways to accomplish what they need with data products. Embed best practices into the platform when possible. You need a product manager in your data platform team. It's easy-ish to build cool things in data but understanding and building to user needs is harder and a must. Treat your data platform as a product! Relatedly, there isn't anything all that special about product management around the data platform. You can take what we've learned from other disciplines - especially software - and tweak it a bit for data. But it's not some arcane art. Focus on KPIs around what you are building and why, especially for your data platform. It's very hard to measure developer productivity but that doesn't mean you just don't measure it. ?Controversial?: Be prepared to deal with a lot of qualitative data when measuring success around your data platform. Surveys work far better than most might think. Good product managers balance the short and long-term. You don't want to make drastic and breaking changes to your data platform often but that doesn't mean you can't take bigger bets and shake things up. Just balance iterative improvements and the bigger picture. Scott note: Zhamak talks about Thomas Kuhn and cumulative progress versus paradigm shifts In the same vein, make small bets where small bets will do but don't be afraid to make big bets when necessary. ?Controversial?: It will be hard to iteratively change a traditional centralized-focused data platform to do data mesh/decentralized ownership well. You want to at least consider a fresh start when looking at your mesh platform. Tools like dbt have given a much broader group of people the ability to model their data. There are inherent problems if they don't do it well but we still need to encourage more people to do data work so we can get them better and producing great work. Data products are a lot like APIs. There are many best practices we can take from APIs and apply them to data products. Explaining data mesh to software engineers can be tough. They probably get the concepts given most are just software engineering concepts reconceptualized for data. But the biggest challenge is they will probably see data as a second class citizen to the underlying back-end systems. Scott note: Unfortunate but extremely common In data incident management, e.g. data loss, you have to look at the prioritization but our general historical focus - how much money did we lose - just doesn't make sense in data. We have to take reliability engineering practices from software and tweak them to work with data but we can take incident management essentially as is. We just have to understand prioritization far better. Sean started with a bit about how he sees his role as leading the data platform team. It is very challenging but still important in his view to try to shape culture even through the data platform. There are so many places in data mesh where there is friction, how do you make things easier as everyone transitions to product thinking and decentralized ownership? Just because you have mandates from the top, people need new ways to accomplish new goals. Make your platform reflect the type of data culture you want. Instill in people the understanding that they can and should participate in your data culture/work. Easier said than done of course. Relatedly, Sean believes the data platform should show people the right way to do things, give them that easy path where possible. But still give them the freedom to do some aspects … not so right 😅 Treat your data platform as a product is something Sean strongly believes in. And to do that, you need someone acting as a product manager. It's not rocket science, we know how product management works and it's not very different when it comes to building a data platform. But you need someone specifically focusing on user needs. And part of that role is also to advocate new features and using the platform. Just because you built it, that doesn't mean people will use it. When asked about iterating to good, Sean talked about how in product management, good practice is about making constant and small improvements but also balancing the bigger picture/big bets. It's not always about the big new platform but sometimes, it's okay to shake things up - make small bets when small bets are good enough but make big bets when necessary. But you have to do that by balancing the short-term and long-term picture. Fail fast and iterative improvements are crucial to good product thinking in software and we need to apply that to data. But again, big changes are okay if you properly build to them instead of trying to flip a switch. He specifically mentioned that it will be hard to iterate to a platform that does decentralized ownership well from one that was highly centralized. Not impossible but at least consider building that out more from scratch. Sean talked about Generative AI and how it's starting to change lots of people's views internally about data. While previously, many software teams were at best reluctant/hesitant to model their data, there is a big interest from the software engineers to directly interact with the large language models (LLMs). Tools like dbt previously brought many new people to the data party, making it easy to model data - at least structurally - so hopefully GenAI will mean more people learning to model their data. There are inherent challenges but the more the merrier when it comes to people working to produce good data. We just have to make sure they learn how to do it well 😅many who are new to data modeling do it… not so well… When it comes to product management, you need to measure how well you're doing. For Sean, that of course extends to the data platform. While KPIs can be somewhat hard around your data platform, that doesn't mean you get to slack off and not measure things. At Delivery Hero, right now they are using surveys to measure a number of things around their data platform rather than trying to measure things automatically without context. It also creates a lot of conversations in the data platform team about what are you trying to do and why, which prevents a lot of waste. It's not perfect but it's getting better. Scott Note: this is why I am writing a book on success factors then one on success metrics in data mesh 😅 this is HARD Sean talked a bit about APIs and how much data products _should_ be treated like APIs. Not just versioning but tracking usage and having users register to use them. There's a lot to learn from how APIs evolved so we don't have to make the same mistakes in data. Scott note: Zhamak comments on this VERY frequently that API approaches are crucial to data mesh When talking to software engineering people, Sean has found using data terminology, especially data mesh terminology, doesn't really resonate with them. We probably need to come up with new terms - or potentially use the terms Zhamak took from software and just make them about data too instead of inventing new terms. But be prepared for it all to fall back to that most software people will see the back-end systems as more important than the data. If you get them over that hump, it's far easier to get them bought in on data mesh. You may be able to win them over by showing them how the data is used internally. Incident management in data is still pretty nascent in Sean's view. While on the software engineering side, there are very well established processes, often in data it has been more slapdash at best. No escalation, no prioritization, no formal process, no post mortem + shared learning, etc. The traditional measure around data issues - how much money did we lose - often isn't applicable to data. So we have to rethink what matters and why because our prioritization is often skewed. Sean wrapped back to the start about how important culture is. Not just getting your organization to be data driven but setting up more and more people for success in your organization through their work with data. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Weekly Episode Summaries and Programming Notes – Week of December 3, 2023 23:41

לפני 2 years23:41

23:41

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#273 An API-First World in Data Integration - An Actual Modern Data Stack - Zhamak's Corner 31 22:52

לפני 2 years22:52

22:52

Key Points: The rush to categorize all of our tooling in data has caused many issues - we will see a big shake-up coming in the future much like happened in application development tooling. So much of data people's time is spent on things that don't add value themselves, it's work that should be automated. We need to fix that so the data work is about delivering value. We can learn a lot from virtualization but data virtualization is not where things should go in general. Containerization is merely an implementation detail. Much like software developers don't really care much about process containers, the same will happen in data product containers - it's all about the experience and containers significantly improve the experience. The pendulum swung towards decoupled data tech instead of monolithic offerings with 'The Modern Data Stack' but most of the technologies were not that easy to stitch together. Going forward, we want to keep the decoupled strategy but we need a better way to integrate - APIs is how it worked in software, why not in data? Sponsored by NextData , Zhamak's company that is helping ease data product creation. For more great content from Zhamak, check out her book on data mesh , a book she collaborated on , her LinkedIn , and her Twitter . Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Data Mesh Radio episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#272 Understanding and Valuing Your Organization's Data - Interview w/ Lauren Cascio and Chris Ensey 55:46

לפני 2 years55:46

55:46

Please Rate and Review us on your podcast app of choice! Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn . Transcript for this episode ( link ) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here . Contact email: Swimwith[at]gulpdata.com Lauren's LinkedIn: https://www.linkedin.com/in/laurencascio/ Chris' LinkedIn: https://www.linkedin.com/in/censey/ In this episode, Scott interviewed Lauren Cascio, Chief Fish Wrangler, and Chris Ensey, CTO at Gulp Data. From here forward in this write-up, L&C will refer to the combination of Lauren and Chris rather than trying to specifically call out who said which part. Some key takeaways/thoughts from L&C's point of view: ?Controversial?: Many organizations have an incorrect perspective that they mostly have a single type of data that's useful for each use case or need. Typically, their data is useful for many more internal use cases and also to organizations in far different industries. Often, there is a lack of a data sharing culture in many organizations. There isn't anyone that really understands how data flows throughout the organization or especially how it _could_ flow to serve many untapped use cases. There are many people emotionally attached to owning their own data but not in the product sense, they are focused on maintaining control rather than structuring it to be shared. So there are organizational challenges to data sharing in addition to technology. Many organizations have a tough time justifying updating their data infrastructure, leading to more and more challenges with progressing their data journey. It's often hard to point to a tangible ROI on updating the data platform for instance. Far too often, companies and LOBs know they want to analyze some information but they don't really know what they are analyzing it for. Instead of shaping data to make specific decisions, there is a focus on the visualization without a clear action in mind once the data tells them something. Drive towards what you care about and use data to answer those questions, the data doesn't speak for itself. Your upper management has limited patience and a limited attention span. Focus on what matters to them and be crisp on delivering an outcome with data, not outputs. A dashboard is just a pretty picture unless it drives action/creates insights. ?Controversial?: It's often easier to get funding to prepare your data for external sale than investing in internal use cases. The simple reason is a tangible ROI. Look to frame your internal investments in data in the same way. Find ways to open more communication about what data you have internally. You will be surprised by the number of new use cases emerging. There's so much untapped data internally that people would use if they only knew about it and could easily use it. Many organizations aren't really thinking of the value of their data and how they protect it. If the data is so valuable to your organization, what kind of investment are you making in security and compliance to protect it? !Controversial!: People's personal data getting shared, at least with some modicum of regulatory oversight, is for the greater good - e.g. more patient data to help fight disease or more financial information to help unbanked people get access to credit/capital. If an organization wants to understand their overall data landscape, the best way to start is simply by starting and also having an end purpose in mind. Essentially, get conversations going and know why you are trying to understand your data. Is it to unlock new use cases, save costs, sell your data, etc.? L&C started with discussing how many organizations view their internal data landscape/estate and how it's not a complete picture. There tends to be a perspective that an organization's data is only useful for their internal use cases and often that each set of data is only useful for one type of use case. And L&C just haven't seen that be true - internally, most orgs have data that could be useful to existing use cases . How that typically manifests is data silos where data that should be shared isn't because people aren't aware it exists. Or the other side is that data producers have no real idea of how their data is being used downstream by other parts of the company. Externally, most companies' data is often very useful to other organizations in entirely different sectors. When asked about why lines of business have such a hard time understanding what data other LOBs have, L&C talked a bit about the technical challenges but much more about the organizational. In many - most? - organizations, lines of business have treated their internal data as overly precious, making sure it was structured specifically so they could use it. Trying to get them to structure it so others can use it is an emotional hurdle because it can feel like giving up that control. Thus, it's been hard for other LOBs to even know about the data across the organization, let alone use it. Add to that the challenge of businesses treating the data team and especially their infrastructure as a cost center and that further impedes their data journey. When it's hard to make the tangible business case for updating your data infrastructure, it's easy to fall further behind. If much of this sounds familiar, it's frequent hurdles towards implementing data mesh. L&C talked about how typical it is where organizations understand they want to visualize their data but without a specific goal in mind. Just visualizing without an expectation of what it will be used for is not product thinking. What information do you need to make your external facing products better? What will cause you to act? Instead, it's about "what does the data tell us" which is not often aligned with taking actual action. That leads to wasted cycles and money; it also often leads to wasted buy-in from upper management - they really only have limited patience, spend it on what matters. Be crisp on what goals you are going after then develop the data and analysis to help you actually go after those goals. It's far easier to get exec buy-in on selling your data externally than investing further for internal use in L&C's experience. That's because there is a tangible outcome at the end of the road. Look to try to shape your asks for additional funding based on that principle: a tangible ROI makes decisioning easier. For L&C, there are many use cases that could be unlocked in most organizations if only people knew what data was available. Finding ways to discover and share more about what data you have internally is very helpful. Yes, a data catalog is great but finding better ways to make people aware of the available data will unlock new valuable use cases. An audit for what data to sell externally is one way to spark these conversations but there are many others :) L&C pointed to two differing types of companies regarding selling their data. The first is low margin businesses. Because they are so reliant on volume, they end up with a considerable amount of data that they could potentially monetize. The other type of company is early stage companies that have yet to reach product market fit, especially B2B. They often think their data will be very valuable but selling data becomes a distraction far too easily. Focus on your core business, not small external monetization streams. On the somewhat controversial topic of data monetization, how people's information is protected versus leveraged, L&C believe there is a greater good in general to your information being shared. While something like GDPR gives the perception of your data being protected, it's not really all that true - everyone's data is out there already 😅. Meanwhile, there is lots of potential good that can come out of more comprehensive data sharing, e.g. better information to fight diseases from more patient information or lower cost of items in retail stores from data generated in loyalty programs. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/ If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
Weekly Episode Summaries and Programming Notes – Week of November 26, 2023 14:30

לפני 2 years14:30

14:30

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ Please Rate and Review us on your podcast app of choice! If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

D

Data Mesh Radio

1
#271 The Importance of Repeatability of Language to Scalability - Mesh Musings 56 11:22

לפני 2 years11:22

11:22

Important points: There are places where nuance adds value. Many times, explicit definitions around data aspects like quality or even SRE metrics like uptime and query performance are not one. Provide a simple way for producers to apply these scalable approaches - the platform should measure data quality metrics for example. Data producers are having a hard enough time in general learning how to leverage data better. Find places to make it about learning about the information encapsulated in the data product, not learning a new set of SLAs for each data product. Consumers will thank you too since it make their lives easier. With that, you should see more of an uptick in data usage. Please Rate and Review us on your podcast app of choice! Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/ If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here Episode list and links to all available episode transcripts here . Provided as a free resource by Data Mesh Understanding . Get in touch with Scott on LinkedIn if you want to chat data mesh. If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm , MondayHopes , SergeQuadrado , ItsWatR , Lexin_Music , and/or nevesf…

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

Minecraft

$25 PlayStation Store Gift Card [Digital Code]

פודקאסטים ששווה להאזין

Data Mesh Radio « » #289 Building the Right Foundations for Generative AI - Interview w/ May Xu

סדרה בארכיון ("עדכון לא פעיל" status)