TWIL: November 13, 2022
This week I’m recommending two awesome podcasts by Scott Hanselman with two very interesting guests. Also, a set of articles on data architecture topics such as Data Mesh, event-driven data architecture, Azure Data Explorer and modern data warehouses with Snowflake versus Databricks. Finally take a look at Microsoft’s Responsible AI toolbox. Have fun!
Episode 864: The Work Ahead with Blind Engineer Sameer Doshi
When engineer and author Sameer Doshi interviewed remotely for his current job at Microsoft, he was nervous about only one thing: Telling his future employer that he’s blind. He was offered the position, and even after starting he didn’t mention that he’s blind until he needed to put in a request for the special software that helps him do his job. Four years later, he has moved up the ranks to a management position and is thriving. He talks to Scott about tech accommodation, and his new sci-fi book The Work Ahead!
Episode 865: What If? 2 with Randall Munroe
In association with Outside In, we are thrilled to share this conversation with Randall Munroe and Scott Hanselman. The #1 New York Times bestselling author of What If? and How To answers more of the weirdest questions you never thought to ask!
Episode 1818: Making Open Source Work for Everyone with David Whitney
How do we make open source work for everyone? While at NDC in Oslo, Carl and Richard talked to David Whitney about his experiences working on open-source projects, and the challenges of making them sustainable. David talks about how many projects start with an individual making something for themselves, which then evolves into many people utilizing the project, but not contributing to it. And when companies depend on that software, the pressure on the creators gets serious – but without compensation. How do we make open source better? And how do the tech giants make the situation better or worse?
The Azure Podcast
Episode 445: SAP for Azure Landing Zone Accelerator
Cynthia, Sujit and Russell discuss the SAP for Azure Land Zone Acclerator with Pankaj Meshram from the Microsoft Product team, and Matt Ely from Microsoft’s Global Partner Solutions CSA team.
The Principles of the Data Mesh Architecture
Zhamak Dehghani first described the current incarnation of the Data Mesh concept as a set of four principles: domain ownership, domain data as a product, federated computational governance, and self-serve data platform. From our perspective, however, the key to the success of Data Mesh implementation lies in understanding that it is a socio-technical architecture, not a technical solution. In this article, we will present Dehghani’s 4 principles and explain the socio-technological aspects that are key to understanding the Data Mesh architecture.
Snowflake or Databricks
Should I go with Snowflake or Databricks? I hear this almost every day. It is a very important question. It is like the truth between any two foes: There is one version of the truth from one perspective, there is another version from the opposing perspective; then there is the truth. Let’s dig in and decompose each organization and I will finish by giving my version of the truth.
Event-Driven Data Architecture
Working with data during the last 8 years of my life I believe I have seen some structural changes in the data landscape. Not only connected with architecture — which nowadays is its majority cloud-based — but also on the roles of certain personas involved in teams and implementations. Looking at the actual implementation in terms of architecture we can have several approaches, but more and more we see event-driven architectures arise! These architectures are based on microservices and API-driven implementations, which fit the purpose of real-time or near-real-time ingestions and processing.
At the core, Apache Arrow is a standardized, language-independent in-memory columnar format specification to represent tabular datasets in memory. It also includes libraries that expose Arrow representation as APIs.
Azure Data Explorer
Five Reasons to Dive into Azure Data Explorer
This e-book focuses on the five reasons that make ADX a good fit for businesses: Scale from gigabytes to petabytes, Interactive queries on billions of records in seconds, Ingest data and query it within seconds, A query language specialized in time series and telemetry, and Efficiently query semi-structured and unstructured data.
Responsible AI Toolbox
Responsible AI is an approach to assessing, developing, and deploying AI systems in a safe, trustworthy, and ethical manner, and take responsible decisions and actions. Responsible AI Toolbox is a suite of tools providing a collection of model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.
Have a great week!
Photo by Aaron Burden on Unsplash