TWIL: September 25, 2022

This week I dove a little into Azure Synapse Analytics and also found a few interesting articles on Data Architecture. Also a podcast on comparing deep learning and the human brain. I hope you find it useful.


Podcasts

Towards Data Science

Episode 126: Does the brain run on deep learning?
Deep learning models — transformers in particular — are defining the cutting edge of AI today. They’re based on an architecture called an artificial neural network which, as their name suggests, were inspired by the structure and function of biological neural networks, like those that handle information processing in our brains. So it’s a natural question to ask: how far does that analogy go? Today, deep neural networks can master an increasingly wide range of skills that were historically unique to humans — skills like creating images, or using language, planning, playing video games, and so on. Could that mean that these systems are processing information like the human brain, too?


Azure Synapse Analytics

Source control in Synapse Studio
By default, Synapse Studio authors directly against the Synapse service. If you have a need for collaboration using Git for source control, Synapse Studio allows you to associate your workspace with a Git repository, Azure DevOps, or GitHub. This article will outline how to configure and work in a Synapse workspace with git repository enabled. And we also highlight some best practices and a troubleshooting guide.

Continuous integration and delivery for an Azure Synapse Analytics workspace
In an Azure Synapse Analytics workspace, CI/CD moves all entities from one environment (development, test, production) to another environment. Promoting your workspace to another workspace is a two-part process. First, use an Azure Resource Manager template (ARM template) to create or update workspace resources (pools and workspace). Then, migrate artifacts like SQL scripts and notebooks, Spark job definitions, pipelines, datasets, and other artifacts by using Synapse Workspace Deployment tools in Azure DevOps or on GitHub.

How to use CI/CD integration to automate the deploy of a Synapse Workspace to multiple environments
You’re about to kick-off another exciting project, this time you and your team will be working for the first time in a cloud based data solution using Azure Synapse Analytics. You want to benefit from your team’s past experience when deploying data solutions on-premises: to use DevOps as a preferred software development methodology, and to use three distinct environments. In this article we are going to demonstrate how you can use Azure Synapse Analytics integrated with an Azure DevOps Git repository to achieve these goals.

Analytics end-to-end with Azure Synapse
This example scenario demonstrates how to use Azure Synapse Analytics with the extensive family of Azure Data Services to build a modern data platform that’s capable of handling the most common data challenges in an organization. The solution described in this article combines a range of Azure services that will ingest, store, process, enrich, and serve data and insights from different sources (structured, semi-structured, unstructured, and streaming).


Data Science

Let’s Move Fast And Get Rid Of Data Engineers
We live in a data world that wants to move fast. Just shove your data into a data lake and we’ll figure it out later. Create this table for this one dashboard that we will only look at once and then it will join the hundreds(if not thousands) of other dashboards that are ignored. With all this pressure to move fast coming from all sides, an interesting solution I have come across a few times is…Let’s just get rid of data engineering and governance.


Data Architecture

μ Architecture
μ Architecture is recursivelly repeating pattern of self simillar data micro processors consisting of tripplets source connect, processor and sink connect, which we will call data μ-products. These μ-products then create mesh. This pattern allows us to spread the complexity of processing across the mesh through distribution and scalability.

Datalake, Datawarehouse and Datamart
In this article we discuss about the things we generally build on Big-Data platforms to store, process and analyse the data. They are Datalake, Datawarehouse and Datamarts. Though they appear to be similar and often interchangeably used, they are built for different purposes and have lot of differences between them.


Have an awesome week!

Photo by Milad Fakurian on Unsplash