TWIL: August 27, 2023

This week I finished a bunch of episodes from Lex Fridman’s podcast, and I recommend listening to the three episodes about Palestine and Israel, even if just to better understand the situation. I’ve also been looking into data refresh and real-time data in Power BI, architectures for LLM-based solutions, and how to manage costs when using Azure OpenAI Service. Finally, a lot of news on AI, new Python support in Excel and hyperscale elastic pools in Azure SQL. Have fun!


Podcasts

Lex Fridman Podcast

Episode 389: Benjamin Netanyahu: Israel, Palestine, Power, Corruption, Hate, and Peace
Benjamin Netanyahu is the Prime Minister of Israel.

Episode 390: Yuval Noah Harari: Human Nature, Intelligence, Power, and Conspiracies
Yuval Noah Harari is a historian, philosopher, and author of Sapiens, Homo Deus, 21 Lessons for the 21st Century, and Unstoppable Us.

Episode 391: Mohammed El-Kurd: Palestine
Mohammed El-Kurd is a Palestinian writer and poet.

Episode 392: Joscha Bach: Life, Intelligence, Consciousness, AI & the Future of Humans
Joscha Bach is a cognitive scientist, AI researcher, and philosopher.

Episode 393: Andrew Huberman: Relationships, Drama, Betrayal, Sex, and Love
Andrew Huberman is a neuroscientist at Stanford and host of the Huberman Lab Podcast.

The Azure Podcast

Episode 469: Microsoft Fabric
Azure and Data technical specialist Ian Pike, joins us on the Podcast to give us a primer on Fabric and what is means for customers that use various data-related services on Azure.


Power BI

Data refresh in Power BI
This article describes the data refresh features of Power BI and their dependencies at a conceptual level. It also provides best practices and tips to avoid common refresh issues. The content lays a foundation to help you understand how data refresh works. For targeted step-by-step instructions to configure data refresh, refer to the tutorials and how-to guides listed in the Next steps section at the end of this article.

Automatic page refresh in Power BI
When you monitor critical events, you want data to be refreshed as soon as the source data is updated. For example, in the manufacturing industry, you need to know when a machine is malfunctioning or is close to malfunctioning. If you’re monitoring signals like social media sentiment, you want to know about sudden changes as soon as they happen. Automatic page refresh in Power BI enables your active report page to query for new data, at a predefined cadence, for DirectQuery sources. Furthermore, Automatic Page Refresh supports Proxy Models as well.

Incremental refresh and real-time data for datasets
Incremental refresh extends scheduled refresh operations by providing automated partition creation and management for dataset tables that frequently load new and updated data. For most datasets, one or more tables contain transaction data that changes often and can grow exponentially, like a fact table in a relational or star database schema. An incremental refresh policy to partition the table, refreshing only the most recent import partitions, and optionally using another DirectQuery partition for real-time data can significantly reduce the amount of data that has to be refreshed. At the same time, this policy ensures that the latest changes at the data source are included in the query results.

Power BI Automatic aggregations
Automatic aggregations use state-of-the-art machine learning (ML) to continuously optimize DirectQuery datasets for maximum report query performance. Automatic aggregations are built on top of existing user-defined aggregations infrastructure first introduced with composite models for Power BI. Unlike user-defined aggregations, automatic aggregations don’t require extensive data modeling and query-optimization skills to configure and maintain. Automatic aggregations are both self-training and self-optimizing. They enable dataset owners of any skill level to improve query performance, providing faster report visualizations for large datasets.


Large Language Models

Emerging Architectures for LLM Applications
Large language models are a powerful new primitive for building software. But since they are so new—and behave so differently from normal computing resources—it’s not always obvious how to use them. In this post, we’re sharing a reference architecture for the emerging LLM app stack. It shows the most common systems, tools, and design patterns we’ve seen used by AI startups and sophisticated tech companies. This stack is still very early and may change substantially as the underlying technology advances, but we hope it will be a useful reference for developers working with LLMs now.

GPT-3.5 Turbo fine-tuning and API updates
Fine-tuning for GPT-3.5 Turbo is now available, with fine-tuning for GPT-4 coming this fall. This update gives developers the ability to customize models that perform better for their use cases and run these custom models at scale. Early tests have shown a fine-tuned version of GPT-3.5 Turbo can match, or even outperform, base GPT-4-level capabilities on certain narrow tasks. As with all our APIs, data sent in and out of the fine-tuning API is owned by the customer and is not used by OpenAI, or any other organization, to train other models.

Best practices for your ChatGPT ‘on your data’ solution
Today I will walk you through the architectural decisions and approaches you can take to further improve your solution to chat with your documents. One of the reasons why RAG systems fail is often due to bad retrieval, not your LLM. A good system is as good as the data you provide.


Azure OpenAI Service

Azure Budgets and Azure OpenAI Cost Management
To give everyone a bit of a context, the pricing for Azure OpenAI Service is based on a pay-as-you-go consumption model, which means you only pay for what you use. The price per unit depends on the type and size of the model you choose. It also depends upon the number of tokens being used in the prompt and the response. Tokens are the units of measurement that OpenAI uses to charge for its API services. Each request to the API consumes a certain number of tokens, depending on the model, the input length, and the output length.

Calculating Chargebacks for Business Units/Projects Utilizing a Shared Azure OpenAI Instance
Large organizations frequently provision a singular instance of Azure OpenAI Service that is shared across multiple internal departments. This shared use necessitates an efficient mechanism for allocating costs to each business unit or consumer, based on the number of tokens consumed. This article delves into how chargeback is calculated for each business unit based on their token usage.


Microsoft Fabric

All Microsoft Fabric icons for diagramming
Do you often have to draw up diagrams of data flows or data architectures and would you like to use Microsoft Fabric icons in your diagrams? Then this post is for you! I always found it hard to find the right icons I needed, so I’ve put them all on this single page for easy access, both as PNG and SVG.


Microsoft Excel

Announcing Python in Excel: Combining the power of Python and the flexibility of Excel
Since its inception, Microsoft Excel has changed how people organize, analyze, and visualize their data, providing a basis for decision-making for the millions of people who use it each day. Today we’re announcing a significant evolution in the analytical capabilities available within Excel by releasing a Public Preview of Python in Excel. Python in Excel makes it possible to natively combine Python and Excel analytics within the same workbook – with no setup required. With Python in Excel, you can type Python directly into a cell, the Python calculations run in the Microsoft Cloud, and your results are returned to the worksheet, including plots and visualizations.


Azure SQL

Hyperscale elastic pools overview in Azure SQL Database
This article provides an overview of Hyperscale elastic pools in Azure SQL Database. An Azure SQL Database elastic pool enables software as a service (SaaS) developers to optimize the price performance ratio for a group of databases within a prescribed budget, while delivering performance elasticity for each database. Azure SQL Database Hyperscale elastic pools introduces a shared resource model for Hyperscale databases.


AI News

Microsoft and Epic expand AI collaboration to accelerate generative AI’s impact in healthcare, addressing the industry’s most pressing needs
Today, the promise of technology to help us solve some of the biggest challenges we face has never been more tangible, and nowhere is generative AI more needed, and possibly more impactful, than in healthcare. Epic and Microsoft have been paving the way to bring generative AI to the forefront of the healthcare industry. Together, we are working to help clinicians better serve their patients and are addressing some of the most urgent needs, from workforce burnout to staffing shortages.

New IBM study reveals how AI is changing work and what HR leaders should do about it
The rise of generative AI has surfaced many new questions about how the technology will impact the workforce. Even as AI becomes more pervasive in business, people are still a core competitive advantage. But business leaders are facing a host of talent-related challenges, as a new global study from the IBM Institute for Business Value (IBV) reveals, from the skills gap to shifting employee expectations to the need for new operating models.

Augmented work for an automated, AI-driven world
AI and automation are creating a new division of labor between humans and machines. The World Economic Forum (WEF) predicts this evolution will disrupt 85 million jobs globally between 2020 and 2025—and create 97 million new job roles.1 This radical shift is ushering in a new age. We call it the age of the augmented workforce—an era when human-machine partnerships boost productivity and deliver exponential business value.

Our principles for partnering with the music industry on AI technology
We’re working closely with our music partners, including Universal Music Group, to develop an AI framework to help us work toward our common goals. These three fundamental AI principles serve to enhance music’s unique creative expression while also protecting music artists and the integrity of their work.


Interesting Stuff

Microsoft’s Satya Nadella is winning Big Tech’s AI war. Here’s how
Microsoft has been at the forefront of the tech world’s AI race because of the landmark partnership Nadella struck with ChatGPT creator OpenAI, which—in return for a reported $13 billion investment—gives the software giant first dibs at the startup’s current and upcoming technologies. As the results have begun showing up in new and upcoming versions of Microsoft products, from GitHub to Bing to Excel to Azure, they’ve greatly boosted the company’s standing in relation to peers such as Amazon and Google. For the first time since its 1990s heyday, the company is widely regarded as the pacemaker in technology’s next historic wave of change.


Have a wonderful week!