TWIL: April 8, 2023

AI research continues to make headlines with new papers coming out every day. I’m highlighting the paper on HuggingGPT with a video explaining how GPT-4 can improve itself. Also, repos for JARVIS (implementation described in the HuggingGPT paper) and Semantic Kernel, and an article on how to use LangChain with Azure OpenAI Service. Finally, a set of AI-focused newsletter and the awesome challenge of 30 days of Azure AI. Enjoy!

Large Language Models

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Solving complicated AI tasks with different domains and modalities is a key step toward advanced artificial intelligence. While there are abundant AI models available for different domains and modalities, they cannot handle complicated AI tasks. Considering large language models (LLMs) have exhibited exceptional ability in language understanding, generation, interaction, and reasoning, we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks and language could be a generic interface to empower this. Based on this philosophy, we present HuggingGPT, a framework that leverages LLMs (e.g., ChatGPT) to connect various AI models in machine learning communities (e.g., Hugging Face) to solve AI tasks. Specifically, we use ChatGPT to conduct task planning when receiving a user request, select models according to their function descriptions available in Hugging Face, execute each subtask with the selected AI model, and summarize the response according to the execution results. By leveraging the strong language capability of ChatGPT and abundant AI models in Hugging Face, HuggingGPT is able to cover numerous sophisticated AI tasks in different modalities and domains and achieve impressive results in language, vision, speech, and other challenging tasks, which paves a new way towards advanced artificial intelligence.

Language serves as an interface for LLMs to connect numerous AI models for solving complicated AI tasks! We introduce a collaborative system that consists of an LLM as the controller and numerous expert models as collaborative executors (from HuggingFace Hub). The workflow of our system consists of four stages: Task Planning, Model Selection, Task Execution, and Response Generation.

GPT 4 Can Improve Itself
GPT 4 can self-correct and improve itself. With exclusive discussions with the lead author of the Reflexions paper, I show how significant this will be across a variety of tasks, and how you can benefit. I go on to lay out an accelerating trend of self-improvement and tool use, laid out by Karpathy, and cover papers such as Dera, Language Models Can Solve Computer Tasks and TaskMatrix, all released in the last few days. I also showcase HuggingGPT, a model that harnesses Hugging Face and which I argue could be as significant a breakthrough as Reflexions. I show examples of multi-model use, and even how it might soon be applied to text-to-video and CGI editing (guest-starring Wonder Studio). I discuss how language models are now generating their own data and feedback, needing far fewer human expert demonstrations. Ilya Sutskever weighs in, and I end by discussing how AI is even improving its own hardware and facilitating commercial pressure that has driven Google to upgrade Bard using PaLM.

Unboxing Google Bard and GPT-4
Hi! I’m Cassie Kozyrkov and today I’m going to show you GPT-4 via ChatGPT and LaMDA via Google Bard. During this interface demo, the right half of the screen contains the paid version of ChatGPT with GPT-4 and the left half of the screen shows last Tuesday’s release of Google Bard which is powered by the LaMDA model.

Semantic Kernel
Semantic Kernel (SK) is a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages. The SK extensible programming model combines natural language semantic functions, traditional code native functions, and embeddings-based memory unlocking new potential and adding value to applications with AI.

Using LangChain with Azure OpenAI Service
In this post we briefly discuss how LangChain can be used with Azure OpenAI Service. LangChain is a powerful tool for building language models that can be used for a variety of applications, from personal assistants to question answering and chatbots. Its modules provide support for different model types, prompt management, memory, indexes, chains, and agents, making it easy to customize and create unique language models.

The GPT-x Revolution in Medicine
A new book by Peter Lee, Carey Goldberg, Isaac Kohane will be released as an e-book April 15th and as a paperback May 3rd and I’ve had the chance to read it. With my keen interest for how AI can transform medicine (as written about in Deep Medicine and multiple recent review papers here, here, here ), I couldn’t put it down. It’s outstanding for a number of reasons that I’ll elaborate.

AI-Focused Newsletters

Learn how to leverage AI to boost your productivity and accelerate your career

The Rundown
Get the rundown on the latest developments in AI before everyone else.

Ben’s Bites
Your daily dose of what’s going on in AI. In 5 minutes or less, with a touch of humour.

Cloud Architecture

Scaling Kubernetes to 7,500 nodes
We’ve scaled Kubernetes clusters to 7,500 nodes, producing a scalable infrastructure for large models like GPT-3, CLIP, and DALL·E, but also for rapid small-scale iterative research such as Scaling Laws for Neural Language Models.

Scaling Kubernetes to 2,500 nodes
We’ve been running Kubernetes for deep learning research for over two years. While our largest-scale workloads manage bare cloud VMs directly, Kubernetes provides a fast iteration cycle, reasonable scalability, and a lack of boilerplate which makes it ideal for most of our experiments. We now operate several Kubernetes clusters (some in the cloud and some on physical hardware), the largest of which we’ve pushed to over 2,500 nodes. This cluster runs in Azure on a combination of D15v2 and NC24 VMs.

Interesting Stuff

The AI Show
The AI Show Live showcases the amazing work happening in AI at Microsoft. Developers learn what’s new in AI in a short amount of time and are directed to assets helping them get started and on the road to success right away. Seth Juarez and friends work on cool projects and highlight what’s new in Azure AI and Machine Learning. Tune in every other Friday at 11 AM pacific.

30 days of Azure AI
Azure AI #30DaysOfAzureAI is a series of daily posts throughout April. Hear from our experts in the product teams, cloud advocacy, community and follow along at your own pace! Where relevant, the daily posts have accompanying Open Source repositories, code samples, and other resources.

Have an awesome week!