Large Language Model Agents llm-agents-mooc

Large Language Model Agents

MOOC, Fall 2024

Announcement:

Sign up and learn more about the LLM Agents Hackathon here!

Prospective Students

  • This course has completed. Video lectures can still be found in the syllabus below. Please sign up for the Spring 2025 iteration today!
  • All certificates have been released! Thank you for a great semester.

Course Staff

Instructor

Co-instructor

Dawn Song

Xinyun Chen

Professor, UC Berkeley

Research Scientist,
Google DeepMind

Guest Speakers

.table { width: 100%; table-layout: fixed; border-collapse: collapse; } .table td { width: 25%; text-align: center; vertical-align: top; padding: 10px; }

Denny Zhou

Shunyu Yao

Chi Wang

Jerry Liu

Burak Gokturk

Omar Khattab

Graham Neubig

Nicolas Chapados

Yuandong Tian

Jim Fan

Percy Liang

Ben Mann

Course Description

Large language models (LLMs) have revolutionized a wide range of domains. In particular, LLMs have been developed as agents to interact with the world and handle various tasks. With the continuous advancement of LLM techniques, LLM agents are set to be the upcoming breakthrough in AI, and they are going to transform the future of our daily life with the support of intelligent task automation and personalization. In this course, we will first discuss fundamental concepts that are essential for LLM agents, including the foundation of LLMs, essential LLM abilities required for task automation, as well as infrastructures for agent development. We will also cover representative agent applications, including code generation, robotics, web automation, medical applications, and scientific discovery. Meanwhile, we will discuss limitations and potential risks of current LLM agents, and share insights into directions for further improvement. Specifically, this course will include the following topics:

  • Foundation of LLMs
  • Reasoning
  • Planning, tool use
  • LLM agent infrastructure
  • Retrieval-augmented generation
  • Code generation, data science
  • Multimodal agents, robotics
  • Evaluation and benchmarking on agent applications
  • Privacy, safety and ethics
  • Human-agent interaction, personalization, alignment
  • Multi-agent collaboration

Syllabus

Date

Guest Lecture
(3:00PM-5:00PM PST)

Supplemental Readings

Sept 9

LLM Reasoning
Denny Zhou, Google DeepMind
Livestream Intro Slides Quiz 1

- Chain-of-Thought Reasoning Without Prompting
- Large Language Models Cannot Self-Correct Reasoning Yet
- Premise Order Matters in Reasoning with Large Language Models
- Chain-of-Thought Empowers Transformers to Solve Inherently Serial Problems

Sept 16

LLM agents: brief history and overview
Shunyu Yao, OpenAI
Livestream Slides Quiz 2

- WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
- ReAct: Synergizing Reasoning and Acting in Language Models

Sept 23

Agentic AI Frameworks & AutoGen
Chi Wang, AutoGen-AI
Building a Multimodal Knowledge Assistant
Jerry Liu, LlamaIndex
Livestream Chi’s Slides Jerry’s Slides Quiz 3

- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
- StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows

Sept 30

Enterprise trends for generative AI, and key components of building successful agents/applications
Burak Gokturk, Google
Livestream Slides Quiz 4

- Google Cloud expands grounding capabilities on Vertex AI
- The Needle In a Haystack Test: Evaluating the performance of RAG systems
- The AI detective: The Needle in a Haystack test and how Gemini 1.5 Pro solves it

Oct 7

Compound AI Systems & the DSPy Framework
Omar Khattab, Databricks
Livestream Slides Quiz 5

- Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs
- Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

Oct 14

Agents for Software Development
Graham Neubig, Carnegie Mellon University
Livestream Slides Quiz 6

- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
- OpenHands: An Open Platform for AI Software Developers as Generalist Agents

Oct 21

AI Agents for Enterprise Workflows
Nicolas Chapados, ServiceNow
Livestream Slides Quiz 7

- WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
- WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
- TapeAgents: a Holistic Framework for Agent Development and Optimization

Oct 28

Towards a unified framework of Neural and Symbolic Decision Making
Yuandong Tian, Meta AI (FAIR)
Livestream Slides Quiz 8

- Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
- Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
- Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets
- SurCo: Learning Linear Surrogates For Combinatorial Nonlinear Optimization Problems

Nov 4

Project GR00T: A Blueprint for Generalist Robotics
Jim Fan, NVIDIA
Livestream Slides Quiz 9

- Voyager: An Open-Ended Embodied Agent with Large Language Models
- Eureka: Human-Level Reward Design via Coding Large Language Models
- DrEureka: Language Model Guided Sim-To-Real Transfer

Nov 11

No Class - Veterans Day

Nov 18

Open-Source and Science in the Era of Foundation Models
Percy Liang, Stanford University
Livestream Slides Quiz 10

- Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

Nov 25

Measuring Agent capabilities and Anthropic’s RSP
Ben Mann, Anthropic
Livestream Slides Quiz 11

- Announcing our updated Responsible Scaling Policy
- Developing a computer use model

Dec 2

Towards Building Safe & Trustworthy AI Agents and A Path for Science‑ and Evidence‑based AI Policy
Dawn Song, UC Berkeley
Livestream Slides Quiz 12

- A Path for Science‑ and Evidence‑based AI Policy
- DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
- Representation Engineering: A Top-Down Approach to AI Transparency
- Extracting Training Data from Large Language Models
- The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Completion Certificate

LLM Agent course completion certificates will be awarded to students based on the rules of the following tiers. All assignments are due December 12th, 2024 at 11:59PM PST. To recieve your certificate please complete the Certificate Declaration Form by December 17th, 2024 at 11:59PM PST.

Trailblazer Tier:

  • Complete all 12 quizzes associated with each lecture
  • Pass the written article assignment

Mastery Tier:

  • Complete all 12 quizzes associated with each lecture
  • Pass the written article assignment
  • Pass all 3 lab assignments

Ninja Tier:

  • Complete all 12 quizzes associated with each lecture
  • Pass the written article assignment
  • Submit a project to the LLM Agents Hackathon

Legendary Tier:

  • Complete all 12 quizzes associated with each lecture
  • Pass the written article assignment
  • Become a prize winner or finalist at the LLM Agents Hackathon

Honorary Tier:

  • For the most helpful/supportive students in discord!
  • Meets coursework requirements of Ninja OR Mastery Tier

NOTE: completing the assignments associated with this course in order to earn a Completion Certificate is completely optional. You are more than welcome to just watch the lectures and audit the course!

Coursework

All coursework will be released and submitted through the course website.

Quizzes

All quizzes are released in parallel with (or shortly after) the corresponding lecture. Please remember to complete the quiz each week. Although it’s graded on completion, we encourage you to do your best. The questions are all multiple-choice and there are usually at most 5 per quiz. The quizzes will be posted in the Syllabus section.

An archive of all of the quizzes can be found here.

Written Article

Create a twitter post, linkedin post, or medium article to post on Twitter of roughly 500 words. Include the link to our MOOC website in the article and tweet.

  • Students in the Trailblazer or Mastery Tier should either summarize information from one of the lecture(s) or write a postmortem on their learning experience during our MOOC
  • Students in the Ninja or Lengendary Tier should write about their hackathon submission

The written article is an effort-based assignment that will be graded as pass or no pass (P/NP). Submit your written article assignment HERE.

Labs

There will be 3 lab assignments to give students some hands-on experience with building agents. Students must pass all 3 lab assignments. All labs are due December 12th, 2024 at 11:59pm PST. Please read the instructions carefully here. Please read the FAQs here before asking questions in Discord.

Assignment

Submission Form

Lab 1

Submission 1

Lab 2

Submission 2

Lab 3

Submission 3

Hackathon

Check out our hackathon website here. Sign up for the hackathon here — every member of the team should signup individually. Then, complete the team creation form here. There are no limits to team sizes.

For any questions, please visit our Hackathon FAQ here. You can also ask questions and find potential team members in our LLM Agents Discord.

Submit your final hackathon project here before December 17th, 2024 @11:59PM PST.

This page was generated by GitHub Pages.