Topic 5: Generative AI APIs

⏱️ Estimated time: 4-5 days

Generative AI and Large Language Models (LLMs) are transforming how we build applications. In this topic, you'll learn how to integrate LLM APIs into your Python applications. These skills are essential for modern cloud engineering as AI services are becoming core components of cloud platforms.

📚 Learning Path

Understanding LLM API Basics

Before coding, understand these core concepts:

Messages format: LLMs work with conversation-style inputs
- System message: Sets behavior/personality
- User message: Your prompt/question
- Assistant message: The AI response
Completions: The API generates text based on your input
Parameters:
- temperature: Controls randomness (0 = deterministic, 1 = creative)
- max_tokens: Limits response length
- model: Which LLM version to use
Structured outputs: Getting JSON instead of free text

Hands-On Learning: Python OpenAI Demos

Before setting up cloud resources, start with this free hands-on practice using GitHub Models:

Resource: Python OpenAI Demos (Video Walkthrough)

This repository teaches you the OpenAI Python SDK through progressively complex examples—the same SDK used by Azure OpenAI. You can run it completely free using GitHub Models in GitHub Codespaces.

Action: Work through these examples in order:

Chat Completions - Start with chat.py, then try chat_stream.py and chat_history.py
Structured Outputs - Learn to get JSON responses with structured_outputs_basic.py
Function Calling - See how LLMs can call your code with function_calling_basic.py

Why start here?

✅ Free (uses GitHub Models, no credit card needed)
✅ Works in browser (GitHub Codespaces)
✅ Same SDK you'll use with Azure OpenAI
✅ Builds skills progressively

Video Series: Python + AI

For deeper learning, check out these videos from the Python + AI livestream series (All Resources):

Topic	Slides	Video
LLMs	Slides	Watch
Structured Outputs	Slides	Watch

Optional: The full series covers 9 topics including RAG, AI Agents, and more. Watch them all if you want a deep understanding of Python + AI.

Choosing Your Cloud Provider

Once you've completed the demos above, apply your skills to your cloud provider's AI service. This teaches you cloud-specific skills like IAM, resource management, and billing.

Azure OpenAI - If you're focusing on Azure (accessed via Azure AI Foundry)
AWS Bedrock - If you're focusing on AWS (supports Claude, Llama, and other models)
GCP Vertex AI - If you're focusing on Google Cloud (supports Gemini and other models)

Action: Choose the provider that matches your cloud focus.

Provider Playground Practice

IMPORTANT: Test in the playground BEFORE writing code.

Azure OpenAI

Study: Azure OpenAI Chat Completions Quickstart
Action: Create an Azure OpenAI resource
Action: Use the Azure AI Foundry Chat playground

AWS Bedrock

Study: AWS Bedrock Getting Started
Action: Use the AWS Bedrock Playground
Action: Enable model access for Claude or Llama models in your region

GCP Vertex AI

Study: Vertex AI Generative AI Overview
Action: Use Vertex AI Studio
Action: Test prompts in the Vertex AI Studio text prompt interface

Playground Exercises

In your chosen provider's playground, test these prompts:

Simple completion:

Analyze the sentiment of this text: "I learned so much about Python today!"

Structured output:

Analyze the sentiment of this journal entry and respond in JSON format with fields: sentiment (positive/negative/neutral) and summary (2 sentences max).

Journal entry: "Today I struggled with async Python but finally got it working after 3 hours."

System message test: Add a system message:

System: You are a helpful learning coach who analyzes student journal entries.
User: Analyze this entry: "I'm frustrated with databases but making progress."

Take screenshots of successful responses. You'll replicate these in code next.

Python SDK Integration

Now implement the same prompts in Python.

Azure OpenAI SDK

Action: Azure OpenAI Python Quickstart
Install: pip install openai

AWS Bedrock SDK

Action: AWS Bedrock Python SDK Examples
Install: pip install boto3

GCP Vertex AI SDK

Action: Vertex AI Python SDK Quickstart
Install: pip install google-cloud-aiplatform

Key Concepts to Learn

Work through your chosen provider's Python documentation and ensure you understand:

Authentication: API keys, service principals, or IAM roles
Making requests: Sending messages to the LLM
Handling responses: Parsing the completion text
Error handling: Rate limits, timeouts, invalid requests
Environment variables: Storing API keys securely (NEVER commit keys to git!)
Async support: Using async/await with LLM APIs

Practice Exercise

Create a simple Python script llm_test.py that:

Loads API credentials from environment variables
Sends a journal entry text to your chosen LLM
Requests sentiment analysis (positive/negative/neutral)
Requests a 2-sentence summary
Prints the results in a clean format

Example journal entry to test:

"Today I learned about FastAPI and built my first endpoint. The automatic documentation is amazing! I struggled a bit with async functions but the official tutorial helped. Tomorrow I'll tackle database integration."

Cost Awareness

LLM APIs are pay-per-use. Typical costs for this phase:

~$0.50 - $3.00 for testing and completing the capstone
Tokens are charged for both input (prompt) and output (response)
Longer prompts = higher cost
Larger models (GPT-4o, Claude Sonnet) = higher cost than smaller models (GPT-4o-mini, Claude Haiku)

Tip: Use smaller, faster models for development and testing. Switch to larger models only when needed.

🧪 Test Your Knowledge

Once you are done with the tutorials, test your knowledge with an AI assistant. Here are some example prompts:

Can you explain what an LLM API is and how it differs from a traditional REST API?
Can you explain the role of system messages, user messages, and assistant messages?
Can you quiz me on what the temperature parameter controls in LLM APIs?
Can you explain how to securely store API keys in a Python application?
Can you ask me to explain the difference between synchronous and asynchronous LLM API calls?
Can you quiz me on how to handle errors and rate limits when calling LLM APIs?
Can you explain how to get structured JSON output from an LLM instead of plain text?

✅ Topic Checklist

Before moving on, make sure you have:

Completed the Python OpenAI Demos exercises
Tested prompts in your cloud provider's playground (Azure AI Foundry, AWS Bedrock, or Vertex AI)
Understood the messages format (system, user, assistant)
Practiced with structured outputs (JSON responses)
Created a Python script that calls an LLM API
Stored API keys securely in environment variables
Understood cost awareness and token pricing

📚 Learning Path​

Understanding LLM API Basics​

Hands-On Learning: Python OpenAI Demos​

Video Series: Python + AI​

Choosing Your Cloud Provider​

Provider Playground Practice​

Azure OpenAI​

AWS Bedrock​

GCP Vertex AI​

Playground Exercises​

Python SDK Integration​

Azure OpenAI SDK​

AWS Bedrock SDK​

GCP Vertex AI SDK​

Key Concepts to Learn​

Practice Exercise​

Cost Awareness​

🧪 Test Your Knowledge​

✅ Topic Checklist​