The Practical .NET Guide to AI & LLM: Introduction

The era of AI-enhanced applications has arrived. .NET developers, particularly those proficient in C#, are in a prime position to leverage its potential. Large Language Models (LLMs) are no longer mysterious black boxes reserved for data scientists. They have become practical instruments that are accessible and valuable for those developing critical business applications on Microsoft’s robust platform.

In this series of articles, I’ll provide you with clear explanations, up-to-date diagrams, and real-world patterns for integrating LLMs with .NET. We’ll dive into model-agnostic approaches, especially focusing on Microsoft.Extensions.AI. Together, we’ll tackle the what and why of LLM integration, as well as the how. I’ll share insights on common pitfalls, security considerations, and detailed technical guidance tailored specifically for you, the modern .NET engineer. Let’s get started!

Fundamentals of Large Language Models

What Is an LLM?

Large Language Models (LLMs) are deep learning models designed to understand, interpret, and generate human-like language. Typical examples include OpenAI’s GPT series, Google’s Gemini, and open-source models like Meta’s Llama or Mistral’s Mixtral. At their core, these models use the Transformer architecture, introduced in 2017, known for its scalability and attention-based learning that enables efficient processing of long textual contexts.

Key architectural components of LLMs include:

Tokenization: Splits text into smaller semantic units (words, subwords, or characters).
Embedding Layers: Translates each token into a dense vector, capturing semantic meaning.
Positional Embeddings: Add information about token order, addressing the transformer’s lack of inherent sequence awareness.
Stacked Transformer Layers: Multiple layers, each with multi-head self-attention mechanisms and feedforward networks, allow the model to build deep, context-rich representations.
Output Decoding: Typically uses a softmax layer to generate probability distributions over vocabulary, supporting language modeling and text generation.

Training a state-of-the-art LLM involves exposing it to vast, diverse datasets including books, documentation, internet text, and, in some domains, code repositories. The process is both data and compute-intensive, often requiring sophisticated distributed training setups.

After pre-training, LLMs can be fine-tuned on domain-specific data (e.g., legal, finance, medical) for higher accuracy on specialized tasks.

Why LLMs matter for .NET developers

Your apps already orchestrate workflows, data, and APIs. LLMs add language understanding and generation so users can interact naturally and so machines can interpret messy text. With the arrival of LLMs, it’s now possible for .NET developers to:

Automate content generation and document summarization.
Build robust conversational agents for internal or external support.
Enrich search and knowledge management with semantic features.
Integrate advanced automation and intelligent decision-making into business workflows.

Crucially, LLMs can be accessed using C# and the .NET ecosystem—empowering traditional backend and full stack developers to participate in the AI revolution.

Basic LLM Example for C#/.NET

// Using OpenAI’s API with Microsoft.Extensions.AI

using Microsoft.Extensions.AI;
using OpenAI;

IChatClient chatClient = new OpenAIClient("your_api_key").GetChatClient("gpt-5").AsIChatClient();
string question = "Explain retrieval-augmented generation in one paragraph.";
var response = await chatClient.GetResponseAsync(question);

Console.WriteLine(response.Message.Text);

// Using OpenAI’s API with Microsoft.Extensions.AI

using Microsoft.Extensions.AI;
using OpenAI;

IChatClient chatClient = new OpenAIClient("your_api_key").GetChatClient("gpt-5").AsIChatClient();
string question = "Explain retrieval-augmented generation in one paragraph.";
var response = await chatClient.GetResponseAsync(question);

Console.WriteLine(response.Message.Text);

(The same logical pattern holds for Azure OpenAI, Ollama, and others)

Problem overview

You build reliable .NET apps that talk to databases, services, and users. LLMs add a new capability: natural language understanding and generation. However, the AI landscape moves fast, providers differ, and vendor lock-in can become expensive. You need a clean, testable way to add AI without tying your architecture to one SDK.

Typical pain points:

Vendor/API lock-in: Changing providers requires significant code refactoring.
Divergent APIs and payload shapes: Each AI model exposes unique endpoints for chat, completion, embeddings, function calling, etc.
Testing/friction: Tuning, A/B testing, and benchmarking become expensive, error-prone efforts.

The Model-Agnostic Solution

Model-agnosticism in AI development means building systems where the application’s core logic interacts with a unified abstraction layer, rather than being tied to any specific model or service. This is achieved through:

Provider Abstraction: Use interfaces or common contracts, such as IChatClient for chat or IEmbeddingGenerator<TIn,TOut> for embeddings.
Dependency Injection: Inject your concrete model/provider at startup, with support for dynamic selection, configuration, and chaining of middleware.
Unified Middleware/Fallbacks: Ability to add logging, telemetry, caching, rate limiting, or fallback to alternate providers—all at the abstraction layer.
Separation of Concerns: Business workflows, reasoning, and orchestration are decoupled from model selection and invocation.

.NET Abstraction	Purpose	Example Providers
IChatClient	Generic chat/completion interface	OpenAI, Azure OpenAI, Ollama, Mistral
IEmbeddingGenerator<T>	Embedding / text-to-vector generator	OpenAI, Azure, Hugging Face, Ollama
Middleware interfaces	Add caching, tracing, limits, tools	All above + custom business logic

Prerequisites

.NET 8 or later installed
Familiarity with ASP.NET Core minimal APIs or Web API controllers
A model provider (pick any): Hosted (Azure OpenAI, OpenAI, Mistral, Anthropic) or Local (Ollama, LLM Studio)
NuGet packages (choose based on provider and abstraction): Microsoft.Extensions.AI, provider adapters, and optional Semantic Kernel

Relevant .NET packages

Microsoft.Extensions.AI — Abstractions like IChatClient, messages, and results.
Microsoft.Extensions.AI.OpenAI — Adapter for OpenAI and Azure OpenAI.
Microsoft.Extensions.AI.Ollama — Adapter for local models via Ollama.
Microsoft.SemanticKernel — Optional orchestration for tools, memory, and templates.

Step-by-step guide

This guide provides a simple, model-agnostic setup using dependency injection and a thin abstraction that can be easily mocked in tests. The goal is to swap providers via configuration without touching business logic. In the next posts I will go deeper into Microsoft.Extensions.AI.

Below is an architecture diagram of a typical .NET app that uses Microsoft.Extensions.AI.

The Practical .NET Guide to AI & LLM: Architectural Pattern

Define a small app-facing contract

Create a thin interface your app depends on. Internally, you’ll adapt it to the provider using Microsoft.Extensions.AI.

public interface ITextGen
{
    Task<string> CompleteAsync(string prompt, CancellationToken ct = default);
}

public interface ITextGen
{
    Task<string> CompleteAsync(string prompt, CancellationToken ct = default);
}

This keeps your controllers/services independent from any specific SDK.

Wire up a provider behind the abstraction

using Microsoft.Extensions.AI;

public sealed class TextGen : ITextGen
{
    private readonly IChatClient _chat;

    public TextGen(IChatClient chat) => _chat = chat;

    public async Task<string> CompleteAsync(string prompt, CancellationToken ct = default)
    {
        // Minimal prompt -> single-turn completion
        var result = await _chat.CompleteAsync(prompt, cancellationToken: ct);
        return result.Text;
    }
}

using Microsoft.Extensions.AI;

public sealed class TextGen : ITextGen
{
    private readonly IChatClient _chat;

    public TextGen(IChatClient chat) => _chat = chat;

    public async Task<string> CompleteAsync(string prompt, CancellationToken ct = default)
    {
        // Minimal prompt -> single-turn completion
        var result = await _chat.CompleteAsync(prompt, cancellationToken: ct);
        return result.Text;
    }
}

The class depends on IChatClient from Microsoft.Extensions.AI. You’ll provide a concrete implementation via an adapter (OpenAI, Azure OpenAI, or Ollama).

Configure DI and choose a model via appsettings

In Program.cs, bind settings and switch providers without touching business code.

builder.Services.AddOptions<AiOptions>().BindConfiguration("AI");

builder.Services.AddSingleton<IChatClient>(sp =>
{
    var cfg = sp.GetRequiredService<Microsoft.Extensions.Options.IOptions<AiOptions>>().Value;
    return CreateChatClientFrom(cfg);
});

builder.Services.AddSingleton<ITextGen, TextGen>();

builder.Services.AddOptions<AiOptions>().BindConfiguration("AI");

builder.Services.AddSingleton<IChatClient>(sp =>
{
    var cfg = sp.GetRequiredService<Microsoft.Extensions.Options.IOptions<AiOptions>>().Value;
    return CreateChatClientFrom(cfg);
});

builder.Services.AddSingleton<ITextGen, TextGen>();

And a simple options type plus configuration:

public sealed class AiOptions
{
    public string? Provider { get; set; } // AzureOpenAI | OpenAI | Ollama
    public string? Endpoint { get; set; } // e.g., https://my-aoai.openai.azure.com or http://localhost:11434
    public string? ApiKey { get; set; }
    public string? Model { get; set; }  // e.g., gpt-4o, gpt-4o-mini, llama3, phi3
}

public sealed class AiOptions
{
    public string? Provider { get; set; } // AzureOpenAI | OpenAI | Ollama
    public string? Endpoint { get; set; } // e.g., https://my-aoai.openai.azure.com or http://localhost:11434
    public string? ApiKey { get; set; }
    public string? Model { get; set; }  // e.g., gpt-4o, gpt-4o-mini, llama3, phi3
}

Example appsettings.json (switch without code changes):

JSON

{
  "AI": {
    "Provider": "Ollama",
    "Endpoint": "http://localhost:11434",
    "Model": "phi3"
  }
}

{
  "AI": {
    "Provider": "Ollama",
    "Endpoint": "http://localhost:11434",
    "Model": "phi3"
  }
}

This pattern lets you run locally on Ollama for development, then flip to Azure OpenAI in production by changing configuration.

Helper to centralize provider-specific setup (replace comments with the actual client types/methods from the adapter packages you install):

private static IChatClient CreateChatClientFrom(AiOptions cfg)
{
    return cfg.Provider switch
    {
        "AzureOpenAI" => /* create IChatClient using Microsoft.Extensions.AI.OpenAI and your endpoint/deployment/apiKey */ throw new NotImplementedException(),
        "OpenAI"      => /* create IChatClient using Microsoft.Extensions.AI.OpenAI and your apiKey/model */ throw new NotImplementedException(),
        "Ollama"      => /* create IChatClient using Microsoft.Extensions.AI.Ollama and your endpoint/model */ throw new NotImplementedException(),
        _ => throw new InvalidOperationException($"Unknown AI provider '{cfg.Provider}'.")
    };
}

private static IChatClient CreateChatClientFrom(AiOptions cfg)
{
    return cfg.Provider switch
    {
        "AzureOpenAI" => /* create IChatClient using Microsoft.Extensions.AI.OpenAI and your endpoint/deployment/apiKey */ throw new NotImplementedException(),
        "OpenAI"      => /* create IChatClient using Microsoft.Extensions.AI.OpenAI and your apiKey/model */ throw new NotImplementedException(),
        "Ollama"      => /* create IChatClient using Microsoft.Extensions.AI.Ollama and your endpoint/model */ throw new NotImplementedException(),
        _ => throw new InvalidOperationException($"Unknown AI provider '{cfg.Provider}'.")
    };
}

Use it from a controller or minimal API

app.MapPost("/summarize", async (ITextGen ai, string text, CancellationToken ct) =>
{
    var summary = await ai.CompleteAsync($"Summarize the following for a busy engineer:\n{text}", ct);
    return Results.Ok(summary);
});

app.MapPost("/summarize", async (ITextGen ai, string text, CancellationToken ct) =>
{
    var summary = await ai.CompleteAsync($"Summarize the following for a busy engineer:\n{text}", ct);
    return Results.Ok(summary);
});

Keep prompts deterministic and concise. Add input validation and logging.

Optional: add simple system prompts and messages

For richer control, construct chat messages and a system instruction.

var messages = new List<ChatMessage>
{
    new(ChatRole.System, "You are a concise senior .NET engineer. Answer with clear bullet points."),
    new(ChatRole.User, "Explain CancellationToken best practices in ASP.NET Core.")
};

var result = await _chat.CompleteAsync(messages);
var answer = result.Text; // send back to caller

var messages = new List<ChatMessage>
{
    new(ChatRole.System, "You are a concise senior .NET engineer. Answer with clear bullet points."),
    new(ChatRole.User, "Explain CancellationToken best practices in ASP.NET Core.")
};

var result = await _chat.CompleteAsync(messages);
var answer = result.Text; // send back to caller

System and user roles guide the model’s behavior. Use as little prompt text as needed for reproducibility and cost.

Best practices and guardrails

Prompt hygiene: Use system prompts and provide canonical examples. Keep inputs short and specific.
Validation: Post-process outputs with regex/JSON schema. Reject or re-ask when invalid.
Observability: Log prompts, token counts, latency, and provider responses (scrub PII).
Cost control: Cache results, batch requests, and prefer smaller models by default.
Security and privacy: Avoid sending secrets/PII. Consider customer-managed keys and private network access on Azure OpenAI.
Failover: Implement timeouts, retries, and fallback models.

Common pitfalls to avoid

Even experienced C# developers new to AI frequently stumble on the following issues:

Overfitting to Provider: Writing logic directly against a single vendor SDK—locks you in, increases future migration pain.
Loss of Context/Missing State: LLMs are stateless; maintaining and reconstructing conversational context is a developer’s responsibility (via chat history, memory abstractions).
Weak Prompt Hygiene: Vague or ambiguous prompts cause unreliable outputs. Strong prompt design (templates, roles, explicit instructions, constraints) is essential.
Blind Trust in Output: LLMs “hallucinate” (fabricate answers) or produce outdated/biased/unsafe content by default. Always add validation, evaluation, and fallback mechanisms.
Insecure Plugins/Tool Invocations: LLMs are not sandboxed—guard function access, handle errors and invalid inputs defensively.
Neglecting Observability: Lack of telemetry and logs makes debugging and monitoring LLM behavior impossible in production.

Avoid these traps to ensure long-term maintainability and safety.

Security, Compliance, and Ethics in LLM Integration

Given the risk of prompt injection, data leakage, and attack surface expansion, the security and governance for LLMs go beyond the basics. When integrating LLMs, consider the following security and compliance issues:

Prompt Injection: Crafting malicious prompts to subvert model guardrails, steal data, or execute unintended actions.
PII Leakage: LLMs leaking private/internal data, inadvertently or via crafted prompts.
Training Data Poisoning: Malicious data inserted to corrupt model outputs.
DoS and Rate Limiting: Preventing overuse/abuse (volume/complexity bottlenecks).
Plugin/Tool Vulnerabilities: Plugins letting LLMs invoke insecure code.

Summary

As an experienced developer, you know how quickly technology evolves. LLMs and model-agnostic abstractions like Microsoft.Extensions.AI are becoming essential tools in your toolkit. They make advanced AI accessible to you and your business application teams, not just to machine learning specialists. This means you can create intelligent, adaptive, and future-proof solutions more efficiently than ever before. Together, we can embrace these advancements to enhance our projects and drive innovation.

Adopt these patterns and practices:

Embrace provider abstraction for agility and future-readiness.
Build with robust architectures and middleware for resilience, auditability, and compliance.
Monitor, evaluate, and secure your LLM-powered apps as you would any enterprise system.
Start locally (Ollama), scale to cloud (OpenAI/Azure), and always keep your user, your data, and your business logic front-and-center.

By utilizing these model-agnostic tools and best practices, you will create AI features that are innovative, maintainable, secure, and enterprise-ready. This approach will ensure your solutions remain effective, regardless of future changes in the landscape.