Building LLM Application using Azure Open AI and HA, DR

Balamurugan Balakreshnan

2 min readOct 28, 2023

--

Introduction

Design HA and DR architecture for LLM application
Inside same region
Across regions
Across globally
Using Azure Open AI
Using App service for semantic kernel or web application
Using Azure Cognitive Search for Vector Index.
Implementing RAG — Retrieval Augumented Generation
Architecture is high level and not detailed

Architecture

Architecture — Explained

Azure Open AI in Mutiple regions
Put a Application Gateway in front of it, for load balancing and HA
You can also keep the application local to the open AI region
Search and web application locally in the region for the application
All the web application and search, use APIM for HA and DR
APIM can be deployed in multiple regions.
Cosmos DB can be replicated across regions.
To have best latency between the application keep all components in same region.
Networking is up to you, you can use VNET peering or VPN or Express Route
Other security controls is up to enterprise and their standards and process.
Also use Azure Monitor to monitor the application and the infrastructure.
Most of the inter application communication can be managed identity or other authentication mechanism.
Most of them use HTTPS as their protocol for communication.

original article — Samples2023/AOAI/aoaillm.md at main · balakreshnan/Samples2023 (github.com)

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Large Language Models

High Availability

Disaster Recovery

Written by Balamurugan Balakreshnan

https://balakreshnan.github.io/

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech