Use open source Titanic data set to run automated machine learning

Pre-requisite

  • Azure Account
  • Create a resource group
  • Create Azure Storage account
  • Create Azure Machine learning Services

Data

  • Data is available in the repo
  • Filename Titanic.csv
  • Download the file to local hard drive for future upload

Build Model

  • Log into Azure Portal
  • Go to Azure Machine learning services resource
  • Open Azure Machine learning Studio

Create Dataset

  • Go to dataset
  • Click Create Dataset
  • Give a name for data set: TitanicTraining


Using Azure Cognitive Services Speech to Text and Logic apps

No Code — Workflow style

please use this as reference, as this might not be your exact use case. But you can see how AI can help drive insights and automate large audio dataset

Pre requisite

  • Azure Account
  • Azure Storage account
  • Azure Cognitive Services
  • Azure Logic apps
  • Get connection string for storage
  • Get the primary key to be used as subcription key for cognitive services
  • Audio file should be wav format
  • Audio file cannot be too big
  • Audio time 10 min

Logic apps

  • First create a trigger from Blob

Using Azure Cognitive Services Forms Recognizer and Logic apps

No Code — Workflow style

Pre requisite

  • Azure Account
  • Azure Storage account
  • Azure Cognitive Services — Form Recognizer
  • Azure Logic apps
  • Get connection string for storage
  • Get the primary key to be used as subcription key for cognitive services
  • Audio file should be pdf format
  • Audio file cannot be too big

Full Flow

Logic apps

  • First create a trigger from Blob

Using Azure Cognitive Services Speech to Text and Logic apps

No Code — Workflow style

We can re use the same pattern for other Azure Cognitive services as well.

Pre requisite

  • Azure Account
  • Azure Storage account
  • Azure Cognitive Services
  • Azure Logic apps
  • Get connection string for storage
  • Get the primary key to be used as subscription key for cognitive services
  • Audio file should be wav format
  • Audio file cannot be too big
  • Audio time 10 min

Logic apps

  • First create a trigger from Blob

Using open census library to push error logs to Azure monitor

Prerequisite

  • Azure account
  • Azure Synapse workspace
  • Azure Storage

Steps

  • Create a Spark cluster
  • install library
  • create a conda file and upload
  • Create a environment.yml
name: example-environment
channels:
- conda-forge
dependencies:
- python
- numpy
- pip
- pip:
- opencensus-ext-azure
  • Create a notebook

Code

  • Choose python as language
import logging
from opencensus.ext.azure.log_exporter import AzureLogHandler
logger = logging.getLogger(__name__)# TODO: replace the all-zero GUID with your instrumentation key.
logger.addHandler(AzureLogHandler(
connection_string='InstrumentationKey=xxxxx-xxxxxx-xxxxxx-xxxxxxx')
)
logger.warning("Sample from open census test 01")
logger.error("Sample from open census test 02")
from opencensus.ext.azure.trace_exporter import AzureExporter
from opencensus.trace.samplers import ProbabilitySampler
from…

Using open census library to push error logs to Azure monitor

Prerequisite

  • Azure account
  • Azure Databricks
  • Azure Storage

Steps

  • Create a Databricks cluster
  • install library
opencensus-ext-azure

Spark code to predict classification for practical dataset — titanic survivor dataset

Use Case

  • Use Titanic dataset
  • Dataset is available in this folder called Titanic.csv
  • Upload the data set to Azure storage

Prerequisite

  • Azure account
  • Azure storage account
  • Create a container called titanic and upload the Titanic.csv file to container
  • Create Azure Data bricks resource
  • Create a azure key vault
  • Store the primary key into a secret
  • Configure the keyvault to azure databricks

Code

  • First lets include the libraries
from pyspark.ml import Pipeline
from pyspark.ml.classification import RandomForestClassifier
from pyspark.ml.feature import IndexToString, StringIndexer, VectorIndexer
from pyspark.ml.evaluation import MulticlassClassificationEvaluator
from pyspark.sql import functions as f
storagekey = dbutils.secrets.get(scope = "allsecrects", key = "storagekey")

Build a end to end data science pipeline lab using Azure data factory and azure databricks

Pre requisites

  • Azure account
  • Azure data factory
  • Azure databricks
  • Azure Storage ADLS gen2 — to store all the parquet file — data lake
  • Azure Keyvault for storing secrets

End to End Pipeline Architecture

Steps

  • For the notebooks i am using existing notebooks from microsoft doc web site
  • the business logic here has not real business value
  • each tasks are not related to each other, it’s to show the flow
  • data used are public data sets

Components

  • Data flow to show case Data warehouse — Facts/Dimension model
  • Notebook — PassingParameters — to show case how to pass pipeline parameters into notebooks
  • Notebook — DataEngineering — to show case data engineering…

When code checked trigger CI/CD using Azure DevOps

Prerequisite

  • Azure Account
  • Azure Machine learning
  • Create a compute instance
  • Create a compute cluster as cpu-cluster
  • Select Standard D series version
  • Create Train file to train the model
  • Create a pipeline file to run the as pipeline

Steps

Create Train file as train.py

  • Create a directory ./train_src
  • Create a train.py
  • Should be a python file not notebook
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.

import argparse
import os
import pandas as pd
import numpy as np
from azureml.core import Workspace, Dataset
from azureml.core import Dataset
from azureml.data.dataset_factory import DataType
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

import sklearn as sk
import pandas as pd
# import seaborn as sn
# import matplotlib.pyplot as plt
from sklearn.preprocessing import…


Create Train pipeline in Azure DevOps for Automated ML Vision

Prerequisite

  • Azure Account
  • Azure Storage
  • Azure Machine learning Service
  • Azure DevOps
  • Github account and repository
  • Azure Service Principal Account
  • Provide service principal contributor access to Machine learning resource
  • Azure Keyvault to store secrets
  • Update the keyvault with Service principal Secrets
  • Train scripts is designed to take 3 parameters
  • Tenant id
  • Service principal client id
  • Service principal secret
  • The above details will be passed from Azure DevOps as Secret variables
  • This automates the training code and Registers the model

Steps

  • Create a train script
  • Create agent dependcies scripts
  • Create Azure DevOps Pipeline

Train script

  • Open Visual Studio Code
  • Create a New project called visionautoml
  • Create…

Balamurugan Balakreshnan

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store