Spark Scala process base 64 message Stream processing

Event hub messages with base 64 encoded message.

Use case

  • Iot Hub and Event Hub uses Avro format and store message with base 64 encoded message
  • Parse the data using structed streaming
  • Parse the body column which is base 64 encoded

Pre requisite

  • Azure subscription
  • Create a Event hub name space
  • Select Standard since schema registry is not available in basic
  • create a event hub with 1 partition
  • create a consumer group called sample1
  • Create Azure Data bricks workspace
  • Spark 3.0
  • Create a Event hub cluster
  • Install event hub library jar from Maven: com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.17

Simulator to create and send data to event hub

  • https://eventhubdatagenerator.azurewebsites.net/
  • Copy the Event hub connection string
  • Copy the name of event hub and paste it here
  • leave the JSON message as like how it is
  • change the number of messages to 500
  • Click submit
  • wait for few seconds to load the data

Azure data bricks code

  • setup event hub config
  • Configure to read the stream from beginning.
  • setup event hub config
  • Configure to read the stream from beginning.
  • Create a connectionString variable and store connection string
  • setup the configuration for read stream
  • Setup Stream now
  • Now parse the json propertise
  • Display the variables

Original Story in Github