Azure Synapse Integrate dataflow with Serverless SQL as source
3 min readApr 30, 2022
Move and process data from server less SQL and Save as delta
requirements
- Azure Account
- Azure Storage
- Azure Synapse Analytics workspace
- Load the NYC taxi data
Steps
- Log into Azure synapse analytics workspace
- Go to Develop on the left menu
- create a new data flow
Data flow
- Overall data flow
- First connect to serverless sql using Azure sql database connector
- Get the ondemand sql name from azure synapse workspace overview page
- I am using sql authentication to test Managed identity authentication for serverless sql to storage
- You can also use passthrough authentication, but make sure proper permission is set in the storage account to read the data
- Select the table name as vwYellowTaxi, in my case it’s a view to call
- Click browse and see if you can see the data
- Next drag and drop select
- Select all columns
- Next we are going to create parition columns as year and month
- Drag and drop derived column1 task
- create new column for year and month
year(tpepPickupDateTime)
month(tpepPickupDateTime)
- Next is save the file as delta into Synapse workspace storage
- drag sink and select inline
- Select delta and see the config below
- Specify the storage folder to store the file
- Now set the partition columns
- Commit and save the data flow
Pipeline to run the dataflow
- Now lets create a pipeline
- Go to Pipeline and create new pipeline
- Then drag and drop dataflow activity and select the above dataflow created
- Set the compute options to 32 + 16 cores
- Select memory optimized
- Commit and publish the pipeline
- Click Add Triger and select Run now
- Wait for the pipeline to run and complete
- Click details
Original article at — Samples2022/serverlesssqldf.md at main · balakreshnan/Samples2022 (github.com)