Recently I was talking to a colleague, Devin Prevost, about a project where he is developing a solution for a client on the Power Platform. One of the challenges he was trying to solve was how he could ingest data from the client’s data warehouse into the Power App solution so that it could do a lookup and pre-fill fields with default data. Due to cost, the client wanted to keep this under the free licenses for Power Apps and Power Automate, so no premium features and data must be in SharePoint. In doing we started looking at how we could feed data into a SharePoint list via a CSV file.

The scenario we will be using starts with a CSV file being created and uploaded to a library in SharePoint Online by a 3rd party application from there we will:

  • Fetch the data from the SP List and parse the contents of the CSV (~30,000 rows each)
  • Compare the data in the SP List to the data in the CSV
  • Delete items from the SP List where they do not exist in the CSV or have been updated
  • Create items that appear in the CSV but not the SP List or have been updated

Now ideally we would handle items that need to be updated separately and avoid deleting and recreating them. However, due to performance concerns at large volumes of data we opted to stick with the create and delete scenarios as the MVP candidate of this solution.

High-Level outline of the flowHigh-Level outline of the flow

At a very high level our flow is composed of a trigger and 4 main scopes:

  • Fetch Data – Where we will get the data from the SP List as well as format the data from the CSV file
  • Comparison – Here we will be formatting the data to get it ready to process it for a comparison check, ultimately the output here is 2 arrays: 1 for all items that need to be delete and another of items that need to be created (Note: I have a separate blog post on comparing items.)
  • Process Items to Delete – Here we will loop through the items to delete in batches of 1000 and use the SP batch API to update them
  • Process Items to Create – Same as the delete but process the items to create

The Setup

Our trigger for this Power Automate Flow is going to be a SharePoint one call, When a file is created in a folder. Our folder will be the root of a super creative document library we created for testing called, "Update SharePoint List from CSV."

When a file is created in a folderWhen a file is created in a folder

To ensure that our Flow only runs on CSV files we are going to put a trigger condition on the trigger that looks at the Content-Type property in the header and ensures it is “text/csv.” That way if someone accidentally uploads a file to this document library it doesn’t trigger our flow to run, unless it is a CSV of course.

Trigger ConditionsTrigger Conditions
javascript
@equals(triggerOutputs()?['headers']?['Content-Type'],'text/csv')

In addition to the trigger setup we will be initializing all the variables we will use in our Flow. To save you from all those screenshots here is a list of them:

  • siteUrl (Makes it easy to update the URL for when switching from dev to prod site collections)
    • Type: String
    • Value: Url of the site that houses the SP list with items
  • listGuid (Makes it easy to update the list for when switching from dev list to prod list)
    • Type: String
    • Value: GUID to the SP list with items
  • terminateCall (Used to break our Do until loop when fetching SP items)
    • Type: Boolean
    • Value: False
  • endPoint
    • Type: String
    • Value: Initial REST API to fetch items from SP List
  • spItems (Used to store the current items in the SP list)
    • Type: Array
    • Value: []
  • createItems (Used to store the items that need to be created in SP)
    • Type: Array
    • Value: []
  • deleteItems (Used to store the items that need to be deleted from SP)
    • Type: Array
    • Value: []
  • remainingItems (Used to store remaining items to loop through in the Delete/ Create items scopes
    • Type: Array
    • Value: []
  • arrayChunkSize (Used to break up the createItems / deleteItems arrays in batches for the batch API)
    • Type: Integer
    • Value: 1000

Now that the Flow is setup we will move into our first Scope, Fetch Data.

Fetch Data

Our fetch data scope has 2 main objectives:

  1. Fetch all items from the designated SP List
  2. Extract the CSV data from the triggerBody and filter out any blank rows
Scope - Fetch DataScope - Fetch Data

Our Do until – getSPItems loop runs until the terminateCall variable is true. The first action in the loop is our call to SharePoint, notice that we are using the odata=nometadata on the Accept header to help limit what is sent back from SharePoint.

Send an HTTP Request to SharePoint - fetch itemsSend an HTTP Request to SharePoint - fetch items

You can see that we are referencing variables defined in the setup here as well. The reason for the endPoint variable is because we will be updating the value or that variable with the value passed back in the nextLink provided by SharePoint.

Additionally we are fetching data in this manner for 2 reasons:

  1. We are limited to 5000 items with the Get Items action in Flow
  2. Performance, using this method we were able to pull back 30000 items in a minute.

To help with performance make sure you use the select query parameter as well as top to on your REST API endpoint.

When SharePoint returns the first call we are going to combine the items in the spItems array with the items that were just returned using the union function and a compose action, as pictured below.

Compose - concat spItems and valueCompose - concat spItems and value
javascript
union(variables('spItems'),outputs('Send_an_HTTP_request_to_SharePoint_-_fetch_items')?['body']['value'])

Then we will update the spItems variable with the Outputs of our Compose action that combined both arrays together.

Set variables - spItemsSet variables - spItems

Next we need to check the body of the SharePoint call to see if an odata.nextLink property is present. If it is not present then it means there are no more items in the list and we can update the terminateCall variable to be true to break out of our Do until loop. If the property is present we update our endPoint variable with the value of the property and run through the loop again.

Condition check to see if there are more itemCondition check to see if there are more item
javascript
empty(outputs('Send_an_HTTP_request_to_SharePoint_-_fetch_items')?['body/odata.nextLink'])
javascript
replace(outputs('Send_an_HTTP_request_to_SharePoint_-_fetch_items')?['body']['odata.nextLink'],concat(variables('siteUrl'),'/'),'')

Once we are done with fetching all the items from the SharePoint List we move onto formatting the file contents in the triggerBody and removing any blank rows from the CSV. Paul Murana did an awesome blog post on parsing CSV content that we used, Power Automate: How to parse a CSV File to create a JSON array.

At a high-level we are using the split function on the content within the body and splitting on the rn characters present in the CSV. Then we are skipping the first item in the array because that is our headers.

Then we are using the Filter array action to remove all the empty rows from the file.

Compose - format the file contentCompose - format the file content
javascript
skip(split(triggerOutputs()?['body'],decodeUriComponent('%0D%0A')),1)

Notice how we can use the decodeUriComponent function to get the rn delimiter for our split function.

Comparison

The bulk of the logic in our comparison scope is covered in my post on how to Find the differences between two arrays of objects. The main concept here is that we will be using the intersection function to find the common items in the array, then we will be filtering out those common elements. The driving force behind this is performance. One of the driving factors behind how the intersection function works is that when you compare objects the schema or properties of the objects in both arrays need to match, thus we start off our array formatting both the SP List items and the CSV items.

Select - format spItems for ComparisonSelect - format spItems for Comparison

After we format both arrays using the Select action we can perform the intersection on their outputs. This will return us an array of common items between the two.

Compose - intersection of spItems and csvCompose - intersection of spItems and csv

Now that we have an array of all the common elements we need to filter them out. Before we can do this we need to make a small adjustment to our array of common elements, and that is to only select the Title property. This property is the primary key between both datasets so it can be used to correctly identify the items that need to be removed. Also you guessed it doing so will make the Filter array action more performant because it doesn’t need to check the properties on each item.

Select - intersection (Titles only)Select - intersection (Titles only)

After we format run the Select action we add a parallel branch to increase performance and run our Filter array actions at the same time, one to filter common items out of spItems and one to out of the CSV.

Filter arraysFilter arrays

We use a simple contains function to see if the Title exists in our array of common Titles between both and if it does we filter it out.

javascript
contains(body('Select_-_intersection_(Titles_Only)'), item()?['Title'])

We then add the results to their respective variables. The unique spItems will be the ones that we need to delete and the unique CSV items are the ones we need to create. Before we add those items to the createItems variable though we format the results into the object needed by our REST endpoint AddValidateUpdateItemUsingPath().

Process Items to Delete

We start off this scope by copying the items in the deleteItems variable to the remainingItems variable. We do this because Power Automate doesn’t allow you reference a variable when you are updating that same variable. After setting the variable we do a check to ensure it is not empty and this is to avoid making a call to SharePoint and running through additional actions when not necessary. This is needed because a Do until loop will run through the actions in the loop at least once before performing the condition check.

Scope - Process Items to DeleteScope - Process Items to Delete

We are using a Do until action because SharePoint batch calls can only process 1000 commands per call and since our array could theoretically have more than 1000 we loop through the array in chunks of 1000. The setup for the actions may look familiar as it was modelled after how John Liu calls the batch API (see shout-outs below for a link) and we used it because of how elegant and simple it is to use. We start by creating 2 guids to use, one for the bath and one for the changeSet using the guid function.

Compose - batchGuidCompose - batchGuid

After that we setup the body of our changeSet that we will use to make our Delete calls. Formatting is key here, make sure you:

  • Spaces not carriage returns between the Method and Call AND Call and Protocol
  • For the Delete method ensure you have 2 carriage returns after the If-Match: *
Compose - changeSetChunkCompose - changeSetChunk

Notice how the variables help to simply the call and the –Outputs variable at the top is a reference to the changeSet guid. Additionally take note of the ID text in our call as we will be replacing that with the ID of the item we need to delete in the next call.

Next we use the Select action to loop through either all the items in the remainingItems array (because it is less than 1000) or the first 1000 items in the array. The output is the changeSet chunk text but with the ID being replaced by the ID of the item in the loop. It looks like this:

Select - replace IDSelect - replace ID
javascript
if(less(length(variables('remainingItems')),variables('arrayChunkSize')),variables('remainingItems'),take(variables('remainingItems'),variables('arrayChunkSize')))
javascript
replace(outputs('Compose_-_changeSetChunk'), 'ID', string(item()?['Id']))

Now that we have our individual calls scaffolded up we need to configure our call to SharePoint’s batch API endpoint.

Send an HTTP request to SharePoint - batch call deleteSend an HTTP request to SharePoint - batch call delete

Take note of where we are referencing the batch guid versus the changeSet guid to ensure that your call is formatted correctly. We then are simply using the join function to take the output of the Select action above and turn it into a string.

javascript
join(body("Select_-_replace_ID"), "");

After we make our call we simply update the remainingItems variable by using the skip function. We are simply skipping the first 1000 items in the deleteItems array. To ensure it will scale properly we multiple the 1000 by the current iteration index + 1, since it is 0 based. This means on the second call we will skip 2000 items. If we skip more items then are in the array we simply get back an empty array.

Set variable - update remainingItemsSet variable - update remainingItems

The Do until loop will then check to see if the length of the remainingItems array is 0 and if not it will run through the actions again.

Process Items to Create

The process items to create scope and actions are almost nearly identical to the steps in the Delete scope, the only differences besides setting the remainingItems variable to createItems are in the changeSetChunk and the Select – replace actions. The changeSetChunk is modified to support creating items and the Select simply updates the replace string from ID to INSERT ITEM.

Compose - changeSetChunk 2Compose - changeSetChunk 2

Above you will see format for the create Items call. Notice our method is now POST and we have updated our call as well. Additionally, notice that our body for this particular call will be dynamically inserted with the Select action and replace the INSERT ITEM text.

Performance

Before we talk about performance it is important to let you know the size of the data set we are working with as the larger the data set the longer it will take certain actions to complete. For demo and testing purposes our SharePoint list had 30000 items in it and our CSV had 30000 items in it. Each item consisted of 7 properties and for the calls to fetch items from SharePoint we requested the ID column as well. With these numbers in mind we can talk about performance:

  • Our fetch items from SharePoint consistently took around 1 minute to complete
  • The intersection was extremely fast and would complete in about 2 seconds (main reason we used it)
  • The filter array actions would consistently take about 1 minute each and on occasion 2 minutes

Now the delete and create scopes would vary based upon how many items needed to be deleted and how many needed to be created. We tested it 2 ways:

  • 1000 items
    • Deleting: ~1 minutes
    • Creating: ~50 seconds
  • 10000 items
    • Deleting: ~12 minutes
    • Creating: ~9 minutes

The average total run of the Flow when 1000 items needed to be deleted and updated was sub 5 minutes whereas a larger 10000 item run took around 24 minutes to complete.

Gotchas & Takeaways

There were some gotchas and takeaways we discovers along the way:

  • Be aware of the default limits on the Do Until loop. The timeout is set to 1h and 60 max loops that may need to be increased based on the amount of data you need to process.
  • If your call size it took large you will receive an error message and need to back down the 1000 commands per batch. This is all dependent on the amount of columns in your data set and the amount of items that need to be created.
  • Be careful when copy pasting into the Select action as sometimes Flow will stringify your array and that ultimately messes up the format for the batch calls. You won’t see the action fail but will notice no items being created.

Known Limitations

In an ideal world we would have had a Per-Flow-Plan license that we could of used for this solution but since we needed to keep it free here were the limitation that we had to contend with:

  • 5000 item limit for Apply To Each action (doesn’t bode well when needed to loop through ~30,000 items)
  • Max 5000 item limit on the Get All Items SharePoint action (Paginated Items limit)
  • Size of the body being passed to the batch endpoint for SharePoint (Since we knew there was a lot of data to move we wanted to limit the amount of calls going to and from SharePoint to make the Flow as performant as possible.)
  • 1000 commands per batch request

Obligatory shout-outs

Like anything else in life where would we be without Google and others in the community blogging about different solutions and scenarios that we used in crafting our solution. In no particular order:

Co-Developer Spotlight – Devin Prevost

Devin Prevost has been a consultant for 6+ years primarily in the Microsoft 365 and Application Development space. Devin’s passion to help solve complicated business problems with technical solutions that are maintainable by his clients has recently found himself architecting and developing solutions within the Power Platform. Connect with Devin on LinkedIn.

AaaS - Anthony as a Service