Back it up, Back it in, let me begin – Automated backup of Microsoft SharePoint

Even in the modern age of cloud computing, where companies routinely synchronise their data to cloud storage facilities, backups are important. Anyone who has been (un)fortunate enough to be involved with ISO27001 knows there are plenty of questions in it about backups, and the answer ‘we’re assuming AWS doesn’t ever break or go out of business, and that none of our employees do anything silly’ results in tooth-sucking from auditors.

Most cloud providers offer their own backup offerings, providing redundancy for any data you have stored on their systems, but those of you who want to don the tin-foil-hat of a paranoid risk owner may wish to have another option.

Cloud provders often offer migration services – how to get your on-premises or differently stored data ON to their systems, but don’t so frequently offer systems to get it OFF as easily, and this can mean you are left with some repetitive clicking to do a monthly backup.

But, computers were designed to do repetitive tasks to take the pain away from humans… so here is how we can use them to do an otherwise tedious backup to a local disk from Microsoft SharePoint.

Taking an automated Offline Backup of Multiple SharePoint Sites

This blog focuses on how to use PowerShell to do an automated backup of multiple SharePoint sites.

Backing up one SharePoint site is pretty simple, but if your organisation uses multiple sites to restrict access to project and shared files to those who need to know, following the principle of Least Privileges, then here is a way to overcome the repetitive task of logging into each site.

Requirements

To follow the steps in this blog, you will need to have ‘SharePoint Administrtor’ rights available to you. You will also need to have PowerShell 7 installed – not the same as ‘Windows Powershell’ you will find in Windows, which is actually version 5.1.

The Script

Let’s start scripting.

Assuming you aren’t the only person who can make SharePoint sites in your organisation, and assuming you can’t remember them all, we are going to need to do some checks and loops to find them all.

First, let’s set up some basic variables and confirm that people are using the right version of PowerShell:

Initialisation

# Set up some variables to get our script pointed in the right direction
# This is all going to be written using Microsoft's favourite fake company, Contoso - change the relevant bits out to match your organisation's configuration.

$SiteURL = "https://contoso.sharepoint.com" # Contoso's sharepoint Top Level Domain URL
$LibraryURL = "/Shared Documents" # This is the common relative URL for the documents in SharePoint - just check your organisation's config matches this
$DestinationFolder = "D:\Backup_Folder" # Local download folder TLD, this script will go on to generate subfolders when it finds them

# Let's also do a PowerShell version check, using the $PSVersionTable.PSVersion environment variable:
Write-Host "You are running PowerShell Version " $PSVersionTable.PSVersion ". This script only runs on PS7 or later."

# Let's do a version test on the first number of the version only (Major), or PowerShell gets a bit confused when doing a comparison. We need to make sure it is greater than or equal to 7:
if ($psVersionTable.PSVersion.Major -ge 7) {
    # If it is 7+, we can crack on
    Write-Host "Continuing Script..."

    # Bring in PnP module for PowerShell - we need this for some of the script later on
    Import-Module -Name PnP.PowerShell

    # Connect to Contoso's main area initially to get master site list - this will be an interactive logon with a popup
    Connect-PnPOnline $SiteURL -Interactive

    #Call our main function - which we will write next
    BackupDownload
}

# If the system this is running on doesn't have PowerShell 7+, let's stop it now to prevent confusion and delay
else {
    Write-Host "Exiting Script"
    exit
}

The Main Backup Script

We’ve checked we have the right version of PowerShell, and pointed the script to our Top Level Domain and logged in – now let’s do some backing up.

Let’s focus on the files and folders in the Top Level Domain and the ‘top’ of the Sites in your organisation first of all, not any subfolders:

# Main function starts here
Function BackupDownload() {
    # Get Collection of all relevant SharePoint sites
    $SiteList = Get-PnPTenantSite -Template GROUP#0 # This line returns a collection with a lot of info, we just need the URL
    
SiteList.Url # This line creates an array of current site URLs in the Contoso SharePoint

    
URLList # This line can be changed to add in just one URL if you want to troubleshoot or just back up one site. Leaving it set to $URLList passes all sites to be backed up.
    
    # We are going to use a for loop to iterate over all the sites in our Top Level Domain (TLD)
    foreach ($Site in $TestSiteURL) {
        Write-Host -f Magenta "Current site is '$Site'" # I Like a lot of output in a terminal so I can see what is happening, delete this if you don't!

        Connect-PnPOnline $Site -Interactive # Connect to that specific site
        
DestinationFolder + (
SiteURL, "") -replace "/", "\") # Set local download folder to match current cloud directory name for ease
        
LocalFolder + LibraryURL # This line adds the generic folder name for 'Shared Documents' to your local folder. # Find all Files and Folders in current directory
CurDirFiles = Get-PnPFolderItem -FolderSiteRelativeUrl LibraryURL -ItemType File # List Files in Shared Documents of the current Site
CurDirFolders = Get-PnPFolderItem -FolderSiteRelativeURL $LibraryURL -ItemType Folder # List Folders in Shared Documents of the current Site
        
        # Once again, I like output so I can see what is going on - these lines print the files and folders in the current area
        Write-Host -f Yellow "Files in the current directory:"
        foreach (
CurDirFiles) { Write-Host "`t
file.Name)" }
        Write-Host -f Yellow "Folders in the current directory:"
        foreach (
CurDirFolders) { Write-Host "`t
folder.Name)" }

        # Confirm folder exists in local area to download into - matches Shared Documents level of Site
        # This if loop checks if we already have a folder in our local backup folder, and makes one if not
        If (
CurDirFiles) {
            # only a relevant step at this stage if there are files in the top level, else will create in next step
            If (!(Test-Path -Path $LocalSDFolder)) {
                New-Item -ItemType Directory -Path $LocalSDFolder | Out-Null
                Write-Host -f Yellow "Created a New Folder '$LocalSDFolder'"
            }
        }

        # Download Files in the current site - we need a for loop to get them all
        foreach (
CurDirFiles) {
            Get-PnPFile -ServerRelativeUrl 
LocalSDFolder -FileName $File.Name -AsFile # LocalFolder works as directory only for the TLD
            Write-Host -f Green "`tDownloaded File from '
File.ServerRelativeUrl)' to $LocalSDFolder"
        }

        #Set the current folder as the 'relative folder' we are about to pass to the subfolder part of the script
        
LibraryURL
        
        # Call 'subfolder hunter' function to find files and folders in the subfolders of your sites
        # NB in PowerShell you cannot pass an array of objects to a function, so loops must exist before the function
        foreach (
CurDirFolders) {
            subfolderhunter 
RelativeFolder $folder.ServerRelativeUrl # Remember you pass PowerShell arguments with a space, not comma, and no brackets
        }
    }
}

Subfolder Hunter

The code above will download the files in the top levels of the sites in your organisation, and tell you what folders are in them. However, we are going to need to dig deeper to get the files and folders in the subfolders of your sites.

Thanks to a quirk of PowerShell meaning it doesn’t let you pass an array of objects into a function we have to do this in two steps. On to the next step, the subfolder hunter, which will essentially do what we have just done – but down a level, and onwards to the bottom of your file structure!

# Function to look in subfolders of top levels of sites

# We are passing in to the function the name of the folder we want to backup, its relative location from the top level (probably /Shared Documents/folder), and its server relative URL
function subfolderhunter(
RelFolder, $SRU) {

    # Set working subdirectory
    
($RelFolder + "/" + $folder)
    Write-Host -f Cyan "Current Working Folder is " $WorkingSubdir
    
    # Create subfolder in our backup location if it doesn't already exist
    
DestinationFolder + ($SRU -replace "/", "\")

    If (!(Test-Path -Path $CurSaveFold)) {
        New-Item -ItemType Directory -Path $CurSaveFold | Out-Null
        Write-Host -f Yellow "Created a New Folder '$CurSaveFold'"
    }

    # Get list of files in current subfolder
    
WorkingSubdir -ItemType File
    
WorkingSubdir -ItemType Folder

    # Print to the terminal the names of the files and folders, for those who like output
    Write-Host -f Yellow "Files in the current subdirectory:"
    foreach (
SubDirFiles) { Write-Host "`t
file.Name)" }
    Write-Host -f Yellow "Folders in the current subdirectory:"
    foreach (
SubDirFolders) { Write-Host "`t
folder.Name)" }

    # Download Files in Current directory
    foreach (
SubDirFiles) {
        Get-PnPFile -ServerRelativeUrl 
CurSaveFold -FileName $File.Name -AsFile
        Write-Host -f Green "`tDownloaded File from '
File.ServerRelativeUrl)' to $CurSaveFold"
    }

    # Recursively call this subfolder hunter on found folders in the subfolders
    foreach (
SubDirFolders) {
        subfolderhunter 
WorkingSubdir $folder.ServerRelativeUrl
    }
}

Conclusion

That’s it. One little bit of script to initialise your locations, and check PowerShell versions. One bit of script to list the files and folders in the top levels of the sites in your organisation. One bit of script to then dig around in the subfolders and download all the files in there.

Bolt these parts together in one script and set it running in the background, and you can cheerfully answer emails while preparing for the cloud crash apocalypse.

Hopefully this will reduce the time it takes you to back things up, and mean you can willingly do it more frequently!

By James Hawkes

Leave a comment

Your email address will not be published. Required fields are marked *