
My dear Sitecore friends, I hope Halloween was wonderful and full of treats! 🎃
With November upon us, Christmas is just around the corner, and it’s the perfect time to tidy up our Sitecore environments before the holiday season.
In today’s post, I’ll show you how to create a reliable backup for your Sitecore XM Cloud content using a GitHub Action workflow. By bundling everything into a single item package, you can keep your backups organized without tracking thousands of individual items in the repository.
Let’s dive into the details!
Why Use Item Packages?
Sitecore item packages offer a simple way to bundle serialized content into a single, cohesive file, making it ideal for backups. Item packages in Sitecore capture all serialized items, including content, layouts, and media assets, so you have a complete snapshot of your site at a specific moment. This method keeps the repository clean and manageable, especially when your site has thousands of items. Instead of tracking each file individually, you get a single package that’s easy to restore and deploy across environments.
Why Use Git LFS?
Since the backup process includes media items, the resulting .itempackage file can be large. Committing large files directly to a Git repository can quickly bloat its size, making it slower to clone, pull, or push changes.
This is where Git LFS (Large File Storage) comes in. Git LFS is designed specifically for handling large files in Git by storing them outside the main repository. Instead of storing the entire file in Git, Git LFS replaces it with a lightweight pointer that references the file in LFS storage. This keeps the repository streamlined and ensures it stays fast and manageable, even with large media backups.
By using Git LFS to track the .itempackage file, we gain the benefits of a single, comprehensive backup file without compromising repository performance. Git LFS makes it possible to manage these large files effectively, ensuring that the backup process remains efficient and scalable as content grows.
Automating with a Cron Schedule
To make backups even easier, we can automate this workflow using a cron schedule. Automating the backup ensures it runs consistently, even if you forget to trigger it manually. Here are the benefits of setting a daily backup schedule:
- Reliability: Automated backups reduce human error by consistently running on schedule, ensuring you always have a recent backup.
- Consistency: Regular backups provide a reliable restore point, minimizing potential data loss and making recovery fast.
- Peace of Mind: With automated daily backups, you don’t need to worry about missing a backup if schedules get busy.
For this workflow, I’ve set up a cron schedule to run at midnight UTC every day ('0 0 * * *'). You can adjust the frequency to fit your needs—for example, weekly or monthly may be more appropriate if content changes less frequently.
Setting Up the Folder Structure
To streamline this GitHub Action workflow, it’s important to set up the right folder structure in your repository. Here’s how the structure should look:
backup content
├── serialized content
│ └── sandboxsite.module.json
In this setup:
- The
backup contentfolder holds all files related to the backup process. - The
serialized contentfolder is where the serialized items and the module configuration (sandboxsite.module.json) are stored.
Installing Sitecore CLI and Required Plugins
Since we’re using dotnet tool restore in the GitHub Action workflow, install the Sitecore CLI and the Serialization Plugin in the backup content folder. Here’s the process:
- Install the Sitecore CLI:
dotnet new tool-manifest
dotnet tool install Sitecore.CLI --version 6.0.18 --add-source https://sitecore.myget.org/F/sc-packages/api/v3/index.json
- Add the Serialization Plugin:
dotnet sitecore plugin add -n Sitecore.DevEx.Extensibility.Serialization --version 6.0.18
This plugin is essential for handling serialized content and creating .itempackage files.
Understanding the sandboxsite.module.json File
The sandboxsite.module.json file defines the specific items, media, and rules for serialization, controlling which items are included in the backup. Here’s a breakdown of the configuration options in this example:
{
"$schema": "../.sitecore/schemas/ModuleFile.schema.json",
"namespace": "Content.Sandbox.SandboxSite",
"items": {
"includes": [
{
"name": "sandboxsite-media",
"path": "/sitecore/media library/Project/Sandbox/SandboxSite",
"scope": "DescendantsOnly",
"allowedPushOperations": "CreateAndUpdate",
"rules": [
{
"path": "/system",
"scope": "ignored"
},
{
"path": "/Sitemaps",
"scope": "ignored"
}
]
},
{
"name": "sandboxsite-media-shared",
"path": "/sitecore/media library/Project/Sandbox/shared",
"scope": "DescendantsOnly",
"allowedPushOperations": "CreateAndUpdate"
},
{
"name": "sandboxsite-content",
"path": "/sitecore/content/Sandbox/SandboxSite",
"database": "master",
"allowedPushOperations": "CreateAndUpdate",
"rules": [
{
"path": "/Home",
"scope": "ItemAndDescendants",
"allowedPushOperations": "CreateUpdateAndDelete"
},
{
"path": "/Data",
"scope": "ItemAndDescendants",
"allowedPushOperations": "CreateUpdateAndDelete"
},
{
"path": "/Dictionary",
"scope": "ItemAndDescendants",
"allowedPushOperations": "CreateUpdateAndDelete"
},
{
"path": "/Presentation",
"scope": "ItemAndDescendants",
"allowedPushOperations": "CreateUpdateAndDelete"
},
{
"path": "*",
"scope": "Ignored"
}
]
}
]
}
}
Key Configuration Options in sandboxsite.module.json
- namespace: Specifies the namespace for the items to be serialized, helping organize content within the package.
- includes: Lists the items and media paths included in the package. Each path can specify:
- name: Describes the set of items or media included.
- path: Defines the root path in Sitecore for serialization.
- scope: Determines the level of content included (e.g.,
DescendantsOnlyto include all nested items). - allowedPushOperations: Controls what operations can be performed on these items during deployment (e.g.,
CreateAndUpdateorCreateUpdateAndDelete).
- rules: Specifies exceptions or specific sub-paths to ignore or process differently. For example, the rule for
/systemundersandboxsite-mediauses"scope": "ignored", so this part of the media library is excluded.
This file customizes what content and media items are backed up, making sure only relevant items are included while excluding any unwanted sub-paths.
GitHub Action Workflow Overview
Here’s the GitHub Action workflow that automates the backup process. It pulls content from your Sitecore XM Cloud production environment, packages it as an item package, and commits it to your repository using Git LFS. By running on a daily schedule, this workflow ensures your Sitecore content is consistently backed up and organized without manual effort.
name: Backup PROD content - XM Cloud
on:
workflow_dispatch: # Allows manual triggering of the workflow
schedule:
- cron: '0 0 * * *' # Runs at midnight UTC every day
jobs:
backup-prod-content:
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- name: Checkout Repository
uses: actions/checkout@v4
- name: Install .NET Core
uses: actions/setup-dotnet@v2
with:
dotnet-version: '8.0.x'
- name: Restore Sitecore CLI
run: dotnet tool restore
working-directory: "backup content"
- name: Authenticate with Production Environment
run: |
dotnet sitecore cloud login --client-credentials --client-id ${{ secrets.XM_CLOUD_CLIENT_ID }} --client-secret ${{ secrets.XM_CLOUD_CLIENT_SECRET }} --allow-write
dotnet sitecore cloud environment connect --environment-id ${{ secrets.XM_CLOUD_PROD_ENVIRONMENT_ID }} --allow-write true
working-directory: "backup content"
- name: Pull Serialized Items from Production
run: dotnet sitecore ser pull -i Content.Sandbox.SandboxSite --environment-name prod
working-directory: "backup content"
- name: Create SCS Package
run: dotnet sitecore ser pkg create -o "SandboxSite_Serialized_Content"
working-directory: "backup content"
- name: Install Git LFS
run: git lfs install
- name: Track large files with Git LFS
run: git lfs track "backup content/SandboxSite_Serialized_Content.itempackage"
- name: Add files from 'backup content'
run: |
git config user.name "GitHub Actions"
git config user.email "actions@github.com"
git add -f "backup content/SandboxSite_Serialized_Content.itempackage"
git commit -m "Added serialized content package"
- name: Create Pull Request
uses: peter-evans/create-pull-request@v7
with:
add-paths: |
backup content/*.itempackage
Workflow Step-by-Step
- Checkout Repository: This clones the repository so the workflow can access files and prepare them for backup.
- Install .NET Core: Ensures that the GitHub runner has the required .NET environment, specifically for Sitecore CLI tools.
- Restore Sitecore CLI: Restores the necessary CLI tools specified in your project, so you’re ready to interact with the Sitecore environment.
- Authenticate with Production Environment: Authenticates securely using GitHub secrets, allowing access to your production environment.
- Pull Serialized Items: Runs the
dotnet sitecore ser pullcommand, pulling content for the specified item (Content.Sandbox.SandboxSite) from the production environment and serializing it for backup. - Create the SCS Package: Packages the serialized items into a
.itempackagefile, bundling everything into a single file that’s easy to manage and restore. - Install Git LFS: Installs Git LFS on the GitHub runner to manage large files within the repository.
- Track and Commit the Package: Configures Git LFS to track the item package file, adds it to the repository, and commits it. This keeps the repository lightweight and handles large media files effectively.
- Create Pull Request: Automatically generates a pull request with the new backup, making it easy to review and merge changes.
Conclusion
This detailed GitHub Action workflow streamlines the process of creating and storing backups for your Sitecore XM Cloud content in a single .itempackage file. By using Git LFS, you avoid managing thousands of individual serialized files, keeping your repository organized and efficient. With the holiday season approaching, it’s the perfect time to implement reliable backups and ensure your Sitecore content is safe and sound.
That’s all for now folks 😊
If you have questions or would like help setting this up, feel free to reach out—I’m always happy to connect! Whether it’s troubleshooting or bringing Sitecore expertise to your next project, let’s discuss how I can support your Sitecore goals.