Troubleshooting Experience Editor Glitches in Your NextJs Sitecore Setup on Kubernetes: A New Challenge for the Sitecore Dinosaurs

Dear Sitecorians, guess what? Spring is almost here, and you know what that means – it’s time for this year’s SUGCON(Europe)! And the best part? It’s happening in Dublin this time! Check out all the details about it – SUGCON Europe 2024, April 11-12

Today, I wanna talk about this problem we’ve been dealing with for a while. So, we’ve got Sitecore XP headless with NextJS running on AKS (Azure Kubernetes cluster). Everything was going smoothly until one day when the editors started having issues with the Experience editor – outages and snail-paced performance. The weird thing is, it wasn’t happening to all the sites, just two or three of them. And to top it off, the logs were absolutely no help at all. A New Challenge for the Sitecore Dinosaurs 🦖

After much troubleshooting and several failed attempts, we uncovered the root of the issue: our editors and visitors were sharing the same NextJS web applications (or pods). The increased visitor traffic was bogging down the Experience Editor, leading to the slowdowns and outages we observed.

Our initial solution was to increase the number of pods serving the web applications, hoping more replicas would alleviate the problem. Although there was some improvement, the “slow issues” persisted, and outages continued to plague us.

Ultimately, we found success by deploying dedicated web apps (pods) specifically for our editors. This approach eliminated the outages and significantly improved the Experience Editor’s performance. To achieve this, we adjusted our deployment manifest to include a separate pod for editing purposes, named nextjsproxy-cm-sandboxsite. This change allowed us to distinguish between the regular NextJS pods and the NextJS-CM pods dedicated to content management.

Here is an example of our Deployment manifest. Notice nextjsproxy-cm-sandboxsite 🙂

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjsproxy-sandboxsite
  labels:
    app: nextjsproxy-sandboxsite
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nextjsproxy-sandboxsite
  template:
    metadata:
      labels:
        app: nextjsproxy-sandboxsite
     spec:
      nodeSelector:
        agentpool: north-pool
      containers:
      - name: nextjsproxy-sandboxsite
        image: regsitecoreprod.azurecr.io/hedin-nextjsproxy-sandboxsite-prod
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
        env:
        - name: NEXTJS_DIST_DIR
          value: ".next-container"
        - name: SITECORE_API_HOST
          value: "http:/sandbox.com"
        - name: PORT
          value: "3000"
        - name: SITECORE_API_KEY
          value: "{xxx-yyy-ZZZ-LLL-MMM}"
        - name: PUBLIC_URL
          value: "https://sandbox.com"
        - name: SITECORE_APP_NAME
          value: "SandboxSite"
        - name: "JSS_EDITING_SECRET"
          value: "my-secret-garden" 
        - name: FETCH_WITH
          value: "rest"
        startupProbe:
          httpGet:
            path: /api/healthcheck
            port: 3000
            httpHeaders:
            - name: X-Kubernetes-Probe
              value: Startup
          timeoutSeconds: 300
          periodSeconds: 30
          failureThreshold: 10
          ...

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nextjsproxy-cm-sandboxsite
  labels:
    app: nextjsproxy-cm-sandboxsite
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nextjsproxy-cm-sandboxsite
  template:
    metadata:
      labels:
        app: nextjsproxy-cm-sandboxsite
    spec:
      nodeSelector:
        agentpool: north-pool
      containers:
      - name: nextjsproxy-sandboxsite
        image: regsitecoreprod.azurecr.io/hedin-nextjsproxy-sandboxsite-prod
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
        env:
        - name: NEXTJS_DIST_DIR
          value: ".next-container"
        - name: SITECORE_API_HOST
          value: "http:/sandbox.com"
        - name: PORT
          value: "3000"
        - name: SITECORE_API_KEY
          value: "{xxx-yyy-ZZZ-LLL-MMM}"
        - name: PUBLIC_URL
          value: "https://sandbox.com"
        - name: SITECORE_APP_NAME
          value: "SandboxSite"
        - name: "JSS_EDITING_SECRET"
          value: "my-secret-garden" 
        - name: FETCH_WITH
          value: "rest"
        startupProbe:
          httpGet:
            path: /api/healthcheck
            port: 3000
            httpHeaders:
            - name: X-Kubernetes-Probe
              value: Startup
          timeoutSeconds: 300
          periodSeconds: 30
          failureThreshold: 10
          ...

Here is an example of our Service manifest. Notice the nextjsproxy-cm-sandboxsite

apiVersion: v1
kind: Service
metadata:
  name: nextjsproxy-sandboxsite
spec:
  selector:
    app: nextjsproxy-sandboxsite
  ports:
  - protocol: TCP
    port: 3000
---
apiVersion: v1
kind: Service
metadata:
  name: nextjsproxy-cm-sandboxsite
spec:
  selector:
    app: nextjsproxy-cm-sandboxsite
  ports:
  - protocol: TCP
    port: 3000

This configuration ensures a dedicated environment for our editors, isolating their activities from visitor traffic and enhancing the stability and performance of the Experience Editor.

One more thing to do, and that is to update Sitecore and the specific site. Let’s open up the Sitecore dashboard and go to the Settings for the site. Find the “Server side rendering engine endpoint URL” in the App Settings section. Enter the name of the service and remember to add the port number 🙂

This means that when the editors open the Experience Editor, it will run on the dedicated pod we have set up. Here is the link: http://nextjsproxy-cm-sandboxsite:3000/api/editing/render

Deploying these changes and observing them through tools like Lens confirmed our success. The dedicated pods for content editing have markedly improved the editing experience, showcasing the importance of adapting our infrastructure to meet the evolving needs of our users and editors alike.

I hope this insight can help others facing similar challenges, demonstrating the value of dedicated resources for different user roles in a headless CMS environment powered by Sitecore and NextJS on Kubernetes.

That’s all for now folks 😊


Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.