Revolutionizing Sitecore Personalize: Unleashing the Power of Generative AI Function Calling

45 min read

Published: October 07, 2024

Welcome back, friends! In today's blog post, we're building upon my previous article where I discussed using the SDK to create a flow (also known as an Experience). We're taking it a step further by combining this topic with Generative AI. Specifically, we'll explore Function Calling—a concept that allows you to inform Generative AI about available functions and have it select the appropriate function and object based on given requirements. This powerful approach enables AI to dynamically interact with predefined functionalities.

I'm excited to announce a new repository I've been working on that leverages this capability. It's called the Sitecore Assistant. While others may have created similar tools in the past, I wanted to build something tailored for the new composable SaaS landscape—a tool you can run locally or host remotely, allowing users to interact with their Sitecore Cloud instance.

I have big dreams for this app, combining AI with the Personalize SDK and potentially more SDKs in the future. You can find the repository here: dylanyoung-dev/sitecore-assistant: Building a local-based (or hosted) app that users can configure for their Cloud Portal environment to handle advanced use cases. (github.com).

It's a straightforward Next.js application. If you prefer not to use the AI capabilities, simply don't add the OpenAI API key, and it'll disable the chat feature. This project is still a work in progress, so stay tuned for future announcements.

I'm considering building a Nextron shell (since it's using Next.js) in the future, which would allow it to run as an executable on your PC or Mac. Sometimes I feel like I have too many ideas! If you're interested in contributing, reach out to me, and we can discuss the vision further.

This project represents a topic tangential to today's discussion, but one I hope to explore more in the future: how to extend Sitecore XM Cloud or other composable products with new functionality. In the past, you'd build modules and install them directly into Sitecore XM/XP, but this approach required modifying the core code—a practice that doesn't align well with our current SaaS landscape. Today's post offers a glimpse into this new project, and I plan to delve deeper into this topic in future articles. Stay tuned!

What is Generative AI Function Calling?

Today, we'll utilize the chat interface I've built into that application, incorporating OpenAI Function Calling. This powerful concept works as follows: when a user asks a question like Can you help me create an experience, it sends the request to an API endpoint I've configured in Next.js. Behind the scenes, it executes a series of tasks that unfold like this:

Pass the current message and all available function specifications to the LLM.
Based on the input and available functions, it determines if any functions can be used and if it has the required input to call them. If not, it returns details to the user, prompting for more information or guiding them towards the correct function (if multiple are defined).
If it has all the necessary information, it returns the function and required parameters to the code running in the API request. You can then parse and run the function, or—as we'll see—use OpenAI's beta functionality to define a function structure and the function to run, allowing it to handle the function call for you.
Once the function returns data (such as from an API call or web scraping), it passes this back to the LLM, which formulates a response based on your original request.

This concept becomes incredibly powerful when you consider its potential. You could chain functions together, combine it with a Retrieval-Augmented Generation (RAG) approach, or even integrate it with a custom or fine-tuned Language Model (LLM) to achieve highly refined, actionable results. In today's examples, we'll leverage the Sitecore Personalize SDK—which I've been discussing over the past few months—and combine it with our Chat application. This integration will enable our Chat application to create experiences on our behalf.

Creating an Experience in Personalize

Let's dive into our code to see how we can create an experience using a chat application. I've already shared the repository in the first part of this article, so refer there for the full source. Our application is currently built with Next.js, as I plan to host it on the web and need to keep my OpenAI key secure. This setup is crucial to understand because the chat application features a chat box (running inside a modal) where users input their commands. These commands are then sent to an API route hosted in Next.js, which handles all the OpenAI magic with Function Calling. This API route is responsible for sending code to OpenAI, managing system messages, defining and using Function Calling, and providing the functions that the code will execute. We also pass in all currently configured clients, which are used to authenticate with various services for asset creation.

Now, let's examine the API route code:

1const client = new OpenAI({ apiKey: process.env.OpenAI_API_KEY });
2
3interface Message {
4  sender: "user" | "assistant";
5  content: string;
6}
7
8interface ChatRequest {
9  message: string;
10  messages: Message[];
11  clients: ClientData[];
12}
13
14export async function POST(req: NextRequest) {
15  try {
16    const { message, messages, clients }: ChatRequest = await req.json();
17
18    const combinedMessages: ChatCompletionMessageParam[] = [
19      {
20        role: "system",
21        content:
22          "You are the Sitecore Assistant which will help users create Sitecore assets in the Sitecore SaaS products.  When running Sitecore Personalize apis that require code, always use EcmaScript 5.0 javascript that would work with Server Side Nashorn Javascript Engine.",
23      },
24      ...messages.map((msg) => ({
25        role: msg.sender,
26        content: msg.content,
27      })),
28      { role: "user", content: message },
29    ];
30
31    console.log({ clients });
32
33    const runner = await client.beta.chat.completions
34      .runTools({
35        model: "gpt-4o-mini",
36        messages: combinedMessages,
37        tools: [
38          await CreateExperienceTool(clients),
39          await GetFlowsTool(clients),
40          await ListOfExperiencesTool(clients),
41        ],
42      })
43      .on("message", (message) => {
44        console.log(message);
45      })
46      .on("functionCallResult", (result) => {
47        console.log("result", result);
48      });
49
50    const finalContent = await runner.finalContent();
51
52    return NextResponse.json({ message: finalContent });
53  } catch (error) {
54    console.error("Error generating AI Response:", error);
55    return NextResponse.json(
56      { error: "Error generating AI Response" },
57      { status: 500 }
58    );
59  }
60}

Let's dissect this code. Within our API route, which processes a POST request from our chat application, we find the following crucial segment:

1const combinedMessages: ChatCompletionMessageParam[] = [
2  {
3    role: "system",
4    content:
5      "You are the Sitecore Assistant which will help users create Sitecore assets in the Sitecore SaaS products.  When running Sitecore Personalize apis that require code, always use EcmaScript 5.0 javascript that would work with Server Side Nashorn Javascript Engine.",
6  },
7  ...messages.map((msg) => ({
8    role: msg.sender,
9    content: msg.content,
10  })),
11  { role: "user", content: message },
12];

This code serves two primary functions. First, it provides initial context to our Language Model (LLM), instructing it on its identity and the services it's designed to offer. It also supplies context regarding the JavaScript format required for Sitecore Personalize. Second, the code manages short-term memory considerations. This is crucial as we aim for a conversational interaction with our LLM, allowing us to craft our Personalize experience through multiple prompts.

Next, let's examine a crucial part of the code:

1const runner = await client.beta.chat.completions
2  .runTools({
3    model: "gpt-4o-mini",
4    messages: combinedMessages,
5    tools: [
6      await CreateExperienceTool(clients),
7      await GetFlowsTool(clients),
8      await ListOfExperiencesTool(clients),
9    ],
10  })
11  .on("message", (message) => {
12    console.log(message);
13  })
14  .on("functionCallResult", (result) => {
15    console.log("result", result);
16  });

This code snippet is crucial for enabling AI Function Calling. We're utilizing the OpenAI Node SDK (openai/openai-node: The official Node.js / Typescript library for the OpenAI API (github.com)), specifically its new beta feature for running Tools in chat completions. We're also employing OpenAI's latest model, gpt-4o-mini. The real magic happens when we define our tools—in this case, three examples that perform these key actions:

Creating a Web Experience
Getting Back a Specific Experience/Experiment based on a ref
Getting a List of Experiences

One fascinating aspect I discovered while working with the LLM is its ability to extract required values from unexpected sources. This capability stems from its understanding of function requirements. For instance, after creating an experience, you can prompt the LLM to retrieve additional details about it. It accomplishes this by running the GetFlowsTool, knowing that the reference ID—obtained from the experience creation—is sufficient for this task.

Now, let's examine the final crucial component: the function definition and the function that executes when all the correct parameters for creating an experience are provided. Let's take a look at the code for CreateExperienceTool:

1export const CreateExperienceTool = async (clients: ClientData[]) => ({
2  type: 'function' as const,
3  product: ProductOptions.PersonalizeCDP,
4  function: {
5    name: 'create_personalization_experience',
6    type: 'function',
7    description: 'Creates a new personalization experience in Sitecore Personalize.',
8    parameters: {
9      type: 'object',
10      properties: {
11        name: {
12          type: 'string',
13          description: 'The name of the personalization experience.',
14        },
15        type: {
16          type: 'string',
17          enum: ['Web', 'API', 'Triggered'],
18          description: 'The type of the experience.',
19        },
20        channels: {
21          type: 'array',
22          items: {
23            type: 'string',
24            enum: ['Call Center', 'Email', 'Mobile App', 'Mobile Web', 'Web', 'SMS'],
25          },
26          description: 'The channels for the experience.',
27        },
28        assets: {
29          type: 'object',
30          properties: {
31            html: {
32              type: 'string',
33              description: 'The HTML content for the experience, use pure HTML only.',
34            },
35            css: {
36              type: 'string',
37              description: 'The CSS content for the experience, do not use precompiled CSS, only pure CSS.',
38            },
39            js: {
40              type: 'string',
41              description:
42                'The JS content for the experience which needs to use Nashorn Engine compatible ES5 Javascript.',
43            },
44            freemarker: {
45              type: 'string',
46              description:
47                'This is used to define the API response information using free marker syntax for the experience.',
48            },
49          },
50        },
51      },
52      required: ['name', 'type', 'channels'],
53    },
54    function: async (args: { params: any }, runner: any) => {
55      return await createPersonalizationExperience(args, clients);
56    },
57    parse: JSON.parse,
58  },
59});

There's a substantial amount of code here, so let's focus on the key points rather than delving into every detail. First, we define the overall description of our function. This description informs our LLM about the function's purpose and use case. Next, we have the parameters section, which describes all the fields or parameters that need to be provided. There's also a separate required property, as some fields aren't necessary in this example. The assets—such as JavaScript, HTML, and CSS—are only needed if we have a variant. As we've discussed previously, these aren't always required to create an experience, though they're certainly valuable to have (especially when provided by our LLM). We can specifically instruct the LLM to generate the HTML, CSS, and JavaScript for our variant by requesting it to create these values.

Finally, at the bottom of the function definition, we have:

1function: async (args: { params: any }, runner: any) => {
2	return await createPersonalizationExperience(args, clients);
3},
4parse: JSON.parse

Here's how it works: Once the LLM determines all the necessary parameters, it executes the function using those parameters. The parse function simply parses the parameters passed through the args parameter. After the function runs, its result is fed back into the LLM, which then generates a response based on the function's output. Let's take a quick look at the code for the createPersonalizationExperience(args, clients) function.

1export const createPersonalizationExperience = async (args: any, clients: ClientData[]) => {
2  let personalizeClient;
3  if (args !== undefined && clients !== undefined) {
4    const clientDetails = clients.find((client) => (client.product = ProductOptions.PersonalizeCDP));
5
6    if (!clientDetails) {
7      return {
8        status: 'error',
9        message: 'You must have a client configured for Personalize/CDP to create an experience.',
10      };
11    }
12
13    personalizeClient = new Client({
14      clientId: clientDetails.clientId,
15      clientSecret: clientDetails.clientSecret,
16      region: mapRegion(clientDetails.region),
17    } as IClientInitOptions);
18    const flowTypeMapping = mapFlowType(args.type);
19
20    console.log(args);
21
22    if (!flowTypeMapping) {
23      console.log(flowTypeMapping);
24      return {
25        status: 'error',
26        message: `Invalid flow type: ${args.type} it should match one of the following: Web, API, Triggered`,
27      };
28    }
29
30    const experience: IFlowDefinition = {
31      name: args.name,
32      friendlyId: args.name.toLowerCase().replace(/\s+/g, '_'),
33      type: flowTypeMapping,
34      channels: args.channels.map((channel: string) => FlowChannel[channel as keyof typeof FlowChannel]),
35      status: FlowStatus.Draft,
36      schedule: {
37        type: FlowScheduleType.Simple,
38        startDate: new Date().toISOString(),
39      },
40    };
41
42    // Check if any of the fields in assets are provided
43    if (args.assets && (args.assets.html || args.assets.javascript || args.assets.freemarker || args.assets.css)) {
44      experience.variants = [
45        {
46          name: 'Default Variant',
47          assets: {
48            html: args.assets.html || '',
49            js: args.assets.javascript || '',
50            css: args.assets.css || '',
51          },
52          templateVariables: {},
53          ...(args.assets.freemarker && {
54            tasks: [personalizeClient.Flows.CreateTemplateRenderTaskInput(args.assets.freemarker)],
55          }),
56        },
57      ];
58    }
59
60    try {
61      console.log('Creating personalization experience:', experience);
62      let response = await personalizeClient.Flows.CreateExperience(experience);
63
64      return {
65        status: 'success',
66        message: 'Personalization experience created successfully.',
67      };
68    } catch (error: any) {
69      return {
70        status: 'error',
71        message: `Failed to create personalization experience: ${error.message}`,
72      };
73    }
74  }
75
76  let response = await console.log('Creating personalization experience:', args.params);
77  return {
78    status: 'fail',
79    message: 'Parameters are missing for creating a personalization experience.',
80  };
81};

I won't go through the code line by line, but I'll explain the logic to help you understand what's happening. First, we perform some type checking to ensure we have valid clients and arguments. We also verify that the available clients are for Sitecore Personalize—a specific requirement for our application that might differ depending on your use case.

After confirming the passed data, we create a Personalize Client instance, initializing the Sitecore Personalize SDK. We then create a new IFlowDefinition object to build our experience. We check for any asset information for our variant and include it if available. With a valid Flow Definition, we call personalizeClient.Flows.CreateExperience(experience);, passing in our newly created definition. The response is then used to inform the app of successful completion. We could enhance this by returning the flow definition details, which the LLM could then display. This represents the current state of our proof of concept, with more iterations and improvements on the horizon. Keep an eye on the repository for future updates—the potential use cases are only going to expand from here.

The Flow

We've explored how to create an experience using AI Function Calling with an LLM (in our case, the OpenAI GPT model). But what does the application flow look like today? It's actually quite simple to create our asset. If you're new to running this app, you can simply ask, "What are my options?" This will return all the Function Calling definitions available to you. You could also say, "I'd love to create a web experience today." This helps the system understand that you're trying to create an experience, specifically a "Web" experience, which it can use for some of the parameters.

Now, with these basic instructions, the system would likely create an experience without a variant, which isn't particularly helpful. To enhance this, you could ask something like: It would be great if the experience used a web popup to display that I have a discount available. Could you help me generate this? This prompt signals to the LLM that you want help generating code for your experience, which will become your default variant. Typically, after running this, it'll generate a sample and ask if you like the output and want to proceed with creating the experience. It's quite impressive that once you confirm, it will create your experience using the function code we previously discussed.

Conclusion

AI Function Calling is undoubtedly impressive, but there are some obvious challenges with this approach. What if we wanted to incorporate all the different actions available in the Sitecore Personalize SDK? That would require defining even more functions, which would consume additional tokens to store this context. However, Function Calling offers even more possibilities. I'll be back soon to explore more topics related to this tool as I develop a highly effective LLM tool for Sitecore Personalize, and eventually for other products in the Sitecore Composable stack.