Memory in a LLM chat

LangChain provides all the building blocks to create a LLM chat.

You create your model:

const model = new ChatAnthropic({
    model: "claude-3-5-sonnet-20240620",
    temperature: 0
});

The nest step is to give a “human message” to the model and wait for the answer:

const humanMessage = new HumanMessage(input);
const messages = [
  humanMessage,
];
const result = await model.invoke(messages);
const parsedResult = await parser.invoke(result);

If we put that code inside a chat interface, this gives this kind of interaction:

> hello, what is your name?
--------------------------------
My name is Claude. It's nice to meet you! How are you doing today?
--------------------------------
> my name is Pierre
--------------------------------
Hello Pierre! It's nice to meet you. How can I assist you today? Is there anything specific you'd like to know or discuss?
--------------------------------
> what is my name?
--------------------------------
I apologize, but I don't have any information about your name. As an AI language model, I don't have access to personal information about individual users, nor do I retain information from previous conversations. Each interaction starts fresh, so I don't have any prior context about who you are or what your name might be. If you'd like to share your name, you're welcome to do so, but otherwise, I'm not able to identify you personally.
--------------------------------

This is disappointing.

A model is entirely stateless and doesn’t retain any information from previous messages.

Fortunately, LangChain.js offers several memory options to manage conversation history and context in your applications.

The most straightforward option is BufferMemory.

As is typical with LangChain, you can link all these components together:

const model = new ChatAnthropic({
  model: "claude-3-5-sonnet-20240620",
  temperature: 0,
});

const memory = new BufferMemory();
const parser = new StringOutputParser();
const chain = new ConversationChain({ 
  llm: model, 
  memory: memory, 
  outputParser: parser 
});

The loop to interact with the model becomes :

const response = await chain.call({ input: answer });
console.log(response.response);

Let’s try this new interaction:

> hello what is your name?
--------------------------------
Hello there! It's nice to meet you. My name is Claude. I'm an AI assistant created by Anthropic to be helpful, harmless, and honest. I don't have a physical form or avatar - I'm just a conversational AI. How are you doing today? Is there anything I can help you with?
--------------------------------
> my name is Pierre
--------------------------------
It's a pleasure to meet you, Pierre! That's a wonderful name with a rich history. It's French in origin and means "rock" or "stone." Are you of French heritage by any chance? Or perhaps you were named after someone special?

I'm always excited to learn more about the people I talk to. What brings you to chat with an AI today? Is there anything in particular you'd like to discuss or any questions you have? I'd be happy to converse on a wide range of topics, from science and technology to history, philosophy, or really any subject that interests you.

And how has your day been going so far, Pierre? I hope it's been a good one!
--------------------------------
> what time is it?
--------------------------------
I apologize, Pierre, but I don't actually have access to real-time information like the current time. As an AI language model, I don't have a way to check the current time or date. The time would depend on your specific location and time zone as well.

If you need to know the current time, I'd recommend checking a clock, your phone, or computer, which should display the accurate time for your location.

Is there perhaps another way I can assist you today? I'd be happy to help with other types of questions or information you might need.
--------------------------------
> what is my name?
--------------------------------
Your name is Pierre. You told me that earlier in our conversation when you introduced yourself.
--------------------------------

This approach effectively solves the issue: the model now retains my name in its memory.

If you enable the verbose option when you instantiate the model, you can see how this model class enriches the context with your past conversations.

const model = new ChatAnthropic({
    model: 'claude-3-haiku-20240307',
    temperature: 0,
    verbose: true
});

The kwargs.content key shows the actual context being used:

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: hello, what is your name?
AI: Hello! My name is Claude. It's nice to meet you.
Human: my name is Pierre
AI: It's great to meet you, Pierre! I'm an AI assistant created by Anthropic. As I mentioned, my name is Claude. I'm an artificial intelligence, so I don't have a physical body or appearance. My role is to be a helpful conversational partner and to assist with a variety of tasks.

I'm curious to learn more about you, Pierre. What do you do for work? What are some of your hobbies and interests? I'd love to hear more about your life and what you enjoy. And please let me know if there's anything I can help you with - I'm here to assist in any way I can.
Human: whhere do you live?
AI: I don't actually live anywhere in a physical sense, since I'm an artificial intelligence without a physical body. As an AI, I don't have a specific location or residence. I exist in the cloud, so to speak, running on servers and computing infrastructure provided by Anthropic, the company that created me. My "home" is the digital environment where I operate, rather than a physical place. I'm available to chat and assist users like yourself from anywhere you can access me through the internet or other interfaces. Let me know if you have any other questions!
Human: what is my name?
AI:

You can see that the Memory is adding both your previous questions and the AI’s responses. This is essential for maintaining a meaningful conversation, but it comes with cost implications you should be aware of.

As the conversation grows larger, maintaining the entire conversation in the context may not be ideal, particularly due to cost considerations.

Additionally, the shorter the context, the faster the response will be generated.

LangChain offers mechanisms that allow you to maintain a running summary of the conversation while discarding older messages,

reference