trim_messages#
- langchain_core.messages.utils.trim_messages(messages: Sequence[MessageLikeRepresentation] | None = None, **kwargs: Any) List[BaseMessage] | Runnable[Sequence[MessageLikeRepresentation], List[BaseMessage]] [source]#
Trim messages to be below a token count.
- Parameters:
messages (Optional[Sequence[MessageLikeRepresentation]]) – Sequence of Message-like objects to trim.
max_tokens – Max token count of trimmed messages.
token_counter – Function or llm for counting tokens in a BaseMessage or a list of BaseMessage. If a BaseLanguageModel is passed in then BaseLanguageModel.get_num_tokens_from_messages() will be used.
strategy – Strategy for trimming. - “first”: Keep the first <= n_count tokens of the messages. - “last”: Keep the last <= n_count tokens of the messages. Default is “last”.
allow_partial – Whether to split a message if only part of the message can be included. If
strategy="last"
then the last partial contents of a message are included. Ifstrategy="first"
then the first partial contents of a message are included. Default is False.end_on – The message type to end on. If specified then every message after the last occurrence of this type is ignored. If
strategy=="last"
then this is done before we attempt to get the lastmax_tokens
. Ifstrategy=="first"
then this is done after we get the firstmax_tokens
. Can be specified as string names (e.g. “system”, “human”, “ai”, …) or as BaseMessage classes (e.g. SystemMessage, HumanMessage, AIMessage, …). Can be a single type or a list of types. Default is None.start_on – The message type to start on. Should only be specified if
strategy="last"
. If specified then every message before the first occurrence of this type is ignored. This is done after we trim the initial messages to the lastmax_tokens
. Does not apply to a SystemMessage at index 0 ifinclude_system=True
. Can be specified as string names (e.g. “system”, “human”, “ai”, …) or as BaseMessage classes (e.g. SystemMessage, HumanMessage, AIMessage, …). Can be a single type or a list of types. Default is None.include_system – Whether to keep the SystemMessage if there is one at index 0. Should only be specified if
strategy="last"
. Default is False.text_splitter – Function or
langchain_text_splitters.TextSplitter
for splitting the string contents of a message. Only used ifallow_partial=True
. Ifstrategy="last"
then the last split tokens from a partial message will be included. ifstrategy=="first"
then the first split tokens from a partial message will be included. Token splitter assumes that separators are kept, so that split contents can be directly concatenated to recreate the original text. Defaults to splitting on newlines.kwargs (Any) –
- Returns:
List of trimmed BaseMessages.
- Raises:
ValueError – if two incompatible arguments are specified or an unrecognized
strategy
is specified.- Return type:
Union[List[BaseMessage], Runnable[Sequence[MessageLikeRepresentation], List[BaseMessage]]]
Example
from typing import List from langchain_core.messages import trim_messages, AIMessage, BaseMessage, HumanMessage, SystemMessage messages = [ SystemMessage("This is a 4 token text. The full message is 10 tokens."), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"), AIMessage( [ {"type": "text", "text": "This is the FIRST 4 token block."}, {"type": "text", "text": "This is the SECOND 4 token block."}, ], id="second", ), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"), AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"), ] def dummy_token_counter(messages: List[BaseMessage]) -> int: # treat each message like it adds 3 default tokens at the beginning # of the message and at the end of the message. 3 + 4 + 3 = 10 tokens # per message. default_content_len = 4 default_msg_prefix_len = 3 default_msg_suffix_len = 3 count = 0 for msg in messages: if isinstance(msg.content, str): count += default_msg_prefix_len + default_content_len + default_msg_suffix_len if isinstance(msg.content, list): count += default_msg_prefix_len + len(msg.content) * default_content_len + default_msg_suffix_len return count
- First 30 tokens, not allowing partial messages:
trim_messages(messages, max_tokens=30, token_counter=dummy_token_counter, strategy="first")
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"), ]
- First 30 tokens, allowing partial messages:
trim_messages( messages, max_tokens=30, token_counter=dummy_token_counter, strategy="first", allow_partial=True, )
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"), AIMessage( [{"type": "text", "text": "This is the FIRST 4 token block."}], id="second"), ]
- First 30 tokens, allowing partial messages, have to end on HumanMessage:
trim_messages( messages, max_tokens=30, token_counter=dummy_token_counter, strategy="first" allow_partial=True, end_on="human", )
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"), ]
- Last 30 tokens, including system message, not allowing partial messages:
trim_messages(messages, max_tokens=30, include_system=True, token_counter=dummy_token_counter, strategy="last")
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"), AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"), ]
- Last 40 tokens, including system message, allowing partial messages:
trim_messages( messages, max_tokens=40, token_counter=dummy_token_counter, strategy="last", allow_partial=True, include_system=True )
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), AIMessage( [{"type": "text", "text": "This is the FIRST 4 token block."},], id="second", ), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"), AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"), ]
- Last 30 tokens, including system message, allowing partial messages, end on HumanMessage:
trim_messages( messages, max_tokens=30, token_counter=dummy_token_counter, strategy="last", end_on="human", include_system=True, allow_partial=True, )
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), AIMessage( [{"type": "text", "text": "This is the FIRST 4 token block."},], id="second", ), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"), ]
- Last 40 tokens, including system message, allowing partial messages, start on HumanMessage:
trim_messages( messages, max_tokens=40, token_counter=dummy_token_counter, strategy="last", include_system=True, allow_partial=True, start_on="human" )
[ SystemMessage("This is a 4 token text. The full message is 10 tokens."), HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"), AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"), ]