Other Utils¶
- oscopilot.utils.utils.send_chat_prompts(sys_prompt, user_prompt, llm)[source]¶
Sends a sequence of chat prompts to a language learning model (LLM) and returns the model’s response.
- Parameters:
sys_prompt (str) – The system prompt that sets the context or provides instructions for the language learning model.
user_prompt (str) – The user prompt that contains the specific query or command intended for the language learning model.
llm (object) – The language learning model to which the prompts are sent. This model is expected to have a chat method that accepts structured prompts.
- Returns:
The response from the language learning model, which is typically a string containing the model’s answer or generated content based on the provided prompts.
The function is a utility for simplifying the process of sending structured chat prompts to a language learning model and parsing its response, useful in scenarios where dynamic interaction with the model is required.
- oscopilot.utils.utils.random_string(length)[source]¶
Generates a random string of a specified length.
- Parameters:
length (int) – The desired length of the random string.
- Returns:
A string of random characters and digits of the specified length.
- Return type:
str
- oscopilot.utils.utils.num_tokens_from_string(string: str) int[source]¶
Calculates the number of tokens in a given text string according to a specific encoding.
- Parameters:
text (str) – The text string to be tokenized.
- Returns:
The number of tokens the string is encoded into according to the model’s tokenizer.
- Return type:
int
- oscopilot.utils.utils.parse_content(content, html_type='html.parser')[source]¶
Parses and cleans the given HTML content, removing specified tags, ids, and classes.
- Parameters:
content (str) – The HTML content to be parsed and cleaned.
type (str, optional) – The type of parser to be used by BeautifulSoup. Defaults to “html.parser”. Supported types include “html.parser”, “lxml”, “lxml-xml”, “xml”, and “html5lib”.
- Raises:
ValueError – If an unsupported parser type is specified.
- Returns:
The cleaned text extracted from the HTML content.
- Return type:
str
- oscopilot.utils.utils.clean_string(text)[source]¶
Cleans a given string by performing various operations such as whitespace normalization, removal of backslashes, and replacement of hash characters with spaces. It also reduces consecutive non-alphanumeric characters to a single occurrence.
- Parameters:
text (str) – The text to be cleaned.
- Returns:
The cleaned text after applying all the specified cleaning operations.
- Return type:
str
- oscopilot.utils.utils.chunks(iterable, batch_size=100, desc='Processing chunks')[source]¶
Breaks an iterable into smaller chunks of a specified size, yielding each chunk in sequence.
- Parameters:
iterable (iterable) – The iterable to be chunked.
batch_size (int, optional) – The size of each chunk. Defaults to 100.
desc (str, optional) – Description text to be displayed alongside the progress bar. Defaults to “Processing chunks”.
- Yields:
tuple – A chunk of the iterable, with a maximum length of batch_size.
- oscopilot.utils.utils.generate_prompt(template: str, replace_dict: dict)[source]¶
Generates a string by replacing placeholders in a template with values from a dictionary.
- Parameters:
template (str) – The template string containing placeholders to be replaced.
replace_dict (dict) – A dictionary where each key corresponds to a placeholder in the template and each value is the replacement for that placeholder.
- Returns:
The resulting string after all placeholders have been replaced with their corresponding values.
- Return type:
str