대화 템플릿 - 딥러닝 언어 모델

대화 템플릿 (chat template)는 대화형 모델을 사용할 때, 입력과 출력을 구조화하는 데 사용되는 형식입니다. 이는 모델이 대화의 맥락을 이해하고 적절한 응답을 생성하는 데 도움을 줍니다.

기반(base) 모델¶

import torch
import transformers
from packaging import version

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'google/gemma-3-1b-pt'
tokenizer = AutoTokenizer.from_pretrained(model_name)
# C 컴파일러 필요
base_model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', torch_dtype='auto')
# dtype 확인
print(f"모델 dtype: {base_model.dtype}")

모델 dtype: torch.bfloat16

input_texts = [
    '언어 모델의 기반 모델은 ', # 생성할 문장의 시작 (base)
    "다음 문장을 영어로 번역: '언어 모델'" # 지시 수행 의도 (instruct)
]

model_inputs = tokenizer(input_texts, padding=True, return_tensors='pt').to(base_model.device)

with torch.inference_mode():
    outputs = base_model.generate(
        **model_inputs, 
        max_new_tokens=50, 
        do_sample=False,
        repetition_penalty=2.0,)

The following generation flags are not valid and may be ignored: ['top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
/home/me/.conda/envs/pytorch/lib/python3.10/site-packages/torch/_inductor/compile_fx.py:236: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance.
  warnings.warn(

decoded_texts = tokenizer.batch_decode(outputs, skip_special_tokens=True)
for i, (text, output) in enumerate(zip(input_texts, decoded_texts)):
    print(i+1)
    print(f'{text}', end=' -> ')
    print(f'{output[len(text):]}')
    print()

1
언어 모델의 기반 모델은  -> 텍스트를 읽고, 그에 따라 학습된 언어를 사용하여 새로운 문장을 생성합니다.

이러한 기계 번역 시스템을 사용하면 다양한 분야에서 중요한 역할을 합니다:
- 문서 및 웹 페이지 자동번역시스템 :

2
다음 문장을 영어로 번역: '언어 모델' -> 은 인간의 언어를 학습하는 데 사용되는 컴퓨터 프로그램입니다.

이 질문에 대한 답변을 찾으려면 다음 링크를 클릭하십시오 : https://www .quora - com/What-is the difference between a language model and an

지시 수행 모델¶

지시 수행 모델은 사용자의 지시를 이해하고 이에 따라 작업을 수행하는 모델입니다. 이는 대화형 AI의 핵심 기능 중 하나로, 사용자가 원하는 작업을 정확하게 수행할 수 있도록 설계되었습니다. 지시 수행 모델은 일반적으로 사전 훈련된 언어 모델을 기반으로 하며, 특정 작업에 맞게 미세 조정됩니다.

import torch
import transformers
from packaging import version

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'google/gemma-3-1b-it'
tokenizer = AutoTokenizer.from_pretrained(model_name)
# C 컴파일러 필요
instruct_model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', torch_dtype='auto')
# dtype 확인
print(f"모델 dtype: {instruct_model.dtype}")

모델 dtype: torch.bfloat16

input_texts = [
    '언어 모델의 기반 모델은 ', # 생성할 문장의 시작 (base)
    "다음 문장을 영어로 번역: '언어 모델'" # 지시 수행 의도 (instruct)
]

model_inputs = tokenizer(
    input_texts, padding=True, return_tensors='pt').to(instruct_model.device).to(instruct_model.dtype)

with torch.inference_mode():
    outputs = instruct_model.generate(
        **model_inputs, 
        max_new_tokens=50, 
        do_sample=False,
        repetition_penalty=2.0,)

Attempting to cast a BatchEncoding to type torch.bfloat16. This is not supported.
The following generation flags are not valid and may be ignored: ['top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.

decoded_texts = tokenizer.batch_decode(outputs, skip_special_tokens=True)
for i, (text, output) in enumerate(zip(input_texts, decoded_texts)):
    print(i+1)
    print(f'{text}', end=' -> ')
    print(f'{output[len(text):]}')
    print()

1
언어 모델의 기반 모델은  -> 텍스트 데이터에 대한 학습을 통해 생성된 언어를 이해하고, 그 지식을 활용하여 새로운 문장을 만들거나 질문과 답변하는 능력을 갖추게 됩니다. 이러한 능력으로 인해 다양한 분야에서 사용될 수 있습니다.**

**

2
다음 문장을 영어로 번역: '언어 모델' -> 은 텍스트 데이터의 다양한 패턴을 학습하고,
**이러한 특징들을 바탕으로 특정 분야에 대한 지식을 생성하는 능력을 갖추고 있습니다.**

번환문장입니다. 좀 더 자연스럽게 표현하기

대화 템플릿 적용¶

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi! How can I help you today?"},
    {"role": "user", "content": "What's the weather?"},
]

from transformers import AutoTokenizer

model_names = [
    "HuggingFaceTB/SmolLM2-135M-Instruct",
    "Qwen/Qwen3-0.6B", # tiktoken 필요,
    "kakaocorp/kanana-1.5-2.1b-instruct-2505"

]

for model_name in model_names:
    print(f"Processing model: {model_name}\n")
    tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
    result = tokenizer.apply_chat_template(messages, tokenize=False)
    print(result)
    print('-----' * 10)

Processing model: HuggingFaceTB/SmolLM2-135M-Instruct

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello!<|im_end|>
<|im_start|>assistant
Hi! How can I help you today?<|im_end|>
<|im_start|>user
What's the weather?<|im_end|>

--------------------------------------------------
Processing model: Qwen/Qwen3-0.6B

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello!<|im_end|>
<|im_start|>assistant
Hi! How can I help you today?<|im_end|>
<|im_start|>user
What's the weather?<|im_end|>

--------------------------------------------------
Processing model: kakaocorp/kanana-1.5-2.1b-instruct-2505

<|begin_of_text|><|start_header_id|>system<|end_header_id|>


<|eot_id|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Hello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Hi! How can I help you today?<|eot_id|><|start_header_id|>user<|end_header_id|>

What's the weather?<|eot_id|>
--------------------------------------------------

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = 'Qwen/Qwen3-0.6B'
tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side='left')
model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', torch_dtype='auto')

add_generation_prompt¶

add_generation_prompt=True는 대화 템플릿을 적용할 때 모델이 응답을 생성할 수 있도록 생성 프롬프트를 자동으로 추가하는 옵션입니다.

True: 대화의 마지막에 어시스턴트가 응답할 수 있는 프롬프트 추가 (예: <|im_start|>assistant\n)
False: 대화 템플릿만 적용하고 생성 프롬프트는 추가하지 않음

이 옵션이 없으면 모델이 어디서부터 응답을 시작해야 하는지 알 수 없어서 제대로 된 응답을 생성하지 못할 수 있습니다.

# add_generation_prompt의 차이 확인
messages = [
    {"role": "system", "content": "당신은 도움이 되는 AI 어시스턴트입니다."},
    {"role": "user", "content": "안녕하세요!"}
]

print("add_generation_prompt=False:")
template_false = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
print(repr(template_false))
print("\n" + "="*50 + "\n")

print("add_generation_prompt=True:")
template_true = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(repr(template_true))

print("\n" + "="*50)
print("차이점:")
print(f"False 길이: {len(template_false)}")
print(f"True 길이: {len(template_true)}")
print(f"추가된 부분: {repr(template_true[len(template_false):])}")

add_generation_prompt=False:
'<|im_start|>system\n당신은 도움이 되는 AI 어시스턴트입니다.<|im_end|>\n<|im_start|>user\n안녕하세요!<|im_end|>\n'

==================================================

add_generation_prompt=True:
'<|im_start|>system\n당신은 도움이 되는 AI 어시스턴트입니다.<|im_end|>\n<|im_start|>user\n안녕하세요!<|im_end|>\n<|im_start|>assistant\n'

==================================================
차이점:
False 길이: 87
True 길이: 109
추가된 부분: '<|im_start|>assistant\n'

import torch

messages = [
    ["다음 문장을 영어로 번역: '언어 모델'",],
    [
        {"role": "user", "content": "다음 문장을 영어로 번역: '언어 모델'"}
    ],
    [
        {"role": "system", "content": "자세하게 설명해 주는 번역 모델"},
        {"role": "user", "content": "다음 문장을 영어로 번역: '언어 모델'"}
    ]
]

model_inputs = tokenizer.apply_chat_template(
    messages, tokenize=True, 
    padding=True, 
    return_tensors='pt',
    return_dict=True,
    add_generation_prompt=True,  # 생성 프롬프트 추가
    )

# 타입 변환 문제 해결: BatchEncoding 객체에서 개별 텐서로 변환
model_inputs = {
    'input_ids': model_inputs['input_ids'].to(model.device),
    'attention_mask': model_inputs['attention_mask'].to(model.device)
}

with torch.inference_mode():
    outputs = model.generate(**model_inputs, max_new_tokens=200, do_sample=False)

The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.

응답 처리¶

decoded_response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
for i, response in enumerate(decoded_response):
    print(f'[응답 {i+1}]\n{response}')

[응답 1]
assistant
Okay, so I need to figure out how to solve this problem. The problem is: "A man is walking along a straight line, and he is moving at a constant speed. The distance he covers in 10 seconds is 100 meters. What is the speed of the man?" Alright, let's start by recalling what speed means. Speed is generally calculated by dividing the distance traveled by the time taken to cover that distance. The formula for speed is:

$$ \text{Speed} = \frac{\text{Distance}}{\text{Time}} $$

So, in this problem, the man is moving at a constant speed. The distance he covers in 10 seconds is 100 meters. Therefore, I need to plug in the values into the formula. Let me write that down.

First, the distance (D) is 100 meters, and the time (T) is 10 seconds. Therefore, the speed (v)
[응답 2]
user
다음 문장을 영어로 번역: '언어 모델'
assistant
<think>
Okay, the user wants me to translate the Korean sentence "언어 모델" into English. Let me start by breaking down the Korean word. "언어" means language, and "모델" translates to model. So putting it together, it should be "Language Model". I need to make sure the translation is accurate and natural. Sometimes in technical contexts, "Language Model" is a common term, so that's probably the right choice here. Let me double-check if there's any nuance I'm missing. No, it's straightforward. The user probably needs this for a technical or academic purpose, so the translation should be precise.
</think>

"Language Model"
[응답 3]
system
자세하게 설명해 주는 번역 모델
user
다음 문장을 영어로 번역: '언어 모델'
assistant
<think>
Okay, the user wants to translate the Korean sentence "언어 모델" into English. Let me start by breaking down the Korean word. "언어" means language, and "모델" translates to model. So putting it together, it should be "Language Model". I need to make sure the translation is accurate and natural in English. Sometimes people might use "Language Model" as a common term, so that's probably the best option here. Let me double-check if there's any nuance I'm missing, but I don't think "Language Model" is too different from "Model". Yep, that seems right.
</think>

"Language Model"

for i, output_ids in enumerate(outputs):
    input_length = len(model_inputs['input_ids'][i])
    generated_tokens = output_ids[input_length:]  # 입력 토큰 제외
    decoded_text = tokenizer.decode(generated_tokens, skip_special_tokens=True)
    print(f'[응답 {i+1} - 입력 제외]\n{decoded_text}')
    print()

[응답 1 - 입력 제외]
Okay, so I need to figure out how to solve this problem. The problem is: "A man is walking along a straight line, and he is moving at a constant speed. The distance he covers in 10 seconds is 100 meters. What is the speed of the man?" Alright, let's start by recalling what speed means. Speed is generally calculated by dividing the distance traveled by the time taken to cover that distance. The formula for speed is:

$$ \text{Speed} = \frac{\text{Distance}}{\text{Time}} $$

So, in this problem, the man is moving at a constant speed. The distance he covers in 10 seconds is 100 meters. Therefore, I need to plug in the values into the formula. Let me write that down.

First, the distance (D) is 100 meters, and the time (T) is 10 seconds. Therefore, the speed (v)

[응답 2 - 입력 제외]
<think>
Okay, the user wants me to translate the Korean sentence "언어 모델" into English. Let me start by breaking down the Korean word. "언어" means language, and "모델" translates to model. So putting it together, it should be "Language Model". I need to make sure the translation is accurate and natural. Sometimes in technical contexts, "Language Model" is a common term, so that's probably the right choice here. Let me double-check if there's any nuance I'm missing. No, it's straightforward. The user probably needs this for a technical or academic purpose, so the translation should be precise.
</think>

"Language Model"

[응답 3 - 입력 제외]
<think>
Okay, the user wants to translate the Korean sentence "언어 모델" into English. Let me start by breaking down the Korean word. "언어" means language, and "모델" translates to model. So putting it together, it should be "Language Model". I need to make sure the translation is accurate and natural in English. Sometimes people might use "Language Model" as a common term, so that's probably the best option here. Let me double-check if there's any nuance I'm missing, but I don't think "Language Model" is too different from "Model". Yep, that seems right.
</think>

"Language Model"