주요 메뉴 바로가기 보조 메뉴 바로가기 본문 바로가기

콘텐츠 본문

논문 국내 국내전문학술지(KCI급) 최적화된 RAG 프레임워크 기반의 LLM을 활용한 관세 품목분류 코드 자동 판별 시스템 개발

논문 초록 (Abstract)

Purpose

This study aims to analyze the impact of RAG data processing methods and the sentence comprehension capabilities of large language models (LLMs) on the performance of HS code classification in the field of customs classification, with the goal of providing practical research insights that can be utilized by private-sector companies in the customs domain. 

 

Design/methodology/approach

This study developed a two-stage Hybrid-rerank-RAG system framework, named THE-RAG (Two-stage-Hybrid-rEranking-RAG), which integrates dense retrieval, sparse retrieval (BM25), and reranking techniques to enhance the performance of HS code classification. The research measures the impact of two key factors—(1) the format of RAG data preprocessing and (2) the sentence comprehension capability of large language models (LLMs)—on the final answer quality, and analyzes their combined influence on the overall classification performance in the field of customs HS code determination. 

 

Findings

The THE-RAG system proposed in this study achieved up to 60% accuracy in HS code classification for queries (product descriptions) with highly complex explanatory structures, depending on specific chunk sizes and the choice of LLM. The findings confirm that both the chunk size of the input data and the Korean sentence comprehension capability embedded in the LLM significantly influence HS code classification performance. This study represents the first domestic case of applying state-of-the-art RAG technology to the customs domain for HS code determination, suggesting its potential utility as a technical resource in various practical applications within the field.