콘텐츠 본문
논문 국내 국내전문학술지(KCI급) 최적화된 RAG 프레임워크 기반의 LLM을 활용한 관세 품목분류 코드 자동 판별 시스템 개발
- 학술지 구분 국내전문학술지(KCI급)
- 게재년월 2025-09
- 저자명 김호경, 김건우, 최근호
- 학술지명 정보시스템연구
- 발행처명 한국정보시스템학회
- 발행국가 국내
- 논문언어 한국어
- 전체저자수 3
- 연구분야 사회과학 > 경영학
- 키워드 #Retrieval-Augmented Generation #LLM #Harmonized System Code #Artificial Intelligence
논문 초록 (Abstract)
Purpose
This study aims to analyze the impact of RAG data processing methods and the sentence comprehension capabilities of large language models (LLMs) on the performance of HS code classification in the field of customs classification, with the goal of providing practical research insights that can be utilized by private-sector companies in the customs domain.
Design/methodology/approach
This study developed a two-stage Hybrid-rerank-RAG system framework, named THE-RAG (Two-stage-Hybrid-rEranking-RAG), which integrates dense retrieval, sparse retrieval (BM25), and reranking techniques to enhance the performance of HS code classification. The research measures the impact of two key factors—(1) the format of RAG data preprocessing and (2) the sentence comprehension capability of large language models (LLMs)—on the final answer quality, and analyzes their combined influence on the overall classification performance in the field of customs HS code determination.
Findings
The THE-RAG system proposed in this study achieved up to 60% accuracy in HS code classification for queries (product descriptions) with highly complex explanatory structures, depending on specific chunk sizes and the choice of LLM. The findings confirm that both the chunk size of the input data and the Korean sentence comprehension capability embedded in the LLM significantly influence HS code classification performance. This study represents the first domestic case of applying state-of-the-art RAG technology to the customs domain for HS code determination, suggesting its potential utility as a technical resource in various practical applications within the field.