국립한밭대학교 마이크로사이트

주요 메뉴 바로가기 본문 바로가기

메인 슬라이드 이미지

경상대학 융합경영학과

BTS(Big data Technology and Strategy)

융합경영학과 최근호 교수 Keunho Choi

Research result

해외논문

Enhanced Helicopter Vibration Prediction with Hybrid Sampling and Cost Mining Techniques

Helicopter vibrations increase pilot workload and accelerate fatigue and wear in structural and mechanical components, potentially resulting in higher maintenance costs and reduced operational safety. To address these challenges, this study develops a machine learning-based prediction model using vibration test data from the cockpit of a Korean utility helicopter. To mitigate the issue of class imbalance in the dataset, two hybrid sampling techniques are proposed and analyzed: first oversampling and last undersampling (FOLU) and first undersampling and last oversampling (FULO). In addition to conventional evaluation based on prediction accuracy, this study adopts a cost-aware perspective by applying both cost-incentive and cost-sensitive learning frameworks. The models are compared in terms of misclassification-related cost losses under realistic operational conditions. Experimental results confirm that the proposed hybrid sampling methods outperform traditional oversampling and undersampling techniques in prediction performance. Among all configurations, the FULO-based models using multilayer perception (MLP) and random forest (RF) achieved the highest prediction accuracy. Moreover, cost-sensitive learning generally reduced misclassification losses compared to cost-incentive learning; however, in certain cases, the cost-incentive model yielded lower total costs. These findings indicate that predictive model selection should not be based solely on accuracy metrics, but also on economic efficiency within operational contexts. This study contributes to the literature by demonstrating the practical effectiveness of hybrid sampling in helicopter vibration prediction as well as introducing a cost-aware model evaluation framework suitable for prognostics and health management (PHM) applications in military and civilian rotorcraft operations.

2025-06-23 12:01
Does Economic Stability Influence Family Development? Insights from Women in Korea with the Lowest Childbirth Rates Worldwide

The aim of this study is to explore the multidimensional relationships among factors influencing decision-making processes regarding women’s willingness to marry and childbirth in South Korea with recognizing the context of family development in East Asian cultures. To this end, we employed three different analytical approaches, including classification tree modeling, Cox proportional hazard modeling, and permutation feature importance evaluation. Leveraging longitudinal data specific to Korean women, we highlighted the significance of socio-economic factors in family development dynamics. Our findings revealed that financial stability played a crucial role. Unmarried women’s willingness to marry was influenced by their perspectives on economic stability, while households’ consumption capacity and financial capability determined childbirth decisions and timing. We observed a trend of postponed marriage among women in their marriageable age range, particularly those with stable economic situations, reflecting a prevalent trend of skepticism of marriage in Korean society. Additional findings related to values, cultural factors, and personal happiness also suggested the challenges that discourage younger generations from entering into marriage and starting families in South Korea. By offering insights into these dynamics, our study provides practical implications for addressing the obstacles faced, contributing to a better understanding of family development dynamics.

2024-03-26 11:06
Double ensemble technique for improving the weight defect prediction of injection molding in smart factories

The growing move toward smart factories can leverage industrial big data to enhance productivity. In particular, research is being conducted on injection molding and utilizing machine learning techniques to analyze molding process data, discover optimal molding conditions, and predict and improve product quality. This study aims to identify the key factors influencing the weight defects of injection-molded products and demonstrate the potential use of the double ensemble technique for better prediction accuracy of weight defects. We obtain the key factors influencing weight defects prediction, barrel H2 temp real, metering time, and fill time using gain ratio analysis. Subsequently, we develop single models using machine learning algorithms, including decision tree, random forest, logistic regression, the Bayesian network, and the artificial neural network. Ensemble models, including bagging and boosting and double ensemble models are developed to compare their performance with that of single models. The findings indicate that ensemble models outperform the prediction accuracy of the single models. The double ensemble technique demonstrates the greatest improvements in prediction accuracy over the single models. These results showcase the potential of applying the double ensemble technique to other injection molding areas and suggest that adopting this technique will contribute to establishing other smart factories that will enhance both productivity and cost competitiveness.

2023-11-30 17:12
Building a core rule-based decision tree to explain the causes of insolvency in small and medium-sized enterprises more easily

This study proposes a harmonic average of support and conﬁdence method (HSC), which is a new way to select important rules from the many rules in the decision tree and thereby build a core rule-based decision tree (CorDT) that more easily explains the insolvency factors related to small and medium-sized enterprises (SMEs) using the HSC. To this end, an insolvency prediction model for SMEs was developed using a decision tree algorithm and technological feasibility assessment data as non-ﬁnancial datasets. We divided these datasets into three types, a general type, a technology development type and a toll processing type applying characteristics of SMEs. We also applied a cost-sensitive approach and several data balancing techniques to construct the same proportion of healthy and insolvent company samples in the datasets. As a result, the insolvency prediction model applied using the synthetic minority over-sampling technique (SMOTE), an over-sampling technique, showed the highest performance with an average hit ratio of 77.6%. Next, we selected important rules by applying HSC to the decision trees with the highest performance and built CorDTs for three types of SMEs using the selected rules. Finally, using the developed CorDTs, we explained the causes of insolvency by type of SME and presented insolvency prevention strategies customized to the three types of SMEs.

2023-11-30 17:05
Recommending Valuable Ideas in an Open Innovation Community: A Text Mining Approach to Information Overload Problem

Purpose - Open innovation communities are a growing trend across diverse industries because they provide opportunities of collaborating with customers and exploiting their knowledge effectively. Although open innovation communities can be strategic assets that can help firms innovate, firms nonetheless face the challenge of information overload incurred due to the characteristic of the community. The purpose of this paper is to mitigate the problem of information overload in an open innovation environment. Design/methodology/approach - This study chose MyStarbucksIdea. com (MSI) as a target open innovation community in which customers share their ideas. The authors analyzed a large data set collected from MSI utilizing text mining techniques including TF-IDF and sentiment analysis, while considering both term and non-term features of the data set. Those features were used to develop classification models to calculate the adoption probability of each idea. Findings - The results showed that term and non-term features play important roles in predicting the adoptability of ideas and the best classification accuracy was achieved by the hybrid classification models. In most cases, the precisions of classification models decreased as the number of recommendations increased, while the models' recalls and F1s increased. Originality/value - This research dealt with the problem of information overload in an open innovation context. A large amount of customer opinions from an innovation community were examined and a recommendation system to mitigate the problem was proposed. Using the proposed system, the firm can get recommendations for ideas that could be valuable for its business innovation in the idea generation phase, thereby resolving the information overload and enhancing the effectiveness of open innovation.

2023-07-26 22:52
Assignment of Collaborators to Multiple Business Problems using Genetic Algorithm

As firms encounter new problems in the fast-changing business environment, they have to find collaborators with problem-solving expertise. Since this optimization problem takes place in a firm as the business environment changes, genetic algorithm (GA), which has shown outstanding performance in obtaining a sub-optimal solution relatively quickly, seems to be the right solution, one that is superior to goal-programming, multi-attribute decision making, and branch and bound. We therefore propose a GA-based approach to solving the problem of assigning collaborators to multiple business problems. Our solution worked well in several experiments.

2023-07-26 22:49
Extended Collaborative Filtering Technique for Mitigating the Sparsity Problem

Many online shopping malls have implemented personalized recommendation systems to improve customer retention in the age of high competition and information overload. Sellers make use of these recommendation systems to survive high competition and buyers utilize them to find proper product information for their own needs. However, transaction data of most online shopping malls prevent us from using collaborative filtering (CF) technique to recommend products, for the following two reasons: 1) explicit rating information is rarely available in the transaction data; 2) the sparsity problem usually occurs in the data, which makes it difficult to identify reliable neighbors, resulting in less effective recommendations. Therefore, this paper first suggests a means to derive implicit rating information from the transaction data of an online shopping mall and then proposes a new user similarity function to mitigate the sparsity problem. The new user similarity function computes the user similarity of two users if they rated similar items, while the user similarity function of traditional CF technique computes it only if they rated common items. Results from several experiments using an online shopping mall dataset in Korea demonstrate that our approach significantly outperforms the traditional CF technique.

2023-07-26 22:47
An Ontology-Based Co-Creation Enhancing System for Idea Recommendation in an Online Community

Companies have been collecting innovative ideas that can help them to develop new products and services through co-creation with their customers. As more customers participate in suggesting ideas, companies are likely to acquire more valuable ones. At the same time, however, some fundamental problems occur such as managing and selecting useful ideas from a large number of collected ideas. Semantic web mining techniques allow us to manage a large number of customers' ideas effectively, extract meaningful information from the ideas, and provide useful information for idea selection. In order to cope with such problems and enhance the value of co-creation, we propose an ontology-based co-creation enhancing system (OnCES) developed using semantic web mining techniques. To this end, we 1) defined a co-creation idea ontology (CCIO) that includes common concepts related to customers' ideas from MyStarbucksIdea.com, their attributes, and relationships between them; 2) transformed the customers' ideas into semantic data in RDF format according to the CCIO; 3) conducted text mining to extract new knowledge from the ideas such as keywords, the number of positive words, the number of negative words, and the sentiment score; and 4) built prediction models using keywords and other features such as those about customer and idea in order to predict the adoptability of each idea. The results of text mining and prediction analysis were also added to the semantic data. We implemented the OnCES system, which provides useful services such as idea navigation, idea recommendation, semantic information retrieval, and idea clustering, utilizing the stored semantic data while saving the time and effort required to process a huge number of customers' ideas.

2023-07-26 22:39
Building and Evaluating a Collaboratively Built Structured Folksonomy

Flat folksonomy uses simple tags and has emerged as a powerful instrument for classifying and sharing a huge amount of knowledge on Web 2.0. However, it has semantic problems, such as ambiguous and misunderstood tags. To alleviate such problems, researchers have built structured folksonomies with a hierarchical structure or relationships among tags. Structured folksonomies, however, also have some fundamental problems, such as limited tagging of pre-defined vocabulary and time-consuming manual effort required to select tags. To resolve these problems, we suggested a new method of attaching a tag with its category, which we call a categorized tag (CT), to web content. CTs entered by users are automatically and immediately integrated into a collaboratively built structured folksonomy (CSF), reflecting the tag-and-category relationships supported by the majority of users. Then, we developed a CT-based knowledge organization system (CTKOS), which builds upon the CSF to classify organizational knowledge and enables us to locate appropriate knowledge. In addition, the results of the evaluation, which we conducted to compare our proposed system with the flat folksonomy system, indicate that users perceive CTKOS to be more useful than the flat folksonomy system in terms of knowledge sharing (i.e. the tagging mechanism) and retrieval (i.e. the searching mechanism).

2023-07-26 22:36
CRM Strategies for a Small-Sized Online Shopping Mall Based on Association Rules and Sequential Patterns

As dot-com bubble burst in 2002, an uncountable number of small-sized online shopping malls have emerged every day due to many good characteristics of online marketplace, including significantly reduced search costs and menu cost for products or services and easily accessing products or services in the world. However, all the online shopping malls have not continuously flourished. Many of them even vanished because of the lack of customer relationship management (CRM) strategies that fit them. The objective of this paper is to propose CRM strategies for small-sized online shopping mall based on association rules and sequential patterns obtained by analyzing the transaction data of the shop. We first defined the VIP customers in terms of recency, frequency and monetary (RFM) value. Then, we developed a model which classifies customers into VIP or non-VIP, using various data mining techniques such as decision tree, artificial neural network, logistic regression and bagging with each of these as a base classifier. Last, we identified association rules and sequential patterns from the transactions of VIPs, and then these rules and patterns were utilized to propose CRM strategies for the online shopping mall.

2023-07-26 22:32
Classification Cost: An Empirical Comparison Among Traditional Classifier, Cost-Sensitive Classifier, and MetaCost

Loan fraud is a critical factor in the insolvency of financial institutions, so companies make an effort to reduce the loss from fraud by building a model for proactive fraud prediction. However, there are still two critical problems to be resolved for the fraud detection: (1) the lack of cost sensitivity between type I error and type II error in most prediction models, and (2) highly skewed distribution of class in the dataset used for fraud detection because of sparse fraud-related data. The objective of this paper is to examine whether classification cost is affected both by the cost-sensitive approach and by skewed distribution of class. To that end, we compare the classification cost incurred by a traditional cost-insensitive classification approach and two cost-sensitive classification approaches, Cost-Sensitive Classifier (CSC) and MetaCost. Experiments were conducted with a credit loan dataset from a major financial institution in Korea, while varying the distribution of class in the dataset and the number of input variables. The experiments showed that the lowest classification cost was incurred when the MetaCost approach was used and when non-fraud data and fraud data were balanced. In addition, the dataset that includes all delinquency variables was shown to be most effective on reducing the classification cost. (C) 2011 Elsevier Ltd. All rights reserved.

2023-07-26 14:23
A Hybrid Online-Product Recommendation System: Combining Implicit Rating-Based Collaborative Filtering and Sequential Pattern Analysis

Many online shopping malls in which explicit rating information is not available still have difficulty in providing recommendation services using collaborative filtering (CF) techniques for their users. Applying temporal purchase patterns derived from sequential pattern analysis (SPA) for recommendation services also often makes users unhappy with the inaccurate and biased results obtained by not considering individual preferences. The objective of this research is twofold. One is to derive implicit ratings so that CF can be applied to online transaction data even when no explicit rating information is available, and the other is to integrate CF and SPA for improving recommendation quality. Based on the results of several experiments that we conducted to compare the performance between ours and others, we contend that implicit rating can successfully replace explicit rating in CF and that the hybrid approach of CF and SPA is better than the individual ones. (C) 2012 Elsevier B. V. All rights reserved.

2023-07-26 14:23
A New Similarity Function for Selecting Neighbors for Each Target Item in Collaborative Filtering

As one of the collaborative filtering (CF) techniques, memory-based CF technique which recommends items to users based on rating information of like-minded users (called neighbors) has been widely used and has also proven to be useful in many practices in the age of information overload. However, there is still considerable room for improving the quality of recommendation. Shortly, similarity functions in traditional CF compute a similarity between a target user and the other user without considering a target item. More specifically, they give an equal weight to each of the co-rated items rated by both users. Neighbors of a target user, therefore, are identical for all target items. However, a reasonable assumption is that the similarity between a target item and each of the co-rated items should be considered when finding neighbors of a target user. Additionally, a different set of neighbors should be selected for each different target item. Thus, the objective of this paper is to propose a new similarity function in order to select different neighbors for each different target item. In the new similarity function, the rating of a user on an item is weighted by the item similarity between the item and the target item. Experimental results from MovieLens dataset and Netflix dataset provide evidence that our recommender model considerably outperforms the traditional CF-based recommender model. (C) 2012 Elsevier B.V. All rights reserved.

2023-07-26 14:22
A Personalized Trustworthy Seller Recommendation in an Open Market

Although more and more customers are buying products on online stores, they have a difficulty in selecting a both trustworthy and suitable seller who sells a product they want to buy since there is a plenty number of sellers who sell the same product with different options. Therefore, the objective of this research is to propose a personalized trustworthy seller recommendation system for the customers of an open market in Korea. To that end, we first developed a module which classifies sellers into trustworthy one or not using a classification technique such as decision tree, and then developed another module which makes use of the content-based filtering method to find best-matching top k sellers among the selected trustworthy sellers. Experimental results show that our approach is worthwhile to take. This study makes a contribution at least in that to our knowledge it is the first attempt to recommend sellers, not products as done in most other studies, to customers. Crown Copyright (C) 2012 Published by Elsevier Ltd. All rights reserved.

2023-07-26 14:22
Classification Model for Detecting and Managing Credit Loan Fraud Based on Individual-Level Utility Concept

As credit loan products significantly increase in most financial institutions, the number of fraudulent transactions is also growing rapidly. Therefore, to manage the financial risks successfully, the financial institutions should reinforce the qualifications for a loan and augment the ability to detect and manage a credit loan fraud proactively. In the process of building a classification model to detect credit loan frauds, utility from classification results (i.e., benefits from correct prediction and costs from incorrect prediction) is more important than the accuracy rate of classification. The objective of this paper is two-fold: (1) to propose a new approach to building a classification model for detecting credit loan fraud based on an individual-level utility, and (2) to suggest customized interest rate for each customer - from both opportunity utility and cash flow perspectives. Experimental results show that our proposed model comes up with higher utility than the fraud detection models which do not take into account the individual-level utility concept. Also, it is shown that the individual-level utility from our model is more accurate than the mean-level utility used in previous researches, from both opportunity utility and cash flow perspectives. Implications of the experimental results from both perspectives are provided.

2023-07-26 14:22
Predicting Agricultural and Livestock Products Purchases Using the Internet Search Index and Data Mining Techniques

Purpose This study identifies whether the Internet search index can be used as effective enough data to identify agricultural and livestock product demand and compare the accuracy of the prediction of major agricultural and livestock products purchases between these prediction models using artificial neural network, linear regression and a decision tree. Design/methodology/approach Artificial neural network, linear regression and decision tree algorithms were used in this study to compare the accuracy of the prediction of major agricultural and livestock products purchases. The analysis data were studied using 10-fold cross validation. Findings First, the importance of the Internet search index among the 20 explanatory variables was found to be high for most items, so the Internet search index can be used as a variable to explain agricultural and livestock products purchases. Second, as a result of comparing the accuracy of the prediction of six agricultural and livestock purchases using three models, beef was the most predictable, followed by radishes, chicken, Chinese cabbage, garlic and dried peppers, and by model, a decision tree shows the highest accuracy of prediction, followed by linear regression and an artificial neural network. Originality/value This study is meaningful in that it analyzes the purchase of agricultural and livestock products using data from actual consumers' purchases of agricultural and livestock products. In addition, the use of data mining techniques and Internet search index in the analysis of agricultural and livestock purchases contributes to improving the accuracy and efficiency of agricultural and livestock purchase predictions.

2023-07-26 14:18
Predicting the Insolvency of SMEs Using Technological Feasibility Assessment Information and Data Mining Techniques

The government makes great efforts to maintain the soundness of policy funds raised by the national budget and lent to corporate. In general, previous research on the prediction of company insolvency has dealt with large and listed companies using financial information with conventional statistical techniques. However, small- and medium-sized enterprises (SMEs) do not have to undergo mandatory external audits, and the quality of accounting information is low due to weak internal control. To overcome this problem, we developed an insolvency prediction model for SMEs using data mining techniques and technological feasibility assessment information as non-financial information. We divided the dataset into two types of data based on three years of corporate age. The synthetic minority over-sampling technique (SMOTE) was used to solve the data imbalance that occurred at this time. Six insolvency prediction models were created using logistic regression, a decision tree, an artificial neural network, and an ensemble (i.e., boosting) of each algorithm. By applying a boosted decision tree, the best accuracies of 69.1% and 82.7% were derived, and by applying a decision tree, nine and seven influential factors affected the insolvency of SMEs established for fewer than three years and more than three years, respectively. In addition, we derived several insolvency rules for the two types of SMEs from the decision tree-based prediction model and proposed ways to enhance the health of loans given to potentially insolvent companies using these derived rules. The results of this study show that it is possible to predict SMEs insolvency using data mining techniques with technological feasibility assessment information and find meaningful rules related to insolvency.

2021-10-28 11:15

view more

국내논문

view more

News

등록된 항목이 없습니다

News 더보기

Research

우리 연구실의 연구정보를 안내합니다.

자세히 보기

Quick Menu