Indonesian Journal of Data and Science https://jurnal.yoctobrain.org/index.php/ijodas <p align="justify">Indonesian Journal of Data and Science (IJODAS) is an electronic periodical publication published by Yocto Brain (YB),&nbsp; a non-commercial company that focused on education and training. IJODAS provides online media to publish scientific articles from research in the field of Data Science, Data Mining, Data Communication and Data Security. IJODAS is registered in National Library with Online Number International Standard Serial Number (ISSN) <a title="SK ISSN" href="https://portal.issn.org/resource/ISSN/2715-9930" target="_blank" rel="noopener"><strong>2715-9930</strong></a>.</p> <p>&nbsp;</p> en-US <p><strong>License and Copyright Agreement</strong></p> <p>In submitting the manuscript to the journal, the authors certify that:</p> <ul> <li class="show">They are authorized by their co-authors to enter into these arrangements.</li> <li class="show">The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.</li> <li class="show">The work is not under consideration for publication elsewhere.</li> <li class="show">The work has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.</li> <li class="show">They secure the right to reproduce any material that has already been published or copyrighted elsewhere.</li> <li class="show">They agree to the following license and copyright agreement.</li> </ul> <p><strong>Copyright</strong></p> <p>Authors who publish with Indonesian Journal of Data and Science agree to the following terms:</p> <ol> <li class="show">Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a&nbsp;<a href="https://creativecommons.org/licenses/by-nc/4.0/" rel="license">Creative Commons Attribution-NonCommercial 4.0 International License</a>.<a href="https://creativecommons.org/licenses/by-nc/4.0/" target="_blank" rel="noopener">&nbsp;(CC BY-NC 4.0)</a>&nbsp;that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.&nbsp;</li> <li class="show">Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.</li> <li class="show">Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.</li> </ol> huzain.azis@umi.ac.id (Huzain Azis) ijodas.journal@gmail.com (Ijodas admin) Tue, 31 Dec 2024 00:00:00 +0000 OJS 3.1.2.4 http://blogs.law.harvard.edu/tech/rss 60 Comparative Analysis of Fuzzy Logic Models for Depression Prediction: Python and LabVIEW Approaches https://jurnal.yoctobrain.org/index.php/ijodas/article/view/189 <p>Depression is one of the mental disorders with a significant impact on individuals' quality of life and productivity. The diagnostic process for depression, which typically relies on subjective assessment, often encounters challenges of uncertainty and variability in symptoms. This study aims to develop a fuzzy model for predicting depression levels based on five primary symptom variables: worthlessness, concentration, suicidal ideation, sleep disturbance, and hopelessness. The model is implemented on two platforms, Python and LabVIEW, to evaluate the accuracy and consistency of prediction results between these platforms. The analysis process begins with data preprocessing, input variable fuzzification, inference using 243 fuzzy rules, and defuzzification to generate a crisp output value classified into four depression levels: No Depression, Mild, Moderate, and Severe. The study results indicate a very small error margin between the two platforms, with error values below 0.01 in each trial. These findings suggest that both Python and LabVIEW can produce nearly identical and consistent predictions. This conclusion supports the effectiveness of fuzzy logic in addressing uncertainty in clinical data, especially for cases of depression with varying symptoms. Nonetheless, there are limitations related to the subjectivity in selecting membership functions and rules, as well as limitations in the number of variables used. Therefore, this study recommends expanding the developed fuzzy model with additional variables or integrating it with machine learning approaches to improve prediction accuracy. These findings are expected to serve as a foundation for the development of fuzzy-based systems in future mental health diagnostics.</p> Nurul Rismayanti, Gilberth Valentino Titaley, Anik Nur Handayani Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/189 Tue, 31 Dec 2024 00:00:00 +0000 Bayesian Analysis of Two Parameter Weibull Distribution Using Different Loss Functions https://jurnal.yoctobrain.org/index.php/ijodas/article/view/179 <p>This paper focuses on the Bayesian technique to estimate the parameters of the Weibull distribution. At this location, we use both informative and non-informative priors. We calculate the estimators and their posterior risks using different asymmetric and symmetric loss functions. Bayes estimators do not have a closed form under these loss functions. Therefore, we use an approximation approach established by Lindley to get the Bayes estimates. A comparative analysis is conducted to compare the suggested estimators using Monte Carlo simulation based on the related posterior risk. We also analyze the impact of distinct loss functions when using various priors.</p> Dler Najmaldin, Mahmut Kara, Yıldırım Demir, Sakir İşleyen Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/179 Tue, 31 Dec 2024 00:00:00 +0000 Grid Search Hyperparameter Analysis in Optimizing The Decision Tree Method for Diabetes Prediction https://jurnal.yoctobrain.org/index.php/ijodas/article/view/190 <p>Diabetes is a global health issue that continues to rise, especially in Indonesia, caused by unhealthy lifestyles, poor diets, and genetic factors. Early detection of diabetes risk is crucial to prevent serious complications, and machine learning offers innovative predictive solutions. This research focuses on the development of a diabetes risk prediction model using the Decision Tree algorithm with hyperparameter optimization through the Grid Search technique. The research methodology includes the collection of patient medical data with key attributes such as glucose levels, blood pressure, skin health, insulin, body mass index (BMI), diabetes pedigree, age, and health history. The hyperparameter tuning process is carried out by varying key parameters such as the maximum tree depth (max_depth), the minimum number of samples required to split a node (min_samples_split), and the minimum number of samples required at a leaf node (min_samples_leaf). Grid Search is used to systematically explore hyperparameter combinations in order to find the optimal configuration that can improve the model's performance. The research process includes data preprocessing, splitting the dataset into training and testing sets, model training, and evaluation using accuracy metrics, confusion matrix, and ROC AUC curve. The initial results show a model accuracy of 76%, which was then improved to 81% after hyperparameter optimization using Grid Search. The visualization of the decision tree reveals that glucose levels and BMI have the most significant contributions in predicting diabetes risk. This research demonstrates the potential of machine learning in supporting the early detection of diabetes, with the Decision Tree algorithm showing promising predictive capabilities. Nevertheless, further research with larger datasets and the integration of other algorithms is highly recommended to improve the accuracy and generalization of the model. The main contribution of this research is the development of a machine learning-based approach that can assist medical personnel in screening for diabetes risk more efficiently and accurately.</p> Desi Anggreani, Hamdani , Nurmisba, Lukman Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/190 Tue, 31 Dec 2024 00:00:00 +0000 Performance Comparison of CNN and ResNet50 for Skin Cancer Classification Using U-Net Segmented Images https://jurnal.yoctobrain.org/index.php/ijodas/article/view/200 <p>Skin cancer is a significant global health issue, with melanoma, basal cell carcinoma, and actinic keratosis being the most common types. Early and accurate detection is critical to improve survival rates and treatment outcomes. This study evaluates the performance of Convolutional Neural Networks (CNN) and ResNet50 in classifying segmented images of skin lesions. The dataset, sourced from Kaggle, was pre-processed using U-Net for lesion segmentation to enhance the quality of input data. Both models were trained and evaluated using accuracy, precision, recall, and F1-score metrics. The CNN model demonstrated a balanced performance across classes, with a weighted F1-score of 47%, but suffered from overfitting, as indicated by the divergence between training and validation losses. ResNet50 achieved better recall for basal cell carcinoma (100%) but failed to classify actinic keratosis and melanoma, resulting in a macro F1-score of 23%. The findings reveal that U-Net segmentation improved classification focus but was insufficient to address dataset imbalance and model-specific limitations. This study highlights the challenges of skin cancer classification using deep learning and underscores the importance of addressing data imbalance and overfitting. Future research should explore advanced techniques, such as ensemble methods, data augmentation, and transfer learning, to improve the generalization and clinical applicability of these models. The proposed framework serves as a foundation for further investigation into automated skin cancer detection systems.</p> Aris Wahyu Murdiyanto, Dian Hafidh Zulfikar, Bagus Satrio Waluyo Poetro, Alda Cendekia Siregar Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/200 Tue, 31 Dec 2024 00:00:00 +0000 Classification of Noni Fruit Ripeness Using Support Vector Machine (SVM) Method https://jurnal.yoctobrain.org/index.php/ijodas/article/view/180 <p>The classification of Noni fruit (Morinda citrifolia) ripeness is essential for maximizing its medicinal benefits and ensuring product quality. This research aimed to classify Noni fruit ripeness using the Support Vector Machine (SVM) method, comparing three kernel functions: linear, Radial Basis Function (RBF), and polynomial. A dataset consisting of images of ripe and unripe Noni fruits was utilized, with preprocessing steps including the extraction of color and texture features. Performance evaluation revealed that the RBF kernel achieved the highest accuracy at 86.18%, followed by the polynomial kernel with 84.55%, and the linear kernel with 81.30%. These results suggest that the RBF kernel is the most effective for this classification task, showing superior capability in capturing non-linear patterns and complexities within the dataset.</p> Yudha Islami Sulistya, Maie Istighosah, Maryona Septiara, Abednego Dwi Septiadi, Arif Amrullah Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/180 Tue, 31 Dec 2024 00:00:00 +0000 Sugeno Fuzzy Personality Prediction System: An Approach to Overcoming Psychological Measurement Uncertainty https://jurnal.yoctobrain.org/index.php/ijodas/article/view/192 <p>Personality prediction is a significant field in psychological measurement, yet it faces challenges due to psychological data's ambiguous and uncertain nature. This study aims to develop a Sugeno-based fuzzy logic system for predicting personality types according to the Myers-Briggs Type Indicator (MBTI). The dataset includes synthetic personality data, incorporating age, introversion, sensing, thinking, and judging. The fuzzification process converts crisp input values into fuzzy variables, which are then processed using predefined fuzzy rules to generate personality predictions. The defuzzification step yields crisp outputs corresponding to MBTI types, demonstrating the system's ability to handle uncertainty and ambiguity effectively. Implementation and evaluation were conducted using Python and LabVIEW, revealing a satisfactory performance with a low error rate of 0.445. This study highlights the potential of fuzzy logic, particularly the Sugeno method, in enhancing accuracy and adaptability in personality prediction, contributing to applications in education, human resource management, and personalized digital services.</p> Nadindra Dwi Ariyanta, Anik Nur Handayani Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/192 Tue, 31 Dec 2024 00:00:00 +0000 Development of a Decision Tree Classifier for Breast Cancer Diagnosis Using Fine Needle Aspirate Data https://jurnal.yoctobrain.org/index.php/ijodas/article/view/202 <p>Breast cancer is one of the leading causes of mortality among women globally, necessitating early and accurate detection to improve survival rates. This study leverages machine learning to develop a decision tree classifier for distinguishing between benign and malignant breast masses using the Kaggle Breast Cancer FNA dataset. The dataset underwent rigorous pre-processing, including the removal of irrelevant columns, data cleaning, label encoding, and feature scaling. The model was evaluated using 5-fold cross-validation, achieving an average accuracy of 84.0%, with a test set accuracy of 83.72%. Performance metrics such as precision, recall, and F1-score further validated the model's robustness, with an overall accuracy of 90.24% on the test set. The decision tree classifier demonstrated high interpretability, making it a practical tool for aiding clinical decision-making. While the results are promising, the study highlights opportunities for improvement, including the use of ensemble methods and larger datasets to enhance generalizability. This research contributes to the growing body of evidence supporting machine learning applications in medical diagnostics, particularly in breast cancer detection.</p> Agus Halid, I Gusti Ngurah Wikranta Arsa, Rezania Agramanisti Azdy, Agus Aan Jiwa Permana Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/202 Tue, 31 Dec 2024 00:00:00 +0000 Probabilistic Graphical Models for Predicting Properties of New Materials Based on Their Composition and Structure https://jurnal.yoctobrain.org/index.php/ijodas/article/view/177 <p>Probabilistic graphical model (PGMs) offer a powerful framework for modeling complex relationships between different components. By integrating information on the element composition and structural features, these models enable the inference of materials properties with a probabilistic perspective. This approach holds promising efforts towards accelerating materials discovery design, as it facilitates the predication of diverse materials characteristics, ranging from electronic and mechanical properties to thermal and optical behavior. The use of PGMs in materials science represents a sophisticated methodology for harnessing data-driven insights to guide the exploration of innovative materials with tailored functionalities. The purpose of this paper is to investigate literature for the exploitation of the data science concepts, big data and machine learning that yields computational intelligence. A literature review approach to understand the exploitation and use of computational intelligence in the leading-edge research and innovation of materials science. The findings illustrate that machine learning can be used to intricate chemical problems that otherwise would not be tractable. Leveraging PGMs presents a promising avenue for predicting the properties of new materials based on their composition and structure.</p> Vusumuzi Malele, Ashley Phala Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/177 Tue, 31 Dec 2024 00:00:00 +0000 Classification of Mushroom Edibility Using K-Nearest Neighbors: A Machine Learning Approach https://jurnal.yoctobrain.org/index.php/ijodas/article/view/199 <p>This study investigates the use of the K-Nearest Neighbors (KNN) algorithm for the binary classification of mushroom edibility using a cleaned version of the UCI Mushroom Dataset. The dataset underwent pre-processing techniques such as modal imputation, one-hot encoding, z-score normalization, and feature selection to ensure data quality. The model was trained on 80% of the dataset and evaluated on the remaining 20%, achieving an overall accuracy of 99%. Evaluation metrics, including precision, recall, and F1-score, confirmed the model's effectiveness in distinguishing between edible and poisonous mushrooms, with minimal misclassification errors. Despite its high performance, the study identified scalability as a limitation due to the computational complexity of KNN, suggesting that future research should explore alternative algorithms for enhanced efficiency. This research underscores the importance of pre-processing and hyperparameter optimization in building reliable classification models for food safety applications.</p> Fadhila Tangguh Admojo, Made Leo Radhitya, Hamada Zein, Ahmad Naswin Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/199 Tue, 31 Dec 2024 00:00:00 +0000 Predictive Modeling of Air Quality Levels Using Decision Tree Classification: Insights from Environmental and Demographic Factors https://jurnal.yoctobrain.org/index.php/ijodas/article/view/201 <p>Air pollution poses a significant global challenge, adversely impacting public health and environmental sustainability. Understanding the factors influencing air quality is essential for developing effective mitigation strategies. This study aims to analyse key environmental and demographic factors, such as PM2.5 concentration, population density, and proximity to industrial areas, to predict air quality levels using a Decision Tree model. The dataset, comprising 5000 samples, was pre-processed by encoding the target variable and applying Z-score normalization to numerical features. The model was trained on 80% of the data and evaluated on the remaining 20%, achieving an accuracy of 93%. Evaluation metrics, including a classification report and confusion matrix, demonstrated the model's effectiveness in distinguishing between four air quality categories: Good, Moderate, Poor, and Hazardous. PM2.5 emerged as the most critical predictor, followed by demographic and industrial factors. These findings underscore the potential of machine learning models in providing actionable insights for air quality management. The results contribute to public policy by highlighting the need for targeted interventions in high-risk areas and the importance of incorporating environmental data into urban planning. Future work should focus on expanding the feature set and exploring ensemble techniques to further enhance predictive accuracy and robustness.</p> I Gede Iwan Sudipa, Muhammad Habibi, Ery Setiyawan Jullev Atmadji, Ika Arfiani Copyright (c) 2024 Indonesian Journal of Data and Science https://creativecommons.org/licenses/by-nc/4.0 https://jurnal.yoctobrain.org/index.php/ijodas/article/view/201 Tue, 31 Dec 2024 00:00:00 +0000