Implementasi Preprocessing dan Synonym Expansion untuk Sistem Temu Kembali Berita Bahasa Indonesia

Authors

  • Adrian Suparto Universitas Multi Data Palembang
  • Michael Joy Clement
  • Abdul Rahman
  • Hafiz Irsyad

DOI:

https://doi.org/10.36982/jseci.v3i01.5405

Keywords:

cosine similarity, Indonesian language, information retrieval, query expansion, TF-IDF

Abstract

In Indonesian information retrieval systems, vocabulary differences between user queries and target documents are often a major obstacle in obtaining relevant search results. This research examines the effectiveness of applying synonym-based query expansion techniques to improve search relevance in IR systems. The system is designed using TF-IDF weighting and Cosine Similarity technique to calculate the closeness between query and document. A total of 10 queries were tested against a collection of news documents, with a manual approach in expanding keywords based on synonyms referred from KBBI. The evaluation was conducted using Precision@20 as the main metric. The results showed that the precision increased significantly from an average of 0.51 without query expansion to 0.725 after synonyms were added to the query. This shows that query meaning expansion can improve search accuracy in the context of a rich natural language such as Indonesian. This research indicates that the integration of semantic-based expansion techniques has great potential in optimizing the performance of IR systems. In the future, automated approaches such as semantic embedding or digital synonym mapping can be an alternative for more extensive and efficient development.

 

Downloads

Published

2025-06-30

Issue

Section

Articles
external-statistic-user-interface-budi-arianto Abstract views: 38 / PDF downloads: 17