An Entropy Analysis of the Cirebon Language Script Using the Ternary Huffman Code Algorithm
Abdul Kodir, Ray Fajar, Asep Solih Awalluddin, Uus Ruswandi, Nanang Ismail, Deni Miharja

UIN Sunan Gunung Djati Bandung


Abstract

Entropy is a statistical parameter that measures how much average information is generated for each symbol in a text. Each language usually has several important features that are hidden statistically and certain redundancy. These features can be utilized to form appropriate text compression tools for optimal use of resources. This study proposes an analysis of the entropy of the Cirebon language text for text compression using the Ternary Huffman Code algorithm. Cirebon language was chosen because it has its uniqueness. This entropy value then becomes the reference level of the Cirebon script compression level. The probability of each symbol in the Cirebon Regional Text is used to calculate the entropy value. The result shows the entropy of the Cirebon language script was 2.508 bits per symbol, with an expected code length of 2.565 bits per symbol. Estimated compression efficiency with Ternary Huffman Code is 97.77% and compression rate is 0.51308.

Keywords: entropy, Cirebon Language, Ternary Huffman code, efficiency, compression rate

Topic: Computer and Communication Engineering

AASEC 2020 Conference | Conference Management System