Selection of Variables In Generalized Linear Mixed Model For Smoker In Jambi Province
R Warti1, K A Notodiputro2, B Sartono2

1 Department of Mathematics Education, UIN Sulthan Thaha Saifuddin, Jambi, 36363, Indonesia.
2 Department of Statistics, IPB University, Bogor, 16680, Indonesia


Abstract

There are currently more than 67 million adult smokers or about 39% of the adult population in Indonesia. Data on Basic Health Research (Riskesdas) Ministry of Health in 2018 showed that there was an increase in the smoking prevalence of the population aged 18 years from 7.2% to 9.1%. Many factors cause a person to smoke, both originating from oneself and environmental factors. The statistical problem that arises is how to select the factors that influence people to smoke. This study aims to select variables in the generalized linear mixed model with the case of smokers in Jambi Province. Respondents in this study were 160 people taken from the 2017 SDKI data. The variables were selected using the Lasso penalty and the Boosting function, which used the EM and REML algorithms. The analysis shows that the variables selected from the two methods are occupation, the level of welfare, and the family members who smoke. Based on the AIC value, the best model obtained from the selection of variables with the Boosting function and REML algorithm

Keywords: EM Algorithm, REML Algorithm, Boosting, Lasso, Smoker, Variable Selection

Topic: Mathematics

AASEC 2020 Conference | Conference Management System