Penanganan Multikolinieritas dalam Regresi Saham GOTO Menggunakan PCA, Ridge, LASSO, dan PLS

  • Fachri Faisal Universitas Bengkulu
  • Ratna Widayati Department of Mathematics, University of Bengkulu
  • Zulfia Memi Mayasari
  • Siska Dwi Kumala
  • Aisyah Nooravieta Setiawan Department of Mathematics, University of Bengkulu
  • Nur El Hasanah Mahasiswa Matematika, FMIPA, Universitas Bengkulu, Bengkulu
  • Revika Putri Asharia Mahasiswa Matematika, FMIPA, Universitas Bengkulu, Bengkulu
Keywords: stock prices, multicollinearity, LASSO regression, ridge regression, PLS regression

Abstract

This study aims to address the problem of multicollinearity in a multiple regression model of the daily closing stock price of PT GoTo Gojek Tokopedia Tbk (GOTO) during the period from 2022 to early 2025. Multicollinearity occurs when independent variables are highly correlated, which can lead to inefficient and unreliable parameter estimates. GOTO’s stock price experienced high volatility following its Initial Public Offering (IPO) in April 2022, making it necessary to apply appropriate analytical approaches to identify factors influencing its price movements. The study uses the closing price as the dependent variable, with opening price, high price, low price, and trading volume as independent variables. The methods employed include multiple regression and several approaches to handle multicollinearity, namely variable elimination, Principal Component Analysis (PCA), Ridge Regression, LASSO Regression, and Partial Least Squares (PLS) Regression. The initial multiple regression model achieved an R² of 0.9990 and an RMSE of 2.88, but Variance Inflation Factor (VIF) analysis indicated severe multicollinearity. After applying the alternative methods, PLS Regression demonstrated the best performance, with an R² of 0.9990 and an RMSE of 0.0318. Therefore, it can be concluded that PLS Regression is a more stable and accurate method for addressing multicollinearity and improving the prediction of GOTO’s stock prices.

 

Abstrak

Penelitian ini bertujuan menangani masalah multikolinearitas dalam model regresi berganda terhadap harga saham penutupan harian PT GoTo Gojek Tokopedia Tbk (GOTO) selama periode 2022 hingga awal 2025. Multikolinearitas terjadi ketika variabel bebas saling berkorelasi kuat sehingga menyebabkan estimasi parameter menjadi tidak efisien dan kurang akurat. Harga saham GOTO mengalami volatilitas tinggi sejak IPO April 2022, sehingga diperlukan pendekatan analisis yang tepat untuk mengidentifikasi faktor-faktor yang memengaruhi pergerakan harga. Data penelitian menggunakan variabel Terakhir sebagai variabel dependen, serta Pembukaan, Tertinggi, Terendah, dan Volume sebagai variabel independen. Metode yang digunakan meliputi regresi berganda dan beberapa pendekatan penanganan multikolinearitas, yaitu penghapusan variabel, Principal Component Analysis (PCA), Ridge Regression, LASSO Regression, dan Partial Least Squares (PLS) Regression. Model awal menghasilkan R² sebesar 0,9990 dan RMSE 2,88, namun terindikasi multikolinearitas tinggi berdasarkan nilai VIF. Setelah penerapan metode alternatif, PLS Regression memberikan performa terbaik dengan R² = 0,9990 dan RMSE = 0,0318. Dengan demikian, PLS Regression dinilai paling stabil dan akurat dalam mengatasi multikolinearitas serta meningkatkan ketepatan prediksi harga saham GOTO.

References

Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101

Chand, S., & Kamal, S. (2011). Variable selection by lasso-type methods. Pakistan Journal of Statistics and Operation Research, 7(2 SPECIAL ISSUE), 451–464. https://doi.org/10.18187/pjsor.v7i2-sp.389

Chin, W. W., Henseler, J., & Ringle, P. (2010). Handbook of Partial Least Squares. In Handbook of Partial Least Squares (Issue January 2010). https://doi.org/10.1007/978-3-540-32827-8

Dewi, Y. S. (2010). OLS, LASSO dan PLS Pada data Mengandung Multikolinearitas. Jurnal ILMU DASAR, 11(1), 83–91.

Draper, N. R., & Smith, H. (2014). Applied regression analysis. In Applied Regression Analysis (pp. 1–716). https://doi.org/10.1002/9781118625590

Ferdiansyah, Tin, S., & Anthonius. (2016). Globalisasi Ekonomi, Integrasi Ekonomi Global, Dinamika Pasar Modal & Kebutuhan Standar Akuntansi Internasional Ferdiansyah Se Tin Anthonius. Jurnal Akuntansi, 8(1), 119–130.

Golam Kibria, B. M. (2003). Performance of some New Ridge regression estimators. Communications in Statistics Part B: Simulation and Computation, 32(2), 419–435. https://doi.org/10.1081/SAC-120017499

Gujarati, D. N. (2004). Basic Econometrics.: Student Solutions Manual for Use with Basic Econometrics. McGraw-Hill. https://doi.org/0072427922

Gupta, A., Akansha, Joshi, K., Patel, M., & Pratap, V. (2023). Stock Market Prediction using Machine Learning Techniques: A Systematic Review. 2023 International Conference on Power, Instrumentation, Control and Computing, PICC 2023, 1–6. https://doi.org/10.1109/PICC57976.2023.10142862

Montgomery, D.C., Peck, E.A. and Vining, G. G. (2012). Introduction to Linear Regression Analysis. (Fifth Edit). John Wiley & Sons, Hoboken.

Pirouz, D. M. (2012). An Overview of Partial Least Squares. SSRN Electronic Journal, March. https://doi.org/10.2139/ssrn.1631359

Rajput, G. G., & Kaulwar, B. H. (2018). Predicting Stock Prices in National Stock Exchange of India using Principal Component Analysis and Neural Networks. International Journal of Computer Sciences and Engineering, 6(6), 746–752. https://doi.org/10.26438/ijcse/v6i6.746752

Rouf, N., Malik, M. B., Arif, T., Sharma, S., Singh, S., Aich, S., & Kim, H. C. (2021). Stock market prediction using machine learning techniques: A decade survey on methodologies, recent developments, and future directions. Electronics (Switzerland), 10(21). https://doi.org/10.3390/electronics10212717

Sari, D. R. P. (2023). Metode Principal Component Analysis (PCA) sebagai penanganan asumsi multikolinearitas (studi kasus: data produksi tapioka). Parameter: Jurnal Matematika, Statistika dan Terapannya, 2(2), 115–124. https://doi.org/10.30598/parameterv2i02pp115-124

Shrestha, N. (2020). Detecting Multicollinearity in Regression Analysis. American Journal of Applied Mathematics and Statistics, 8(2), 39–42. https://doi.org/10.12691/ajams-8-2-1

Sujatha, K. V., & Sundaram, S. M. (2011). Ridge Regression Model for the Prediction of. 9304, 19–26.

Sungkono, J., & Nugrahaningsih, T. K. (2017). Simulasi Dampak Multikolinearitas Pada Kondisi Penyimpangan Asumsi Normalitas. Magistra, XXIX(101), 45–50.

Swanson, D. A., & Tayman, J. (2012). Regression Methods. Springer Series on Demographic Methods and Population Analysis, 31(November 1987), 165–185. https://doi.org/10.1007/978-90-481-8954-0_8

Tibshirani, R. (1996). Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society. Series B: Methodological, 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

Verleysen, M., & Verleysen, M. (2001). Principal Component Analysis (PCA). Statistics, September, 1–8. https://doi.org/10.5455/ijlr.20170415115235

Wasilaine, T. L., Talakua, M. W., & Lesnussa, Y. A. (2014). Model Regresi Ridge untuk Mengatasi Model Regresi Linier Berganda yang Mengandung Multikolinieritas (Studi Kasus: Data Pertumbuhan Bayi di Kelurahan Namaelo RT 001, Kota Masohi). BAREKENG: Jurnal Ilmu Matematika Dan Terapan, 8(1), 31–37.

Zhang, Y., Shen, D., & Huang, L. (2021). Predicting stock market returns using deep learning and technical indicators. Neurocomputing, 432, 347–364.

Zhao, P., & Yu, B. (2006). On model selection consistency of Lasso. Journal of Machine Learning Research, 7, 2541–2563.

Published
2026-03-11
Section
Articles