COVID-19 Inpatients in Sothern Iran: A Time Series Forecasting for 2020-2021

,

people. The province has 14 cities, 22 hospitals with 2514 active beds, supervised by Hormozgan University of Medical Sciences (HUMS). During the spread of the pandemic, the university has equipped around 405 hospital beds and 100 ICU beds specifically for admitting patients with COVID-19 (10).

Objective
This study aimed to forecast the COVID-19 inpatients in Hormozgan province hospitals until 20 March 2021 (end of the Iranian calendar year).

Methods
This time series data study was done in Hormozgan province in southern Iran in 2020. The method used is elaborated as below.

Dataset
This time-series data study was done in Hormozgan province in southern Iran in 2020 (20 February to 13 November). The data for the present study was obtained using the time series data daily reported to the statistic center of the HUMS' vice-chancellor for Treatment Affairs. The data included the daily new cases of 1) Confirmed inpatients (PCR positive), 2) Suspected inpatients (PCR negative), 3) Deaths, 4) Recovered patients, 5) Admitted cases to ICUs, 6) ICU discharged cases, and 7) ICU inpatient service day. The aforementioned variables were chosen based on earlier studies (5,(7)(8)(9).

Unit-Root Test
In this study, two models were applied for forecasting. For each variable, the model with the least root mean square error (RMSE) was chosen and reported. Before estimating the models, it was necessary to use the unit-root test.
Unit-root test is one of the characteristics of time series, indicating that the variable is not time-dependent. This characteristic ensures the mean and variance to stay constant over time and is stationary around the deterministic trend. This makes a time series out-ofsample predictable. There are several techniques for unitroot tests. The two most common methods are augmented Dickey-Fuller (ADF) and Kwiatkowski-Phillips-Schmidt-Shin (KPSS). The difference between ADF and KPSS is that KPSS is only applicable for series with constant trends, up-rise, or downward slope. Sometimes, in series with deterministic trends, KPSS does not necessarily reject the null hypothesis. So, for ascending trends, it is necessary to enter the trend in the model settings.
On the other hand, in such situations, ADF test may result in non-stationary for the time series variable, in contrast to KPSS (11). Since the frequency of COVID-19 cases might be affected by health policies, holidays, and other interfering factors, KPSS would not be a good option. Then in this study, we used the ADF unit-root test.

Forecasting Model
Autoregressive integrated moving average (ARIMAX) and FbProphet methods were applied to forecast health resources for 127 days remaining to the end of the Iranian official calendar year. We used Python for data analysis. For each variable, the model with the least RMSE was chosen and reported.
Based on the ADF unit-root test results, all the study variables were forecasted with 1st differenced series by ARIMAX and FbProphet methods. Then, 80% of the data were considered as train data and the rest as test data.
ARIMAX is an extension of ARIMA model. It includes three parameters: p (the autoregressive term), d (the moving average term) and q (indicating the series is differenced to make it stationary).
After checking autocorrelation function and partial autocorrelation function, we found the need for moving average (MA) and autoregressive (AR) terms, q and p. Then, using "statsmodels" package in Python, we determined the optimum values of p and q with the Akaike information criterion (AIC), based on Box-Jenkins methodology (12). It should be noted that from ADF results, we found that the value of d, the minimum differencing need to make a stationary series, is one.
All kinds of ARIMAX models with explanatory variables, including official holidays, weekend holidays, and all the holidays (the sum of official and weekend holidays), were fitted separately. Moreover, ARIMAX was estimated for all the variables of the study. Ultimately, the model with the minimum mean squared error (MSE) and RMSE was chosen. However, FbProphet model uses machine learning algorithms to forecast. In such models, in which learning occurs via data history, the selection of proper training model is done based on the characteristics of data history and using them for out-of-sample forecasting (13). Prophet model, publicly published by Facebook in 2017, is an open-source framework for forecasting time series based on additive model (13)(14)(15). It is widely used in Python. This model uses time series, which are dividable to trend, seasonal components, and holidays, additively: y(t) = g(t)+s(t)+h(t)+Ɛ t In this formula, g(t), s(t), and h(t) represent the trend, seasonal, and holiday components, respectively. However, h(t) potentially causes irregular changes within one or two days. Ɛ t is the error term and has a normal distribution.
Specification of Prophet is similar to the generalized additive model (GAM) offered by Hastie and Tibshirani in 1987 (16). Although inferential advantages of ARIMAX is not addressed in the fitting of this model, GAM possesses various other advantages. This model's flexibility facilitates matching the seasonal trend with different periods and permits the analyst to have different assumptions about the trends. Secondly, unlike ARIMAX models, it is not required to measure variables at regular intervals and to interpolate missing data, i.e., there is no Soleimani Movahed hmj.hums.ac.ir http need for interpolation in case of deleting outliers. Thirdly, quick fitting enables the analyst to visualize different specifications. The fourth advantage is that the forecasting models possess simple interpretable parameters, and hence, the analyst can conveniently change his/her assumptions (15).

Results
The results of unit-root test revealed that the variables were not at the stationary level. However, they became stationary after first difference, I (1) ( Table 1).
The best degrees for MA and AR components were chosen based on the results obtained from seasonal and non-seasonal ARIMAX models using the optimal algorithm of Box and Jenkins in Python. Moreover, Table 2 shows the selected models which offer the best forecasting based on AIC among seasonal and non-seasonal models. Each of the ARIMAX seasonal and non-seasonal models was separately estimated with explanatory variables of weekend holidays (fwholid), official holidays (holidays), and all the holidays (allholi). Later, given the RMSE and MSE criteria, the final model was selected (Table 2).
In this way, it can be implied that the best explanatory variable that can help forecast the frequency of inpatients (both confirmed and suspected), ICU admission, and discharge is weekend holidays. In other words, the aforementioned variables can be explained more clearly by the use of this explanatory variable (weekend holidays). However, official holidays are the best explanatory variable for the frequency of deaths and ICU-inpatient service day. All official holidays and weekends (allholi) are the best explanatory variables for the recovered (discharged) patients.
Forecasting each of the variables was performed by the afore-mentioned optimized models of ARIMAX and the Prophet model separately in Python. Both the models forecasted similarly for the time series although there were differences in the forecasted values. Table 3 compares the validity of the models by the use of mean absolute error (MAE) and MSE criteria. Given the data on this table, the selection between the linear model, ARIMAX, and the nonlinear model, and Prophet was done based on MSE and MAE criteria (highlighted cells). Figure 1 forecasts the trend of "Confirmed inpatients", "Suspected inpatients", "Deaths", and "Recovered patients" according to ARIMAX and Prophet Models in Hormozgan during 2020-2021. It indicates that the highest rate of inpatients and deaths occurred after the May and June holidays (May 20 to 25; and June 3 to 6). Figure 2 forecasts the trend of "Admitted cases to ICUs", "ICU discharged cases", and "ICU Inpatients service day" according to ARIMAX and Prophet Models in Hormozgan in 2020. According the time series, the figure forecasts that there will be upward trends for the ICU variables in the upcoming months to the end of the Iranian calendar year (March 21, 2021).

Discussion
This paper forecasted COVID-19 time series (inpatients) in Hormozgan province (Iran) using ARIMAX and the Prophet models based on the data from 20 February 2020 to 13 November 2020. This analysis was carried out for a better understanding of COVID-19 incidence trends in the province. The achievements of this study might be referred for policy-making and predicting the required budget for maintaining and extending COVID-19 services by healthcare providers. Unlike earlier studies (7), our study showed ARIMAX was not valid for forecasting confirmed and suspected cases. The values forecasted by   ARIMAX were negative and were unmatched with the statistics in reality. In other words, ARIMAX forecasting is not dependable based on the evidence in our study. Therefore, we forecasted the time series data using the Prophet model. Appendix 1 reports the mean weekly forecasting of the variables in both the models. Researchers reported that there has been a direct correlation between gatherings/holidays and incidence of COVID-19 (17). As Figure 1 shows Hormozgan experienced the highest rate of inpatients and deaths following the May and June holidays (May 20 to 25; and June 3 to 6). However, to avoid such an unpleasing experience, Hormozgan COVID-19 policy makers made appropriate decisions prior to the next holidays (from 26 to 30 August) relying on the previous experiences obtained in June holidays and also forecasting the trend of inpatients. This successful management was achieved even though all over Iran reported an upward trend after the August holidays for all the COVID-19 indicators (Figure 1). The measures taken based on the decisions could be classified into four main sections including: 1) Prevention measures such as mandatory use of masks, social distancing, use of IT infrastructure potentials for online services (telecommuting, online education, online shopping, etc.), screening, obligating public and private organizations and centers for implementation of COVID-19 protocols, and also limiting travelling between cities and even forbidding familial ceremonies and gatherings, 2) Diagnostic and treatment measures such as increasing the centers for PCR testing, dedicating beds and equipment for patients with COVID-19 in wards and ICUs, etc., 3) Educational measures including online and offline classes/workshops for the healthcare staff and the public, creating contents to be shared on social media, preparation of print materials for distribution all around the province, etc., and 4) Supportive measures including facilitating administrative rules and regulations such as telecommuting, online education, and financing for all the measures taken.
Based on time series forecasting, the number of confirmed cases, recovered cases, deaths, and ICUinpatient service days will have a downward trend while the number of ICU-inpatients and ICU-discharge will show a mild upward trend (Figures 1 and 2). According to the downward trend of inpatients and deaths, it could be concluded that the decisions made by COVID-19 policy makers as well as people's compliance with health protocols effectively controlled the pandemic in Hormozgan province in October 2020. This was achieved amid unprecedented records of deaths (400 cases per day) reported in Iran, during which most news agencies relayed the condition of hospitals and lack of facilities for the new wave of COVID-19 by publishing "NO EMPTY BEDS" headlines (18). Moreover, in the case of death time series, which is one of the most important epidemiological indicators in pandemics, there will be a downward trend in Hormozgan by 20 March 2021 (the end of the current Iranian calendar year), according to the current statistics obtained by the results of this time series forecast. It is even predicted to reach zero by the onset of the Iranian new year.
Time series forecasted upward trends for the ICU variables (Figure 2) for the upcoming months to the end of the Iranian calendar year (March 21, 2021). Increased admission could be explained by the HUMS' decision for increasing ICU beds for COVID-19 cases after the July peak. This decision facilitated the transfer of patients to ICU beds in the upcoming months. In other words, there were strict requirements for transferring patients to ICUs during the July peak due to a high number of cases and limited number of ICU beds. However, in upcoming months with a downward trend of inpatients, patients with moderate conditions might also be transferred to ICUs to be provided with a higher quality of healthcare in case of availability of ICU beds.
According to real data, the number of suspected admissions began to decline in Mid-April, while the number of confirmed admissions continued to rise until it reached over 400 per day on July 11, 2020, and then started to decline until August 21; and reached 168 people. This incompatibility between the trend of confirmed and suspected cases during the same time could be explained by 1) overdiagnosis and 2) the limited potential of HUMS laboratories for performing COVID-19 PCR tests during the early months of the pandemic. Moreover, the reduction of suspected inpatients in June and July could be justified by the lack of hospital beds dedicated to confirmed cases of COVID-19 infection, which in turn, led to not admitting suspected cases.
In the present work, Prophet outperformed ARIMAX. However, earlier studies showed that confirmed cases of COVID-19 would have an upward trend in the world using different forecasting models (7). An earlier study comparing forecasting methods reported Prophet-like models, which are a kind of machine-learning-based forecasting models, forecasted future time series more accurately (8). The results of our study also approved this because four variables were more accurately estimated using Prophet model and three variables were better forecasted with ARIMAX (Table 3). However, since forecasting values in ARIMAX were negative, we only considered the forecasting trend using ARIMAX; and therefore, we relied on Prophet model results for forecasting future events.
In this study, two forecasting models presented conflicting results regarding the number of suspected inpatients. However, the optimal model, which forecasted negative values based on the MSE, was not reported. Therefore, the Prophet model was selected to interpret the results. In Prophet modelling, forecasting is based on the analysis of all the available data; however, ARIMAX modelling carries out the initial forecasting based on train data and then the main forecasting based on the test data. In ARIMAX model, it is natural for the closer trends to be more effective. In contrast, the general long-term effect of the trend is used in Prophet modelling. Greater weights in In other words, the forecasting results of the two models are more likely to be closer if only there is more available data for analysis. In this way, it is expected to extract seasonal effects by increasing instantaneous data and simulating variables' behavior.

Limitation
The main limitation of our study was unavailability of event-report such as age, sex, comorbidity, etc. Such data could enable us to involve more variables in our forecasting model.

Conclusion
Based on the findings of this study, which proved the outperformance of Prophet to ARIMAX, it can be concluded that time series of suspected inpatients, confirmed inpatients, recovered cases, deaths, and ICUinpatient service day are following a downward trend while ICU-admission and discharge time series are taking an upward trend in Hormozgan by the end of the current Iranian calendar year.