Abstract
Linear Regression (LR) forecasts the value of a random variable (dependent variable) for a given value of an associated independent variable. The regression equation provides the formula for such a calculation. Simple Exponential Smoothing (SES) forecasts the bases its forecasts on a weighted average of past data, with more weight on the more recent periods. A smoothing factor helps in providing weight. It generally forecasts for the short-term. In the present case, we forecast the sales data for year-2 for ABC Furniture Company using both the methods. For SES, we use two smoothing factors α equal to 0.15 and 0.90 to find out which smoothing factor is optimal. Once the forecast is complete, we calculate the average difference in the forecast and actual sales (Mean Error), the average percentage of error (MPE), the average absolute percentage in error (MAPE) for each of these forecasts individually. The report compares these values. Since the aim of any forecast is reduce the errors, the report then recommends the best forecast method for the present case.
Keywords: Linear Regression, Simple Exponential Smoothing, ME, MPE, MAPE, Smoothing factor
Introduction
This paper compares the results of two different types of forecasting methods. The two different methods chosen were forecasting using linear regression (LR) and forecasting using simple exponential smoothing (SES) methods. For the comparison, the report uses ABC Furniture Company’s year-two actual sales data and data obtained by forecasting using both methods. The report then recommends the best method for forecasting for the given data, SES or LR
Background
The two different types of quantitative forecasting methods are 1) Time series: Reactive or one-dimensional methods, or 2) Causal: Proactive or multi-dimensional method. The Time series methods rely on identification of patterns such as trend, seasonality, or cyclicality, in the past data for forecasting. There are several types of time series forecasting methods, which are naïve model, moving averaging, exponential smoothing (single or simple, Holts and Winter’s two parameters, Brown’s double, and Winter’s three parameters), decomposition (additive and multiplicative), Box-Jenkin’s Autoregressive Integrated Moving Average (ARIMA).
Forecasting for Time Series Data
In the linear regression method, we find the trend equations to fit the data. Using these equations, one can forecast by varying the time variable. In the case of a linear trend in a time series, for each time period, the value of the variable changes by a constant amount. In the case of an exponential trend in a time series, the value of the variable changes by a constant percentage. Moving averages model is a simple and frequently used model for forecasting. In this case, one chooses a span of fixed number of time-periods and their average provides the forecast for the next time period. As the period moves, the span remains the same so the average will move along with the period. When the span is large, the smoothing is better and vice versa. When the span is one, it becomes similar to the Naïve model.
Exponential smoothing dampens the fluctuations in a time series; each “smoothed” value is a weighted average of current and past values observed in the series. The advantages of the time series methods are that they suit situations where there is a need to forecast the demand for a large number of products with a stable history. They are simple and can smooth out small random fluctuations, can forecast for short-term and automated software is readily available. However, they require a large amount of data and adjust slowly to actual sales. To find the smoothing weights, one might have to try for many iterations. They are not suitable for forecasts that have a long horizon and are not good if the fluctuations are large .
Regression analysis studies the relationship between variables and is, therefore, a good tool for a business analyst. In the case of linear regression, it includes determining how a single variable depends on another independent variable. One can use regression analysis to understand the relationships or for forecasting. We can generate time series data by observing a particular variable over equally spaced periods. Regression analysis uses a dependent variable and an independent variable. If one can describe the relationship between these variables using a straight-line, then it is a linear relationship and one can perform linear regression using suitable mathematical transformations.
Figure 1: Formula for simple regression line
Source:
In the ABC Furniture Company case, a scatter plot was drawn and it was determined if the relationship between the two variables was linear. We find the correlation, which expresses the strength of the linear relationship, to be 0.8473. This means there is a strong positive correlation between the two variables. The trendline has a positive slope. From this, using the regression equation in Figure 1, one can determine the values of the slope (b1) and the intercept (b0). Using those values, we complete the year-2 forecast, using the same regression equation, for each of the months. From these, we find the mean error (ME), mean percentage error (MPE), and mean absolute percentage error (MAPE) values.
Exponential smoothing bases its forecasts on a weighted average of past data, with more weight on the more recent periods, and it requires very little data storage. In addition, it is simple for most business people to understand and hence is widely used in the business world, particularly when frequent and automatic forecasts of many items are required. There are different variations of exponential smoothing. The simplest is simple (or single) exponential smoothing. It is relevant when there is no pronounced trend or seasonality in the series. If there is a trend but no seasonality, Holt’s method is applicable. If both are there, then Winters’ method can be used.
In the case of SES, to get started, the exponential smoothing requires an initial value to seed the forecasting. Therefore, we use the January months’ sales as the starting point. We have to use a smoothing constant (Alpha - α). Therefore, we use alpha = 0.15 and 0.9 and calculate the ME, MPE, and MAPE for each. The equation to estimate the sales of a particular period is:
F(t) = F(t-1) + α * E(t) where the forecast error Et = Y(t-1) – F(t-1)
This equation derives the next estimate of the sales forecast from the previous estimate by adding a multiple of the most recent forecast error. Therefore, if the previous forecast was too high, then E(t) is negative, and the equation adjusts the forecast for the current period downward. Similarly, if the forecast was too low, then E(t) is positive, and the equation adjusts the forecast for the current period upward. However, the equation does not adjust the entire magnitude of the error, but only by a fraction of it. If α is small, say, α = 0.1, the adjustment is minor; if α is close to 1, the adjustment is large. One should choose a large α; if the forecast has to adjust quickly to movements in the series, otherwise, one should choose a small α.
The exponentially smoothed forecast is a weighted average of the previous data. It can be seen that if α is small or close to 0, then even data from the oldest period will have a large influence on the next forecast. When α is closer to 1, then only the recent data has an influence on the next forecast. In this case, forecasts react quickly to sudden changes in the series. No universally acceptable values that one should use for α exist, however, some experts recommend using a value between 0.1 and 0.2 while others suggest using different α values until we can minimize measures such as Root Mean Square Error (RMSE) or MAPE.
In the current scenario, for ABC Furniture Company, the Table 2 shows the ME, MPE, and MAPE using initially α = 0.15 and then α = 0.9. In the first case, the MAPE is 0.148 and in the second case, it is 0.135. One must note that the value of α that tracks the historical series’ most closely does not necessarily guarantee the most accurate future forecasts. One must understand that, when using larger α, it might appear as if the historical forecasts are tracking the actual series closely, but they might be just tracking some random noise.
Recommendations
The forecast using LR has ME = -16. 95 and MAPE = 0.12.
The forecast using SES and α = 0.15 has ME = 16.884 and MAPE = 0.148
The forecast using SES and α = 0.9 has ME = 2.322 and MAPE = 0.135
When ME is lower, it means that the forecast is more sensitive to the peaks and lows of the sales data, thereby lowering the error. However, the target for any forecasting is to reduce the MAPE so that it approaches zero so that the actual values agree completely with the forecast values. Considering that, we can recommend linear regression method as a better forecasting method in the present case for the given data as its MAPE of 0.12 is less than 0.135 or 0.148 that the SES method of forecasting has generated.
References
Albright, S. C., Winston, W. L., & Broadie, M. (2015). Business analytics: data analysis and decision making (5th ed.). Stamford, CT: Cengage Learning.
Chase, C. (2013). Demand-driven forecasting: a structured approach to forecasting (2nd ed.). Hoboken, NJ: John Wiley & Sons, Inc.
Kazmier, L. J. (2004). Schaum's outline of theory and problems of business statistics (4th ed.). New York, NY: The McGraw-Hill Companies, Inc.
Weiers, R. M., Gray, J. B., & Peters, L. H. (2011). Introduction to business statistics (7th ed.). Mason, OH: South-Western Cengage Learning.