Dirichlet Regression

`"Dirichlet regression can be used for modelling 'compositional data', when the dependent-Y variable is practically a sum total of contribution from multiple Y components."`

Dirichlet regression can be used to predict the ratio’s in which the sum total X (demand/forecast/estimate) can be distributed among the component Y’s. It is practically a case where there are multiple dependent ‘Y’ variables and one predictor ‘X’ variable, whose sum is distributed among the Y’s .

```Few possible real-world examples could be as follows:

1. The dependent Y variable to be predicted - Total demand of a product of a multi-national organization is actually a sum of demand of the product from multiple factories  of the organization. We are interested in both the total demand as well as the factory wise split.

2. The demand of a product is actually the sum total of demand of 4 different variants of the same product.```

In either case, the dependent Y variables, which are the contributions from each component, should be converted to fractions summing up to 1. It is the job of DirichReg() to predict these fractions when the sum total (X) is known.

The code shown below can model, predict and visualize multiple Y Variables

Step 1: Prepare the data

Prepare the test and training samples. Make the diririchlet Reg data on Y’s.

```library (DirichletReg) inputData <- ArcticLake  # plug-in your data here. set.seed(100) train <- sample (1:nrow (inputData), round (0.7*nrow (inputData)))  # 70% training sample inputData_train <- inputData [train, ] # training Data inputData_test <- inputData [-train, ] # test Data inputData\$Y <- DR_data (inputData[,1:3])  # prepare the Y's inputData_train\$Y <- DR_data (inputData_train[,1:3]) inputData_test\$Y <- DR_data (inputData_test[,1:3])```

Step 2: Train the model

```# Train the model. Modify the predictors as such. res1 <- DirichReg(Y ~ depth + I(depth^2), inputData_train)  # modify the predictors and input data here res2 <- DirichReg(Y ~ depth + I(depth^2) | depth, inputData_train, model="alternative") ```

Step 3: Fit the training data and forecast

```# Predict On Training Data: Fitted Values predict(res1) # Model 1 fit predict(res2) # Model 2 fit resid(res1) # Residuals # Predict On Test Data or Forecast predicted_res1 <- predict(res1, inputData_test) # Model 1 predicted_res2 <- predict(res2, inputData_test) # Model 2```

Step 4: Visualize results

``` # Plot plot(DR_data(predicted_res2)) # plot test Data on model 2 plot(DR_data(inputData_test\$Y)) # plot actual test Data # additional plots plot(inputData\$Y)```

A Dirichlet Plot

Summary
Review Date
Author Rating
5