"Dirichlet regression can be used for modelling 'compositional data', when the dependent-Y variable is practically a sum total of contribution from multiple Y components."

Dirichlet regression can be used to predict the ratio’s in which the sum total X (demand/forecast/estimate) can be distributed among the component Y’s. It is practically a case where there are multiple dependent ‘Y’ variables and one predictor ‘X’ variable, whose sum is distributed among the Y’s .

Few possible real-world examples could be as follows: 1. The dependent Y variable to be predicted - Total demand of a product of a multi-national organization is actually a sum of demand of the product from multiple factories of the organization. We are interested in both the total demand as well as the factory wise split. 2. The demand of a product is actually the sum total of demand of 4 different variants of the same product.

In either case, the dependent Y variables, which are the contributions from each component, should be converted to fractions summing up to 1. It is the job of DirichReg() to predict these fractions when the sum total (X) is known.

The code shown below can model, predict and visualize multiple Y Variables

### Step 1: Prepare the data

Prepare the test and training samples. Make the diririchlet Reg data on Y’s.

`library (DirichletReg)`

inputData <- ArcticLake # plug-in your data here.

set.seed(100)

train <- sample (1:nrow (inputData), round (0.7*nrow (inputData))) # 70% training sample

inputData_train <- inputData [train, ] # training Data

inputData_test <- inputData [-train, ] # test Data

inputData$Y <- DR_data (inputData[,1:3]) # prepare the Y's

inputData_train$Y <- DR_data (inputData_train[,1:3])

inputData_test$Y <- DR_data (inputData_test[,1:3])

### Step 2: Train the model

`# Train the model. Modify the predictors as such.`

res1 <- DirichReg(Y ~ depth + I(depth^2), inputData_train) # modify the predictors and input data here

res2 <- DirichReg(Y ~ depth + I(depth^2) | depth, inputData_train, model="alternative")

### Step 3: Fit the training data and forecast

`# Predict On Training Data: Fitted Values`

predict(res1) # Model 1 fit

predict(res2) # Model 2 fit

resid(res1) # Residuals

# Predict On Test Data or Forecast

predicted_res1 <- predict(res1, inputData_test) # Model 1

predicted_res2 <- predict(res2, inputData_test) # Model 2

### Step 4: Visualize results

` # Plot`

plot(DR_data(predicted_res2)) # plot test Data on model 2

plot(DR_data(inputData_test$Y)) # plot actual test Data

# additional plots

plot(inputData$Y)

A Dirichlet Plot