Homework 4

SAS代写 Important Note 1: AS mentioned before, please do not submit zip folder. Submit two files: A word report and SAS code.

SAS代写
SAS代写

Important Note 1:           SAS代写

AS mentioned before, please do not submit zip folder. Submit two files: A word report and SAS code. The word report must meet the requirements stated before.

Important Note 2:            SAS代写

Please be specific when you write your answers. There are 6 parts, so you need to specify exactly which part you are answering.

 

The SAS dataset HeinzHunts has data on grocery store purchases of Hunts and Heinz ketchup.    SAS代写

Each observation corresponds to one purchase occasion (of one of these brands) and consists of the following variables:

  1. Heinz: =1 if Heinz was purchased, =0 if Hunts was purchased
  2. PriceHeinz: Price of Heinz
  3. PriceHunts: Price of Hunts
  4. DisplHeinz: = 1 if Heinz had a store display, =0 if Heinz did not have a store display
  5. DisplHunts: = 1 if Hunts had a store display, =0 if Hunts did not have a store display
  6. FeatureHeinz: = 1 if Heinz had a store feature, =0 if Heinz did not have a store feature
  7. FeatureHunts: = 1 if Hunts had a store feature, =0 if Hunts did not have a store feature

1. Create a variable LogPriceRatio = log (PriceHeinz/PriceHunts).    SAS代写

data data;

     set da.Heinzhunts;

     LogPriceRatio = log (PriceHeinz/PriceHunts);

run;

2.Randomly select 80% of the data set as the training sample, remaining 20% as test sample      SAS代写

proc surveyselect  data=data  method=srs  seed=123 outall

                                       samprate=0.8 out=splitdata;

run;



data training;

   set splitdata;

   if Selected=1;

   drop Selected;

run;




data test;

   set splitdata;

   if Selected=0;

   drop Selected;

run;

3.Estimate a logit probability model for the probability that Heinz is purchased –     SAS代写

using LogPriceRatio, DisplHeinz, FeatureHeinz, DisplHunts, FeatureHunts as the explanatory variables.Include interaction terms between display and feature for a particular brand (e.g., DisplHeinz * FeatureHeinz).

data training;

          set training;

          interHeinz = DisplHeinz*FeatHeinz;

         interHunts = DisplHunts*FeatHunts;

run;


proc logistic data=training;

           model Heinz (descending) = LogPriceRatio DisplHeinz FeatHeinz DisplHunts FeatHunts interHeinz interHunts;

run;

4.Interpret the results.     SAS代写

What promotional methods (feature / display) are effective for Hunts? For Heinz? How would you interpret the results for the interaction effects?

SAS代写
SAS代写

 

Based on the result above, we can know that DisplHeinz and FeatHeinz both have a significantly positive relationship with Heinz at the level of 10%. Therefore, feature and display methods are effective for Heinz. Similarly, DisplHunts and FeatHunts are negatively related with Heinz, or positively related with Hunts, at a significance level of 10%.

So, feature and display are also effective to Hunts. The coefficient of interaction term is negative, meaning that the effect of the combined promotional methods is less than the sum of the individual effects. However, the P value of interaction terms is higher than 0.1, which means the interaction effect is not significant.

 

5.Based on the estimated model, and using the logit probability formula,     SAS代写

calculate the change in predicted probability that Heinz is purchased if LogPriceRatio changes from 0.5 to 0.6 and Heinz does not use a feature or display, while Hunts uses a feature and a display.

Recall that in the logit model:  , where Y is the outcome variable, X are the predictor variables, and  are the estimated model coefficients.

1
1
data pred;

         input LogPriceRatio DisplHeinz FeatHeinz DisplHunts FeatHunts interHeinz interHunts;

         cards;

         0.5 0 0 1 1 0 1

         0.6 0 0 1 1 0 1

         ;

run;

proc logistic data=training;

           model Heinz (descending) = LogPriceRatio DisplHeinz FeatHeinz DisplHunts 

FeatHunts interHeinz interHunts;

           score data=pred out=estimates;

run;
2
2

Based on the result above, we can know that the predicted probability that Heinz is purchased is decreasing from 0.1559905324 to 0.0903286148.

 

6.The estimated model is to be used for targeting customers for Hunts coupons to build loyalty for the brand.           SAS代写

Coupons are to be sent to customers who are likely to buy Hunts, and not to customers who are likely to buy Heinz. Therefore, the coupons should be sent to customers whose predicted probability of buying Heinz is below a certain threshold level that needs to be determined based on the costs of misclassifications (incorrectly sending / not sending a coupon)

The following information about the costs of incorrect classification is available: The cost of incorrectly sending a coupon to a customer who would have bought Heinz is $1 per customer, and the cost of incorrectly failing to send a coupon to a customer who would have bought Hunts is $0.25 per customer.

Based on these costs, what is the optimal threshold probability level that should be used with the estimated model to decide which consumers should receive coupons.

(HINT: Step 1: Using the appropriate SAS command, create an ROC table for the test data from the estimated model. The ROC table provides the number of false positive and false negative classifications for each possible probability threshold.

Step 2: Using the cost information, calculate the total cost of misclassification for each probability threshold.

Total Cost = # of False Positives * False Positive Cost + # of False Negatives * False Negative Cost

Think carefully as to what is false positive and negative in this context.

Step 3: Choose the probability threshold that leads to the lowest total cost.)

data test;

          set test;

          interHeinz = DisplHeinz*FeatHeinz;

          interHunts = DisplHunts*FeatHunts;

run;


proc logistic data=test;

          model Hunts (descending) = LogPriceRatio DisplHeinz FeatHeinz DisplHunts 

FeatHunts interHeinz interHunts / outroc = rocscore ctable;

run;


data rocscore;

          set rocscore;

          cost = _falpos_*1+_falneg_*0.25;

run;


proc sort data=rocscore;

           by cost;

run;
3
3

 

 

The false positive is incorrectly sending a coupon to a customer who would have bought Heinz. The false negative is incorrectly failing to send a coupon to a customer who would have bought Hunts. So the probability threshold that leads to the lowest total cost, which is 15,  is 0.7860743617. SAS代写

 

更多其他:数据分析代写  Assignment代写  Case study代写  文学论文代写  商科论文代写  艺术论文代写  人文代写  Case study代写  心理学论文代写  哲学论文代写

合作平台:天才代写 幽灵代写  写手招聘

 

SAS代写
SAS代写