Different Types of Facebook Page Posts
Different Types of Facebook Page Posts
Social media marketing is consistently growing as more consumers and businesses become involved in social media sites, and various studies have confirmed the utility of social media sites such as Facebook in building brand image and equity for businesses owners (Kim, Sung, Lee, Choi, & Sung, 2016). However, the growing number of information available on the Internet also means users are overloaded with content and need to be selective of where they direct their attention. For Facebook page owners, that means their posts will be less likely to appear in the News Feeds of their fans as Facebook implements new algorithms to selectively display information in the users’ News Feeds. In order to increase the visibility and engagement with their audience, brands need to invest in paid advertising or post engaging content. The purpose of this study is to determine how payment (free vs. paid) and different types of post categories (action vs. product vs. inspiration) affect the magnitude of user interaction with Facebook page posts, which is measured by the amount of likes, shares, and comments a post receives.
Methods
Data
The data set “Facebook metrics” was created by Moro, Rita, and Vala (2016), who made it publicly available via the University of California Machine Learning Repository. The dataset contains 500 observations of 19 variables. Five variables of interest were selected from the dataset and analyzed in this study. The names of those variables, their descriptions, and their respective data types are shown in Table 1.
Variables of interest from the “Facebook metrics” dataset
Note. Adapted from “Predicting social media performance metrics and evaluation of the impact on brand building: A data mining approach,” by S. Moro, P. Rita, and B. Vala, 2016, Journal of Business Research, 69(9), 3341-3351. Copyright 2015 Elsevier Ltd.
“Paid” and “Category” were selected as the independent variables because it was expected that the financial investment in a post could have an effect on user engagement with the posts. “Paid” is a dichotomous binary variable that characterizes a post as a paid post or a free post. “Category” is a nominal categorical variable as Moro et al. (2016) recorded the intention of each post, which include motivating the user to take some sort of action (i.e., action), directly advertise a product (i.e., product), or provide non-commercial brand-related content to the users (i.e., inspiration).
“Comments,” “Likes,” and “Shares” were the dependent variables because those are the key metrics used to assess how the audience interacts with a Facebook post. Although conversion rates and profits are the best measures for evaluating the success of marketing strategies, the primary purpose of marketing on social media is to build brand reputation and trust, as well as develop and maintain a relationship between brands and consumers (Kim et al., 2016). That is why the metrics used as dependent variables in this study are used to measure the success of different groups of posts.
Hypotheses
H1: The results of the analysis will not be affected by the selection of different techniques used to deal with missing values.
Rationale: Because of the large size of the dataset (N = 500) and small number of incomplete cases (n = 5), the selection of different techniques for handling missing data should have no profound effect on the results of the analysis.
H2: The category of the posts will have a significant effect on the number of comments, likes and shares a post receives.
Rationale: Different types of posts are expected to produce a different response from the audience because each type of post will have a different purpose, which determines how people perceive and react to it. For example, some posts will be used to build brand reputation and reach, whereas other posts will be used in an attempt to sell a product.
H3: Payment for advertising a post will have a significant effect on the number of comments, likes, and shares a post receives.
Rationale: Paid posts are a form of advertising used on Facebook to extend the reach of the post within their fan base, as well as show the content to a target audience that does not follow the advertised page.
H4: The interaction between post category and payment for advertising the post will have a significant effect on the number of comments, likes, and shares a post receives.
Rationale: It is expected that the category of a post determines whether a Facebook page owner will pay for advertising. For example, a post advertising a product will more likely be paid for to extend its reach than a post designed to share interesting brand-related content with the audience. Therefore, a significant interaction effect between the two variables is expected.
Data Analysis
The data analysis in this study was conducted using R version 3.3.2 “Sincere Pumpkin Patch” in RStudio version 1.0.44. Two packages were used in addition to the core packages. The package “MVN” was used to assess univariate and multivariate normality, which are important assumptions of multivariate analyses such as the MANOVA (Korkmaz, Goksuluk, & Zararsiz, 2014). The package “mi” was used to perform multiple imputations and create multiple imputed values for each missing value (Gelman & Hill, 2006). The R code used to conduct the analysis is attached in Appendix A.
There are numerous approaches to dealing with missing data, and choosing the correct approach depends on its limitations and characteristics of the dataset analyzed. R deals with missing values according to the instructions stored in the “na.action” option, which is set to “na.omit” by default. Therefore, R will perform a listwise deletion of missing data unless otherwise specified. That approach to dealing with missing values is not recommended because it limits the amount of data used in the analysis, which is not an issue in this case considering the number of observations in the dataset (N = 500), but it could produce biased parameter estimates (Gelman & Hill, 2006; Kang, 2013). Other possibilities for dealing with missing data include pairwise deletion, mean substitution, regression imputation, last observation carried forward, maximum likelihood, expectation-maximization, multiple imputation, and sensitivity analysis (Kang, 2013).
Each of the aforementioned approaches to dealing with missing data has certain limitations and adverse effects on the outcome of the calculations. According to Kang (2013), maximum likelihood and multiple imputation are the best techniques for handling missing data. Therefore, this study will analyze the data twice to include the multiple imputation technique and a simple listwise deletion, so it will be possible to compare the outcomes of a recommended approach and the default option used in R. The multiple imputation procedure was set to produce four chains with 30 iterations, and two of those chains were stored as complete data frames. Using multiple datasets in the analysis when multiple imputation technique is used is a common practice because it allows researchers to combine inferences across multiple datasets (Gelman & Hill, 2006)
MANOVA was used to test the second, third, and fourth hypotheses. The Shapiro-Wilks test was used to test the univariate normality assumption, whereas Mardia’s test was used to test the multivariate normality assumption. The results of the test are included as comments in the R code in Appendix A. As the entire dataset did not satisfy the univariate and multivariate normality assumptions, bootstrapping was used to draw a random sample with replacement from the dataset (n = 60). The procedure was repeated until a near-normal distribution was achieved with no significant outliers. The dataset obtained by bootstrapping is attached in colon-separated values (CSV) format in Appendix B.
Results
The listwise technique was used to remove missing cases from the dataset, and the following results of the MANOVA test for the three response variables were obtained:
Response comment :
Df Sum Sq Mean Sq F value Pr(>F)
paid_listwise 1 1229 1228.88 2.7845 0.09582 .
category_listwise 2 2531 1265.47 2.8674 0.05780 .
paid_listwise:category_listwise 2 4016 2008.15 4.5503 0.01102 *
Residuals 489 215808 441.33
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response like :
Df Sum Sq Mean Sq F value Pr(>F)
paid_listwise 1 617018 617018 6.1394 0.013557 *
category_listwise 2 1091359 545680 5.4295 0.004653 **
paid_listwise:category_listwise 2 1136349 568175 5.6534 0.003739 **
Residuals 489 49145439 100502
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response share :
Df Sum Sq Mean Sq F value Pr(>F)
paid_listwise 1 5305 5304.7 3.0862 0.0795859 .
category_listwise 2 28112 14055.8 8.1775 0.0003211 ***
paid_listwise:category_listwise 2 24933 12466.7 7.2529 0.0007869 ***
Residuals 489 840517 1718.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Using MANOVA to analyze the first chain from the multiple imputation dataset produced the following results:
Response comment :
Df Sum Sq Mean Sq F value Pr(>F)
paid_imputed_1 1 1299 1298.58 2.9701 0.08544 .
category_imputed_1 2 2551 1275.28 2.9168 0.05504 .
paid_imp_1:category_imp_1 2 4030 2015.06 4.6088 0.01040 *
Residuals 494 215988 437.22
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response like :
Df Sum Sq Mean Sq F value Pr(>F)
paid_imputed_1 1 648924 648924 6.5162 0.010989 *
category_imputed_1 2 1138402 569201 5.7156 0.003516 **
paid_imp_1:category_imp_1 2 1132751 566375 5.6873 0.003615 **
Residuals 494 49195724 99586
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response share :
Df Sum Sq Mean Sq F value Pr(>F)
paid_imputed_1 1 4350 4350.3 2.3413 0.1266211
category_imputed_1 2 23585 11792.7 6.3468 0.0018986 **
paid_imp_1:category_imp_1 2 26373 13186.4 7.0969 0.0009148 ***
Residuals 494 917877 1858.1
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Using MANOVA to analyze the second chain from the multiple imputation dataset produced the following results:
Response comment :
Df Sum Sq Mean Sq F value Pr(>F)
paid_imputed_2 1 1299 1298.58 2.9701 0.08544 .
category_imputed_2 2 2551 1275.28 2.9168 0.05504 .
paid_imp_2:category_imp_2 2 4030 2015.06 4.6088 0.01040 *
Residuals 494 215988 437.22
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response like :
Df Sum Sq Mean Sq F value Pr(>F)
paid_imputed_2 1 648924 648924 6.5162 0.010989 *
category_imputed_2 2 1138402 569201 5.7156 0.003516 **
paid_imp_2:category_imp_2 2 1132751 566375 5.6873 0.003615 **
Residuals 494 49195724 99586
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response share :
Df Sum Sq Mean Sq F value Pr(>F)
paid_imputed_2 1 5233 5233.0 3.0579 0.0809641 .
category_imputed_2 2 27437 13718.4 8.0164 0.0003748 ***
paid_imp_2:category_imp_2 2 25192 12596.2 7.3606 0.0007080 ***
Residuals 494 845375 1711.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Although some values are different across datasets, the significance levels mostly remained consistent. Only the “share” response variable in the first chain created with multiple imputation had different F statistics and p values compared to the “share” response in other two datasets. Based on the MANOVA results, H1 (“The results of the analysis will not be affected by the selection of different techniques used to deal with missing values”) was confirmed, but it is important to note that the differences between the techniques may have been more pronounced if the dataset had contained more missing values or if the dataset had contained fewer observations. However, the comparison of results obtained with different does support the fact that listwise deletion is a feasible method for dealing with missing values when there are few missing values and when the sample size is large (Gelman & Hill, 2006; Kang, 2013).
As there were no differences in p values based on the technique used to deal with missing values, the analysis was repeated using the listwise case removal technique to test H2, H3, and H4. However, this analysis was preceded by normality tests. According to the results of the Shapiro-Wilk test and Mardia’s test, the dependent variables did not satisfy the univariate and multivariate normality assumptions. Even though MANOVA is considered robust and resistant to deviations from both normality and homogeneity, the observed deviations from normality were too large in all cases. The output of the normality tests is included with the R code in Appendix A.
In order to satisfy the normality assumptions, bootstrapping was used to draw a random sample with replacement a normal distribution from the original sample. The bootstrapped sample dataset is appended in Appendix B in CSV format, and the results of the univariate and multivariate normality tests are included in the R code in Appendix A. Although some deviations from univariate and multivariate normality were observed in the response variables of each group, they were not as strong as the deviations observed in the entire dataset, so the MANOVA was repeated using the bootstrapped sample to test H2, H3, and H4. The results of the analysis were as follows:
Response comment :
Df Sum Sq Mean Sq F value Pr(>F)
paid_normal 1 49.57 49.572 2.3914 0.12785
category_normal 2 133.09 66.543 3.2101 0.04816 *
paid_normal:category_normal 2 63.70 31.848 1.5364 0.22442
Residuals 54 1119.38 20.729
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response like :
Df Sum Sq Mean Sq F value Pr(>F)
paid_normal 1 4 4.4 0.0009 0.9764
category_normal 2 6224 3112.2 0.6237 0.5398
paid_normal:category_normal 2 13457 6728.4 1.3484 0.2683
Residuals 54 269464 4990.1
Response share :
Df Sum Sq Mean Sq F value Pr(>F)
paid_normal 1 419.5 419.48 2.4949 0.12006
category_normal 2 1346.8 673.40 4.0051 0.02389 *
paid_normal:category_normal 2 224.1 112.03 0.6663 0.51778
Residuals 54 9079.3 168.14
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The results of the analysis support H2 (“The category of the posts will have a significant effect on the number of comments, likes and shares a post receives”) as the category effect was significant for comments, F(2, 54) = 3.21, p = .048, and for shares, F(2, 54) = 4.01, p = .024. However, the number of likes the posts received was not associated with their categories. H3 (“Payment for advertising a post will have a significant effect on the number of comments, likes, and shares a post receives”) and H4(“The interaction between post category and payment for advertising the post will have a significant effect on the number of comments, likes, and shares a post receives”) are rejected as there is no evidence that a financial investment in a post or an interaction between the two independent variables has a significant
The number of comments by post categories is displayed in Figure 1, which shows that the highest median number of comments was observed for “action” posts, following by “product” posts and “inspiration” posts in descending order. According to Moro et al. (2016), action posts were those posts that asked users to participate in contests or presented special offers, so it is possible to suggest that those types of posts could be used by Facebook page owners when their marketing strategy emphasizes building rapport with their audience.
Figure 1. The median number and range of comments for action, product, and inspiration posts.
Figure 2. The median number and range of comments for action, product, and inspiration posts.
When it comes to sharing content posted on Facebook pages, inspirations are the type of post that are the most successful, followed by product posts and action posts in descending order. Even though inspiration posts are aimed at distributing brand-related content without the intent to directly and explicitly promote a brand’s products or services, they can provide more exposure to Facebook pages as users tend to share them more often than other types of posts.
Conclusion
The purpose of this study was to determine the effects of financial investments in Facebook page posts and different types of post affect the magnitude of user interaction with Facebook page posts, which was expressed by the amount of likes, shares, and comments the posts included in the dataset received. Although the dataset had missing values, the MANOVA results were similar across different datasets, so it is suggested that listwise deletion is a feasible technique when the sample size is large and when the number of incomplete cases is low. However, multiple imputations are still recommended for handling datasets with larger number of missing values.
The findings of this study have important practical implications for marketing Facebook pages. The results suggest that the type of post used by Facebook page owners has a significant effect on audience interactions with the post. Posts that contain brand-related content without the intent to sell have the potential to reach a wider audience than other types of content as they tend to get shared more than other types of posts. In contrast, posts that ask users to take some sort of actions tend to receive more user comments, so they facilitate on-page discussions and relationships between the audience and brands. Paying for posts was not associated with improved audience engagement. Although that finding was not expected, it is reasonable to assume that the quality of the post will more likely explain higher user engagement than relying on a payment to display posts to more people.
It is important to note that the dataset used in this study did not include business outcomes of the posts based on the Facebook page owners investments into the posts or types of posts used. Several important variables that are used in business, such as seasonality and industry, were also not reported in that dataset, so it is not possible to estimate the transferability of the data to different real world settings. Therefore, the findings of this study can be used to increase the chance of eliciting a specific type of response from the users, but results in practice may be subject to various external influences.
References
Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. New York, NY: Cambridge University Press.
Kang, H. (2013). The prevention and handling of the missing data. Korean Journal of Anesthesiology, 64(5), 402–406. http://doi.org/10.4097/kjae.2013.64.5.402
Kim, D. H., Sung, Y. H., Lee, S. Y., Choi, D., & Sung, Y. (2016). Are you on Timeline or News Feed? The roles of Facebook pages and construal level in increasing ad effectiveness. Computers in Human Behavior, 57, 312-320.
Korkmaz, S., Goksuluk, D., & Zararsiz, G. (2014). MVN: an R package for assessing multivariate normality. The R Journal, 6(2), 151-162.
Moro, S., Rita, P., & Vala, B. (2016). Predicting social media performance metrics and evaluation of the impact on brand building: A data mining approach. Journal of Business Research, 69(9), 3341-3351.
Appendix A: R Code
library(MVN)
library(mi)
fb_metrics <- read.csv("dataset_Facebook.csv", sep = ";",
stringsAsFactors = FALSE)
# Select only relevant variables
fb_metrics <- subset(fb_metrics,
comment:share, Total.Interactions)) # DVs
fb_metrics$Paid <- factor(fb_metrics$Paid,
levels = c(0, 1),
labels = c("Free", "Paid"))
fb_metrics$Category <- factor(fb_metrics$Category,
levels = 1:3,
labels = c("Action", "Product", "Inspiration"))
# tk important!
missing_selector <- which(!complete.cases(fb_metrics))
# Listwise deletion
fb_listwise <- fb_metrics[complete.cases(fb_metrics), ]
# Multiple imputation
fb_imputation <- missing_data.frame(fb_metrics)
# Defining DVs as non-negative integers
fb_imputation <- change(fb_imputation, y = "like",
what = "type", to = "count")
fb_imputation <- change(fb_imputation, y = "share",
what = "type", to = "count")
fb_imputation <- change(fb_imputation, y = "comment",
what = "type", to = "count")
# Estimating missing values
imputation <- mi(fb_imputation, seed = 2359,
n.iter = 30, n.chains = 4)
# Verification: Are there enough iterations?
# (Means should converge across chains.)
round(mipply(imputation, mean,
Rhats(imputation)
# Yes, close enough
fb_imputation <- complete(imputation, m = 2) # export first two chains
fb_imputation_1 <- fb_imputation[[1]][, 1:6] # select IVs and DVs only
fb_imputation_2 <- fb_imputation[[2]][, 1:6]
# Listwise MANOVA -------------------------------------------------------------
# Independent categorical variables, as factors
paid_listwise <- fb_listwise$Paid
category_listwise <- fb_listwise$Category
# Dependent variables, as matrix
post_interaction <- as.matrix(subset(fb_listwise,
)
# Analysis
listwise_manova <- manova(post_interaction ~
paid_listwise * category_listwise)
summary.aov(listwise_manova)
# Imputed MANOVA --------------------------------------------------------------
# Independent categorical variables, as factors
paid_imputed_1 <- fb_imputation_1$Paid
category_imputed_1 <- fb_imputation_1$Category
paid_imputed_2 <- fb_imputation_2$Paid
category_imputed_2 <- fb_imputation_2$Category
# Dependent variables, as matrix
post_interaction_1 <- as.matrix(subset(fb_imputation_1,
)
post_interaction_2 <- as.matrix(subset(fb_imputation_2,
)
# Analysis: 1st chain
imputed_manova_1 <- manova(post_interaction_1 ~
paid_imputed_1 * category_imputed_1)
summary.aov(imputed_manova_1)
# Analysis: 2nd chain
imputed_manova_2 <- manova(post_interaction_2 ~
paid_imputed_2 * category_imputed_2)
summary.aov(imputed_manova_2)
# Normality tests -------------------------------------------------------------
# Univariate normality
uniNorm(fb_listwise[, 3:5], type = "SW")
# Output:
#
# $`Descriptive Statistics`
# n Mean Std.Dev Median Min Max 25th 75th Skew Kurtosis
# comment 495 7.558 21.274 3 0 372 1 7.0 11.650 179.310
# like 495 179.145 324.412 101 0 5172 57 188.0 8.879 116.826
# share 495 27.265 42.656 19 0 790 10 32.5 12.076 205.171
#
# $`Shapiro-Wilk's Normality Test`
# Variable Statistic p-value Normality
# 1 comment 0.2826 0 NO
# 2 like 0.3941 0 NO
# 3 share 0.3887 0 NO
# Multivariate normality
mardiaTest(fb_listwise[, 3:5], qqplot = T)
# Output:
#
# Mardia's Multivariate Normality Test
#
# data : fb_listwise[, 3:5]
#
# g1p : 204.3648
# chi.skew : 16860.09
# p.value.skew : 0
#
# g2p : 296.9679
# z.kurtosis : 572.6795
# p.value.kurt : 0
#
# chi.small.skew : 17013.73
# p.value.small : 0
#
# Result : Data are not multivariate normal.
#
# MANOVA with Bootstrap -------------------------------------------------------
# Dataset cleaned with listwise techniqe used ---------------------------------
set.seed(9530)
iterations <- 0
while (iterations < 5) {
action_boot <- subset(fb_listwise,
Category == "Action")
action_boot <- action_boot[sample(nrow(action_boot), 20, replace = T), ]
iterations <- iterations + 1
} # Five iterations produced a (relatively) normal distribution
uniNorm(action_boot[, 3:5], type ="SW")
# $`Descriptive Statistics`
# n Mean Std.Dev Median Min Max 25th 75th Skew Kurtosis
# comment 20 6.75 6.315 6.0 0 23 2.00 8.50 1.031 0.207
# like 20 106.70 66.350 98.0 3 234 71.75 144.50 0.219 -0.983
# share 20 15.95 13.391 10.5 0 44 5.50 25.25 0.617 -0.943
#
# $`Shapiro-Wilk's Normality Test`
# Variable Statistic p-value Normality
# 1 comment 0.8795 0.0173 NO
# 2 like 0.9560 0.4674 YES
# 3 share 0.9084 0.0595 YES
mardiaTest(action_boot[, 3:5], qqplot = T)
# Mardia's Multivariate Normality Test
#
# data : action_boot[, 3:5]
#
# g1p : 4.524284
# chi.skew : 15.08095
# p.value.skew : 0.1291384
#
# g2p : 14.5373
# z.kurtosis : -0.1888962
# p.value.kurt : 0.8501741
#
# chi.small.skew : 18.67717
# p.value.small : 0.04455991
#
# Result : Data are multivariate normal.
iterations <- 0
while (iterations < 8) {
product_boot <- subset(fb_listwise,
Category == "Product")
product_boot <- product_boot[sample(nrow(product_boot), 20, replace = T), ]
iterations <- iterations + 1
} # Satisfactory normality achieved after 8 iterations
uniNorm(product_boot[, 3:5], type ="SW")
# $`Descriptive Statistics`
# n Mean Std.Dev Median Min Max 25th 75th Skew Kurtosis
# comment 20 5.4 3.966 4.0 1 13 2.00 9.25 0.473 -1.361
# like 20 124.9 68.166 129.5 13 319 82.25 165.25 0.722 1.111
# share 20 22.4 12.386 21.5 1 55 15.50 28.75 0.643 0.345
#
# $`Shapiro-Wilk's Normality Test`
# Variable Statistic p-value Normality
# 1 comment 0.8841 0.0210 NO
# 2 like 0.9207 0.1023 YES
# 3 share 0.9623 0.5909 YES
mardiaTest(product_boot[, 3:5], qqplot = T)
# data : product_boot[, 3:5]
#
# g1p : 2.415462
# chi.skew : 8.051539
# p.value.skew : 0.6238026
#
# g2p : 13.24137
# z.kurtosis : -0.7179563
# p.value.kurt : 0.4727842
#
# chi.small.skew : 9.971521
# p.value.small : 0.4429954
#
# Result : Data are multivariate normal.
iterations <- 0
while (iterations < 4) {
inspiration_boot <- subset(fb_listwise,
Category == "Inspiration")
inspiration_boot <- inspiration_boot[sample(nrow(inspiration_boot), 20, replace = T), ]
iterations <- iterations + 1
} # Satisfactory normality achieved after 4 iterations
uniNorm(inspiration_boot[, 3:5], type ="SW")
# $`Descriptive Statistics`
# n Mean Std.Dev Median Min Max 25th 75th Skew Kurtosis
# comment 20 3.05 2.982 2.0 0 11 1.00 4.25 1.108 0.339
# like 20 130.30 76.470 104.0 11 304 73.50 183.75 0.575 -0.754
# share 20 28.10 13.118 30.5 3 47 18.25 38.75 -0.189 -1.224
#
# $`Shapiro-Wilk's Normality Test`
# Variable Statistic p-value Normality
# 1 comment 0.8631 0.0089 NO
# 2 like 0.9311 0.1622 YES
# 3 share 0.9600 0.5432 YES
mardiaTest(inspiration_boot[, 3:5], qqplot = T)
# data : inspiration_boot[, 3:5]
#
# g1p : 3.791635
# chi.skew : 12.63878
# p.value.skew : 0.2445748
#
# g2p : 14.118
# z.kurtosis : -0.3600733
# p.value.kurt : 0.7187923
#
# chi.small.skew : 15.65265
# p.value.small : 0.1100169
#
# Result : Data are multivariate normal.
# Resampled dataset with relatively normal distribution by group
fb_metrics_normal <- rbind(action_boot,
product_boot,
inspiration_boot)
# Check normality for Paid variable
uniNorm(subset(fb_metrics_normal, Paid == "Paid")[, 3:5], type = "SW")
# $`Descriptive Statistics`
# n Mean Std.Dev Median Min Max 25th 75th Skew Kurtosis
# comment 14 6.714 5.497 6.5 0 18 2.25 9.50 0.628 -0.747
# like 14 120.143 51.088 138.5 11 197 81.25 145.75 -0.470 -0.761
# share 14 17.357 9.660 16.0 3 34 10.00 22.00 0.311 -1.352
#
# $`Shapiro-Wilk's Normality Test`
# Variable Statistic p-value Normality
# 1 comment 0.9208 0.2259 YES
# 2 like 0.9542 0.6280 YES
# 3 share 0.9303 0.3081 YES
uniNorm(subset(fb_metrics_normal, Paid == "Free")[, 3:5], type = "SW")
# $`Descriptive Statistics`
# n Mean Std.Dev Median Min Max 25th 75th Skew Kurtosis
# comment 46 4.565 4.530 3.0 0 23 2.0 6.75 1.716 3.844
# like 46 120.783 75.309 98.0 3 319 72.0 166.75 0.643 -0.130
# share 46 23.609 14.481 24.5 0 55 11.5 33.75 0.113 -0.994
#
# $`Shapiro-Wilk's Normality Test`
# Variable Statistic p-value Normality
# 1 comment 0.8213 0.0000 NO
# 2 like 0.9481 0.0395 NO
# 3 share 0.9708 0.2975 YES
mardiaTest(subset(fb_metrics_normal, Paid == "Paid")[, 3:5], qqplot = T)
# data : subset(fb_metrics_normal, Paid == "Paid")[, 3:5]
#
# g1p : 2.084401
# chi.skew : 4.863602
# p.value.skew : 0.9001012
#
# g2p : 11.16925
# z.kurtosis : -1.308449
# p.value.kurt : 0.1907211
#
# chi.small.skew : 6.562003
# p.value.small : 0.7660459
#
# Result : Data are multivariate normal.
mardiaTest(subset(fb_metrics_normal, Paid == "Free")[, 3:5], qqplot = T)
# data : subset(fb_metrics_normal, Paid == "Free")[, 3:5]
#
# g1p : 5.383752
# chi.skew : 41.27543
# p.value.skew : 1.008496e-05
#
# g2p : 19.54716
# z.kurtosis : 2.815324
# p.value.kurt : 0.004872804
#
# chi.small.skew : 45.41678
# p.value.small : 1.828729e-06
#
# Result : Data are not multivariate normal.
# Deviations from normality in "Free" group, but MANOVA is robust and resistant
# to deviations from normality
# Export dataset of the bootstrapped sample
write.csv(fb_metrics_normal, file = "dataset_Facebook_boot.csv")
# Analysis of fb_metrics_normal dataset
paid_normal <- fb_metrics_normal$Paid # dichotomous IV
category_normal <- fb_metrics_normal$Category # categorical IV
post_interaction_normal <- as.matrix(subset(fb_metrics_normal,
) # Discrete numerical DV
# MANOVA results
manova_results <- manova(post_interaction_normal ~
paid_normal * category_normal)
summary.aov(manova_results)
aov(comment ~ Category, data = fb_metrics_normal)
# Visuals ---------------------------------------------------------------------
# Total interaction by Paid
png("commentsByCategory.png")
boxplot(fb_metrics_normal$comment ~ fb_metrics_normal$Category,
main = "Number of Comments by Category of Post",
sub = "Bootstrapped Sample Data")
dev.off()
# Total interaction by Category
png("sharesByCategory.png")
boxplot(fb_metrics_normal$share ~ fb_metrics_normal$Category,
main = "Number of Shares by Post Category",
sub = "Bootstrapped Sample Data",
outline = T)
dev.off()
Appendix B: Dataset with Bootstrapped Sample in CSV format
"","Category","Paid","comment","like","share","Total.Interactions"
"430","Action","Free",0,3,0,3
"71","Action","Paid",7,146,9,162
"172","Action","Free",2,30,6,38
"239","Action","Free",2,101,25,128
"252","Action","Free",6,194,34,234
"38","Action","Paid",16,76,8,100
"361","Action","Free",3,72,24,99
"178","Action","Free",2,234,40,276
"436","Action","Free",11,95,4,110
"424","Action","Free",8,109,3,120
"30","Action","Paid",18,143,13,174
"228","Action","Paid",6,109,11,126
"251","Action","Paid",4,71,10,85
"278","Action","Free",23,204,44,271
"300","Action","Free",0,14,1,15
"62","Action","Paid",8,144,10,162
"115","Action","Free",0,15,2,17
"447","Action","Paid",10,197,21,228
"70","Action","Free",7,84,28,119
"374","Action","Free",2,93,26,121
"36","Product","Free",6,172,21,199
"259","Product","Free",10,167,26,203
"226","Product","Free",9,95,17,121
"122","Product","Free",6,186,40,232
"16","Product","Free",4,86,18,108
"185","Product","Free",11,68,11,90
"149","Product","Free",3,148,28,179
"221","Product","Free",2,86,8,96
"217","Product","Free",2,17,9,28
"411","Product","Free",7,142,24,173
"98","Product","Free",1,115,26,142
"155","Product","Free",13,319,55,387
"6","Product","Free",1,152,33,186
"176","Product","Paid",2,165,22,189
"185.1","Product","Free",11,68,11,90
"282","Product","Free",1,71,17,89
"240","Product","Paid",10,145,31,186
"113","Product","Free",4,117,18,139
"152","Product","Free",2,166,32,200
"236","Product","Free",3,13,1,17
"334","Inspiration","Free",0,64,20,84
"73","Inspiration","Free",3,226,44,273
"367","Inspiration","Free",4,304,47,355
"476","Inspiration","Paid",0,65,19,84
"253","Inspiration","Free",6,226,42,274
"364","Inspiration","Paid",1,179,30,210
"33","Inspiration","Free",2,155,47,204
"127","Inspiration","Free",3,198,41,242
"394","Inspiration","Free",1,57,13,71
"464","Inspiration","Paid",8,134,34,176
"328","Inspiration","Paid",3,97,22,122
"148","Inspiration","Free",0,80,14,94
"337","Inspiration","Free",2,72,9,83
"94","Inspiration","Free",2,111,16,129
"382","Inspiration","Free",11,235,34,280
"479","Inspiration","Free",1,74,31,106
"96","Inspiration","Free",5,153,27,185
"214","Inspiration","Paid",1,11,3,15
"499","Inspiration","Free",7,91,38,136
"479.1","Inspiration","Free",1,74,31,106