Exploring the relationship between number_of_events in extended reality experiences and time_spent by users
Here I report on the analysis of user metrics related to extended reality experiences developed by Bitgeyser and Ketama Collective during the entire year of 2023. Taking into account the lack of data on the actual number of printed QR codes per experience, and their circulation number in a given market, data analysis on Total_Views and Unique_Users per experience are not informative metrics and can only be taken in relative terms. For this reason, I focused my attention on exploring the relationship between the number_of_events in extended reality experiences and the time_spent by users on the experience. By assessing the relationship between these two variables, the number_of_events that leads to longer times spent by users on the experience can be optimized.
I accessed each one of the projects available in Zapworks during 2023, downloaded their associated .csv files and registered the Number_of_Events for each experience in conjunction with its Average_Dwell_Time. With this information in hand, I proceeded to assemble my own spreadsheet for custom-analysis (Table 1).
Using the R programming language (a programming language for data science and data visualization), I proceeded with my analysis (Figure 1) that showed the mean average_dwell_time was 94.7 seconds and the mean number_of_events was 7.9 respectively.
# load libraries
library(tidyverse)
library(ggrepel)
# load .csv files containing data on extended reality experiences
# extended reality experiences as "ere"
ere <- read.csv(file.choose())
summary(ere)
The distribution of values for these two variables are shown on Figure 2 and Figure 3. It can be seen that the experience Baggio_Late_Shake is an outlier for the variable average_dwell_time with 234 seconds (Figure 2A); whereas the experience Chile is an outlier for the variable number_of_events with 75 events (Figure 3A).
# analyze the distribution of values for average_dwell_time and number_of_events using histograms
# Average Dwell Time
ggplot(data = ere, aes(x = Average_Dwell_Time)) +
geom_histogram(binwidth = 3, fill = "white", colour = "black")
# Number of Events
ggplot(data = ere, aes(x = Number_of_Events)) +
geom_histogram(binwidth = 3, fill = "white", colour = "black")
# arrange experiences in descending order for the variable average_dwell_time
ere %>%
arrange(desc(Average_Dwell_Time))
# arrange experiences in descending order for the variable number_of_events
ere[, -c(2,3)] %>%
arrange(desc(Number_of_Events))
Based on the above, I decided to remove from the analysis the experiences Baggio_Late_Shake and Chile since they appear to represent extreme cases, leaving me with 106 experiences for further down analysis.
# filter out Baggio_Late_Shake and Chile experiences
ere_filtered <- filter(ere, Average_Dwell_Time < 180 & Number_of_Events < 40)
I visually explored then the relationship between the number_of_events and the average_dwell_time by creating a scatter plot (Figure 4). For experiences showing between 7 and 27 events, the relationship with the average_dwell_time appears to be somehow linear (as the number_of_events increases in the experience, the average_dwell_time spent by the user increases as well).
I then recreated the above scatter plot taking into account only experiences with at least 3 events, which resulted in 83 experiences in total (Figure 5). The upward trend appears to be more clear, as the number of events per experience increases (up to 27 events), so does the average time users spend on the experience (up to 135 seconds). However, experiences with greater than 27 events appeared to negatively impact on the average time users spent on the experience (see Watts and Andina experiences).
Because I was interested in assessing the degree of association between the number_of_events and average_dwell_time variables I calculated the Pearson correlation (which assesses the degree of linear relationship between two quantitative variables) which was 0.31 and thus less than 0.5 (meaning the association is somehow weak). However, the association was significant (Figure 6).
# assess the degree of association among variables
cor(ere_filtered_3$Average_Dwell_Time, ere_filtered_3$Number_of_Events) # Pearson correlation of 0.31
# is the correlation significant?
cor.test(ere_filtered_3$Average_Dwell_Time, ere_filtered_3$Number_of_Events, alternative = "two.side", method = "pearson")
In conclusion, my data analysis revealed the average time spent by users on extended reality experiences appeared to be positively associated with the number of events in the experience. The trend (red line in Figure 5) in the data suggests that as the number of events increases, the time spent by users in the experience increases as well. However, the trend also suggests that the average time spent by users decreases for experiences with greater than 27 events. This finding may help in the optimization of events per experience in order to maximize the time spent by users.