1 Introduction to the data

The analysis of impact of average annual income on life expectancy as seen globally in 2015, the single data of merged longitudinal and categorical (named LifeExp_Clean) data that was cleaned as part of “The Race to Immortality: An analysis of variables impacting average life expectancy” will be utilized. This overall merged data frame includes the following information for each country:

  • Average Life Expectancy
  • Average Annual Income
  • Annual Country Population Data
  • Country Region Data
life.raw <- read.xlsx("https://nlepera.github.io/sta553/w05_ggplot/LifeExp_Clean.xlsx")
str(life.raw)
'data.frame':   37590 obs. of  6 variables:
 $ Country   : chr  "Afghanistan" "Albania" "Algeria" "Angola" ...
 $ Year      : chr  "X1800" "X1800" "X1800" "X1800" ...
 $ Life_Exp  : num  28.2 35.4 28.8 27 33.5 33.2 34 34 34.4 29.2 ...
 $ Income    : num  603 667 715 618 757 1510 514 814 1850 775 ...
 $ Population: num  3280000 410000 2500000 1570000 37000 534000 413000 351000 3210000 880000 ...
 $ Continent : chr  "Asia" "Europe" "Africa" "Africa" ...
DT::datatable(life.raw, fillContainer = FALSE, options = list(pageLength = 10))




2 Pruning the Data for Visualization Purposes

In order to properly visualize data of this scale a subset for the year 2015 was taken (named data2015)

data2015 <- filter(life.raw, Year == "X2015")
str(data2015)
'data.frame':   174 obs. of  6 variables:
 $ Country   : chr  "Afghanistan" "Albania" "Algeria" "Andorra" ...
 $ Year      : chr  "X2015" "X2015" "X2015" "X2015" ...
 $ Life_Exp  : num  57.9 77.6 77.3 82.5 64 77.2 76.5 75.4 82.6 81.4 ...
 $ Income    : num  1750 11000 13700 46600 6230 20100 19100 8180 43800 44100 ...
 $ Population: num  33700000 2920000 39900000 78000 27900000 99900 43400000 2920000 23800000 8680000 ...
 $ Continent : chr  "Asia" "Europe" "Africa" "Europe" ...
DT::datatable(data2015, fillContainer = FALSE, options = list(pageLength = 10))




3 Plotting the Data

3.1 2015 Data

The Prepared and Pruned data as found in data2015 is plotted below. Due to the presence of 174 countries, color coding is on the continent level to allow for ease of visibility and differentiation.

plot_ly(
    data = data2015,
    x = ~Income,  
    y = ~Life_Exp,
    size = ~Population,
    color = ~factor(Continent),
    colors = c('#332288','#117733', '#0072B2', '#D55E00', '#882255'),
    text = ~paste("Continent: ", Continent,
                   "<br>Country: ", Country,
                   "<br>Population: ", Population), 
     hovertemplate = paste('<i><b>Life Expectancy<b></i>: %{y}',
                           '<br><b>Income</b>:  %{x}',
                           '<br><b>%{text}</b>'),
     alpha  = 0.8,
    marker = list(line=(list(color = "black"))),
     type = "scatter") %>% 
     layout(
      title = list(text="Impact of Income on Global Life Expectancy as Seen in 2015 <br> Controlled for both region (Continent) and population size (Population)", y = 1.2),
      xaxis = list(
        title = "Average Annual Income ($/year)",
        linecolor = "black"
      ),
      yaxis = list(
        title = "Average Life Expectancy (years)"
      ),
      legend = list(title = list(text="Continent"), y= 0.8, x = 1),
   annotations = list(  
                     x = 0.8,
                     y = 0.2, 
                  font = list(size = 12, color = "darkred"),   
                  text = "The point size is proportional to the country population size",   
                  xref = "paper",    
                  yref = "paper", 
               xanchor = "center", 
               yanchor = "bottom", 
             showarrow = FALSE))

The 2015 data shows similar trends as was seen in the 2000 data, with a positive correlation between average annual income and average life expectancy across all continents. Continents with a greater range of average incomes per person demonstrate a greater positive correlation between income and life expectancy.Population size is more difficult to observe with the above plot, but there appears to be a minor negative correlation for the Asia continent between both population size and life expectancy as well as population size and income per person. This correlation is identified by the clustering of larger sized points closer towards the bottom left of the plot (lower values for both income and life expectancy)

3.2 All Data over Time

plot_ly(
    data = life.raw,
    x = ~Income,  
    y = ~Life_Exp,
    size = ~Population,
    color = ~factor(Continent),
    colors = c('#332288','#117733', '#0072B2', '#D55E00', '#882255'),
    frame = ~Year,
    text = ~paste("Continent: ", Continent,
                   "<br>Country: ", Country,
                   "<br>Population: ", Population), 
     hovertemplate = paste('<i><b>Life Expectancy<b></i>: %{y}',
                           '<br><b>Income</b>:  %{x}',
                           '<br><b>%{text}</b>'),
     alpha  = 0.8,
    marker = list(line=(list(color = "black"))),
     type = "scatter") %>% 
     layout(
      title = list(text="Impact of Income on Global Life Expectancy for All Years <br> Controlled for both region (Continent) and population size (Population)", y = 1.2),
      xaxis = list(
        title = "Average Annual Income ($/year)"
      ),
      yaxis = list(
        title = "Average Life Expectancy (years)"
      ),
      legend = list(title = list(text="Continent"), y= 0.8, x = 1),
   annotations = list(  
                     x = 0.8,
                     y = 0.2, 
                  font = list(size = 12, color = "darkred"),   
                  text = "The point size is proportional to the country population size",   
                  xref = "paper",    
                  yref = "paper", 
               xanchor = "center", 
               yanchor = "bottom", 
             showarrow = FALSE))

In reviewing the Average Life Expectancy on a global scale to the Average Annual Income, the correlations seen in both the year 2000 and year 2015 analyses are not present throughout the entire range of years in which data is available. Until approximately 1950 there was significant increase in Average Life Expectancy globally, with little increase in Average Annual Income. Barring temporary Average Life Expectancy decreases that correspond to the years of both World Wars, Average Life Expectancy appears to be near independent from Average Annual Income. After 1950 the trends of positive correlation between Average Life Expectancy and Average Annual Income becomes more prevalent each year.


