Exploring the structure of national consumption
This entry is a direct continuation of my first exploration of the structure of national resource consumption
library(ggplot2)
# READING IN DATA
## SETTING DIRECTORY FOR EORA DATA ON LOCAL HARD DRIVE
wd<-"G:/Documents/PostDocKVA/Data/Eora" ### data directory
setwd(wd)
dir()
## [1] "countries.csv" "country_lookup.csv"
## [3] "Eora26_2011_bp.zip" "Eora26Structure.xlsx"
## [5] "gdppop.csv" "regionmembership.csv"
## [7] "TradeBalance_I-ENERGY.csv" "TradeBalance_I-ENERGY.xlsx"
## [9] "TradeBalance_I-VA.csv" "TradeBalance_I-VA.xlsx"
## [11] "Wiedmann"
## READING IN DATA
### MATERIAL USE DATA - ENERGY DATASET
energy.df<-read.csv("TradeBalance_I-ENERGY.csv",header=TRUE)
### Reading in .csv file with annual gdp and population sizes
gdppop.df<-read.csv("gdppop.csv",header=TRUE,skip=1) #skipping the first line which includes a description of the file
## REMOVING NEGATIVE AND ZERO CONSUMPTION ENTRIES
energy.df<-energy.df[which(energy.df[,"Consumption"]>0),]
## REMOVING NEGATIVE AND ZERO CONSUMPTION ENTRIES
energy.df<-energy.df[-which(as.character(energy.df$Country)=="Former USSR"),]
## merging the gdp and population size data onto the energy consumption data frame
energy.df<-merge(energy.df,gdppop.df,by=c("CountryA3","y","Country"),all.x=TRUE)
## To make consumption more comparable let's calculate per capita consumption by associating population data
### calculate per capita consumption and gdp consumption intensity by associating population data
energy.df[,"Consum.pop.int"]<-energy.df[,"Consumption"]/energy.df[,"val"]
energy.df[,"Consum.gdp.int"]<-energy.df[,"Consumption"]/energy.df[,"GDP"]
Picking up where we left off.
## visualizing per captia consumption and the GDP efficiency of consumption
### percapita consumption
ggplot(energy.df,aes(y=Consum.pop.int,x=y,group=CountryA3)) + geom_line()
## Warning: Removed 320 rows containing missing values (geom_path).
ggplot(energy.df,aes(y=Consum.gdp.int,x=y,group=CountryA3)) + geom_line()
We immediately see that some time series contain one or more years with abnormal fluctuations. These anomalies are unlikely to reflect actual changes in the structure of consumption, but could instead be due to sudden changes in accounting methods. This is one of the main limitations of using accounting statistics to estimate consumption. Now let’s take a look at the countries and years that exhibit large anomalies.
Just by looking at the plots above we see that many of the per capita consumption anomalies occur in 1991.
energy.df[order(energy.df$Consum.pop.int, decreasing=TRUE)[1:20],c("Country","y","Consum.pop.int")]
## Country y Consum.pop.int
## 6392 San Marino 1991 5.676499
## 6261 Singapore 1975 5.469364
## 7568 British Virgin Islands 1991 3.648615
## 4431 Monaco 1991 3.485693
## 1870 Cayman Islands 1991 3.416776
## 1030 Bermuda 1991 3.220180
## 4095 Liechtenstein 1991 3.176298
## 6260 Singapore 1974 2.641860
## 6476 Serbia 1991 2.545683
## 2980 Guyana 2010 2.480164
## 5962 Qatar 1970 2.327962
## 253 UAE 1970 2.189188
## 2978 Guyana 2008 2.189014
## 5963 Qatar 1971 2.177633
## 2979 Guyana 2009 2.133004
## 254 UAE 1971 2.059360
## 255 UAE 1972 2.020770
## 5964 Qatar 1972 2.003712
## 2977 Guyana 2007 1.942043
## 22 Aruba 1991 1.937733
It looks like many of the countries that exhibit per capita consumption anomalies are characterized by having a small area and population and small territorial emissions (i.e. domestic extraction of resources).
energy.df[order(energy.df$Consum.gdp.int, decreasing=TRUE)[1:20],c("Country","y","Consum.gdp.int")]
## Country y Consum.gdp.int
## 6560 Sudan 1991 53.764571
## 948 Belarus 1993 32.356232
## 4475 Moldova 1993 18.520411
## 4482 Moldova 2000 13.564694
## 6193 South Sudan 1991 12.848089
## 4481 Moldova 1999 11.373931
## 4474 Moldova 1992 10.666067
## 4484 Moldova 2002 10.644795
## 4480 Moldova 1998 10.634448
## 4485 Moldova 2003 10.523747
## 4486 Moldova 2004 10.270041
## 4479 Moldova 1997 10.089988
## 4483 Moldova 2001 9.791289
## 4487 Moldova 2005 9.641692
## 6539 Sudan 1970 9.358367
## 4478 Moldova 1996 8.784872
## 955 Belarus 2000 8.776421
## 4477 Moldova 1995 8.628466
## 4488 Moldova 2006 8.510303
## 949 Belarus 1994 8.374650
High GDP consumption intensity on the other hand, seem to be limited to a smaller number of countries including notably Moldova but also Sudan, South Sudan and Belarus.
For a more proper investigation of aberant consumption anomalies will use scaling of national time series with mean 0 and sd 1.
energy.df<-energy.df[order(energy.df[,"Country"],energy.df[,"y"]),]
head(energy.df)
## CountryA3 y Country TerritorialEmissions Imports Exports
## 43 AFG 1970 Afghanistan 115043 15343 1343
## 44 AFG 1971 Afghanistan 115043 13345 1262
## 45 AFG 1972 Afghanistan 115043 11725 1465
## 46 AFG 1973 Afghanistan 115043 9633 1392
## 47 AFG 1974 Afghanistan 115043 8265 1213
## 48 AFG 1975 Afghanistan 115043 7828 1150
## DirectEmissions Consumption GDP val Consum.pop.int
## 43 43301 129043 1277935 11839729 0.010899151
## 44 43301 127126 1362663 12138578 0.010472891
## 45 43301 125303 1168728 12449180 0.010065161
## 46 43301 123284 1266842 12760486 0.009661388
## 47 43301 122094 1567124 13058067 0.009350082
## 48 43301 121720 1722525 13328589 0.009132249
## Consum.gdp.int
## 43 0.10097775
## 44 0.09329233
## 45 0.10721314
## 46 0.09731600
## 47 0.07790960
## 48 0.07066371
energy.df[,"Consum.pop.int.scale"]<-unlist(by(energy.df,energy.df[,"Country"], function(x) scale(x[,"Consum.pop.int"],center=TRUE,scale=TRUE)))
energy.df[,"Consum.gdp.int.scale"]<-unlist(by(energy.df,energy.df[,"Country"], function(x) scale(x[,"Consum.gdp.int"],center=TRUE,scale=TRUE)))
ggplot(energy.df,aes(y=Consum.pop.int.scale,x=y,group=CountryA3)) + geom_line()
## Warning: Removed 320 rows containing missing values (geom_path).
energy.df[order(energy.df$Consum.pop.int.scale,decreasing=TRUE)[1:20],c("Country","y","Consum.pop.int.scale")]
## Country y Consum.pop.int.scale
## 4179 Lesotho 1991 6.198724
## 22 Aruba 1991 6.159096
## 694 Burkina Faso 1991 6.140783
## 1744 Cape Verde 1991 6.136408
## 7694 Samoa 1991 6.136137
## 4431 Monaco 1991 6.105693
## 6812 Seychelles 1991 6.101615
## 6518 Sao Tome and Principe 1991 6.086875
## 400 Antigua 1991 6.081820
## 4095 Liechtenstein 1991 6.080496
## 4808 Montenegro 1991 6.063593
## 6109 Rwanda 1991 6.057820
## 4682 Mali 1991 6.055243
## 1030 Bermuda 1991 6.051178
## 4011 Liberia 1991 6.028471
## 988 Belize 1991 6.009964
## 2877 Greenland 1991 5.906676
## 3647 Japan 2005 5.887532
## 7652 Vanuatu 1991 5.823841
## 4557 Maldives 1991 5.796995
ggplot(energy.df,aes(x=Consum.pop.int.scale))+geom_histogram()+facet_wrap(~y)
The annual histograms of scaled per capita consumption shows that 1991 indeed is a weird year compared to its neighbouring years. Similarly, the year 2000 looks to have a long rights-skewed tail to its frequency distribution.
How about economic intensity of consumption?
ggplot(energy.df,aes(y=Consum.gdp.int.scale,x=y,group=CountryA3)) + geom_line()
energy.df[order(energy.df$Consum.gdp.int.scale,decreasing=TRUE)[1:20],c("Country","y","Consum.gdp.int.scale")]
## Country y Consum.gdp.int.scale
## 6193 South Sudan 1991 6.162223
## 6560 Sudan 1991 6.091356
## 4011 Liberia 1991 6.066484
## 6518 Sao Tome and Principe 1991 5.811309
## 948 Belarus 1993 5.561503
## 7694 Samoa 1991 5.496875
## 6434 Somalia 1991 5.412538
## 4179 Lesotho 1991 5.225550
## 6602 Suriname 1991 4.830266
## 2982 Hong Kong 1970 4.536611
## 7736 Yemen 1991 4.416178
## 694 Burkina Faso 1991 4.402034
## 2857 Greenland 1970 4.311507
## 1744 Cape Verde 1991 4.292522
## 4032 Libya 1970 4.222122
## 6261 Singapore 1975 4.195786
## 1849 Cayman Islands 1970 4.152875
## 967 Belize 1970 4.132067
## 3402 Iceland 1970 4.117356
## 6896 Chad 1991 4.098158
ggplot(energy.df,aes(x=Consum.gdp.int.scale))+geom_histogram()+facet_wrap(~y)
Again, 1991 is abnormal compared to the years immediately before and after it. It also looks like very few countries have population data for 2011.
For now, and to avoid the abormal fluctuations in the consumption intensities in 1991 and 2000, I will remove 1991 and 2000 from the dataset. I will also remove 2011.
I will defnitely need to return to these two years to better understand how energy consumption relates to the size of population and economy in these years.
energy.df<-energy.df[-which( energy.df[,"y"] %in% c(1991,2000,2011)),]
ggplot(energy.df,aes(y=Consum.pop.int.scale,x=y)) + geom_line(aes(group=CountryA3))
## Warning: Removed 135 rows containing missing values (geom_path).
ggplot(energy.df,aes(y=Consum.gdp.int.scale,x=y)) + geom_line(aes(group=CountryA3))
To summarize the overall pattern in the intensity of consumption in relatio to GDP and population. There looks to be a rather monotomous decline in energy/GDP from 1970 to 2010. In contrast energy/population undergoes a general decrease from 1970 to the mid 1990’s whereafter some countries start to increase again.
I will continue exploring the relative contribution of domestic extraction and imports to energy consumption in the next notebook entry.