Data Science for Economics and Finance: Tidy Programming in R

N.F. Katzke

It is safe to say that the tidyverse, or tidy coding, requires a complete rethink on how you approach data analytics.

Much of how we go about data analysis is tainted by our use of excel, and our propensity to think in terms of many columns, fewer rows and many (oh so many) excel tabs.

Bonus practical (more examples)

For the code for the second part, first try it - and then look at the solution that will be posted here in time:

library(tidyverse)

# Cap-weight index return per country:
df_use <- 

  fmxdat::Fin_Data_SA_US_NKY %>% group_by(Universe) %>% mutate(YM = format(date, "%Y%B")) %>% 
  # For easier naming:
  rename(Sector = BICS_LEVEL_1_SECTOR_NAME)
  

# Cap-weighted index returns per month, universe and sector:

df_rets <- 

df_use %>% group_by(Universe, Sector, YM) %>% 
  
  mutate(weight = Market.Cap / sum(Market.Cap, na.rm=T)) %>% 
  
  group_by(Tickers, YM) %>% 
  
  filter(date == last(date)) %>% 
  
  # Date gap issue resolved:
  arrange(date) %>% group_by(Tickers) %>% 
  
  mutate(Returns = ifelse(date - lag(date) >= 35, NA, TRI / lag(TRI)-1)) %>% 
  
  filter(date > first(date)) %>% ungroup() %>% 
  
  # Drop some redundant columns:
  select(-Short.Name, -contains("BICS"))
  

# So for the first question:
  
Rets_per_sector <- df_rets %>% group_by(YM, Universe, Sector) %>% summarise(Returns = sum(weight*Returns, na.rm=T))


# Question 2:

Rets_per_Universe <- 
  
  df_rets %>% 
  # Drop the Sector grouping:
  group_by(YM, Universe) %>% mutate(weight = Market.Cap / sum(Market.Cap, na.rm=T)) %>% 
  
  summarise(Index_Returns = sum(weight*Returns, na.rm=T), EW_Returns = sum(Returns*1/n(), na.rm=T)) %>% ungroup()


Rets_per_Universe %>% group_by(Universe) %>% 
  
  # Chaining the returns set, i.e. (1+Ret_1) * (1+Ret_2) * .... (1+Ret_20)
  summarise(Full_Ret = prod(1+Index_Returns)-1, Full_SD = sd(Index_Returns),
            Full_Ret_EW = prod(1+EW_Returns)-1, Full_SD_EW = sd(EW_Returns),
            Sharpe = Full_Ret / Full_SD, Sharpe_EW = Full_Ret_EW / Full_SD_EW) %>% 
  
  # Rearrange: (notice everything for 'everything else')
  
  select(Universe, contains("Ret"), contains("SD"), everything())



# Question 3:


df_rets %>% filter(Universe == "JALSHAll") %>% 
  
  # Again, drop the Sector grouping:
  group_by(YM, Universe) %>% mutate(weight = Market.Cap / sum(Market.Cap, na.rm=T)) %>% 
  
  group_by(YM) %>% arrange(date, desc(weight)) %>% 
  
  mutate(CumsumW = cumsum(weight)) %>% filter(CumsumW <= 0.95) %>% 
  
  group_by(YM) %>% summarise(Top95_Index_Return = sum(weight * Returns, na.rm=T)) %>% 
  
  summarise(Full_Ret = prod(1+Top95_Index_Return)-1, Full_SD = sd(Top95_Index_Return),Sharpe = Full_Ret / Full_SD)
  
  
# DIY: see if you can annualise these numbers to boot....

Tidy Programming in R

Author

Affiliation

Published

Citation

In this practical we introduce the tidyverse

Bonus practical (more examples)

Footnotes

Citation

Course Content Fully Copyright protected: Nico Katzke

nfkatzke@gmail.com