As we come close to the end of 2017, its time to look back. This has been a great year for me in many ways. This blog started as a way to write short pieces about using R for finance and promote my book in an organic way. Today, I’m very happy with my decision. Discovering and trying new writing styles keeps my interest very alive. Academic research is very strict on what you can write and publish. It is satisfying to see that I can promote my work and have an impact in different ways, not only through the publication of academic papers.

My blog is build using a Jekyll template, meaning the whole site, including individual posts, is built and controlled with editable text files and Github. All files related to posts follow the same structure, meaning I can easily gather the textual data and organize it in a nice tibble. Let’s first have a look in all post files:

post.folder <- '~/GitRepo/msperlin.github.io/_posts/'

my.f.posts <- list.files(post.folder, full.names = TRUE)
my.f.posts

##  [1] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-15-First-post.md"                  
##  [2] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-16-BatchGetSymbols.md"             
##  [3] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-17-predatory.md"                   
##  [4] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-18-GetHFData.md"                   
##  [5] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-19-CalculatingBetas.md"            
##  [6] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-30-Exams-with-dynamic-content.md"  
##  [7] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-05-R-and-Tennis.md"                
##  [8] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-06-My-Book-is-out.md"              
##  [9] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-10-Shiny_Exams.md"                 
## [10] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-13-R-and-Tennis-Players.md"        
## [11] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-16-Writing-a-book.md"              
## [12] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-03-05-Prophet-and_stock-market.md"    
## [13] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-03-26-pmdR-exercises.md"              
## [14] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-04-pafdR-is-out.md"                
## [15] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-09-Studying-Pkg-Names.md"          
## [16] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-15-R-Finance.md"                   
## [17] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-29-Update-GetHFData-1-3.md"        
## [18] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-06-01-Instaling-R-in-Linux.md"        
## [19] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-08-24-Reinstalling_R_Packages.md"     
## [20] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-08-24-Switching_to_Linux.md"          
## [21] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-04-Package-GetLattesData.md"       
## [22] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-10-Update-GetHFData-1-4.md"        
## [23] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-14-Brazilian-Yield-Curve.md"       
## [24] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-29-_Package-GetITRData.md"         
## [25] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-12-06-_Package-GetDFPData.md"         
## [26] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-12-13-_Serving-shiny-apps-internet.md"

I posted 26 posts during 2017. Notice how all dates are in the beginning of the file name. I can easily convert that to a Date object using as.Date. Let’s organize it all in a nice tibble.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 2.2.1     ✔ purrr   0.2.4
## ✔ tibble  1.4.1     ✔ dplyr   0.7.4
## ✔ tidyr   0.7.2     ✔ stringr 1.2.0
## ✔ readr   1.1.1     ✔ forcats 0.2.0

## ── Conflicts ────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

df.posts <- tibble(ref.date = as.Date(basename(my.f.posts)),
                   ref.month = format(ref.date, '%m'), 
                   content = sapply(my.f.posts, function(x) paste0(readLines(x), collapse = '\n') ),
                   char.length = nchar(content)) %>%  # includes output code in length calculation..
  filter(ref.date > as.Date('2017-01-01') | ref.date < as.Date('2018-01-01') ) # not really necessary but keep it for future

glimpse(df.posts)

## Observations: 26
## Variables: 4
## $ ref.date    <date> 2017-01-15, 2017-01-16, 2017-01-17, 2017-01-18, 2...
## $ ref.month   <chr> "01", "01", "01", "01", "01", "01", "02", "02", "0...
## $ content     <chr> "---\nlayout: post\ntitle: \"My first post!\"\nsub...
## $ char.length <int> 1734, 5833, 6632, 17265, 23414, 12974, 18899, 1779...

Fist, let’s look at the frequency of posts by month:

print( ggplot(df.posts, aes(x = ref.month)) + geom_histogram(stat='count')) 

## Warning: Ignoring unknown parameters: binwidth, bins, pad

It is not accidental that january was the month with the highest number of posts. This is when I had material reserved for the book. June and July (0!) were the worst months as I traveled a lot. In June I attended R and Finance in Chicago, SER in Rio de Janeiro and in July I was visiting Goethe University in Germany for the whole month. On average, I created 2.1666667 posts per month overall, which fells quite alright. I hope I can keep that pace for the upcoming years.

As for the length of posts, below we can see a nice pattern for its distribution conditional on the months of the year.

print(ggplot(df.posts, aes(x=ref.month, y = char.length)) + geom_boxplot())

I was not very productive from may to august, writing a few and short posts, when comparing to other months. This was probably due to my travels.

Plans for 2018

Along the usual effort in research and teaching, my plans for 2018 are:

  • Work on the second edition of the portuguese book. It significantly lags the english version in content and this need to be fixed. I already have some ideas laid out for new chapters and new packages to cover. I’ll write more about this update as soon as I have it figured out.

  • Start a portal for financial data in Brazil. I want to make it easy for people to visualize and download organized financial data, specially those without programming experience. It will include the usual datasets such as prices in equity/bond/derivative markets for various frequencies, historical yield curves, financial statements of companies, and so on. The idea is to offer the datasets in various file formats, facilitating its use in research.

Thats it. If you got this far, happy new year! Enjoy your family and the holidays!