2017年3月9日木曜日

Normalize Historical EPS data from S&P.


S&P500 eps data comes from here http://us.spindices.com/documents/additional-material/sp-500-eps-est.xlsx?force_download=true

When inputs are as below.


09/30/2016,$28.69,$25.39
06/30/2016,$25.70,$23.28
  <SKIP>
06/30/1988,$6.05,$6.22
03/31/1988,$5.48,$5.53

awk to remove "$" mark in 2nd and 3rd columns is

~$ awk  -F, '{gsub("\\$","",$2);gsub("\\$","",$3);print $1","$2","$3}' < inputfile 

results are,

09/30/2016,28.69,25.39
06/30/2016,25.70,23.28
      <SKIP>
06/30/1988,6.05,6.22
03/31/1988,5.48,5.53

don't forget to use "\\"  not "\". it is because "\" itself requires own escape sequence. for the case above, "^." might work in the same way.

furthermore.

Below will remove the 1st field, print header in the first line and reverse order for latter usages.

awk  -F, '{gsub("\\$","",$2);gsub("\\$","",$3);print $2","$3}' < inputfile | tail -r | awk 'BEGIN{print "ope,rep"}{print $0}'

ope,rep
5.48,5.53
6.05,6.22
6.22,6.38
6.37,5.62
6.41,6.74

0 件のコメント: