ラベル system の投稿を表示しています。 すべての投稿を表示
ラベル system の投稿を表示しています。 すべての投稿を表示

2017年2月7日火曜日

Get data and construct R sentence to load daily data. - perl


  1. data comes as "[[2017-01-01,54],[2017-01-02,79],<SKIP>,[2017-02-01,160]]".
  2. starts and ends with 2 brackets.
  3. each day's data is a pair of date and # of incident separated by "],[".
  4. execute wget command.
  5. open retrieved file.
  6. remove unnecessary brackets and separtor. store data into array @data.
  7. disassemble each @data entry to pick up date and data.
  8. construct R sentences to input, which is made up from start date, end date and # of incident of each day.



#! /usr/bin/perl

# $file="./testdata";
system("wget http://10.251.66.58/kljstatistics/php/dashboard/maindashboard/dailynew2016.php");
$file="./dailynew2016.php";
$outfile="> ./dataget.r";

open(IN,$file) or die "$!";
open(OUT,$outfile) or die "$!";
print("# start\n");
while(<IN>){
 # data is expected to come in a sigle line.
 # remove 2 brackets at the start and the end of the input.
 $buff = substr($_,2,length($_));
 $buff = substr($buff,0,length($buff)-2);
 # split data at the sequence of "],["
 @data = split(/\],\[/, $buff);
}
print "# file read end. the size of data  is ";
# get the length of the list.
$size = @data;
print "# ";
print $size;
print "\n\n\n";

$count = 0;
# prepare start of the output statement.
$output = "w <- c(); w <- append(w,xts(c(";
@xts = split(/,/,$data[0]); # take 1st element
$startdate = $xts[0];  # and store its date part
# print "startdate is ",$startdate; # debug purpose
foreach $element(@data)  # disassemble @data from start again.
{
  @xts = split(/,/,$element); #split at ','
  $output =$output.$xts[1].","; # store # of incident and concatenate
}

$output = substr($output,0,length($output)-1);
# add date sequence with start date and end date. end date comes from the last data element.
$output = $output."),seq(as.Date(\"$startdate\"),as.Date(\"$xts[0]\"),1)))";
print OUT $output;
print OUT "\n######### end date is ",$xts[0]," #####################\n";
print OUT "inc_daily_xts <- w\n";
print OUT "inc_daily <- as.numeric(w)\n";
print "\n\n\n";


$len = scalar(@data);
print "# $len th entry is $data[$len-1]\n";
# remove input file. don't forget this otherwise wget stores data in a different filename.
system("rm dailynew2016.php");
print("# from $startdate til $xts[0].\n");
print "# the end of the process.\n";

close(IN);
close(OUT);

2016年6月10日金曜日

Calculate day dependent data with less workload - 1 set languge locale


In order to calculate day dependent data with less workload, I like to automate processes. The first step is to judge the first day of each belongs to which day of the week. weekday function provides this function.

Currently R's language setting is set to JA. This may be

weekdays(as.Date("2016-01-01"),abbreviate = TRUE)
[1] "金"
> Sys.getlocale("LC_MESSAGES")
[1] "ja_JP.UTF-8"

Change LC_MESSAGES to US. But weekdays function still return the value in Japanese.

> Sys.setlocale("LC_MESSAGES",'en_US')
[1] "en_US"
> weekdays(as.Date("2016-01-01"),abbreviate = TRUE)
[1] "金"

Investigate Sys.setlocale and found that there are other classes than LC_MESSAGES. I have no problem with US English.

> Sys.setlocale
function (category = "LC_ALL", locale = "") 
{
    category <- match(category, c("LC_ALL", "LC_COLLATE", "LC_CTYPE", 
        "LC_MONETARY", "LC_NUMERIC", "LC_TIME", "LC_MESSAGES", 
        "LC_PAPER", "LC_MEASUREMENT"))
    if (is.na(category)) 
        stop("invalid 'category' argument")
    .Internal(Sys.setlocale(category, locale))
}
<bytecode: 0x102cfdee0>
<environment: namespace:base>

Set LC_ALL locale, instead of LC_MESSAGES to US English.

> Sys.setlocale("LC_ALL",'en_US')
[1] "en_US/en_US/en_US/C/en_US/en_US"

Then return is in US English.

> weekdays(as.Date("2016-01-01"),abbreviate = TRUE)
[1] "Fri"
>

2016年5月19日木曜日

how to know current working directory



type getwd() on console.

> getwd()
[1] "/Users/<directory_name>/R_proj/tmp1"


Please be aware that just type "getwd" doesn't work.

> getwd
function () 
.Internal(getwd())
<bytecode: 0x101fb2000>
<environment: namespace:base>

Please note that ,without any additional location information, the functions which read/write files look for the object in this directory. "read.csv" and "write.csv" are among them.