being a Linux Fedora user i started limiting my, albeit quick trial, with what was supposed to be a linux/unix readable format on the CLWVOC main database release page. this being CLIWOC15.Z file. that trial was not very successful. managed to read in the data into R after getting rid of some "unwanted" characters that R did not "like". however, the fields (columns supposed to be semicolon delimited) was more or less messed up. and the number of records were also less than specified on the page. most importantly the trial was based on non-reproducible code within the R-environment.
so, reluctantly i decided to have a go at the MS Access databases CLIWOC15_2000.zip source, albeit not having high hopes that i could find a solution that would work withing the Linux environment. but after some search on the web on how to read mdb format directly into R environment within Linux i stumbled across this post. In particular: "Use mdb.get() from Hmisc package to import entire tables from the database into dataframes." just what the doctor ordered. now i had the Hmisc library already installed. but I did not have the success with the mdb.get() function. reading the help file on mdb.get (?mdb.get) one "gets": "Uses the
mdbtools
package executables mdb-tables
, mdb-schema
, and mdb-export
. in Debian/Ubuntu Linux run apt get install mdbtools
." being a Fedora user the equivalent command to install the mdbtools is:yum install mdbtools
with that I was ready to go. with the following code i managed to achieve my objective of getting the CLIWOC MS Access data into R environment within the Linux framework (as well as do some very crude initial ggplot2):
require(Hmisc) # need also mdbtools, in Fedora do >yum install mdbtools path <- "yourworkingdirectory" URL <- "http://www.knmi.nl/" PATH <- "cliwoc/download/" FILE <- "CLIWOC15_2000.zip" download.file(paste(URL,PATH,FILE,sep=""), paste(path,"CLIWOC15_2000.zip",sep="")) dir <- unzip(paste(path,"CLIWOC15_2000.zip",sep="")) file <- substr(dir,3,nchar(dir)) dat <- mdb.get(file) tmp <- dat$CLIWOC15[,c("Lon3","Lat3")] require(ggplot2) ggplot(tmp,aes(Lon3,Lat3)) + geom_point(alpha=0.01,size=1) + coord_map() + ylim(-90,90)
No comments:
Post a Comment