RHadoop : Reading CSV using rhdfs

    RHadoop : Reading CSV using rhdfs


Here is a small code snippet on how to read the csv data from HDFS using rhdfs (RHadoop)

rhdfs uses rJava and the buffersize is limited by the heapsize. By default the size of the buffer is set to 5Mb in rhdfs. The source code for rhdfs can be found here.

HADOOP_CMD environment should point to the hadoop.

Sys.setenv(HADOOP_CMD="/bin/hadoop")

library(rhdfs)
hdfs.init()

f = hdfs.file("fulldata.csv","r",buffersize=104857600)
m = hdfs.read(f)
c = rawToChar(m)

data = read.table(textConnection(c), sep = ",")

## Alternatively You can use hdfs.line.reader()

reader = hdfs.line.reader("fulldata.csv")
 
x = reader$read()
typeof(x)
## [1] "character"

  1. Could you please give me some hint? Following is my code snippet:
    ==========================================================
    library(rmr2);
    library(rhdfs);
    library(lubridate);
    hdfs.init();
    f = hdfs.file("/bigdata/rawdata/201312.csv","r",buffersize=104857600);
    m = hdfs.read(f);
    c = rawToChar(m);
    data = read.table(textConnection(c), sep = ",");
    ==========================================================

    thanks in advance.

6 comments:

  1. There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.Hadoop Training in Chennai | Big Data Training in Chennai

    ReplyDelete
  2. how i read .mp3 file from hdfs ?

    ReplyDelete
  3. I simply wanted to thank you so much again. I am not sure the things that I might have gone through without the type of hints revealed by you regarding that situation.Authorized Dot Net training in chennai
    Advance Digital Marketing Training in chennai– 100% Job Guarantee

    ReplyDelete
  4. And indeed, I’m just always astounded concerning the remarkable things served by you. Some four facts on this page are undeniably the most effective I’ve had. Dotnet developer
    dotnet training in bangalore

    ReplyDelete
  5. I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.

    Amazon Web Services Training in Chennai


    Best Java Training Institute Chennai


    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete