hadoop - Writing lots of online files on HDFS -

August 15, 2015

i have hdfs cluster spark streaming handling logs of thousands of tenants.
want write each tenant logs different file (parquet) can access data based on tenant id (and able query log 5 seconds after arrived).
assuming each tenant sends few logs per second, i'll need append small amount of data day long tenant log file.
since hdfs block size 64mb, i'm assuming inefficient have thousands of files appended few bytes every few seconds.
there technique handle such scenario efficiently on hdfs?

Search This Blog

Single

hadoop - Writing lots of online files on HDFS -

Comments

Post a Comment

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -