hadoop - Writing lots of online files on HDFS -


i have hdfs cluster spark streaming handling logs of thousands of tenants.
want write each tenant logs different file (parquet) can access data based on tenant id (and able query log 5 seconds after arrived).
assuming each tenant sends few logs per second, i'll need append small amount of data day long tenant log file.
since hdfs block size 64mb, i'm assuming inefficient have thousands of files appended few bytes every few seconds.
there technique handle such scenario efficiently on hdfs?


Comments

Popular posts from this blog

minify - Minimizing css files -

neo4j - finding mutual friends in a cypher statement starting with three or more persons -

php - How to remove letter in front of the word laravel -