hadoop - Best way to automatate getting data from Csv files to Datalake -
i need data csv files ( daily extraction différent business databasses ) hdfs move hbase , finaly charging agregation of data datamart (sqlserver ).
i know best way automate process ( using java or hadoops tools )
little no coding required? in no particular order
- talend open studio
- streamsets data collector
- apache nifi
assuming can setup kafka cluster, can try kafka connect
if want program something, spark. otherwise, pick favorite language. schedule job via oozie
if don't need raw hdfs data, can load directly hbase
Comments
Post a Comment