Category Archives: Hadoop

datadotzweekly

DataDotz Bigdata Weekly

Data ingestion into Splunk
=====

Amazon Web Services (AWS) jointly announced that Amazon Kinesis Data Firehose now supports Splunk Enterprise and Splunk Cloud as a delivery destination. This native integration between Splunk Enterprise, Splunk Cloud, and Amazon Kinesis Data Firehose is designed to make AWS data ingestion setup seamless, while offering a secure and fault-tolerant delivery mechanism. Continue reading

Read More
datadotzweekly

DataDotz Bigdata Weekly

Using Amazon S3 with Cloudera BDR
=================

More of you are moving to public cloud services for backup and disaster recovery purposes, and Cloudera has been enhancing the capabilities of Cloudera Manager and CDH to help you do that. Specifically, Cloudera Backup and Disaster Recovery (BDR) now supports backup to and restore from Amazon S3.BDR lets you replicate HDFS data from your on-premise cluster to or from Amazon S3 with full fidelity (all file and directory metadata is replicated along with the data). Continue reading

Read More
datadotzweekly

DataDotz Bigdata Weekly

Serverless Delivery with Databricks and AWS CodePipeline
=====================================

Databricks interactive workspace serves as an ideal environment for collaborative development and interactive analysis. The platform supports all the necessary features to make the creation of a continuous delivery pipeline not only possible but simple. Continue reading

Read More
datadotzweekly

DataDotz Bigdata Weekly

Server-Side Encryption for Amazon Kinesis Streams
==========

Amazon Kinesis Streams to ingest, process, and deliver data in real time from millions of devices or applications. Use cases for Kinesis Streams vary, but a few common ones include IoT data ingestion and analytics, log processing, clickstream analytics, and enterprise data bus architectures.Within milliseconds of data arrival, attached to a stream are continuously mining value or delivering data to downstream destinations. Continue reading

Read More
datadotzweekly

DataDotz Bigdata Weekly

AMAZON KINESIS VS APACHE KAFKA FOR BIG DATA ANALYSIS
==========

Data processing today is done in form of pipelines which include various steps like aggregation, sanitization, filtering and finally generating insights by applying various statistical models. Amazon Kinesis is a platform to build pipelines for streaming data at the scale of terabytes per hour. Continue reading

Read More
datadotzweekly

DataDotz Bigdata Weekly

Reading data securely from Apache Kafka
==========

The Cloudera Distribution of Apache Kafka 2.0.0 (based on Apache Kafka 0.9.0) introduced a new Kafka consumer API that allowed consumers to read data from a secure Kafka cluster. This allows administrators to lock down their Kafka clusters and requires clients to authenticate via Kerberos. Continue reading

Read More