Category Archives: DataDotz Weekly

DataDotz BigData Weekly

DataDotz BigData Weekly

Confluent released first preview version of Confluent Platform
This first preview release introduces powerful new capabilities for KSQL (streaming SQL for Apache Kafka®) and Confluent Control Center. Confluent Control Center provides features such as UI for KSQL, Broker Configuration, Topic Inspection, Consumer Lag.  They have also made several improvements to KSQL REST API. Additional KSQL features such as flexible timestamp handling, non-windowed aggregate functions(SUM, COUNT) on table. Preview release also includes protection on both tables and streams.
https://www.confluent.io/blog/introducing-confluent-platform-preview-releases/
 
Continue reading

Read More
DataDotz BigData Weekly

DataDotz Bigdata Weekly

Apache Spark
=====

Structured Streaming in Apache Spark 2.0, it has supported joins (inner join and some type of outer joins) between a streaming and a static DataFrame/Dataset. With the release of Apache Spark 2.3.0, now available in Databricks Runtime 4.0 as part of Databricks Unified Analytics Platform, we now support stream-stream joins. Continue reading

Read More
DataDotz BigData Weekly

DataDotz Bigdata Weekly

Data ingestion into Splunk
=====

Amazon Web Services (AWS) jointly announced that Amazon Kinesis Data Firehose now supports Splunk Enterprise and Splunk Cloud as a delivery destination. This native integration between Splunk Enterprise, Splunk Cloud, and Amazon Kinesis Data Firehose is designed to make AWS data ingestion setup seamless, while offering a secure and fault-tolerant delivery mechanism. Continue reading

Read More
DataDotz BigData Weekly

DataDotz Bigdata Weekly

Using Amazon S3 with Cloudera BDR
=================

More of you are moving to public cloud services for backup and disaster recovery purposes, and Cloudera has been enhancing the capabilities of Cloudera Manager and CDH to help you do that. Specifically, Cloudera Backup and Disaster Recovery (BDR) now supports backup to and restore from Amazon S3.BDR lets you replicate HDFS data from your on-premise cluster to or from Amazon S3 with full fidelity (all file and directory metadata is replicated along with the data). Continue reading

Read More
DataDotz BigData Weekly

DataDotz Bigdata Weekly

Serverless Delivery with Databricks and AWS CodePipeline
=====================================

Databricks interactive workspace serves as an ideal environment for collaborative development and interactive analysis. The platform supports all the necessary features to make the creation of a continuous delivery pipeline not only possible but simple. Continue reading

Read More