DataDotz BigData Weekly

DataDotz Bigdata Weekly

This entry was posted in Uncategorized on by .   0 Comment[s]

Data Management Strategies for Computer Vision
============================================

Computer vision (CV) developers often find the biggest barrier to success relates to data management, and yet so much of what you’ll find about CV is about the algorithms, not the data. In this blog, I’ll describe three separate data management strategies I’ve used with applications that process images.Through the anecdotes of my experiences, you’ll learn about several functions that data platforms provide for CV.The main event here is a discussion about how video can be transported through MapR-ES (which is MapR’s reimplementation of Apache Kafka) and how Docker can be used to elastically scale video processors for face detection.
https://mapr.com/blog/data-management-strategies-for-computer-vision/

Big Data Analytics and Machine Learning with PTC and Hortonworks
============================================================

Today, PTC and Hortonworks announce a strategic partnership to “fast-forward” the realization of Industry 4.0 benefits including improved manufacturing quality and yield, enhanced asset and plant uptime, and optimized production flexibility and throughput. This collaboration is directed at a state-of-the art solution comprised of complementary offerings from Hortonworks.

https://hortonworks.com/blog/fast-forward-industry-4-0-enterprise-big-data-analytics-machine-learning-ptc-hortonworks/
Embed interactive dashboards in your application with Amazon QuickSight
=================================================================

Embedded Amazon QuickSight dashboards allow you to utilize Amazon QuickSight’s serverless architecture and easily scale your insights with your growing user base, while ensuring you only pay for usage with Amazon QuickSight’s unique pay-per-session pricing model.

https://aws.amazon.com/blogs/big-data/embed-interactive-dashboards-in-your-application-with-amazon-quicksight/
Scale your Amazon Redshift clusters
================================

Amazon Redshift is the cloud data warehouse of choice for organizations of all sizes—from fast-growing technology companies such as Turo and Yelp to Fortune 500 companies such as 21st Century Fox and Johnson & Johnson. With quickly expanding use cases, data sizes, and analyst populations, these customers have a critical need for scalable data warehouses.Since we launched Amazon Redshift, our customers have grown with us.

https://aws.amazon.com/blogs/big-data/scale-your-amazon-redshift-clusters-up-and-down-in-minutes-to-get-the-performance-you-need-when-you-need-it/
Stitch & Mobile Webinar Questions & Replay
========================================

How do you test MongoDB Stitch functions, how do you store Stitch triggers, and what services can you integrate Stitch with? These were some of the great questions that were asked and answered in my recent webinar. You can watch the replay of “MongoDB Mobile and MongoDB Stitch – I.For those new to MongoDB Stitch, it’s the serverless platform from MongoDB that isolates complexity and ‘plumbing’ so you can build applications faster.

hhttps://www.mongodb.com/blog/post/stitch–mobile-webinar-questions–replay
Apache Avro as a Built-in Data Source in Apache Spark 2.4
=======================

Apache Avro is a popular data serialization format. It is widely used in the Apache Spark and Apache Hadoop ecosystem, especially for Kafka-based data pipelines. Starting from Apache Spark 2.4 release, Spark provides built-in support for reading and writing Avro data. The new built-in spark-avro module is originally from Databricks’ open source project Avro Data Source for Apache Spark (referred to as spark-avro from now on).

https://databricks.com/blog/2018/11/30/apache-avro-as-a-built-in-data-source-in-apache-spark-2-4.html