DataDotz BigData Weekly

DataDotz Bigdata Weekly

This entry was posted in Uncategorized on by .   0 Comment[s]

Data Management Strategies for Computer Vision
============================================

Kafka is a message system. Let us understand more about the message system and the problems it solves. Take the currently popular micro-service as an example. Let’s assume that there are three terminal-oriented.(WeChat official account, mobile app, and browser) web services (HTTP protocols) at the web end, namely Web1, Web2, and Web3, and three internal application services App1, App2, and App3
https://www.alibabacloud.com/blog/an-overview-of-kafka-distributed-message-system_594218

Cache warming: Agility for a stateful service
======================================

EVCache has been a fundamental part of the Netflix platform (we call it Tier-1), holding Petabytes of data. Our caching layer serves multiple use cases from signup, personalization, searching, playback, and more. It is comprised of thousands of nodes in production and hundreds of clusters all of which must routinely scale up due to the increasing growth of our members.

https://medium.com/netflix-techblog/cache-warming-agility-for-a-stateful-service-2d3b1da82642
Time Series at ShiftLeft
====================

Time series are a major component of the ShiftLeft runtime experience. This is true for many other products and organizations too, but each case involves different characteristics and requirements. This post describes the requirements that we have to work with, how we use TimescaleDB to store and retrieve time series data, and the tooling we’ve developed to manage our infrastructure.

https://blog.shiftleft.io/time-series-at-shiftleft-e1f98196909b
Manage centralized Microsoft Exchange Server logs using Amazon Kinesis
================================================================

Microsoft Exchange servers store different types of logs. These log types include message tracking, Exchange Web Services (EWS), Internet Information Services (IIS), and application/system event logs. With Exchange servers deployed on a global scale, logs are often scattered in multiple directories that are local to these servers.

https://aws.amazon.com/blogs/big-data/manage-centralized-microsoft-exchange-server-logs-using-amazon-kinesis-agent-for-windows/
Building Secure and Governed Microservices with Kafka Streams
========================================================

With Hortonworks DataFlow (HDF) 3.3 now supporting Kafka Streams, we are truly excited about the possibilities of the applications that you can benefit from when combined with the rest of our platform.In this post, we will demonstrate how Kafka Streams can be integrated with Schema Registry, Atlas and Ranger to build set of microservices apps for a fictitious use case.

https://hortonworks.com/blog/building-secure-and-governed-microservices-with-kafka-streams/
Spark from the Trenches
======================

Spark includes a configurable metrics system based on the dropwizard.metrics library. It is set up via the Spark configuration. As we already are heavy users of Graphite and Grafana, we use the provided Graphite sink.

https://medium.com/teads-engineering/spark-from-the-trenches-part-2-f2ff9ab67ea1