Continually updated Data Science Notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), Spark, Hadoop MapReduce, Kaggle, scikit-learn, matplotlib, pandas, NumPy, AWS, Python essentials…

Importing JSON into Hadoop via Kafka

Command-line tools can be 235x faster than your Hadoop cluster

Apache Hadoop Explained: Kafka, ZooKeeper, HDFS and Cassandra.

Pachyderm (YC W15) Challenges Hadoop with Containerized Data Lakes

Do GPU-optimized databases threaten Oracle, Splunk and Hadoop?

Anaconda (Python) and Hadoop

A Decade of Hadoop: Doug Cutting on the Right Place for the Right Time

Command-line tools can be 235x faster than your Hadoop cluster

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Samza, etc

Data Sketches – Fast, Approximate Analysis of Big Data for Hadoop and Druid

Hadoop creator Doug Cutting on evolving and succeeding in open source

The Hadoop Ecosystem Table

Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets

Continually updated Data Science IPython Notebooks: Deep learning, Spark, Hadoop MapReduce, Kaggle, scikit-learn, matplotlib, pandas, NumPy, AWS, Python essentials, and various command lines

MI6 (SIS) Is Developing a Node.js, Angular, NoSQL, Hadoop System on Cloudera

Eagle: Secure Hadoop in Real Time

Large Scale Distributed Deep Learning on Hadoop Clusters

Ibis: Scaling Python Analytics on Hadoop and Impala

Hadoop filesystem at Twitter

The Way to Hadoop Native SQL

Pinterest open-sources Terrapin, a tool for serving data from Hadoop

Complementing Hadoop at Yahoo: Interactive Analytics with Druid

Spinning Up a Free Hadoop Cluster: Step by Step

The Improved Job Scheduling Algorithm of Hadoop Platform

Continually updated Data Science Python Notebooks: Spark, Hadoop MapReduce, HDFS, AWS, Kaggle, scikit-learn, matplotlib, pandas, NumPy, and various command lines.

No one ever got fired for using Hadoop on a cluster (2012)

Don't use Hadoop – your data isn't that big (2013)

What I learned working with Hadoop, HBase and HyperLogLog

Hadoop Corporate Adoption Remains Low: Gartner

More →