技术

大数据动态之201507

Hortonworks
HDP 2.3发布：
HDP 2.3新增加组件Apache Atlas、Apache Calcite
http://hortonworks.com/blog/available-now-hdp-2-3/
http://hortonworks.com/blog/introducing-availability-of-hdp-2-3-part-2/
http://hortonworks.com/blog/introducing-availability-of-hdp-2-3-part-3/
Spark 1.2开始支持ORC(Columnar Formats)
http://hortonworks.com/blog/bringing-orc-support-into-apache-spark/
Spark in HDInsight新特性一览
http://hortonworks.com/blog/spark-in-hdinsight/

Cloudera
HBase 1.0 开始支持Thrift客户端鉴权
http://blog.cloudera.com/blog/2015/07/thrift-client-authentication-support-in-apache-hbase-1-0/
Pig on MR优化
http://blog.cloudera.com/blog/2015/07/how-to-tune-mapreduce-parallelism-in-apache-pig-jobs/
Apache Zeppelin on CDH
http://blog.cloudera.com/blog/2015/07/how-to-install-apache-zeppelin-on-cdh/
大数据欺诈检测架构
http://blog.cloudera.com/blog/2015/07/designing-fraud-detection-architecture-that-works-like-your-brain-does/

MapR
YARN资源管理实践
https://www.mapr.com/blog/best-practices-yarn-resource-management
Hive 1.0对Transaction的支持
https://www.mapr.com/blog/hive-transaction-feature-hive-10

Databricks
Spark Streaming执行模型
https://databricks.com/blog/2015/07/30/diving-into-spark-streamings-execution-model.html
Spark 1.4 MLP新特性
https://databricks.com/blog/2015/07/29/new-features-in-machine-learning-pipelines-in-spark-1-4.html
从Spark 1.2开始支持ORC
https://databricks.com/blog/2015/07/16/joint-blog-post-bringing-orc-support-into-apache-spark.html
从Spark 1.4开始支持窗口函数
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
从Spark 1.4开始新的Web UI
https://databricks.com/blog/2015/07/08/new-visualizations-for-understanding-spark-streaming-applications.html

Phoenix对join的支持，TPC in Apache Phoenix
https://blogs.apache.org/phoenix/entry/tpc_in_apache_phoenix

Cassandra
http://cassandra.apache.org/

mongoDB
https://www.mongodb.org/

Confluent
基于Kafka的实时流处理
http://www.confluent.io/
大数据生态系统之Kafka价值
http://www.confluent.io/blog/the-value-of-apache-kafka-in-big-data-ecosystem/