[134]简报:大数据技术动态 - 20170307






ML & DL & AI & RL

Voice from Facebook: Using Apache Spark for Large-Scale Language Model Training
DATALAKE 3.0 PART 3 – DISTRIBUTED TENSORFLOW ASSEMBLY ON APACHE HADOOP YARN
Machine Learning in the Age of Big Data

Spark

Working with Complex Data Formats with Structured Streaming in Apache Spark 2.1
Processing a Trillion Rows Per Second on a Single Machine: How Can Nested Loop Joins be this Fast?
Achieving a 300% speedup in ETL with Apache Spark

SQL & Real-Time Analytics on Hadoop

Performance comparison of different file formats and storage engines in the Apache Hadoop ecosystem
Apache Kudu: Top Use Cases for Real-Time Analytics
How To Set Up a Shared Amazon RDS as Your Hive Metastore
Latest Impala Cookbook

Hadoop

DATA LAKE 3.0 PART 2 – A MULTI-COLORED YARN
DATA LAKE 3.0: THE EZ BUTTON TO DEPLOY IN MINUTES AND CUT TCO BY HALF
DATA LAKE 3.0: THE EZ BUTTON TO DEPLOY IN MINUTES AND CUT TCO BY HALF
How-to: Use the New HDFS Intra-DataNode Disk Balancer in Apache Hadoop
Apache Hadoop 3.0.0-alpha2 Released
Untangling Apache Hadoop YARN, Part 5: Using FairScheduler queue properties
HDFS DataNode Scanners and Disk Checker Explained
How-to: Deploy a Secure Enterprise Data Hub on Microsoft Azure – Part 1
How-to: Deploy a Secure Enterprise Data Hub on Microsoft Azure – Part 2

Data Ingestion

New in Cloudera Enterprise 5.8: Flafka Improvements for Real-Time Data Ingest

Testing & Evaluating

YCSB 0.10.0 Now in Cloudera Labs
Quality Assurance at Cloudera: Highly-Controlled Disk Injection

[133]Hexo升级到 V1.0.2

第一步,查看当前版本:

命令:hexo version

1
2
3
4
5
6
7
8
9
10
C:\Windows\System32>hexo version
hexo-cli: 0.1.7
os: Windows_NT 6.1.7601 win32 ia32
http_parser: 2.3
node: 0.12.7
v8: 3.28.71.19
uv: 1.6.1
zlib: 1.2.8
modules: 14
openssl: 1.0.1p

第二步,重新安装

命令:npm install -g hexo-cli

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
E:\xDoc\hexo2017>npm install -g hexo-cli
|

> hexo-util@0.6.0 postinstall C:\Users\stevenxu\AppData\Roaming\npm\node_modules
\hexo-cli\node_modules\hexo-util
> npm run build:highlight

\
> hexo-util@0.6.0 build:highlight C:\Users\stevenxu\AppData\Roaming\npm\node_mod
ules\hexo-cli\node_modules\hexo-util
> node scripts/build_highlight_alias.js > highlight_alias.json

npm WARN optional dep failed, continuing fsevents@1.0.17


> dtrace-provider@0.8.0 install C:\Users\stevenxu\AppData\Roaming\npm\node_modul
es\hexo-cli\node_modules\hexo-log\node_modules\bunyan\node_modules\dtrace-provid
er
> node scripts/install.js

C:\Users\stevenxu\AppData\Roaming\npm\hexo -> C:\Users\stevenxu\AppData\Roaming\
npm\node_modules\hexo-cli\bin\hexo
hexo-cli@1.0.2 C:\Users\stevenxu\AppData\Roaming\npm\node_modules\hexo-cli
├── abbrev@1.0.9
├── object-assign@4.1.0
├── minimist@1.2.0
├── bluebird@3.4.7
├── tildify@1.2.0 (os-homedir@1.0.2)
├── chalk@1.1.3 (supports-color@2.0.0, ansi-styles@2.2.1, escape-string-regex
p@1.0.5, strip-ansi@3.0.1, has-ansi@2.0.0)
├── hexo-util@0.6.0 (striptags@2.1.1, html-entities@1.2.0, highlight.js@9.9.0
, camel-case@3.0.0, cross-spawn@4.0.2)
├── hexo-log@0.1.2 (bunyan@1.8.5)
└── hexo-fs@0.1.6 (escape-string-regexp@1.0.5, graceful-fs@4.1.11, chokidar@1
.6.1)

查看升级后的版本变化:

1
2
3
4
5
6
7
8
9
10
11
12
13
E:\xDoc\hexo2017>	
E:\xDoc\hexo2017>hexo version
hexo-cli: 1.0.2
os: Windows_NT 6.1.7601 win32 ia32
http_parser: 2.3
node: 0.12.7
v8: 3.28.71.19
uv: 1.6.1
zlib: 1.2.8
modules: 14
openssl: 1.0.1p

E:\xDoc\hexo2017>

第三步,创建博客

创建博客目录:navigating.github.io
hexo init

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126

E:\xDoc\hexo2017\navigating.github.io>hexo init
INFO Cloning hexo-starter to E:\xDoc\hexo2017\navigating.github.io
Cloning into 'E:\xDoc\hexo2017\navigating.github.io'...
fatal: unable to access 'https://github.com/hexojs/hexo-starter.git/': Failed co
nnect to github.com:8080; No error
Deletion of directory 'E:\xDoc\hexo2017\navigating.github.io' failed. Should I try
again? (y/n) n
WARN git clone failed. Copying data instead
INFO Install dependencies
npm WARN deprecated swig@1.4.2: This package is no longer maintained
npm WARN deprecated minimatch@0.3.0: Please update to minimatch 3.0.2 or higher
to avoid a RegExp DoS issue
/


> hexo-util@0.6.0 postinstall E:\xDoc\hexo2017\navigating.github.io\node_modules\h
exo-renderer-marked\node_modules\hexo-util
> npm run build:highlight

/

> hexo-util@0.6.0 build:highlight E:\xDoc\hexo2017\navigating.github.io\node_modul
es\hexo-renderer-marked\node_modules\hexo-util
> node scripts/build_highlight_alias.js > highlight_alias.json

npm WARN optional dep failed, continuing fsevents@1.0.17
npm WARN optional dep failed, continuing fsevents@1.0.17
npm WARN engine request@2.79.0: wanted: {"node":">= 4"} (current: {"node":"0.12.
7","npm":"2.11.3"})


> dtrace-provider@0.8.0 install E:\xDoc\hexo2017\navigating.github.io\node_modules
\hexo\node_modules\hexo-log\node_modules\bunyan\node_modules\dtrace-provider
> node scripts/install.js


> hexo-util@0.6.0 postinstall E:\xDoc\hexo2017\navigating.github.io\node_modules\h
exo\node_modules\hexo-util
> npm run build:highlight


> hexo-util@0.6.0 build:highlight E:\xDoc\hexo2017\navigating.github.io\node_modul
es\hexo\node_modules\hexo-util
> node scripts/build_highlight_alias.js > highlight_alias.json

hexo-renderer-ejs@0.2.0 node_modules\hexo-renderer-ejs
├── object-assign@4.1.0
└── ejs@1.0.0

hexo-generator-category@0.1.3 node_modules\hexo-generator-category
├── object-assign@2.1.1
└── hexo-pagination@0.0.2 (utils-merge@1.0.0)

hexo-generator-archive@0.1.4 node_modules\hexo-generator-archive
├── object-assign@2.1.1
└── hexo-pagination@0.0.2 (utils-merge@1.0.0)

hexo-generator-index@0.2.0 node_modules\hexo-generator-index
├── object-assign@4.1.0
└── hexo-pagination@0.0.2 (utils-merge@1.0.0)

hexo-generator-tag@0.2.0 node_modules\hexo-generator-tag
├── object-assign@4.1.0
└── hexo-pagination@0.0.2 (utils-merge@1.0.0)

hexo-renderer-stylus@0.3.1 node_modules\hexo-renderer-stylus
├── stylus@0.53.0 (css-parse@1.7.0, mkdirp@0.5.1, debug@2.6.0, glob@3.2.11, s
ax@0.5.8, source-map@0.1.43)
└── nib@1.1.2 (stylus@0.54.5)

hexo-server@0.2.0 node_modules\hexo-server
├── object-assign@4.1.0
├── mime@1.3.4
├── chalk@1.1.3 (supports-color@2.0.0, ansi-styles@2.2.1, escape-string-regex
p@1.0.5, strip-ansi@3.0.1, has-ansi@2.0.0)
├── opn@4.0.2 (pinkie-promise@2.0.1)
├── morgan@1.7.0 (on-headers@1.0.1, basic-auth@1.0.4, depd@1.1.0, debug@2.2.0
, on-finished@2.3.0)
├── connect@3.5.0 (utils-merge@1.0.0, parseurl@1.3.1, debug@2.2.0, finalhandl
er@0.5.0)
├── compression@1.6.2 (on-headers@1.0.1, vary@1.1.0, bytes@2.3.0, debug@2.2.0
, compressible@2.0.9, accepts@1.3.3)
├── bluebird@3.4.7
└── serve-static@1.11.1 (escape-html@1.0.3, encodeurl@1.0.1, parseurl@1.3.1,
send@0.14.1)

hexo-renderer-marked@0.2.11 node_modules\hexo-renderer-marked
├── object-assign@4.1.0
├── marked@0.3.6
├── strip-indent@1.0.1 (get-stdin@4.0.1)
└── hexo-util@0.6.0 (striptags@2.1.1, html-entities@1.2.0, camel-case@3.0.0,
cross-spawn@4.0.2, bluebird@3.4.7, highlight.js@9.9.0)

hexo@3.2.2 node_modules\hexo
├── abbrev@1.0.9
├── pretty-hrtime@1.0.3
├── archy@1.0.0
├── hexo-front-matter@0.2.3
├── titlecase@1.1.2
├── text-table@0.2.0
├── strip-indent@1.0.1 (get-stdin@4.0.1)
├── tildify@1.2.0 (os-homedir@1.0.2)
├── chalk@1.1.3 (supports-color@2.0.0, escape-string-regexp@1.0.5, ansi-style
s@2.2.1, has-ansi@2.0.0, strip-ansi@3.0.1)
├── hexo-i18n@0.2.1 (sprintf-js@1.0.3)
├── minimatch@3.0.3 (brace-expansion@1.1.6)
├── swig-extras@0.0.1 (markdown@0.5.0)
├── bluebird@3.4.7
├── js-yaml@3.7.0 (esprima@2.7.3, argparse@1.0.9)
├── hexo-fs@0.1.6 (escape-string-regexp@1.0.5, graceful-fs@4.1.11, chokidar@1
.6.1)
├── hexo-cli@1.0.2 (object-assign@4.1.0, minimist@1.2.0)
├── swig@1.4.2 (optimist@0.6.1, uglify-js@2.4.24)
├── nunjucks@2.5.2 (asap@2.0.5, yargs@3.32.0, chokidar@1.6.1)
├── moment-timezone@0.5.11
├── moment@2.13.0
├── cheerio@0.20.0 (entities@1.1.1, dom-serializer@0.1.0, css-select@1.2.0, h
tmlparser2@3.8.3, jsdom@7.2.2)
├── lodash@4.17.4
├── warehouse@2.2.0 (graceful-fs@4.1.11, is-plain-object@2.0.1, JSONStream@1.
3.0, cuid@1.3.8)
├── hexo-log@0.1.2 (bunyan@1.8.5)
└── hexo-util@0.6.0 (striptags@2.1.1, html-entities@1.2.0, camel-case@3.0.0,
cross-spawn@4.0.2, highlight.js@9.9.0)
INFO Start blogging with Hexo!

E:\xDoc\hexo2017\navigating.github.io>

第四步,安装 npm install

1
2
3
4

E:\xDoc\hexo2017\navigating.github.io>npm install

E:\xDoc\hexo2017\navigating.github.io>

第五步,调试hexo server

1
2
3
4

E:\xDoc\hexo2017\navigating.github.io>hexo server
INFO Start processing
INFO Hexo is running at http://localhost:4000/. Press Ctrl+C to stop.

第六步,将原博客目录source下的内容复制到新的source目录下。

第七步,远程发布 hexo deploy

1
2
3
4

1.遇到问题
E:\xDoc\hexo2017\navigating.github.io>hexo d
ERROR Deployer not found: git

参考:https://www.v2ex.com/t/175940

然后执行:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

npm install hexo-deployer-git --save

E:\xDoc\hexo2017\navigating.github.io>npm install hexo-deployer-git --save
npm WARN deprecated swig@1.4.2: This package is no longer maintained
\


> hexo-util@0.6.0 postinstall E:\xDoc\hexo2017\navigating.github.io\node_modules\h
exo-deployer-git\node_modules\hexo-util
> npm run build:highlight

|
> hexo-util@0.6.0 build:highlight E:\xDoc\hexo2017\navigating.github.io\node_modul
es\hexo-deployer-git\node_modules\hexo-util
> node scripts/build_highlight_alias.js > highlight_alias.json

npm WARN optional dep failed, continuing fsevents@1.0.17
hexo-deployer-git@0.2.0 node_modules\hexo-deployer-git
├── chalk@1.1.3 (escape-string-regexp@1.0.5, supports-color@2.0.0, ansi-style
s@2.2.1, strip-ansi@3.0.1, has-ansi@2.0.0)
├── moment@2.17.1
├── swig@1.4.2 (optimist@0.6.1, uglify-js@2.4.24)
├── hexo-util@0.6.0 (striptags@2.1.1, html-entities@1.2.0, bluebird@3.4.7, ca
mel-case@3.0.0, cross-spawn@4.0.2, highlight.js@9.9.0)
└── hexo-fs@0.1.6 (escape-string-regexp@1.0.5, graceful-fs@4.1.11, bluebird@3
.4.7, chokidar@1.6.1)

参考:
http://www.tuicool.com/articles/IfIRVv2

[131]简报:大数据Hadoop动态 - 2016Q3

HDP/HDF

THREE THINGS TO KNOW ABOUT HDF 2.0
HDF installation on EC2
Hortonworks Connected Data Cloud Overview
Hortonworks Data Cloud
QUICKLY LAUNCH HORTONWORKS DATA PLATFORM IN AMAZON WEB SERVICES

NiFi

Using Apache NiFi for Slowly Changing Dimensions on Hadoop Part 1
Using NiFi to ingest and transform RSS feeds to HDFS using an external

Kafka

Kafka 0.9 Configuration Best Practices

Spark

Stream Processing: NiFi and Spark

Storm

MICROBENCHMARKING APACHE STORM 1.0 PERFORMANCE
Why does HDF come with Storm and not Spark?

Hive

INTERACTIVE SQL ON HADOOP WITH HIVE LLAP
Implementing a real-time Hive Streaming example
ANNOUNCING APACHE HIVE 2.1: 25X FASTER QUERIES AND MUCH MORE

HBase

Phoenix HBase Tuning - Quick Hits
HBase Replication and comparison with popular online backup programs

Hadoop

Data transfer between two clusters
Monitor Hadoop JVMs with jVisualVM
HOW file storage in HDFS is Done? please go through details.
How much actual space required to store 10GB to HDFS? And HBase ?
Heterogeneous Storage in HDFS(Part-1)

[131]简报:大数据技术动态 - 2016Q4

Hortonworks









HDP/HDF

THREE THINGS TO KNOW ABOUT HDF 2.0
HDF installation on EC2
Hortonworks Connected Data Cloud Overview
Hortonworks Data Cloud
QUICKLY LAUNCH HORTONWORKS DATA PLATFORM IN AMAZON WEB SERVICES

NiFi

Using Apache NiFi for Slowly Changing Dimensions on Hadoop Part 1
Using NiFi to ingest and transform RSS feeds to HDFS using an external

Kafka

Kafka 0.9 Configuration Best Practices

Spark

Stream Processing: NiFi and Spark

Storm

MICROBENCHMARKING APACHE STORM 1.0 PERFORMANCE
Why does HDF come with Storm and not Spark?

Hive

INTERACTIVE SQL ON HADOOP WITH HIVE LLAP
Implementing a real-time Hive Streaming example
ANNOUNCING APACHE HIVE 2.1: 25X FASTER QUERIES AND MUCH MORE

HBase

Phoenix HBase Tuning - Quick Hits
HBase Replication and comparison with popular online backup programs

Hadoop

Data transfer between two clusters
Monitor Hadoop JVMs with jVisualVM
HOW file storage in HDFS is Done? please go through details.
How much actual space required to store 10GB to HDFS? And HBase ?
Heterogeneous Storage in HDFS(Part-1)

[127]简报:大数据产品分析 - 基于位置的用户行为

腾讯位置大数据

网站:http://heat.qq.com/
产品功能:
位置流量趋势
区域热力图
人口迁徙图

百度地图人气

网站:http://renqi.map.baidu.com/
产品功能:
百度迁徙
景区热力图
百度慧眼
通勤图

TalkingData移动观象台

网站:http://www.talkingdata.com/index/#/mobileIndex/zh_CN
产品功能:
应用洞察
终端指数
数据报告
用户趋势
人迹地图

参考:

  1. 腾讯位置大数据
  2. 腾讯大数据
  3. 腾讯大数据 开发者中心
  4. 腾讯云分析
  5. 百度地图|人气
    百度地图|景区热力图
  6. 百度预测
  7. 百度大数据
  8. 百度统计
  9. 百度营销中心
    10.数加
    11.TalkingData|移动观象台|人迹地图
    12.高德观景台
    13.高德指数
    14.华北城市智能出行大数据报告
  10. http://www.talkingdata.com/index/#/datareport/-1/zh_CN
  11. http://www.talkingdata.com/index/#/profile/usertrend/zh_CN
  12. http://www.talkingdata.com/index/behaviormap/heatMapOverlay.jsp
  13. http://www.talkingdata.com/index/files/2016-04/1461050046352.pdf
  14. http://www.domarketing.org/html/2012/yd_0508/4311.html
    20.神州泰岳|大数据应用
    21.鼎富语义云
    22.腾讯大数据应用
    23.腾讯大数据|移动页面用户行为报告

<完>