当前位置:主页 > 查看内容

Hadoop学习之大数据概论

发布时间:2021-07-08 00:00| 位朋友查看

简介:???突然想学大数据了加上上课老师把大数据说的天花乱坠我还没听懂于是只能课下开小灶。大致了解一下大数据的一些基本概念和名词。 1.大数据的概念 大数据是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的数据集合是需要新的处理模式才能具有更……

???突然想学大数据了,加上上课老师把大数据说的天花乱坠,我还没听懂😟,于是只能课下开小灶。大致了解一下大数据的一些基本概念和名词。

1.大数据的概念

大数据:是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的数据集合,是需要新的处理模式才能具有更强的决策力、洞察发现力和流程优化能力的海量、高增长率和多样化的信息资产。一般来说就是指存储数据在TB、PB、EB量级的数据。

  • 主要解决海量数据的采集、存储和分析计算的问题。
  • 数据量大
  • 产生速度快
  • 数据类型多样:结构化(数据库/文本)和非结构化数据(网络日志、音频、视频、图片、地理位置等)
  • 密度大,价值低,即价值密度的高低和数据量大小成反比。

2.Hadoop介绍

  • Hadoop是一个由Apache基金会所开发的分布式系统基础架构
  • 主要解决海量数据的存储海量数据的分析计算的问题。
  • 广义上来说,Hadoop 通常是指一个更广泛的概念——Hadoop 生态圈。

3.Hadoop的版本

hadoop的三大发行版本:Apache、Cloudera、Hortonworks。

  • Apache版本最原始(最基础),对入门学习最好。
  • Cloudera内部集成了很多大数据框架,对应产品CDH。
  • Hortonworks文档较好,对应产品HDP 。
  • Hortonwork和Cloudera合并

3.Hadoop的特点

  • 高可靠性:Hadoop底层维护多个数据副本,即使某个计算单元存储出现故障,也不会导致数据丢失。
position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-Dea3eFcIXASpWujp .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-Dea3eFcIXASpWujp text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-Dea3eFcIXASpWujp .actor-line{stroke:grey}#mermaid-svg-Dea3eFcIXASpWujp .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-Dea3eFcIXASpWujp .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-Dea3eFcIXASpWujp #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-Dea3eFcIXASpWujp .sequenceNumber{fill:#fff}#mermaid-svg-Dea3eFcIXASpWujp #sequencenumber{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp #crosshead path{fill:#333;stroke:#333}#mermaid-svg-Dea3eFcIXASpWujp .messageText{fill:#333;stroke:#333}#mermaid-svg-Dea3eFcIXASpWujp .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-Dea3eFcIXASpWujp .labelText,#mermaid-svg-Dea3eFcIXASpWujp .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-Dea3eFcIXASpWujp .loopText,#mermaid-svg-Dea3eFcIXASpWujp .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-Dea3eFcIXASpWujp .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-Dea3eFcIXASpWujp .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-Dea3eFcIXASpWujp .noteText,#mermaid-svg-Dea3eFcIXASpWujp .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-Dea3eFcIXASpWujp .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-Dea3eFcIXASpWujp .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-Dea3eFcIXASpWujp .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-Dea3eFcIXASpWujp .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .section{stroke:none;opacity:0.2}#mermaid-svg-Dea3eFcIXASpWujp .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-Dea3eFcIXASpWujp .section2{fill:#fff400}#mermaid-svg-Dea3eFcIXASpWujp .section1,#mermaid-svg-Dea3eFcIXASpWujp .section3{fill:#fff;opacity:0.2}#mermaid-svg-Dea3eFcIXASpWujp .sectionTitle0{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp .sectionTitle1{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp .sectionTitle2{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp .sectionTitle3{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-Dea3eFcIXASpWujp .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .grid path{stroke-width:0}#mermaid-svg-Dea3eFcIXASpWujp .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-Dea3eFcIXASpWujp .task{stroke-width:2}#mermaid-svg-Dea3eFcIXASpWujp .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .taskText:not([font-size]){font-size:11px}#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-Dea3eFcIXASpWujp .task.clickable{cursor:pointer}#mermaid-svg-Dea3eFcIXASpWujp .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-Dea3eFcIXASpWujp .taskText0,#mermaid-svg-Dea3eFcIXASpWujp .taskText1,#mermaid-svg-Dea3eFcIXASpWujp .taskText2,#mermaid-svg-Dea3eFcIXASpWujp .taskText3{fill:#fff}#mermaid-svg-Dea3eFcIXASpWujp .task0,#mermaid-svg-Dea3eFcIXASpWujp .task1,#mermaid-svg-Dea3eFcIXASpWujp .task2,#mermaid-svg-Dea3eFcIXASpWujp .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutside0,#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutside2{fill:#000}#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutside1,#mermaid-svg-Dea3eFcIXASpWujp .taskTextOutside3{fill:#000}#mermaid-svg-Dea3eFcIXASpWujp .active0,#mermaid-svg-Dea3eFcIXASpWujp .active1,#mermaid-svg-Dea3eFcIXASpWujp .active2,#mermaid-svg-Dea3eFcIXASpWujp .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-Dea3eFcIXASpWujp .activeText0,#mermaid-svg-Dea3eFcIXASpWujp .activeText1,#mermaid-svg-Dea3eFcIXASpWujp .activeText2,#mermaid-svg-Dea3eFcIXASpWujp .activeText3{fill:#000 !important}#mermaid-svg-Dea3eFcIXASpWujp .done0,#mermaid-svg-Dea3eFcIXASpWujp .done1,#mermaid-svg-Dea3eFcIXASpWujp .done2,#mermaid-svg-Dea3eFcIXASpWujp .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-Dea3eFcIXASpWujp .doneText0,#mermaid-svg-Dea3eFcIXASpWujp .doneText1,#mermaid-svg-Dea3eFcIXASpWujp .doneText2,#mermaid-svg-Dea3eFcIXASpWujp .doneText3{fill:#000 !important}#mermaid-svg-Dea3eFcIXASpWujp .crit0,#mermaid-svg-Dea3eFcIXASpWujp .crit1,#mermaid-svg-Dea3eFcIXASpWujp .crit2,#mermaid-svg-Dea3eFcIXASpWujp .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-Dea3eFcIXASpWujp .activeCrit0,#mermaid-svg-Dea3eFcIXASpWujp .activeCrit1,#mermaid-svg-Dea3eFcIXASpWujp .activeCrit2,#mermaid-svg-Dea3eFcIXASpWujp .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-Dea3eFcIXASpWujp .doneCrit0,#mermaid-svg-Dea3eFcIXASpWujp .doneCrit1,#mermaid-svg-Dea3eFcIXASpWujp .doneCrit2,#mermaid-svg-Dea3eFcIXASpWujp .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-Dea3eFcIXASpWujp .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-Dea3eFcIXASpWujp .milestoneText{font-style:italic}#mermaid-svg-Dea3eFcIXASpWujp .doneCritText0,#mermaid-svg-Dea3eFcIXASpWujp .doneCritText1,#mermaid-svg-Dea3eFcIXASpWujp .doneCritText2,#mermaid-svg-Dea3eFcIXASpWujp .doneCritText3{fill:#000 !important}#mermaid-svg-Dea3eFcIXASpWujp .activeCritText0,#mermaid-svg-Dea3eFcIXASpWujp .activeCritText1,#mermaid-svg-Dea3eFcIXASpWujp .activeCritText2,#mermaid-svg-Dea3eFcIXASpWujp .activeCritText3{fill:#000 !important}#mermaid-svg-Dea3eFcIXASpWujp .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-Dea3eFcIXASpWujp g.classGroup text .title{font-weight:bolder}#mermaid-svg-Dea3eFcIXASpWujp g.clickable{cursor:pointer}#mermaid-svg-Dea3eFcIXASpWujp g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-Dea3eFcIXASpWujp g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-Dea3eFcIXASpWujp .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-Dea3eFcIXASpWujp .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-Dea3eFcIXASpWujp .dashed-line{stroke-dasharray:3}#mermaid-svg-Dea3eFcIXASpWujp #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp .commit-id,#mermaid-svg-Dea3eFcIXASpWujp .commit-msg,#mermaid-svg-Dea3eFcIXASpWujp .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-Dea3eFcIXASpWujp g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-Dea3eFcIXASpWujp g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-Dea3eFcIXASpWujp g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-Dea3eFcIXASpWujp .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-Dea3eFcIXASpWujp .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-Dea3eFcIXASpWujp .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-Dea3eFcIXASpWujp .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-Dea3eFcIXASpWujp .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-Dea3eFcIXASpWujp .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-Dea3eFcIXASpWujp .edgeLabel text{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-Dea3eFcIXASpWujp .node circle.state-start{fill:black;stroke:black}#mermaid-svg-Dea3eFcIXASpWujp .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-Dea3eFcIXASpWujp #statediagram-barbEnd{fill:#9370db}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-state .divider{stroke:#9370db}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-Dea3eFcIXASpWujp .note-edge{stroke-dasharray:5}#mermaid-svg-Dea3eFcIXASpWujp .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family: '"trebuchet ms", verdana, arial';--mermaid-font-family: "Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-Dea3eFcIXASpWujp .error-icon{fill:#522}#mermaid-svg-Dea3eFcIXASpWujp .error-text{fill:#522;stroke:#522}#mermaid-svg-Dea3eFcIXASpWujp .edge-thickness-normal{stroke-width:2px}#mermaid-svg-Dea3eFcIXASpWujp .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-Dea3eFcIXASpWujp .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-Dea3eFcIXASpWujp .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-Dea3eFcIXASpWujp .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-Dea3eFcIXASpWujp .marker{fill:#333}#mermaid-svg-Dea3eFcIXASpWujp .marker.cross{stroke:#333} :root { --mermaid-font-family: "trebuchet ms", verdana, arial;}
Hapood101
Hapood102
Hapood103
  • 高拓展性:在集群之间分配任务数据,可方便的扩展数以万计节点。
  • 高效性:在MapReduce的思想下,Hadoop是并行工作的,以加快任务处理速度。
  • 高容错性:能够自动将失败的任务重新分配。

4.Hadoop的组成

  • Hadoop1.X版本的组成
    MapReduce负责计算和计算所需的cpu、内存等资源的调度
hadoop1.x
HDFS 数据存储
Common 辅助工具
MapReduce 计算+资源调度
  • Hadoop2.X版本的组成
    增加了Yarn进行资源调度,原来的MapReduce只负责计算。
hadoop2.x
HDFS 数据存储
Common 辅助工具
Yarn 资源调度
MapReduce 计算
  • Hadoop3.X版本的组成没啥区别,在细节上还是有区别的。

5.HDFS

HDFS(Hadoop Distributed File System)是一个分布式文件系统。
大致是这样的:将一个很大的文件拆成很多部分,然后存储在一个个DataNode中,而NameNode中只存储DataNode的位置信息,2NN对NameNode进行备份(害怕NameNode挂掉,然后丢失所有信息。

  • NameNode(nn):存储文件的元数据,如文件名、文件目录结构、文件属性,以及每个文件的块列表和块所在的DataNode等。
  • DataNode(dn):在本地文件系统存储文件块数据,以及块数据的校验和。
  • Secondary NameNode(2nn):每隔一段时间对NameNode进行备份。
NameNode数据存储的位置
存储数据DataNode1
存储数据DataNode2
存储数据DataNode3
.....还有很多........
2NN备份

6.YARN

YARN(Yet Another Resource Negotiator),是一种资源协调者,是Hadpood的资源管理器。

  • ResourceManager(RM):整个集群资源(内存、CPU)的老大。
  • NodeManager(NM):单个节点服务器资源老大。
  • ApplicationMaster(AM):单个任务运行的老大。
  • client:客户端
  • Container:容器,相当于一台独立的服务器,里面封装了运行所需的资源,如内存、CPU、磁盘、网络等。
  • 客户端可有多个、集群上可有运行多个ApplicationMaster、每个NodeManager上可以有多个Container.

i1
i2
i3
i4
NodeManager - 4G内存2CPU
Container1
NodeManager - 4G内存2CPU
Container里包含App Mstr
NodeManager - 4G内存2CPU
Container里包含App Mstr
Container4
NodeManager - 4G内存2CPU
Resource Manager 16G内存8CPU
client
client

7.MapReduce

MapReduce将计算过程划分为两个阶段:MAP和Reduce

  1. Map阶段并行处理输入数据。
  2. Reduce阶段对Map结果进行汇总。
    100T的数据已经被分被存储到很多台服务器上,如果需要找寻某个资料,我们就可以要求各个服务器并行寻找自己的电脑上有没有对应的内容,然后把结果告诉汇总服务器。
map
Reduce
数据集100T
hadpood101
hadpood102
hadpood103
......很多........
汇总服务器

8.HDFS、YARN和MapReduce三者的关系

在这里插入图片描述

9.大数据处理的过程

在这里插入图片描述

参考资料:

大数据课程《Hadoop入门》

;原文链接:https://blog.csdn.net/weixin_48077303/article/details/115673027
本站部分内容转载于网络,版权归原作者所有,转载之目的在于传播更多优秀技术内容,如有侵权请联系QQ/微信:153890879删除,谢谢!

推荐图文


随机推荐