关于大数据的十个有力事实

2023-10-09 08:50
文章标签 数据 十个 有力 事实

本文主要是介绍关于大数据的十个有力事实,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

0.jpg

无论大家如何进行定义,大数据自诞生之日起就饱受争议——既有毛病之词,亦不乏诋毁之声。大数据对于很多人来说包含有重要的意义,特别是科学家和零售商家。不过这项技术的出现也引发了大量的相关隐私问题与安全威胁。


到底是救世主、骗局抑或二者兼而有之?无论如何,大数据仍然在技术专家、趋势分析师、市场推广人士以及安全从业者群体中拥有极高的热度与人气。事实上,截至今天大数据仍然没有一个受到普遍认同的官方定义。那么大数据到底是什么?维基百科给出的描述可以说为大数据的概念确立之路开了个好头:“任何由于规模庞大且高度复杂而难以通过现有数据库管理工具或者传统数据处理应用进行处理的数据集。”


虽然管理这种规模庞大、形式多变且对速度要求较高(这三点也就是经典的3V定义)的数据集确实充满挑战,不过目前针对这类任务的数据共享设备的数量正呈现指数级增长的趋势,而这又给大数据难题带来更多别样的变化。这类硬件被统称为物联网,其中包括机器传感器以及面向普通消费者的设备,例如联网温控器、电灯泡、冰箱以及可穿戴式健康监测工具等。IDC公司预计,物联网市场在未来几年当中将迅猛增长——其单位安装数量将由2013年年底的91亿增长到2020年的281亿。


企业则将来自大数据的可行性分析结论视为潜在的利好消息,这不仅是因为此类结论能够帮助商家售出更多工具及服务,同时也可以更好地处理医疗事务、阻止伪劣药品流通、追踪恐怖分子甚至监控特定目标的通话内容。因此,大数据本身并没有善恶之分,真正起决定作用的还是我们的实际使用方式。



具有讽刺意味的是,尽管大数据当中蕴藏着提升人类经验的潜在可能性,但这些宝贵的信息却往往很难进行收集、筛选、分析以及最后的解释。今天的文章着重审视大数据领域的挑战与机遇,这些事实与论证数据很可能给各位带来意外惊喜。哪些内容值得期待?这个嘛,作为大数据平台中的领导者,Hadoop的发展前景一片光明。而且数据科学家与大数据相关技术人士也将在未来几年中获得丰厚的薪酬回报。


业内人士作出预测,认为“大数据”作为流行词汇将彻底消失。“一切的一切最终都会被归结为数据,仅此而已。大数据与所有以此为基础的预测行为都将成为由分析师以及众多‘大型’技术供应商负责的‘数据管理’工作,”Hortonworks公司总裁Herb Cunitz在2012年12月的一篇博文中写道。


Cunitz作出的“大数据”概念消亡预测可能为时过早,他提出了很重要的一项结论,即一切的一切最终都会被归结为数据。只有管理这些信息所必需的工具会迎来变革。现在就请大家跟随我们的脚步,一同通过图文了解与大数据紧密相关的统计及研究成果。



一、有多少数据被忽略掉了?

0

大多数企业估算称,他们只对自身持有的约12%数据进行了分析,Forrester研究公司在最近的一项调查中发现。这到底是好消息还是坏消息?这个嘛,被他们所忽略的88%数据当中很可能蕴藏着足以带来数据驱动结论的宝贵信息。但从另一个角度看,他们也许明智地避免了由所谓“煮沸海洋”战略所带来的巨大资源消耗。说起企业忽略绝大多数自有数据的理由,原因主要有两点:第一是缺乏相关分析工具与“可控制”数据仓库,第二则在于他们很难确切了解哪些信息能够实现价值、哪些则最好加以忽略,Forrester公司在报告中指出。


二、大数据相关工作岗位持续增长


0

大数据掀起的狂潮对于具备特定技能的从业人员来说不啻为一大福音。根据 Dice网站(一家专门服务于技术及工程专业人才的求职网站)的统计,目前业界对于数据专家的需求正持续激增。与上一年相比,目前针对NoSQL技术人员的招聘岗位数量增长了54%,而面向“大数据人才”的岗位也上涨了46%,该网站在今年四月的报告中指出。虽然这样的提升幅度令人印象深刻,不过与网络安全专家的职位需求相比仍然是小巫见大巫——后者的同比增长幅度高达162%。


三、大数据最终将成长至怎样的规模?

0

在未来六年当中,数字化领域的数据问题将由目前的3.2 ZB(即泽字节)增长到40 ZB。(1 ZB基本相当于10亿TB。)“当我们审视即将席卷而来的数据量时,其庞大的规模真的很令人兴奋,”Hortonworks公司CEO Rob Bearden在今年于加利福尼亚州圣何塞举办的2014 Hadoop峰会上表示。“从现在到2020年,企业所持有的数量问题将以每年50倍的速度递增。我认为目前最重要的任务在于清醒地认识到,其中85%的数据来自新兴网络数据源。”包括移动、社交媒体以及Web与机器生成数据在内的这些新兴数据源将给全球企业带来重大挑战与不可错过的发展机遇,Bearden指出。


四、大数据等同于大财富

0

大数据相关岗位的薪酬相当突出。根据Burtch Works公司发布的2014年4月数据科学家薪酬报告,2014年数据科学家职位的基础薪酬为每年12万美元,相关管理岗位则为每年16万美元。这一结论以Burtch Works就业数据库的分析为基础,涉及超过170位数据科学家在采访中的意见反馈。对于范畴更为广泛的大数据相关专业人士而言,也就是那些“利用复杂的定量分析技术对事务、相互作用或者其它人为因素进行数据化描述、从而得出结论及对应方案的从业者”,其整体薪酬同样实现了显著提升。这类工作人员在2013年获得的平均薪酬水平在每年9万美元左右,而相关管理岗位则开出了每年14.5万美元这一令人艳羡的平均工资。


五、大数据专业人士是否准备好迎接物联网时代?

0

大多数IT专家表示他们还没有开始为物联网时代的来临进行准备。Spiceworks公司今年四月对440位IT专业人士进行了调查,了解他们如何看待物联网并有针对性地推进前期准备工作。其中62%的受访者来自北美地区,38%则来自EMEA(即欧洲、中东以及非洲)地区。超过一半(59%)的受访者指出,他们还没有采取具体的步骤来处理未来产生自传感器、摄像头以及其它各类物联网设备的海量数据。不过调查还发现,也有相当一部分IT专业人士开始切实筹备物联网相关事宜,包括向基础设施、安全、应用以及分析机制进行投资,并同时扩大数据传输带宽。


六、数据科学家:仍然性感、依旧迷人

0

2012年10月《哈佛商业评论》发布了一篇抓人眼球的报道,其中将数据科学相关工作称为“二十一世纪最性感的工作岗位”。这种说法存在一定争议,不过如果把“性感”当成是需求的代名词则更容易理解,这是指数据科学家仍然拥有旺盛的市场需求。根据全球IT职业介绍服务供应商Modis的统计,目前数据科学家仍然处于“需求高企但供应不足”的阶段,换言之与大数据相关的博士学位持有者年平均薪酬都能超过六位数。


七、颤抖吧,数据仓库:Hadoop就要将你取而代之了



0

数据仓库业界是否该为Hadoop的迅速崛起而感到担忧甚至恐慌?抑或是该向其敞开热情的怀抱?Cloudera公司的Doug Cutting与Hortonworks公司的Arun Murthy作为Hadoop领域的两位先驱者,在本届Hadoop 2014峰会的问答环节中提出了这样的问题。尽管很多企业开始将数据仓库中的工作负载迁移到Hadoop环境当中,但这种作法仍然没有成为主流。但未来情况是否会有变化?“如果相当比例的用户不再增加数据仓库的规模,反而由于发现了Hadoop类系统在处理效率与负担成本方面的优势而对数据仓库方案进行投资或者规模缩减处理,那我认为这确实应该算作一种威胁,”Cutting解释道。


八、对于隐私的忧虑不会阻碍大数据的前进步伐

0

对于隐私与安全漏洞的担忧与看似无穷无尽的问题解决道路不可能阻止大数据的发展进程。《经济学家》在今年六月的一篇报道中指出,“没有证据表明隐私问题会给数据的使用以及存储方式带来根本性转变。”Gartner公司分析师Carsten Casper在接受该杂志采访时表示,IT领域并没有酝酿一场“隐私大革命”。而且尽管企业用户始终在就隐私相关问题提出更多要求,但其中九成查询其实指向的都是本地数据中心,Casper补充称。

  

九、大数据推动软件市场快速增长

0

从2013年到2018年,全球软件市场的年度复合增长率将在6%上下浮动,研究企业IDC公司预测称。不过大数据相关门类,包括协作应用程序与数据访问、分析与交付解决方案以及结构化数据管理软件,将在未来五年内迎来更高的年度复合增长水平(约为9%),IDC指出。


对于社交媒体的进一步关注也将有助于这种增长趋势的持续。“社交媒体关注度与面向大数据及分析解决方案的需求增长可谓互相依托,二者将帮助企业理解并切实推进对于客户行为的预期以及与产品可靠性及维护相关的新思路,”IDC公司分析师Herny Morris在一份声明中表示。


十、几乎万事万物都将与网络相连

0

物联网将包含众多千奇百怪但又精妙非常的设备,其中很多对于大数据领域来说都是前所未见的新鲜事物。有鉴于此,ABI研究公司的分析师们预计到2020年,全球无线联网设备总量将超过300亿。其中医疗相关数据收集方案将在物联网时代下扮演重要角色。


下面我们来看一个独特的例子:微软与来自罗切斯特大学(纽约)以及南安普敦大学(英国)的研究人员们共同设计出一款智能纹胸,能够借助传感器检测穿着者的心跳与皮肤活性、从而计算出其压力水平,BBC报道称。这款纹胸能够收集数据并将其发送至智能手机端的应用程序,从而利用穿戴式技术掌握用户的压力水平,进而帮助其摆脱由压力引发的暴饮暴食、保持良好的饮食习惯。


【10 Powerful FactsAbout Big Data】

More than a buzzword
Big data, however you define it, has been praised and vilified. It's manythings to many people: a boon to scientists andretailers, but also an enabling technology for a host of privacy and security threats.


Whether savior or scam -- or maybe evena mixture of the two -- big data remains a popular topic among pundits,prognosticators, marketers, and security buffs. Its unofficial definition isevolving as well. So what is it? Wikipedia's description is a good start:"any collection of data sets so large and complex that it becomesdifficult to process using on-hand database management tools or traditionaldata processing applications."


But the challenges of managing massivevolumes of varied data sets arriving at high velocities -- the classic 3V's definition -- are changingas the number of data-sharing devices grows exponentially. This hardware,collectively known as the Internet of Things (IoT), includes machine sensorsand consumer-oriented devices such as connected thermostats, light bulbs,refrigerators, and wearable health monitors. IDC predicts the IoT market willsoar in the coming years -- from 9.1 billion installed units at the end of 2013to 28.1 billion by 2020.


Organizations see a potential boon inactionable insights derived from big data, not only to sell more widgets andservices, but also to better manage healthcare, stopthe flow of counterfeit drugs, track terrorists, andmaybe even track your phone calls.Hence it's a given that big data isn't inherently good or evil. It's how you use it thatcounts.


The irony of big data is that despiteits potential to enhance the human experience, it's often difficult to collect,filter, analyze, and interpret to gain those cherished insights. This slideshowexamines the challenges and capabilities of big data. The facts and figures maysurprise you. What to expect? Well, the future appears bright for Hadoop, theleading big data platform. And data scientists and related big data gurusshould be gainfully (and lucratively) employed for years to come.


Industry insiders have predicted the buzzterm "big data" will fade away. "It is all just data, after all.Big data and all the predictions for this space will collapse into 'datamanagement' by the analysts and all those following, including a lot of the'big' vendors," wrote Hortonworks president Herb Cunitz in a December 2012blog.


Cunitz may have prematurely predictedthe demise of "big data," but he's spot on: It's all just data. Onlythe tools needed to manage it will change. Now dig into our slideshow and get alook at some revealing statistics and research.


Jeff Bertolucci is a technology journalist in Los Angeles who writesmostly for Kiplinger's Personal Finance, The Saturday Evening Post, andInformationWeek.

How much datais ignored?
Most companies estimate they're analyzing a mere 12% of the data they have, according to a recent studyby Forrester Research. Is this good or bad? Well, these firms might be missingout on data-driven insights hidden inside the 88% of data they're ignoring. Orperhaps they're wisely avoiding a resource-gobbling, boil-the-ocean strategy. A lack of analytics tools and"repressive" data silos are two reasons companies ignore a vastmajority of their own data, says Forrester, as well as the simple fact thatoften it's hard to know which information is valuable and which is best leftignored.

Big data jobgrowth
The big data craze is a boon for tech workers with a particular set of skills.According to Dice, a career site for tech and engineering professionals, demandis soaring for data mavens. Job postings for NoSQL experts were up 54% yearover year, and those for "big data talent" rose 46%, the sitereported in April. Similarly, postings for Hadoop and Python pros were up 43%and 16%, respectively. Impressive stats, certainly, but small potatoes comparedwith job postings for cyber-security specialists, which soared 162%year-over-year.

How big willbig data get?
The digital universe will grow from 3.2 zettabytes today to 40 zettabytes inonly six years. (One zettabyte is roughly a billion terabytes.) "When welook at the data volumes coming at us, it's mind-blowing," saidHortonworks CEO Rob Bearden in his keynote address at Hadoop Summit 2014 in SanJose, Calif. "The data volume in the enterprise is going to grow 50xyear-over-year between now and 2020. I think the most important thing torecognize is that 85% of that data is coming from net-new data sources."And these sources, including mobile, social media, and web- andmachine-generated data, present both a challenge and an opportunity forenterprises globally, Bearden noted.

Big data = bigbucks
Big data jobs pay quite well. According to Salaries of Data Scientists, an April 2014 study fromBurtch Works, the 2014 mean base salary for a staff data scientist is $120,000,and $160,000 for a manager. The estimates are based on interviews with morethan 170 data scientists from a Burtch Works employment database. The pay scaleis almost as good for the broader category of big data professionals, meaningthose who "apply sophisticated quantitative skills to data-describingtransactions, interactions, or other behaviors of people to derive insights andprescribe actions." In this category the 2013 median base salary for staffis $90,000; for managers, it's a cool $145,000.

Are big datapros ready for the IoT?
Most IT pros say they haven't started preparing for the Internet of Things --even if they have. Spiceworkspolled 440 IT professionals in April 2014 to get their take on the IoT and howthey're preparing for it. Sixty-two percent of respondents were in NorthAmerica and 38% in EMEA (Europe, the Middle East, and Africa). More than half(59%) of respondents said they're not taking specific steps to address theexpected data deluge from sensors, cameras, and numerous other IoT devices.However, the survey also found that many IT pros are, in fact, preparing forthe IoT by investing in infrastructure, security, applications, and analytics,and by expanding bandwidth.

Datascientists: still sexy
The eye-grabbing headline of an October 2012 article in the Harvard BusinessReview called the data science profession the "Sexiest Job of the 21st Century." That's debatable,but if "sexy" is synonymous with "in demand," datascientists haven't lost any of their mojo. According to Modis, a global ITstaffing services provider, data scientists remain in "high demand butshort supply," which translates into generous six-figure salaries for some PhDs with relevantbig data experience.

Be afraid,data warehouse: Hadoop's in town
Should the data warehouse industry fear the rise of Hadoop? Embrace it? Thatquestion was posed to two Hadoop pioneers -- Doug Cutting of Cloudera and ArunMurthy of Hortonworks -- during a Q&A; at Hadoop Summit 2014. While manyenterprises are moving workloads from data warehouses to Hadoop, that's nothappening en masse. But will it? "If you've got a lot of people no longerincreasing the size of their data warehouse, but rather capping the size orpotentially even decreasing their investment because they find they can do muchof the processing as effectively and much more affordably in a Hadoop-basedsystem, I think that's a threat," said Cutting.

Privacy fearswon't stop big data
The cacophony of concerns rising from a seemingly endless series of privacy andsecurity breaches isn't likely to thwart big data's advancement. The Economistreports in its June 2014 issue that "there is scant evidence that concernabout privacy is causing a fundamental change in the way data are used andstored." Gartner analyst Carsten Casper tells the magazine that no"big privacy revolution" is brewing in the IT world. And whilecompanies are asking more privacy-related questions, nine of 10 of thosequeries have to do with the location of data centers, Casper adds.

Big data drives softwaregrowth
The compound annual growth rate (CAGR) for the 2013-2018 worldwide softwaremarket will hover near 6%, research firm IDC predicts. But big data relatedcategories, including collaborative applications and data access, analysis anddelivery solutions, and structured data management software, will show a higherCAGR (around 9%) over that five-year period, says IDC.


A heightened interest in socialmedia will help drive this growth. "This is complementary to the increasedattention to big data and analytics solutions, which help enterprisesunderstand and act on anticipated customer behavior and new insights intoproduct reliability and maintenance," said IDC analyst Henry Morris in astatement.

Almost everything will beconnected
The Internet of Things will include many strange and wondrous devices, many ofwhich are new to the world of big data. That's why analysts at ABI Researchpredict more than 30 billion devices will be wirelessly connected by 2020.Health-related data collection will play a large role in the IoT, of course.


Here's a unique example:Microsoft, in conjunction with researchers from the University of Rochester(New York) and University of Southampton (UK), have designed a brawith sensors that detects the wearer's stress level by monitoring heart andskin activity, the BBC reported. Designed to see if wearable tech can helpcontrol stress-related overeating, the bra collects and sends data to asmartphone app to help the user control eating habits.


原文发布时间为:2014-07-08

本文来自云栖社区合作伙伴“大数据文摘”,了解相关信息可以关注“BigDataDigest”微信公众号

这篇关于关于大数据的十个有力事实的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/171711

相关文章

Python将大量遥感数据的值缩放指定倍数的方法(推荐)

《Python将大量遥感数据的值缩放指定倍数的方法(推荐)》本文介绍基于Python中的gdal模块,批量读取大量多波段遥感影像文件,分别对各波段数据加以数值处理,并将所得处理后数据保存为新的遥感影像... 本文介绍基于python中的gdal模块,批量读取大量多波段遥感影像文件,分别对各波段数据加以数值处

使用MongoDB进行数据存储的操作流程

《使用MongoDB进行数据存储的操作流程》在现代应用开发中,数据存储是一个至关重要的部分,随着数据量的增大和复杂性的增加,传统的关系型数据库有时难以应对高并发和大数据量的处理需求,MongoDB作为... 目录什么是MongoDB?MongoDB的优势使用MongoDB进行数据存储1. 安装MongoDB

Python MySQL如何通过Binlog获取变更记录恢复数据

《PythonMySQL如何通过Binlog获取变更记录恢复数据》本文介绍了如何使用Python和pymysqlreplication库通过MySQL的二进制日志(Binlog)获取数据库的变更记录... 目录python mysql通过Binlog获取变更记录恢复数据1.安装pymysqlreplicat

Linux使用dd命令来复制和转换数据的操作方法

《Linux使用dd命令来复制和转换数据的操作方法》Linux中的dd命令是一个功能强大的数据复制和转换实用程序,它以较低级别运行,通常用于创建可启动的USB驱动器、克隆磁盘和生成随机数据等任务,本文... 目录简介功能和能力语法常用选项示例用法基础用法创建可启动www.chinasem.cn的 USB 驱动

Oracle数据库使用 listagg去重删除重复数据的方法汇总

《Oracle数据库使用listagg去重删除重复数据的方法汇总》文章介绍了在Oracle数据库中使用LISTAGG和XMLAGG函数进行字符串聚合并去重的方法,包括去重聚合、使用XML解析和CLO... 目录案例表第一种:使用wm_concat() + distinct去重聚合第二种:使用listagg,

Python实现将实体类列表数据导出到Excel文件

《Python实现将实体类列表数据导出到Excel文件》在数据处理和报告生成中,将实体类的列表数据导出到Excel文件是一项常见任务,Python提供了多种库来实现这一目标,下面就来跟随小编一起学习一... 目录一、环境准备二、定义实体类三、创建实体类列表四、将实体类列表转换为DataFrame五、导出Da

Python实现数据清洗的18种方法

《Python实现数据清洗的18种方法》本文主要介绍了Python实现数据清洗的18种方法,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学... 目录1. 去除字符串两边空格2. 转换数据类型3. 大小写转换4. 移除列表中的重复元素5. 快速统

Python数据处理之导入导出Excel数据方式

《Python数据处理之导入导出Excel数据方式》Python是Excel数据处理的绝佳工具,通过Pandas和Openpyxl等库可以实现数据的导入、导出和自动化处理,从基础的数据读取和清洗到复杂... 目录python导入导出Excel数据开启数据之旅:为什么Python是Excel数据处理的最佳拍档

在Pandas中进行数据重命名的方法示例

《在Pandas中进行数据重命名的方法示例》Pandas作为Python中最流行的数据处理库,提供了强大的数据操作功能,其中数据重命名是常见且基础的操作之一,本文将通过简洁明了的讲解和丰富的代码示例,... 目录一、引言二、Pandas rename方法简介三、列名重命名3.1 使用字典进行列名重命名3.编

Python使用Pandas库将Excel数据叠加生成新DataFrame的操作指南

《Python使用Pandas库将Excel数据叠加生成新DataFrame的操作指南》在日常数据处理工作中,我们经常需要将不同Excel文档中的数据整合到一个新的DataFrame中,以便进行进一步... 目录一、准备工作二、读取Excel文件三、数据叠加四、处理重复数据(可选)五、保存新DataFram