大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)Ye (Julia) Li
This powerpoint does a introduction to basic concepts and techiques of big data. It's target readers are normal people starts to be interested in "Big Data" and data analytics.
From http://www.csdn.net/article/2015-12-17/2826501
《小米金融技术主管方流: 大数据在互联网金融中的应用》
方流在主题演讲中重点介绍了DW建设的业务架构及开发工具,包括log利器Scribe、ETL利器之Hadoop/Hdfs、DW利器之HBase、数据分析利器Hive/Sentry、OLAP利器Impala、数据迁移利器之sqoop、机器学习利器之spark。同时重点分析了用户金融画像并针对大数据反欺诈,给出了自己的探索实践,防止盗号,提供异常环境监测/手机验证;防止身份伪造,采用实名认证;鉴定虚假资料,进行交叉验证。
1. The document discusses privacy challenges in the big data era and introduces differential privacy as a promising framework. It provides examples of how privacy can be attacked from published medical, web search, and genomic data. 2. Old privacy models like k-anonymity and l-diversity are shown to have limitations and possible attacks. 3. Differential privacy is defined as ensuring that the privacy of an individual cannot be compromised whether they are included in the dataset or not. Randomizing query outputs with noise provides this property.
大数据知识及技术简介(Introduction to basic concepts and techiques of big data in Chinese)Ye (Julia) Li
This powerpoint does a introduction to basic concepts and techiques of big data. It's target readers are normal people starts to be interested in "Big Data" and data analytics.
From http://www.csdn.net/article/2015-12-17/2826501
《小米金融技术主管方流: 大数据在互联网金融中的应用》
方流在主题演讲中重点介绍了DW建设的业务架构及开发工具,包括log利器Scribe、ETL利器之Hadoop/Hdfs、DW利器之HBase、数据分析利器Hive/Sentry、OLAP利器Impala、数据迁移利器之sqoop、机器学习利器之spark。同时重点分析了用户金融画像并针对大数据反欺诈,给出了自己的探索实践,防止盗号,提供异常环境监测/手机验证;防止身份伪造,采用实名认证;鉴定虚假资料,进行交叉验证。
1. The document discusses privacy challenges in the big data era and introduces differential privacy as a promising framework. It provides examples of how privacy can be attacked from published medical, web search, and genomic data. 2. Old privacy models like k-anonymity and l-diversity are shown to have limitations and possible attacks. 3. Differential privacy is defined as ensuring that the privacy of an individual cannot be compromised whether they are included in the dataset or not. Randomizing query outputs with noise provides this property.
This training covers the User Experience design fundamentals from the psychological & scientific side. We all know that UI made part of UX, so what's the rest ? you'll find the answere here !
[2015 e-Government Program]City Paper Presentation : Wuhan(China)shrdcinfo
- Wuhan aims to embrace the big data era and intensify smart city construction by establishing a unified big data platform, innovating big data applications, developing the big data industry, and strengthening big data management.
- Key challenges include the need for an improved information sharing mechanism, a shortage of data analysis and management talents, and escalating information security risks.
- The plan is supported by strengthening organization and leadership, implementing supportive policies, cultivating data professionals, establishing dedicated funding, and integrating resources from science, education, and human capital.
This document discusses smart city solutions and enterprise-grade IoT frameworks. It begins with an overview of the growth of IoT spending and adoption globally. It then discusses challenges of IoT at enterprise scale, including data orchestration, security, connectivity, and device management. The presentation introduces VMware's IoT platform and solutions to address these challenges, including tools for data orchestration, operational analytics, security, and device management. It emphasizes the need for IT and OT to converge at the edge to securely manage diverse IoT systems and simplify deployment and scaling of IoT use cases.
Critical insight about smart government initiatives in the gcc countriesSaeed Al Dhaheri
The document discusses the evolution of e-government to smart government. It defines smart government and compares it to smart cities. Several GCC countries' efforts toward mobile/smart government are reviewed, highlighting the UAE's comprehensive approach through initiatives like Dubai Smart City. Recommendations include establishing smart government policies, frameworks, and awards to drive adoption and regional cooperation. The key takeaway is that while mobile access is widespread, GCC countries need formal smart government programs and new technology embrace to fully realize smart governance.
This training covers the User Experience design fundamentals from the psychological & scientific side. We all know that UI made part of UX, so what's the rest ? you'll find the answere here !
[2015 e-Government Program]City Paper Presentation : Wuhan(China)shrdcinfo
- Wuhan aims to embrace the big data era and intensify smart city construction by establishing a unified big data platform, innovating big data applications, developing the big data industry, and strengthening big data management.
- Key challenges include the need for an improved information sharing mechanism, a shortage of data analysis and management talents, and escalating information security risks.
- The plan is supported by strengthening organization and leadership, implementing supportive policies, cultivating data professionals, establishing dedicated funding, and integrating resources from science, education, and human capital.
This document discusses smart city solutions and enterprise-grade IoT frameworks. It begins with an overview of the growth of IoT spending and adoption globally. It then discusses challenges of IoT at enterprise scale, including data orchestration, security, connectivity, and device management. The presentation introduces VMware's IoT platform and solutions to address these challenges, including tools for data orchestration, operational analytics, security, and device management. It emphasizes the need for IT and OT to converge at the edge to securely manage diverse IoT systems and simplify deployment and scaling of IoT use cases.
Critical insight about smart government initiatives in the gcc countriesSaeed Al Dhaheri
The document discusses the evolution of e-government to smart government. It defines smart government and compares it to smart cities. Several GCC countries' efforts toward mobile/smart government are reviewed, highlighting the UAE's comprehensive approach through initiatives like Dubai Smart City. Recommendations include establishing smart government policies, frameworks, and awards to drive adoption and regional cooperation. The key takeaway is that while mobile access is widespread, GCC countries need formal smart government programs and new technology embrace to fully realize smart governance.
Social media, a kind of source of big data, are shaping customers' behavior in China, the analysis of social data is fundamental job of future marketing. Find insights of customers based on social data by inter3i, a leading SaaS company in China.
From http://www.csdn.net/article/2015-12-17/2826501
《数美公司联合创始人兼CTO梁堃:Sentry金融实时风控系统》
数美公司联合创始人兼CTO梁堃在主题演讲中介绍了Sentry金融实时风控系统。他表示实时风控系统对于银行业继续保持高速发展越来越重要。Sentry金融实时风控系统是基于大数据技术构建的实时交易风险评估系统。其工作过程是,在每一笔交易发生时,实时进行(1)业务系统将交易信息发送风控系统;(2)发现该交易中存在的异常行为和可疑场景;(3)根据发现的“证据”计算该交易的风险系数;(4)将风险系数等相关信息反馈给业务系统。
From http://www.csdn.net/article/2015-12-17/2826501
《阿里巴巴数据安全部阿里数据安全小组总监郑斌:大数据下的数据安全》
阿里巴巴数据安全部阿里数据安全小组总监郑斌在《大数据下的数据安全》主题演讲中表示以数据流控制为中心的IT时代正走向以数据共享为基础、激活生产力为目的的DT时代,而大数据是新的生产要素,互联网+的新基础设施云网端(云:云计算、大数据;网:互联网、物联网;端:终端,APP)正激活大数据。
From http://www.csdn.net/article/2015-12-17/2826501
《新浪微博算法技术总监姜贵彬:大数据驱动下的微博社会化推荐》
新浪微博算法技术总监姜贵彬发表题为《大数据驱动下的微博社会化推荐》的演讲。他主要从以下几个方面进行了分享:推荐的角色与定位、大数据与推荐的关系、数据驱动下的微博推荐、商业推荐。他认为推荐扮演了加速器和调控器的角色。加速器是指加速优质信息传播、加速高价值关系构建、加速用户成长。调控器是指优化用户关系网络结构、调控和引爆信息的定向传播。
From http://www.csdn.net/article/2015-12-17/2826501
《南京大学计算机系PASA大数据实验室教授黄宜华 :Octopus(大章鱼):基于R语言的跨平台大数据机器学习与数据分析系统》
黄宜华认为大数据+机器学习是驱动全球互联网企业的核心。大数据机器学习是一个同时涉及到机器学习和大数据处理两个主要方面的交叉性研究课题。面向大数据复杂分析挖掘,现有的串行化机器学习与数据挖掘算法都需要重写,进行并行化设计以及不同的大数据并行处理平台上,各种大数据机器学习与数据挖掘算法需要进行基于特定平台的并行化算法设计等问题的存在,迫切需要研究提供一种统一化并易于使用的大数据机器学习系统支撑平台。
1) The document discusses using big data and financial innovation from research to practice. It identifies challenges that traditional financial services face and opportunities that big data presents.
2) It analyzes the three main values of big data: insights from scale, knowledge from enrichment, and agility from real-time responsiveness. It also compares internal enterprise data and external social media big data.
3) The document provides examples of using big data for precision marketing and relationship marketing/risk management. It also discusses research topics like mining offline relationships from online social networks.
BDTC2015 hulu-梁宇明-voidbox - docker on yarnJerry Wen
From http://www.csdn.net/article/2015-12-17/2826501
《Hulu 资深研发主管梁宇明 :Voidbox - Docker On YARN在Hulu的实践》
Docker 技术越来越得到了很多开发者的青睐,而YARN对于多数爱好者来说还是一个比较新的产品平台。如果两者放在一起融化会发生什么事情呢?来自Hulu公司的资深研发主管梁宇明为大家讲解了这一神奇的经历。他的演讲题目是《Voidbox - Docker On YARN在Hulu的实践》。因为基于YARN的大数据计算平台使得不同的计算框架可以在同一集群中混合部署,进而提升了集群资源利用率。
From http://www.csdn.net/article/2015-12-17/2826501
《京东云平台总架构师、系统技术部负责人刘海锋 :从2014 到2016,大规模内存数据库演进之路》
刘海锋带来了名为“大规模内存数据库JIMDB:从2014到2016”的主题演讲。JIMDB基于redis,以内存为中心的数据存储,其底层技术研发包括了存储引擎(Dict、LSM with RAM-SSD hybrid、B+Tree)、复制协议(async、sync等)、分片策略(Hash、Range)三个部分。过去两年,JIMDB一直持续建设,拥有着数千台大内存机器,多个数据中心,1000+线上集群,支撑了京东几乎所有的业务。
This document summarizes Spark's growth and development in 2015 and outlines its future direction. It discusses how Spark has become the most active open source big data project, with growing community and contributor numbers. Spark is now used across diverse industries for applications like log processing, recommendations, and business intelligence. The document highlights how Spark supports diverse runtime environments beyond Hadoop and how its user base has expanded beyond data engineers. It outlines upcoming features like the Dataset API and streaming DataFrames that will provide more optimized and easier to use APIs. The goal is for Spark to serve as a unified engine for all data workloads through continued optimization and support for new technologies like 3D XPoint memory.