Advanced Analytics and Machine Learning with Data Virtualization (Chinese)Denodo
Watch full webinar here: https://bit.ly/3mLNJ1J
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spent most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Attend this webinar and learn:
- How data virtualization can accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- How popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc. integrate with Denodo
- How you can use the Denodo Platform with large data volumes in an efficient way
- How Prologis accelerated their use of Machine Learning with data virtualization
Modernising Data Architecture for Data Driven Insights (Chinese)Denodo
Watch full webinar here: https://bit.ly/3phVEEv
In an era increasingly dominated by advancements in cloud computing, AI and advanced analytics, it may come as a shock that many organizations still rely on data architectures built before the turn of the century. But, that scenario is rapidly changing with the increasing adoption of real-time data virtualization - A paradigm shift in the approach that organisations take towards accessing, integrating, and provisioning data required to meet business goals.
As data analytics and data-driven intelligence takes center stage in today’s digital economy, logical data integration across the widest variety of data sources, with proper security and governance structure in place has become mission critical.
Register this webinar to learn:
- How you can meet the challenges of delivering data insights with data virtualization
- Why Data Virtualization is increasingly find enterprise-wide adoption
- How customers are reducing costs and delivering faster insight
Advanced Analytics and Machine Learning with Data Virtualization (Chinese)Denodo
Watch full webinar here: https://bit.ly/3mLNJ1J
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spent most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Attend this webinar and learn:
- How data virtualization can accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- How popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc. integrate with Denodo
- How you can use the Denodo Platform with large data volumes in an efficient way
- How Prologis accelerated their use of Machine Learning with data virtualization
Modernising Data Architecture for Data Driven Insights (Chinese)Denodo
Watch full webinar here: https://bit.ly/3phVEEv
In an era increasingly dominated by advancements in cloud computing, AI and advanced analytics, it may come as a shock that many organizations still rely on data architectures built before the turn of the century. But, that scenario is rapidly changing with the increasing adoption of real-time data virtualization - A paradigm shift in the approach that organisations take towards accessing, integrating, and provisioning data required to meet business goals.
As data analytics and data-driven intelligence takes center stage in today’s digital economy, logical data integration across the widest variety of data sources, with proper security and governance structure in place has become mission critical.
Register this webinar to learn:
- How you can meet the challenges of delivering data insights with data virtualization
- Why Data Virtualization is increasingly find enterprise-wide adoption
- How customers are reducing costs and delivering faster insight
4. 关于大数据
• 线下大数据 vs 线上大数据
• 数据挖掘 vs 在线服务
• 持久化的大数据 vs 内存中的大数据
• 结构化大数据 vs 半结构化大数据
• 个人定义:数据要求比单台机器能力高
⼀一个数量级
DTCC2012
12年4月15日星期日
5. Intro to Redis
• REmote DIctionary Server
• NoSQL by @antirez by VMWare
• redis.io github.com/antirez/redis
• start at 2009, now latest stable 2.4.10
• Key - String,Hash,List,(Sorted)Set,Pub/Sub
• Great Performance
DTCC2012
12年4月15日星期日
6. Intro to Redis
• Written in C , Single thread , event driven
• Fork : copy on write by OS
• Replication
• Persist
• aof
• rdb
• All Data In Memory DTCC2012
12年4月15日星期日
11. Redis大数据之 通知
• 存储 by redis
• 索引 key - list
• uid - notice id list
• public notice id list
• 内容 key - value
• notice id - notice content
DTCC2012
12年4月15日星期日
12. Redis大数据之 通知
• 存储 by redis
• 提醒 key - value
• uid - since public notice id
• uid - since notice id ?
DTCC2012
12年4月15日星期日
32. Redis大数据之计数器
• 技术实现
• mc + mysql (原始列表数据)
• Redis : key - value
• key : uid or mid
• value : count
DTCC2012
12年4月15日星期日
33. Redis大数据之 计数器
• 问题
• ⼀一致性
• count vs list
DTCC2012
12年4月15日星期日
34. Redis大数据之 计数器
• 问题
• TCO
• redis cost 100+ bytes to store a count
• hash : store multi counts in a hash
• rediscounter : use array instead of hash
table
DTCC2012
12年4月15日星期日
35. Redis大数据之 计数器
• 问题
• 长尾(微博维度计数)
• 10+ Billion counts
• 1% hot : Only hot data in memory
• mget <=10ms
• 暂时无解
DTCC2012
12年4月15日星期日