Simon Metson of Bristol University and CERN's CMS experiment, discussing how to use Hadoop for processing CERN event data, or other data generated in/by the experiment
Simon Metson of Bristol University and CERN's CMS experiment, discussing how to use Hadoop for processing CERN event data, or other data generated in/by the experiment
Adoption of Cloud Computing in Scientific ResearchYehia El-khatib
Some might say the scientific research community is somewhat behind the curve of adopting the cloud. In this talk, I present a few examples of adopting the cloud from the wider research community. I also highlight some of the aspects by which cloud computing could affect scientific research in the near future and the associated challenges.
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
Cloud Computing Evolution
Why Cloud Computing needed?
Cloud Computing Models
Cloud Solutions
Cloud Jobs opportunities
Criteria for Big Data
Big Data challenges
Technologies to process Big Data- Hadoop
Hadoop History and Architecture
Hadoop Eco-System
Hadoop Real-time Use cases
Hadoop Job opportunities
Hadoop and SAP HANA integration
Summary
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data DataCentred
Presentation given by our CEO Mike Kelly at this year's Excellence in Policing conference talking about the benefits of cloud computing and the Effectiveness, Efficiency and Legitimacy of outsourcing data. The presentation looks at the long term trends supporting the adoption of cloud technologies and dispels some of the myths and reasons why not to adopt cloud.
The presentation concludes with an examination of the benefits of utilising cloud technology and examines how best to adopt a cloud approach.
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
Talk outlining the CLoud Infrastructure for Microbial Bioinformatics (CLIMB) system given at the CLIMB Launch in July 2016. CLIMB is a UK national e-infrastructure providing Microbial Bioinformatics as a Service.
cloud computing - concepts and technologies and mechanisms of tackling problems in cloud
you plz ignore who created it , plz focus on problem oriented points
Cloud computing is used to define a new class of computing that is based on the network technology. Cloud computing takes place over the internet. It comprises of a collection of integrated and networked hardware, software and internet infrastructures. These infrastructures are used to provide various services to the users. Distributed computing comprises of multiple software components that belong to multiple computers. The system works or runs as a single system. Cloud computing can be referred to as a form that originated from distributed computing and virtualization.
Disclaimer :
The images, company, product and service names that are used in this presentation, are for illustration purposes only. All trademarks and registered trademarks are the property of their respective owners.
Data/Image collected from various sources from Internet.
Intention was to present the big picture of Big Data & Hadoop
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
More Related Content
Similar to 01 introduction to cloud computing technology
Adoption of Cloud Computing in Scientific ResearchYehia El-khatib
Some might say the scientific research community is somewhat behind the curve of adopting the cloud. In this talk, I present a few examples of adopting the cloud from the wider research community. I also highlight some of the aspects by which cloud computing could affect scientific research in the near future and the associated challenges.
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
Cloud Computing Evolution
Why Cloud Computing needed?
Cloud Computing Models
Cloud Solutions
Cloud Jobs opportunities
Criteria for Big Data
Big Data challenges
Technologies to process Big Data- Hadoop
Hadoop History and Architecture
Hadoop Eco-System
Hadoop Real-time Use cases
Hadoop Job opportunities
Hadoop and SAP HANA integration
Summary
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data DataCentred
Presentation given by our CEO Mike Kelly at this year's Excellence in Policing conference talking about the benefits of cloud computing and the Effectiveness, Efficiency and Legitimacy of outsourcing data. The presentation looks at the long term trends supporting the adoption of cloud technologies and dispels some of the myths and reasons why not to adopt cloud.
The presentation concludes with an examination of the benefits of utilising cloud technology and examines how best to adopt a cloud approach.
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
Talk outlining the CLoud Infrastructure for Microbial Bioinformatics (CLIMB) system given at the CLIMB Launch in July 2016. CLIMB is a UK national e-infrastructure providing Microbial Bioinformatics as a Service.
cloud computing - concepts and technologies and mechanisms of tackling problems in cloud
you plz ignore who created it , plz focus on problem oriented points
Cloud computing is used to define a new class of computing that is based on the network technology. Cloud computing takes place over the internet. It comprises of a collection of integrated and networked hardware, software and internet infrastructures. These infrastructures are used to provide various services to the users. Distributed computing comprises of multiple software components that belong to multiple computers. The system works or runs as a single system. Cloud computing can be referred to as a form that originated from distributed computing and virtualization.
Disclaimer :
The images, company, product and service names that are used in this presentation, are for illustration purposes only. All trademarks and registered trademarks are the property of their respective owners.
Data/Image collected from various sources from Internet.
Intention was to present the big picture of Big Data & Hadoop
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
3. 1 Billion US$
Mobile
Internet
• 5 phases of computing growth, since 1960’s.
1. Main-frame, 2. Minicomputer, 3. PC, 4. Internet, 5. Mobile Internet.
• Every phase, the total amount of user-time, increased 10 times.
The sum of the top 5 companies’ market value increased 10 times every phase.
• With mobile internet, the big amount of user-times, induces big data。
The technical challenge is how to deal with big data.
• The solution to the big data challenge, is cloud computing.
3 / 30
4. Intel Pentium4 CPU’s power is
10,000 MIPS
MIPS: Million Instructions Per Second.
• 1965, Moore’s Law:
The number of transistors in IC doubles every 2 years, or even 18 months.
• Still, the power of a single CPU, cannot beat the human brain power.
Solution: use many computers.
• Challenge, to orchestrate many computers working together.
4 / 30
5. Google’s initial cloud
• Cloud computing can be built
with commodity PC servers.
• The most successful cloud so far, was by two graduate students.
Larry Page from University of Maryland, (北航 in the US).
Sergey Brin from UIUC, (北邮 in the US).
5 / 30
6. Sergey Brin & Larry Page Andy Bechtolsheim
• Sergey and Larry wanted to build a search engine.
Need the power of super-computer,
to store every webpage, of every website, globally, every historic version.
And to process the big data, to build search index.
• Raised fund from Andy Bechtolsheim, in 1997.
Andy, CMU alumni, cofounder of Sun Microsystems, very rich.
• But Andy only gave them 100K US$.
The most successful investment, but also the most stupid one.
6 / 30
7. • Why was Andy not positive on Google?
4 technical difficulties.
The two boys might not have the skillset.
• Scalability:
Big storage space for big data, Googol (10^100) scale!
Big paralleled computing to process them.
Never succeeded in human’s history.
And the data is increased every second.
Andy Bechtolsheim
• Reliability:
Using commodity machines,
One single machine’s failure should not break down the entire system.
• Elasticity:
The load fluctuation on different modules are different,
Schedule the same machines, to work for different modules at different time.
• Security:
Dynamically separate the machines into clusters, mutually inaccessible.
7 / 30
8. • Sergey and Larry’s answer was,
“O, yah, our company’s name is Google!
We deal with big data.”
• Google runs the world’s largest cloud,
for 15 years continuously, reliably.
8 / 30
10. • Scalability: add more machines, without modify the current system.
• Twitter was launched in May 2006.
Dec 2007, Twitter users increased to 66K.
Dec 2008, Twitter users grew to 5 millions.
April 2009, over 100 million.
• Weibo was launched in Sept 2009.
Nov 2009, Weibo users increased to 1 million.
April 2010, over 10 million.
Aug 2010, over 30 million.
Oct 2010, over 50 million.
• China’s population
makes itself the best test-bed
of cloud computing technology.
10 / 30
11. • Reliability: one single machine’s failure, don’t break down the entire system.
• Oct 29, 2009, T-mall kicked-off 50% discount.
• Half hour after the event started,
支付宝 slowed down significantly.
Another half hour later, the service shut down.
One hour later, the service recovered.
• During the one hour that service was down,
billion yuan’s business was lost.
11 / 30
12. • Elasticity: use the same machines, for different business, at different time.
• Does 支付宝 need to keep the huge amount of machines,
only to prepare for the annual sales? NO!
• Superbowl is the most popular sport event in the US.
During the game, Twitter’s load is 40% higher than the usual one.
During the exciting moment,
Twitter’s load is 150% higher than usual .
• But unlike 支付宝,
Twitter doesn’t keep a lot of machines.
Twitter borrowed machines
temporarily from a third-party.
• A lesson learned from Twitter,
to dynamically allocate machines,
among different business,
automatically,
in real-time.
12 / 30
13. • Security: prevent data leak.
• Cloud can contain multiple business.
• Each business runs in its own LAN.
Mutually inaccessible.
13 / 30
15. • Data flow and control:
push the cloud to run faster.
• Anatomy of Twitter.
• Cache for fast read.
• Queue for async tasks.
• Pub/Sub for messaging.
15 / 30
16. • Distributed File System:
Scalable file storage.
• Google File System.
(Hadoop HDFS)
• Master and Namespace.
• Chunk vs. File
• Replica vs. Fragmentations
16 / 30
18. • Distributed Lock:
Guarantee multiple read single write
in distributed system.
• With replica,
each data one lock or plural.
• How to deal with inconsistency?
• How to raise master,
by Paxos protocol?
18 / 30
19. • No-SQL Database:
Make database more efficient.
• No relational, but only key-value.
• No index, but algorithm.
• No SQL language.
• Easier to add machine.
19 / 30
20. • Paralleled computing:
Process big data by divide and conquer
• Google’s MapReduce
• Not a panacea, case study.
20 / 30
21. • Virtual Machine:
Run multiple OSes on single machine.
• Separate modules,
Present bugs and virus from infecting.
• Dynamically allocate resource.
21 / 30
22. • VLAN:
Regardless physical locations,
multiple machine operate as if in the same network domain.
• VLAN vs. VPN
• Group machines in different regions
as in one LAN.
• Separate machines in the same LAN,
into different groups,
mutually inaccessible.
22 / 30
23. • Traffic Monitoring and Network Topology.
Construct the entire cloud system.
23 / 30
24. • Future trends:
smaller, bigger, faster, easier.
• One chip with 48 CPUs.
• Data-center TCP.
• Cloud in RAM.
• Erlang, PigLatin:
languages for cloud computing.
24 / 30
26. 1. 2/28, 18:00pm - 20:00pm, Tuesday,
Introduction to clouding computer? Why cloud, what to do, and how to do?
Homework: Construct a simple 3-tier website.
2. 3/6, 18:00pm - 20:00pm, Tuesday,
Cluster-based scalable network services, SOA.
Homework: Learn to use THRIFT and MemCached to implement a messaging system.
3. 3/13, 18:00pm - 20:00pm, Tuesday,
Scalable file system, Google file system.
Homework: Learn to use SWIFT file system.
4. 3/20, 18:00pm - 20:00pm, Tuesday, • Syllabus.
Distributed RDBMS database, Google Bigtable.
Homework: Learn to use Hadoop HBase. • Core cloud techniques.
5. 3/27, 18:00pm - 20:00pm, Tuesday, Understand principles,
Invited seminar: Baidu.
• Learn how to use,
6. 4/3, 18:00pm - 20:00pm, Tuesday,
but not re-implement.
Distributed Locking system, Paxos and Google Chubby.
Homework: Learn to use Hadoop ZooKeeper (that is for advanced courses)
7. 4/10, 18:00pm - 20:00pm, Tuesday,
Distributed NO-SQL Database.
Homework: Learn to use Facebook Cassandra.
8. 4/17, 18:00pm - 20:00pm, Tuesday,
Paralleled computation, Google MapReduce.
Homework: Learn to use Hadoop MapReduce.
26 / 30
27. 9. 4/24, 18:00pm - 20:00pm, Tuesday, • Syllabus.
Invited Seminar: Taobao.
• Core cloud techniques.
10. 5/1, 18:00pm - 20:00pm, Tuesday,
Virtual Machine for dynamic resource allocation. Understand principles,
Homework: Learn to use KVM.
• Learn how to use,
11. 5/8, 18:00pm - 20:00pm, Tuesday,
Cloud security and VLAN. but not re-implement.
Homework: TBD
(that is for advanced courses)
12. 5/15, 18:00pm - 20:00pm, Tuesday,
Invited seminar: EMC/VMWare.
13. 5/22, 18:00pm - 20:00pm, Tuesday,
Datacenter network topology and traffic management.
Homework: Learn to use Zookeeper.
14. 5/29, 18:00pm - 20:00pm, Tuesday,
Invited seminar: Google.
15. 6/5, 18:00pm - 20:00pm, Tuesday,
Future Trend:
Bigger: Datacenter as a warehouse-scale computer, Datacenter needs an OS.
Smaller: Multicore CPU and GPUs.
Faster: In-Memory Framework, Piccolo an Spark.
Easier: Erlang, PigLatin language.
16. 6/12, 18:00pm - 20:00pm, Tuesday,
Invited seminar: CloudValley.
27 / 30
28. • Invited seminars.
• Top cloud players will be your teachers.
• Diverse opinions, also deviated from theory,
and why?
• Scheduled for mid-term & final exam periods,
and no homework!
28 / 30
29. • Homework.
• Homework: 50%
Mid-term exam: 20%
Final exam: 30%
• You will be able to build a cloud!
Not just Hadoop, and beyond.
29 / 30
30. • No stupid questions, but it is stupid if not ask!
• Ask a good question, and impress your professor and classmates!
• 新浪微博: @北航云计算公开课
30 / 30