The dawn of digital businesses is upon us, with reimagined business models that make the best use of digital technologies such as automation, analytics, integration and cloud. Digital businesses are efficient, continuously optimizing, proactive, flexible and are able to fully understand their customers. Analytics is a key technology that helps in doing so. It acts as the eyes and ears of the system and provides a holistic view on the past and present so that decision-makers can predict what will happen in the future. This webinar will explore
Why becoming a digital business is not a choice
The role of analytics in digital transformation with examples
How best to leverage state of the art analytics technology
We are at the dawn of digital businesses, that are reimagined to make the best use of digital technologies such as automation, analytics, cloud, and integration. These businesses are efficient, continuously optimizing, proactive, flexible and able to understand customers in detail. A key part of a digital business is analytics: the eyes and ears of the system that tracks and provides a detailed view on what was and what is and lets decision makers predict what will be.
This session will explore how the WSO2 analytics platform
Plays a role in your digital transformation journey
Collects and analyzes data through batch, real-time, interactive and predictive processing technologies
Lets you communicate the results through dashboards
Brings together all analytics technologies into a single platform and user experience
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
This talk discusses Outline of the state of the art of Enterprise Software and how we get there, as I see it. Also second part describes Ballerina, a new programming language WSO2 has built for Enterprise Computing.
It is presented as a Keynote at 11th Symposium and Summer School On Service-Oriented Computing.
Predictive Analytics - Big Data & Artificial IntelligenceManish Jain
Quick overview of the latest in big data and artificial intelligence. A lot of buzzwords being thrown around, hopefully this presentation will demystify many of the terms.
The Rise of Streaming SQL and Evolution of Streaming ApplicationsSrinath Perera
First-generation stream processors, such as Apache Storm, wanted us to write code. It was a great start. However, when building real-world apps, which are used for a long time and evolve, writing code gets us into trouble.
If we want to query a database or query data stored in storage with Hadoop, we use SQL. Why can't we query data streaming using SQL? We can. Almost all open source stream processors, including Storm, Flink, and Kafka, have switched to SQL.
In this webinar, Srinath will talk about the evolution of stream processing, streaming SQL, the status quo, and what this means to stream applications. He will also dissect the experience of building streaming applications by exploring common patterns and pitfalls.
The dawn of digital businesses is upon us, with reimagined business models that make the best use of digital technologies such as automation, analytics, integration and cloud. Digital businesses are efficient, continuously optimizing, proactive, flexible and are able to fully understand their customers. Analytics is a key technology that helps in doing so. It acts as the eyes and ears of the system and provides a holistic view on the past and present so that decision-makers can predict what will happen in the future. This webinar will explore
Why becoming a digital business is not a choice
The role of analytics in digital transformation with examples
How best to leverage state of the art analytics technology
We are at the dawn of digital businesses, that are reimagined to make the best use of digital technologies such as automation, analytics, cloud, and integration. These businesses are efficient, continuously optimizing, proactive, flexible and able to understand customers in detail. A key part of a digital business is analytics: the eyes and ears of the system that tracks and provides a detailed view on what was and what is and lets decision makers predict what will be.
This session will explore how the WSO2 analytics platform
Plays a role in your digital transformation journey
Collects and analyzes data through batch, real-time, interactive and predictive processing technologies
Lets you communicate the results through dashboards
Brings together all analytics technologies into a single platform and user experience
SoC Keynote:The State of the Art in Integration TechnologySrinath Perera
This talk discusses Outline of the state of the art of Enterprise Software and how we get there, as I see it. Also second part describes Ballerina, a new programming language WSO2 has built for Enterprise Computing.
It is presented as a Keynote at 11th Symposium and Summer School On Service-Oriented Computing.
Predictive Analytics - Big Data & Artificial IntelligenceManish Jain
Quick overview of the latest in big data and artificial intelligence. A lot of buzzwords being thrown around, hopefully this presentation will demystify many of the terms.
The Rise of Streaming SQL and Evolution of Streaming ApplicationsSrinath Perera
First-generation stream processors, such as Apache Storm, wanted us to write code. It was a great start. However, when building real-world apps, which are used for a long time and evolve, writing code gets us into trouble.
If we want to query a database or query data stored in storage with Hadoop, we use SQL. Why can't we query data streaming using SQL? We can. Almost all open source stream processors, including Storm, Flink, and Kafka, have switched to SQL.
In this webinar, Srinath will talk about the evolution of stream processing, streaming SQL, the status quo, and what this means to stream applications. He will also dissect the experience of building streaming applications by exploring common patterns and pitfalls.
What is Big Data? What is Data Science? What are the benefits? How will they evolve in my organisation?
Built around the premise that the investment in big data is far less than the cost of not having it, this presentation made at a tech media industry event, this presentation will unveil and explore the nuances of Big Data and Data Science and their synergy forming Big Data Science. It highlights the benefits of investing in it and defines a path to their evolution within most organisations.
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
A look back at how the practice of data science has evolved over the years, modern trends, and where it might be headed in the future. Starting from before anyone had the title "data scientist" on their resume, to the dawn of the cloud and big data, and the new tools and companies trying to push the state of the art forward. Finally, some wild speculation on where data science might be headed.
Presentation given to Seattle Data Science Meetup on Friday July 24th 2015.
Everybody has heard of Big Data, and its promise as the next great frontier for innovation. However, Big Data is neither new nor easily defined. What are the key drivers that make Big Data so critically important today? What is the single idea behind Big Data that promises such game changing outcomes for capable organizations? Who are the skilled talent that deliver Big Data results?
This presentation briefly reviews the opportunities, motivation and trends that are driving Big Data disruption. Data science is introduced as the enabling engine for Big Data transformation via the creation of new Data Products. The data scientist is defined and his tools, workflow and challenges are reviewed. Finally, practical tips are presented for approaching data product development.
Key takeaways include:
- Big Data disruption is driven by four megatrends
- Data is the essential raw material for creating valuable Data Products
- Data scientists are heterogeneous by role & skill set, but share common tools, workflows and challenges
- Data science talent is more important than raw data for Big Data success
These slides are modified from an invited presentation for the Gwinnett Chamber of Commerce on March 18, 2014. An excerpt was presented at the Georgia Pacific Social Media Working Session on March 19, 2014.
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
KMIS International Conference 2021.
This talk aims to provide insights and performance of predictive models for Airbnb Rating using Big Data and distributed parallel computing systems. We have predicted and classified using Two-Class Classification models if a property has a high or a low rating based on the features of the listing. It helps the hosts to know if their property is suitable and how their listing compares to other similar listings. We compare the results and the performance of rating prediction models with accuracy and computing time metrics.
Traffic Data Analysis and Prediction using Big DataJongwook Woo
- Denser traffic on Freeways 101, 405, 10
- Rush hours from 7 am to 9 am produce a lot of traffic, the heaviest traffic time start from 3pm and gets better after 6pm.
- Major areas of traffic in DTLA, Santa Monica, Hollywood
- More insights can be found with bigger dataset using this framework for analysis of traffic
- Using such data and platform can also give an opportunity to predict traffic congestions. Prediction can be performed using machine learning algorithm – Decision Forest with the accuracy of 83% for predicting the heaviest traffic jam.
What is Big Data? What is Data Science? What are the benefits? How will they evolve in my organisation?
Built around the premise that the investment in big data is far less than the cost of not having it, this presentation made at a tech media industry event, this presentation will unveil and explore the nuances of Big Data and Data Science and their synergy forming Big Data Science. It highlights the benefits of investing in it and defines a path to their evolution within most organisations.
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
A look back at how the practice of data science has evolved over the years, modern trends, and where it might be headed in the future. Starting from before anyone had the title "data scientist" on their resume, to the dawn of the cloud and big data, and the new tools and companies trying to push the state of the art forward. Finally, some wild speculation on where data science might be headed.
Presentation given to Seattle Data Science Meetup on Friday July 24th 2015.
Everybody has heard of Big Data, and its promise as the next great frontier for innovation. However, Big Data is neither new nor easily defined. What are the key drivers that make Big Data so critically important today? What is the single idea behind Big Data that promises such game changing outcomes for capable organizations? Who are the skilled talent that deliver Big Data results?
This presentation briefly reviews the opportunities, motivation and trends that are driving Big Data disruption. Data science is introduced as the enabling engine for Big Data transformation via the creation of new Data Products. The data scientist is defined and his tools, workflow and challenges are reviewed. Finally, practical tips are presented for approaching data product development.
Key takeaways include:
- Big Data disruption is driven by four megatrends
- Data is the essential raw material for creating valuable Data Products
- Data scientists are heterogeneous by role & skill set, but share common tools, workflows and challenges
- Data science talent is more important than raw data for Big Data success
These slides are modified from an invited presentation for the Gwinnett Chamber of Commerce on March 18, 2014. An excerpt was presented at the Georgia Pacific Social Media Working Session on March 19, 2014.
Predictive Analysis for Airbnb Listing Rating using Scalable Big Data PlatformSavita Yadav
KMIS International Conference 2021.
This talk aims to provide insights and performance of predictive models for Airbnb Rating using Big Data and distributed parallel computing systems. We have predicted and classified using Two-Class Classification models if a property has a high or a low rating based on the features of the listing. It helps the hosts to know if their property is suitable and how their listing compares to other similar listings. We compare the results and the performance of rating prediction models with accuracy and computing time metrics.
Traffic Data Analysis and Prediction using Big DataJongwook Woo
- Denser traffic on Freeways 101, 405, 10
- Rush hours from 7 am to 9 am produce a lot of traffic, the heaviest traffic time start from 3pm and gets better after 6pm.
- Major areas of traffic in DTLA, Santa Monica, Hollywood
- More insights can be found with bigger dataset using this framework for analysis of traffic
- Using such data and platform can also give an opportunity to predict traffic congestions. Prediction can be performed using machine learning algorithm – Decision Forest with the accuracy of 83% for predicting the heaviest traffic jam.
An Individual project given in order to complete the module named Macro Economics which expresses analysis of the trends of inflation rates of Sri Lanka during recent years.
Sri Lanka is a republic and a unitary state which is governed by a semi-presidential system with its official seat of government in Sri Jayawardenapura - Kotte, the capital.
The country is famous for the production and export of tea, coffee, coconuts, rubber and cinnamon, the last of which is native to the country.
The natural beauty of Sri Lanka has led to the title The Pearl of the Indian Ocean. The island is laden with lush tropical forests, white beaches and diverse landscapes with rich biodiversity.
Sri Lanka's rich culture can be attributed to the many different communities on the island
Sri Lanka is a founding member state of SAARC and a member United Nations, Commonwealth of Nations, G77 and Non-Aligned Movement. As of 2010, Sri Lanka was one of the fastest growing economies of the world. Its stock exchange was Asia's best performing stock market during 2009 and 2010
Objectives: 1. Gain an understanding of key trends in ICT innovation which are influencing/disrupting crisis informatics. 2. Be able to trace these trends through discussions later this semester, and understand their influence and potential. 3. Introduce visualization lab
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageSteven Ramage
Some initial considerations and discussion points around geospatial big data. Location adds context and relevance. Need to consider a number of V factors including Value.
Slides of my presentation at 9th Amirkabir Linux & Open-source Softwares Festival, about Big Data Computing Platforms and the rise of the so-called "Fast Data" phenomenon, and the architectures and state-of-the-art platforms for dealing with them.
Filtering From the Firehose: Real Time Social Media StreamingCloud Elements
All Things Cloud Developer Meetup.
Filtering From the Firehose: Real Time Social Media Streaming with Jim Moffitt from Gnip. Gnip is the world's largest and most trusted provider of social data.
Learn about collecting and filtering social media data with streaming APIs. Jim will cover best practices, use case examples and live demos of filtering data from Twitter.
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
Slides for keynote talk at the Big Data Europe workshop nr 3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference by Ron Dekker, Director CESSDA: European Open Science Agenda: where we are and where we are going?
Abstract:
Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
Similar to What Open Data and Open Source can do for Sri Lanka? (20)
Book: Software Architecture and Decision-MakingSrinath Perera
Uncertainty is the leading cause of mistakes made by practicing software architects. The primary goal of architecture is to handle uncertainty arising from user cases as well as architectural techniques. The book discusses how to make architectural decisions and manage uncertainty. From the book, You will learn common problems while designing a system, a default solution for each, more complex alternatives, and 5Q & 7P (Five Questions and Seven Principles) that help you choose.
Book, https://amzn.to/3v1MfZX
Blog: http://tinyurl.com/swdmblog
Six min video - https://youtu.be/jtnuHvPWlYU
We have critically evaluated how AI will shape integration use cases, their feasibility, and timelines. Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, is the methodology of our study.
We observe that AI can significantly impact integration use cases and identify 13 AI-based use case classes for integration. Points to note include:
Enabling AI in an enterprise involves collecting, cleaning up, and creating a single representation of data as well as enforcing decisions and exposing data outside, each of which leads to many integration use cases. Hence, AI indirectly creates demand for integration.
AI needs data, which in some cases lead to significant competitive advantages. The need to collect data would drive vendors to offer most AI products in the cloud through APIs.
Due to lack of expertise and data, custom AI model building will be limited to large organizations. It is hard for small and medium size organization to build and maintain custom models.
The Role of Blockchain in Future IntegrationsSrinath Perera
We have critically evaluated blockchain-based integration use cases, their feasibility, and timelines. Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, is the methodology of our study. Based on our analysis, we observe that blockchain can significantly impact integration use cases.
In our paper, we identify 30-plus blockchain-based use cases for integration and four architecture patterns. Notably, each use case we identified can be implemented using one of the architecture patterns. Furthermore, we also discuss challenges and risks posed by blockchains that would affect these architecture patterns.
Our webinar presents a critical analysis of serverless technology and our thoughts about its future. We use Emerging Technology Analysis Canvas (ETAC), a framework built to analyze emerging technologies, as the methodology of our study. Based on our analysis, we believe that serverless can significantly impact applications and software development workflows.
We’ve also made two further observations:
Limitations, such as tail latencies and cold starts, are not deal breakers for adoption. There are significant use cases that can work with existing serverless technologies despite these limitations.
We see a significant gap in required tooling and IDE support, best practices, and architecture blueprints. With proper tooling, it is possible to train existing enterprise developers to program with serverless. If proper tools are forthcoming, we believe serverless can cross the chasm in 3-5 years.
A detailed analysis can be found here: A Survey of Serverless: Status Quo and Future Directions. Join our webinar as we discuss this study, our conclusions, and evidence in detail.
1. Blockchain potential impact is real. If successful, Blockchain technologies can transform the way we live our day to day lives.
2. We believe technology is ready for limited applications in Digital Currency, Lightweight financial systems, Ledgers (of identity, ownership, status, and authority), Provenance (e.g. supply chains and other B2B scenarios) and Disintermediation, which we believe will happen in next three years.
3. However, with other use cases, blockchain faces significant challenges such as performance, irrevocability, need for regulation and lack of census mechanisms. These are hard problems and
4. It is not clear whether blockchain can sustain the current level of effort for extended period of 5+ years. There are many startups and they run the risk of running out of money before markets are ready. Failure of startups can inhibit further funding and investments.
5. Value and need of decentralization compared to centralized and semi-centralized alternatives is not clear.
A Visual Canvas for Judging New TechnologiesSrinath Perera
In the fast-changing technology world, the technology landscape shifts faster and faster. The agents of thses changes are new emerging technologies, which sometimes even create, destroy, or transform segments. In a shifting world, prevailing advantages are fleeting. Organizations that can master change and ride technology waves owns the future.
Not all emerging technologies live up to their promise. Every year, as a part of annual planning, most organizations need to decide relevance, impact, and the probability of success of emerging technologies and pick their bets. Although it is a regular decision there is no widely accepted framework for evaluating emerging technologies.
As a solution to this problem, we present “Emerging Technology Analysis Canvas” (ETAC), a framework to assess an individual emerging technology as a solution to this problem. Inspired by the Business Model Canvas, It represents different aspects of technology visually on a single page. This approach includes a set of questions that probe the technology arranged around a logical narrative. The visual representation is concise, compact, and comprehensible in a glance.
The talk discusses how analytics can attack privacy and what we can do about it. It discusses the legal responses (e.g. GDPR) as well technical responses ( differential privacy and homomorphic encryption).
The video is in https://www.facebook.com/eduscopelive/videos/314847475765297/ from 1.18.
Blockchain is often cited as one of the most impactful technology along with AI. It has attracted many startups, venture investments, and academic research. If successful, Blockchain technologies can transform the way, we live our day to day lives.
However, blockchain faces significant challenges such as performance, irrevocability, need for regulation and lack of census mechanisms. They are hard problems, and likely it will take at least 5-10 years to find answers to those problems.
Given the risk involved as well as the significant potential returns, we recommend a cautiously optimistic approach for blockchain with the focus on concrete use cases.
Today's Technology and Emerging Technology LandscapeSrinath Perera
We have seen the rise and fall of many technologies, some disappearing without a trace while others redefining the world. Collectively they have shaped our world beyond recognition. In this talk, Srinath will start with past technologies exploring their behavior. Then he will explore current middleware landscape, its composition, and relationships between different segments. He will discuss significant developments and discuss their future. Further, he will discuss emerging technologies, forces that shape them, and the promise of each technology, and finally, speculate about their evolution. You will walk away with knowledge on the evolution of middleware, the status quo, and discussion about how, at WSO2, we think those technologies will evolve.
Some died, some get by, but some have woven themselves to today's middleware so much that we do not notice them. The point I want to make is that not all emerging technologies are fads. Some are, and some are too early, like AI. But some are lasting.
Analytics and AI: The Good, the Bad and the UglySrinath Perera
Analytics let us question the data, which in effect questions the world around us. This let us understand, monitor, and shape the world. AI let us discover connections, predict the possible futures and automate tasks.
These twin technologies can change the world around us. On one hand, make us efficient, connected, and fulfilled. At the same time, the change of status quo can replace jobs, affect lives and build biases into our systems that can marginalize millions.
In this talk, we will discuss core ideas behind analytics and AI, their possible impact, both good and bad outcomes, and challenges.
How can we filter the truth from lies and complex shades between the two? In the time of data avalanche, this is a skill that serves both our carriers as well as lives.
In this talk, we will discuss where to find information, the importance of sources, understanding bias and conflicts of interests, and finally how to communicate our conclusions with their associated confidence.
Machine learning, or predictive analytics have started entering into our daily life. Businesses and enterprises could use predictive analytics to improve efficiency, improve user experience, as well as to create new business opportunities. This talk will present WSO2 Machine Learner, our experiences of predicting Super Bowl winners, and few real life use cases. Furthermore, talk will discuss open challenges and problems people are working on.
Introduction to WSO2 Data Analytics PlatformSrinath Perera
WSO2 have had several analytics products: WSO2 BAM and WSO2 CEP for some time (or Big Data products if you prefer the term). We are added WSO2 Machine Learner, a product to create, evaluate, and deploy predictive models and renamed WSO2 BAM to WSO2 DAS ( Data Analytics Server).
The platform let you publish ( collect data) once and process them through batch ( Spark) , realtime ( CEP), search the data ( Lucene) and build machine learning models.
This post describes how all those fit within to a single story.
For more information, see https://iwringer.wordpress.com/2015/03/18/introducing-wso2-analytics-platform-note-for-architects/
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
3. Success Stories
• Money Ball ( Baseball drafting)
• Nate Silver predicted outcomes in 49 of
the 50 states in the 2008 U.S. Presidential
election
• Cancer detection from Biopsy cells ( Big
Data find 12 patterns while we only knew
9), http://go.ted.com/CseS
• Bristol-Myers Squibb reduced the time it
takes to run clinical trial simulations by
98%
• Xerox used big data to reduce the attrition
rate in its call centers by 20%.
• Kroger Loyalty programs ( growth in 45
consecutive quarters)
4. If you collect data about your business, and feed it to a Big Data
system, you will find useful insights that will provide competitive
advantage
– (e.g. Analysis of data sets can find new correlations to "spot business
trends, prevent diseases, combat crime and so on”. [Wikipedia])
5. Putting Analytics to Work
What happened? And
Why? ( Hindsight)
What is Happening
right now? (
oversight)
What will happen?
(Foresight)
6. Open Source Market Share
• Apache (60%)
• Linux (Servers 16%)
• Firefox (25%)
• Tomcat and most of
middleware
• Android (43%)
• Even Microsoft looking
favorably at Opensource
projects
• There are lot of open
source projects bundled
inside the proprietary
products
Copyright kafka4prez and licensed for reuse under CC License ,
http://www.flickr.com/photos/kafka4prez/198465913
7. What is Open Source?
• Most commercial software does
not distribute the source code, and
developed and managed in a
closed world.
• Idea of open source is to have the
code in the open, and to improve
it though volunteer contributions
using “open development”
• Idea is that the project becomes a
eco-system
– More ideas
– More developers
– More Testers
– More Bug fixers
“There is no delight in
owning anything
unshared.”
Seneca (Roman philosopher,
mid-1st century AD)
8. How does a Open Source Work?
• Open code repository (SVN or Git
etc.)
• Two parts of the community
– Developer Community
– User Community
• Communication through Mailing
lists / IRC Channel
– Develop mailing list
– User mailing list
• Bug tracking database to track errors
(Jira, Bugzilla)
• People submit improvements as
patches through Jira etc.
Committers have write access to repository
Committers review and apply patches, and when you
submit lot of them, they will make you a committer.
9. History of Opensource
• 1970s – UNIX, Emacs
• 1984-85 - GNU project and
Free Software Foundation
• 1990 - GNU project almost
complete .. well not OS
• 1991 - Linus Torvalds announce
Linux, Phython
• 1993 - Net BSD and Free BSD
• 1994-95 - Linux 1.0 released
• 1995 - Apache, KDE, PHP
• 1997 - Genome
• 1999 Linux 2.2, OpenOffice
• 2003 - Firefox, Android
http://www.geograph.org.uk/photo/916456
http://www.fotopedia.com/items/flickr-3320704544
10. Why People Contribute?
• As a way to improve your profile
(looking for a Job)
• To gain experience
• To work with “like minded” People
• To be part of something bigger
• To be a “Geek”
• As a Job – if you a well known
open source developer, chances are
that you will get payed for
contribution
• As a competitive strategy
Copyright U. S. Fish and Wildlife Service and licensed for reuse under CC License ,
http://www.flickr.com/photos/usfwsnortheast/4754624921 and Copyright WxMom and licensed
for reuse under CC License , http://www.flickr.com/photos/wxmom/1359996991.
11. • Sahahna
• Apache Axis2 and
other projects
http://www.geograph.org.uk/photo/1842872
LKA Success Stories
12. Why People use Open Source
Software?
• It is cheaper
• It is better
• Because it is open source (Religiously)
• More visibility into the code, better security,
auditing
• If there is a problem, I can fix it
• More control over releases, roadmap
• Patches become available faster
• Easy to understand how it works
• Can fork the code if needed
• Not own by one person, less risk to depend
on it.
• Do not have to maintain the code
13. Big Data and Opensource
Most Big data tools are free
Even the state of the art is
being released as opensource
Give countries like a unique
opportunity with a level
playing field
14. Open Data
Make the data public
Advanced form of the RTI act
Opensource idea applied to data science
E.g. programs like “Code for America”
15. Code Red: US healthcare.gov
Rescue
$300M project, that is failing
and small group of volunteers
go to hackathon mode to fix
it, and fix it.
See
http://radar.oreilly.com/2014/03/cod
e-red_-they-have-no-use-for-someone-
who-looks-and-dresses-like-me.html
http://content.time.com/time/magazi
ne/article/0,9171,2166770-1,00.html
16. Filtering Information with Big
Data Big Data can filter
information (e.g. SPAM)
Rank Information ( show
most relevant articles)
Find Anomalies ( detect
Fraud)
Make recommendations (
product
recommendations)
Handle reputations (e.g.
Ebay, Amazon)
George Caleb Bingham, 1846
17. Example: Reddit, Hacker
News( Ranking)
Keep Your
Customers
Get New Customers
Improve Operations
Monetize your data
19. Urban Planning and Policy
Decisions
• Urban Planning
– People distribution
– Mobility
– Waste Management
– Parking
• Policy Decision
– What if we change
minimum wage?
– What are economic impact
of a new law?
By Aqwis - Own work, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?curid=6810430
20. Example: Big Data for
Development
• Done using CDR data
• People density noon vs. midnight
(red => increased, blue =>
decreased)
From: http://lirneasia.net/2014/08/what-does-big-data-say-about-sri-lanka/
21. Traffic
Lot of us waste time on
traffic
Know where is traffic (
Google traffic does that)
Emergency Response
Know the traffic patterns
Long term planning
22. Manage Donors and
Charities
Sri Lanka donates a lot (even the poorest)
Does the money goes to intended place
Can we track how money is spent?
https://iwringer.files.wordpress.com/2015/09/
traffic2.jpg?w=656
23. Day to day Maintenance
Does the news papers are the best way to get day to
day things done?
Can crowd sourcing help?
How to stop false tickets?
24. Disease spread
Earlier Malaria and now dengue
Know current situation
Know overall trends ( focus on problematic
areas)
Emergency Response
25. Summary
• There are lot Opensource, Open
data, and Big Data can do for Sri
Lanka
• Some cases needs money!! And
might be beyond us
• But not for many cases
– e.g. Sahana
– Hackathon to build an app to decide
what topics to take up in the
parliament
• What we really need is
collaborations between domain
experts and computer scientists