"From Big Data to Smart data"
Jie (Jack) Yang, Associate Research Fellow, SMART Infrastructure Facility, presented a summary of his research as part of the SMART Seminar Series on 28 April 2016.
For more information, visit the event page at: http://smart.uow.edu.au/events/UOW212890.html.
This video includes:
Purpose of Data Science, Role of Data Scientist, Skills required for Data Scientist, Job roles for Data Scientist, Applications of Data Science, Career in Data Science.
Data Science Innovations : Democratisation of Data and Data Science suresh sood
Data Science Innovations : Democratisation of Data and Data Science covers the opportunity of citizen data science lying at the convergence of natural language generation and discoveries in data made by the professions, not data scientists.
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
"The most important contribution management needs to make in the 21st Century is to increase the productivity of knowledge work and the knowledge worker", said Peter F. Drucker in 1999, and time has proven him right.
Even NASA is no exception, as it faces a number of challenges. NASA has hundreds of millions of documents, reports, project data, lessons learned, scientific research, medical analysis, geospatial data, IT logs, and all kinds of other data stored nation-wide.
The data is growing in terms of variety, velocity, volume, value and veracity. NASA needs to provide accessibility to engineering data sources, whose visibility is currently limited. To convert data to knowledge a convergence of Knowledge Management, Information Architecture and Data Science is necessary.
This is what David Meza, Acting Branch Chief - People Analytics, Sr. Data Scientist at NASA, calls "Knowledge Architecture": the people, processes, and technology of designing, implementing, and applying the intellectual infrastructure of organizations.
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
The Real-time Linked Dataspace (RLD) is an enabling platform for data management for intelligent systems within smart environments that combines the pay-as-you-go paradigm of dataspaces, linked data, and knowledge graphs with entity-centric real-time query capabilities.
The RLD contains all the relevant information within a data ecosystem including things, sensors, and data sources and has the responsibility for managing the relationships among these participants.
It manages sources without presuming a pre-existing semantic integration among them using specialised dataspace support services for loose administrative proximity and semantic integration for event and stream systems. Support services leverage approximate and best-effort techniques and operate under a 5 star model for “pay-as-you-go” incremental data management.
Personalized News and Video Recomendation System at LinkSureLeanne Hwee
In recent years, the Internet industry has shifted more and more towards digital content distribution through online services. This presentation provides an overview of the overall system design and architecture of LinkSure News and Video Recommendations, the challenges encountered in practice, and the lessons learned from the production deployment of these systems at LinkSure. Specifically, we will highlight how news selection and personalisation of recommendations are formulated and addressed at LinkSure. By presenting our experiences in applying techniques at the intersection of recommender systems, information retrieval, machine learning, and statistical modelling in a large-scale industrial setting and highlighting the open problems, we hope to stimulate further research and collaborations.
This video includes:
Purpose of Data Science, Role of Data Scientist, Skills required for Data Scientist, Job roles for Data Scientist, Applications of Data Science, Career in Data Science.
Data Science Innovations : Democratisation of Data and Data Science suresh sood
Data Science Innovations : Democratisation of Data and Data Science covers the opportunity of citizen data science lying at the convergence of natural language generation and discoveries in data made by the professions, not data scientists.
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
"The most important contribution management needs to make in the 21st Century is to increase the productivity of knowledge work and the knowledge worker", said Peter F. Drucker in 1999, and time has proven him right.
Even NASA is no exception, as it faces a number of challenges. NASA has hundreds of millions of documents, reports, project data, lessons learned, scientific research, medical analysis, geospatial data, IT logs, and all kinds of other data stored nation-wide.
The data is growing in terms of variety, velocity, volume, value and veracity. NASA needs to provide accessibility to engineering data sources, whose visibility is currently limited. To convert data to knowledge a convergence of Knowledge Management, Information Architecture and Data Science is necessary.
This is what David Meza, Acting Branch Chief - People Analytics, Sr. Data Scientist at NASA, calls "Knowledge Architecture": the people, processes, and technology of designing, implementing, and applying the intellectual infrastructure of organizations.
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
The Real-time Linked Dataspace (RLD) is an enabling platform for data management for intelligent systems within smart environments that combines the pay-as-you-go paradigm of dataspaces, linked data, and knowledge graphs with entity-centric real-time query capabilities.
The RLD contains all the relevant information within a data ecosystem including things, sensors, and data sources and has the responsibility for managing the relationships among these participants.
It manages sources without presuming a pre-existing semantic integration among them using specialised dataspace support services for loose administrative proximity and semantic integration for event and stream systems. Support services leverage approximate and best-effort techniques and operate under a 5 star model for “pay-as-you-go” incremental data management.
Personalized News and Video Recomendation System at LinkSureLeanne Hwee
In recent years, the Internet industry has shifted more and more towards digital content distribution through online services. This presentation provides an overview of the overall system design and architecture of LinkSure News and Video Recommendations, the challenges encountered in practice, and the lessons learned from the production deployment of these systems at LinkSure. Specifically, we will highlight how news selection and personalisation of recommendations are formulated and addressed at LinkSure. By presenting our experiences in applying techniques at the intersection of recommender systems, information retrieval, machine learning, and statistical modelling in a large-scale industrial setting and highlighting the open problems, we hope to stimulate further research and collaborations.
Data Science Courses - BigData VS Data ScienceDataMites
Go through the slides to know what is Big Data and what is Data Science and Know the difference between Big Data and Data Science.
DataMites is a global institute, providing industry-aligned courses in Data Science, Machine Learning, and
Artificial Intelligence.
The Certified Data Scientist certification offered by DataMites covers all the important aspects of data science knowledge. The course is designed based on the accepted standards which demonstrates the quality of knowledge of a data science professional.
For more details please visit: https://datamites.com/data-science-course-training-chennai/
Big Data Analytics : Understanding for Research ActivityAndry Alamsyah
Big Data Analytics Presentation at International Workshop Colloquium Exploring Research Opportunity. School of Business and Management (SBM) - ITB. Bandung, 8 August 2019.
Big Data: Beyond the hype, Delivering valueEdward Curry
Big Data: Beyond the hype, Delivering value explains Big Data technology and how it is transforming industry and society to members of the IDEAL-IST project.
IDEAL-IST is an international ICT (Information and Communication Technologies) network, with more than 65 ICT national partners from EU and Non-EU Countries. It assists ICT companies and research organizations worldwide wishing to find project partners for a participation in the Horizon 2020 program of the European Commission.
Convergence Partners has released its latest research report on big data and its meaning for Africa. The report argues that big data poses a threat to those it overlooks, namely a large percentage of Africa’s populace, who remain on big data’s periphery.
Metadata is "data" that provides information about other data". In other words, it is "data about data". Many distinct types of metadata exist, including descriptive metadata, structural metadata, administrative metadata, reference metadata, statistical metadata and legal metadata.
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition
by Patrick Hadley, Australian Bureau of Statistics at the Australian CIO Summit 2014
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...Anna De Liddo
Presentation to the Large-Scale Idea Management and Deliberation Systems Workshop @
6th International Conference on Communities and Technologies C&T2013
June 29,2013
Munich, Germany
Attend The Data Science Course in Bangalore From ExcelR. Practical Data Science Course in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Course in Bangalore.
Data Science Courses - BigData VS Data ScienceDataMites
Go through the slides to know what is Big Data and what is Data Science and Know the difference between Big Data and Data Science.
DataMites is a global institute, providing industry-aligned courses in Data Science, Machine Learning, and
Artificial Intelligence.
The Certified Data Scientist certification offered by DataMites covers all the important aspects of data science knowledge. The course is designed based on the accepted standards which demonstrates the quality of knowledge of a data science professional.
For more details please visit: https://datamites.com/data-science-course-training-chennai/
Big Data Analytics : Understanding for Research ActivityAndry Alamsyah
Big Data Analytics Presentation at International Workshop Colloquium Exploring Research Opportunity. School of Business and Management (SBM) - ITB. Bandung, 8 August 2019.
Big Data: Beyond the hype, Delivering valueEdward Curry
Big Data: Beyond the hype, Delivering value explains Big Data technology and how it is transforming industry and society to members of the IDEAL-IST project.
IDEAL-IST is an international ICT (Information and Communication Technologies) network, with more than 65 ICT national partners from EU and Non-EU Countries. It assists ICT companies and research organizations worldwide wishing to find project partners for a participation in the Horizon 2020 program of the European Commission.
Convergence Partners has released its latest research report on big data and its meaning for Africa. The report argues that big data poses a threat to those it overlooks, namely a large percentage of Africa’s populace, who remain on big data’s periphery.
Metadata is "data" that provides information about other data". In other words, it is "data about data". Many distinct types of metadata exist, including descriptive metadata, structural metadata, administrative metadata, reference metadata, statistical metadata and legal metadata.
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...IT Network marcus evans
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong Value-Adding Proposition
by Patrick Hadley, Australian Bureau of Statistics at the Australian CIO Summit 2014
The Evidence Hub: Harnessing the Collective Intelligence of Communities to Bu...Anna De Liddo
Presentation to the Large-Scale Idea Management and Deliberation Systems Workshop @
6th International Conference on Communities and Technologies C&T2013
June 29,2013
Munich, Germany
Attend The Data Science Course in Bangalore From ExcelR. Practical Data Science Course in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Course in Bangalore.
Students involved in the PetaJakarta.org Pilot Study Program shared their research experiences during a special presentation session at SMART Infrastructure Facility on Wednesday, 25th March 2015.
Classifying malicious websites using an ensemble weighted featuresDharmendra Vishwakarma
Research Project - Master's in Data Analytics
Applying different statistical and machine learning techniques learned as a part of Data Analytics coursework is applied on Thesis Project to solve the malicious web page detection.
These slides were used at the first Aarhus Follower Group meet-up for the EU-funded project IoTCrawler. They entail an introduction to the project aswell as a more in depth presentation of the difference between web search and Internet of Things (IoT) search an the development of Internet of Things. Furthermore some of the scenarios from the project are presented.
Data Science, Personalisation & Product managementBhaskar Krishnan
Does Data Matter?
Why are we discussing Data & Data Science?
Why is it relevant to Product Management?
What is Identity?
How do we understand users?
How do we Personalise user experiences?
What is Risk and Trust & Safety?
Advanced Analytics and Data Science ExpertiseSoftServe
An overview of SoftServe's Data Science service line.
- Data Science Group
- Data Science Offerings for Business
- Machine Learning Overview
- AI & Deep Learning Case Studies
- Big Data & Analytics Case Studies
Visit our website to learn more: http://www.softserveinc.com/en-us/
Building a healthy data ecosystem around Kafka and Hadoop: Lessons learned at...Yael Garten
2017 StrataHadoop SJC conference talk. https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/56047
Description:
So, you finally have a data ecosystem with Kafka and Hadoop both deployed and operating correctly at scale. Congratulations. Are you done? Far from it.
As the birthplace of Kafka and an early adopter of Hadoop, LinkedIn has 13 years of combined experience using Kafka and Hadoop at scale to run a data-driven company. Both Kafka and Hadoop are flexible, scalable infrastructure pieces, but using these technologies without a clear idea of what the higher-level data ecosystem should be is perilous. Shirshanka Das and Yael Garten share best practices around data models and formats, choosing the right level of granularity of Kafka topics and Hadoop tables, and moving data efficiently and correctly between Kafka and Hadoop and explore a data abstraction layer, Dali, that can help you to process data seamlessly across Kafka and Hadoop.
Beyond pure technology, Shirshanka and Yael outline the three components of a great data culture and ecosystem and explain how to create maintainable data contracts between data producers and data consumers (like data scientists and data analysts) and how to standardize data effectively in a growing organization to enable (and not slow down) innovation and agility. They then look to the future, envisioning a world where you can successfully deploy a data abstraction of views on Hadoop data, like a data API as a protective and enabling shield. Along the way, Shirshanka and Yael discuss observations on how to enable teams to be good data citizens in producing, consuming, and owning datasets and offer an overview of LinkedIn’s governance model: the tools, process and teams that ensure that its data ecosystem can handle change and sustain #DataScienceHappiness.
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
So, you finally have a data ecosystem with Kafka and Hadoop both deployed and operating correctly at scale. Congratulations. Are you done? Far from it.
As the birthplace of Kafka and an early adopter of Hadoop, LinkedIn has 13 years of combined experience using Kafka and Hadoop at scale to run a data-driven company. Both Kafka and Hadoop are flexible, scalable infrastructure pieces, but using these technologies without a clear idea of what the higher-level data ecosystem should be is perilous. Shirshanka Das and Yael Garten share best practices around data models and formats, choosing the right level of granularity of Kafka topics and Hadoop tables, and moving data efficiently and correctly between Kafka and Hadoop and explore a data abstraction layer, Dali, that can help you to process data seamlessly across Kafka and Hadoop.
Beyond pure technology, Shirshanka and Yael outline the three components of a great data culture and ecosystem and explain how to create maintainable data contracts between data producers and data consumers (like data scientists and data analysts) and how to standardize data effectively in a growing organization to enable (and not slow down) innovation and agility. They then look to the future, envisioning a world where you can successfully deploy a data abstraction of views on Hadoop data, like a data API as a protective and enabling shield. Along the way, Shirshanka and Yael discuss observations on how to enable teams to be good data citizens in producing, consuming, and owning datasets and offer an overview of LinkedIn’s governance model: the tools, process and teams that ensure that its data ecosystem can handle change and sustain #datasciencehappiness.
Richard Skarbez presented a seminar titled "Cognitive Illusions in Virtual Reality: What do I mean? And why should you care?" as part of the SMART Seminar Series on the 4th March 2019.
More information:
https://news.eis.uow.edu.au/event/cognitive-illusions-in-virtual-reality-what-do-i-mean-and-why-should-you-care/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility
Dr Ricardo Peculis presented a seminar titled "Trusted Autonomous Systems as System of Systems" as part of the SMART Seminar Series on 19th February 2019.
More information:
https://news.eis.uow.edu.au/event/trusted-autonomous-systems-as-system-of-systems/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility"
David Kennewell presented a seminar titled " "The Evolution of the Metric System: From Precious Lumps of Metal to Constants of Nature" as part of the SMART Seminar Series on 1st November 2018.
More information:
https://news.eis.uow.edu.au/event/the-evolution-of-the-metric-system-from-precious-lumps-of-metal-to-constants-of-nature/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility"
Dr Ilya Budovsky presented a seminar titled "The Evolution of the Metric System: From Precious Lumps of Metal to Constants of Nature" as part of the SMART Seminar Series on 1st November 2018.
More information:
https://news.eis.uow.edu.au/event/the-evolution-of-the-metric-system-from-precious-lumps-of-metal-to-constants-of-nature/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Johan Barthelemy presented a seminar titled "Using AI and edge computing devices for traffic flow monitoring" as part of the SMART Seminar Series on 11th October 2018.
More information: https://news.eis.uow.edu.au/event/using-ai-and-edge-computing-devices-for-traffic-flow-monitoring/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Prof Willy Susilo presented a seminar titled "Blockchain and its Applications" as part of the SMART Seminar Series on 20th September 2018.
More information: https://news.eis.uow.edu.au/event/blockchain-and-its-applications/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Prof Theirry Monteil & Fabian Ho presented a seminar titled "From an IoT cloud based architecture to Edge for dynamic service" as part of the SMART Seminar Series on 24th August 2018.
More information: https://news.eis.uow.edu.au/event/from-an-iot-cloud-based-architecture-to-edge-for-dynamic-service/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Bobby Du and Paul-Antonin Dublanche presented a seminar titled "Is bus bunching serious in Sydney? Preliminary findings based on Opal card data analysis" as part of the SMART Seminar Series on 2nd August 2018.
More information: https://news.eis.uow.edu.au/event/is-bus-bunching-serious-in-sydney-preliminary-findings-based-on-opal-card-data-analysis/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Nicolas Verstaevel presented a seminar titled "Keep it SMART, keep it simple! – Challenging complexity with self-organising software" as part of the SMART Seminar Series on 24th July 2018.
More information: https://news.eis.uow.edu.au/event/keep-it-smart-keep-it-simple-challenging-complexity-with-self-organising-software/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Boulent Imam presented a seminar titled "Risk-based bridge assessment under changing load-demand and environmental conditions" as part of the SMART Seminar Series on 17th July 2018.
More information: https://news.eis.uow.edu.au/event/risk-based-bridge-assessment-under-changing-load-demand-and-environmental-conditions/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Rohan Wickramasuriya presented a seminar titled "Deep Learning: Fundamentals and Practice" as part of the SMART Seminar Series on 29th May 2018.
More information: http://www.uoweis.co/event/deep-learning-fundamentals-and-practice/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Sarah Dunn presented a seminar titled "Infrastructure Resilience: Planning for Future Extreme Events" as part of the SMART Seminar Series on 12th April 2018.
More information: http://www.uoweis.co/event/infrastructure-resilience-planning-for-future-extreme-events/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr George Grozev presented a seminar titled "Potential use of drones for infrastructure inspection and survey: as part of the SMART Seminar Series on 27th March 2018.
More information: http://www.uoweis.co/event/potential-use-of-drones-for-infrastructure-inspection-and-survey/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Professor Timoteo Carletti presented a seminar titled "A journey in the zoo of Turing patterns: the topology does matter as part of the SMART Seminar Series on 8th March 2018.
More information: http://www.uoweis.co/event/a-journey-in-the-zoo-of-turing-patterns-the-topology-does-matter/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Dr Carole Adam presented a seminar titled Human behaviour modelling and simulation for crisis management as part of the SMART Seminar Series on 1st March 2018.
More information: http://www.uoweis.co/event/human-behaviour-modelling-and-simulation-for-crisis-management/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Professor Graham Harris presented a seminar titled Dealing with uncertainty: With the observer in the loop as part of the SMART Seminar Series on 13th February 2018.
More information: http://www.uoweis.co/event/dealing-with-uncertainty-with-the-observer-in-the-loop/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Senior Professor Pascal Perez presented on Smart Cities; The Good, The Bad & The Ugly as part of the SMART Seminar Series on 30th January 2018.
More information: http://www.uoweis.co/event/smart-cities-the-good-the-bad-the-ugly/
Keep updated with future events: http://www.uoweis.co/events/category/smart-infrastructure-facility/
Visiting PhD student, Morgane Dumont presented on how to improve the order of evolutionary models in agent-based simulations for population dynamics as part of the SMART Seminar Series on 15 December 2017.
More information: http://www.uoweis.co/event/how-to-improve-the-order-of-evolutionary-models-in-agent-based-simulations-for-population-dynamics/
Keep updated with future events: http://www.uoweis.co/tag/smart-infrastructure/
Professor Tierry Monteil, professor in computer science at INSA – University of Toulouse and researcher at LAAS-CNRS presented on OneM2M and the interoperatbility of the IoT as part of the SMART Seminar Series on 13 December 2017.
More information: http://www.uoweis.co/event/onem2m-towards-end-to-end-interoperability-of-the-iot/
Keep updated with future events: http://www.uoweis.co/tag/smart-infrastructure/
Professor Peter Bridgewater, Chair of Landcare ACT and Adjunct Professor in Terrestrial and Marine Biodiversity Governance at the University of Canberra, presented on blue-green vs grey-black infrastructure and which is the best way forward, as part of the SMART Seminar Series on 24 November 2017.
More information: http://www.uoweis.co/event/blue-green-vs-grey-black-infrastructure-which-is-best-for-c21st-survival/
Keep updated with future events: http://www.uoweis.co/tag/smart-infrastructure/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
The affect of service quality and online reviews on customer loyalty in the E...
SMART Seminar Series: "From Big Data to Smart data"
1. From Big Data to
Smart data
Jie (Jack) Yang | April 2016
2. —What is Big Data?
—Challenge of Big Data processing
—Smart Learning framework
—Applications
—Conclusions
Outline
3. —No single standard definition
—5-V information assets that require innovative
techniques, algorithms, and analytics that enable
decision making, and process automation
Big Data definition
4. 1 – Scale (Volume)
12+ TBs
of tweet data
every day
25+ TBs of
log data every
day
?TBsof
dataeveryday
2+ billion people on
the Web by end
2011
30 billion RFID tags
today
(1.3B in 2005)
4.6 billion
camera
phones
world wide
100s of
millions of
GPS enabled
devices sold
annually
76 million smart meters in
2009…
200M by 2014
5. The ability to manage, analyse, summarise, visualise,
and discover knowledge from the collected data in a
timely and scalable manner
2 – Speed (Velocity)
Social media and networks
(millions of active users)
Mobile devices
(tracking objects all the time)
Infrastructure sensors and/or
instruments
(measuring all kinds of data)
6. Various formats, types and structures:
— Text
— Numerical
— Multi-dim arrays
— Images, audio, video, sequences
— Time series
— Graph (network)
— Streaming data
— etc
3 – Complexity (Varity)
8. 5 – Benefit (Value)
Value ($, time, performance)
9. Beer & Diaper (Woolworths in Illawarra)
“A number of convenience store clerks noticed that men
often bought beer at the same time they bought diapers.
The store mined its receipts and proved the clerks'
observations correct. So, the store began stocking
diapers next to the beer coolers, and sales skyrocketed”
Asimple example
14. Main features
— Collection across different platforms and formats
• APIs
• Web crawling
— 1 master and 6 workers
• distributing–working–waiting–reactivating
process
— Data volume (per day)
• 20K+ records user activities
• 25K+ records from social platforms
• 200K+ tweets around AU and EU
Data harvesting
15. Main features
— save data into different formats
• Pure TXT / CSV
• (NO)SQL
— Query across all
— Fast respond
Data storage
SELECT * FROM
(SELECT * FROM /web/logs/CSV) t0
JOIN
( SELECT country, count(*)
FROM mysql.web.users
GROUP BY country) t1
JOIN
(SELECT timestamp
FROM s3.root.clicks.json
WHERE user_id = 'jdoe‘) t2
16. Main features
— Preprocessing (filtering, cleansing, feature
extraction)
— Event simulation
— Saving to DBs
— Running ML jobs on the fly
• Receiver throughput = 3kb /sec
• Consumer throughput = 2kb /sec
• Consumer latency = 0.23 sec
Data streaming
17. Main features (35 online training jobs per day)
— Supervised (with a human assisting in classification) /
unsupervised machine learning techniques, to assist with
classification, clustering and prediction;
— Geospatial analysis: K-pop cluster in geographical regions;
— Network analysis to understand social connections between
consumers and producers;
— Other analysis including:
• More sophisticated number crunching of comments, such as
time series analysis to examine trends;
• Natural language processing techniques to assist with
sentiment analysis.
Data mining
18. Student behaviour analysis (OLPC, until Feb 2016):
— 153+ schools
— 20K+ active laptops
— 4.2M+ activity records
Application 1
0
1000
2000
3000
1.2M 2.6M 4.2M
Most popular Apps (per school) App usage (per school)
0
1000
2000
3000
1.2M 2.6M 4.2M
25. Jie Yang; Jun Ma, A structure optimization algorithm of neural networks for large-scale data sets, Fuzz-IEEE,2014;
Jie Yang; Jun Ma, A Sparsity-Based Training Algorithm for Least Squares SVM, IEEE SSCI, 2014;
Jie Yang, Jun Ma, A big-data processing framework for uncertainties in Transportation data, Fuzz-IEEE, 2015
Jie Yang, Jun Ma, and Sarah K. Howard, A Structure Optimization Algorithm of Neural Networks for Pattern Learning from Educational Data, Springer
Studies in Computational Intelligence ANN Modelling, 2015
Jie Yang; Jun Ma, A hybrid gene expression programming algorithm based on orthogonal design, International Journal of Computational Intelligence
Systems, 2015
Jie Yang, Brian Yecies, Mining Chinese Social Media UGC A SmartLearning Framework For Analyzing Douban Movie Reviews, Journal of Big Data,
2016
Jie Yang; Jun Ma, A structure optimization framework for feed-forward neural networks using sparse representation, Knowledge-Based Systems, 2016;
Jie Yang; Jun Ma, Sarah K. Howard, Exploring Technology Integration in Education using Fuzzy Representation and Feature Selection, Fuzz-IEEE,
2016
Brian Yecies, Jie Yang, Matthew Berryman, Kai Soh, Marketing Bait: Using SMART Data to Identify E-guanxi Among China’s ‘Internet Aborigines,
Film Marketing in a Global Era, 2015
Brian Yecies, Jie Yang, Matthew Berryman, Aegyung Shim, and Kai Soh, Korean Female Writer-Directors and SMART Analysis of Douban
commentary Among China’s Digital Natives, Women Screenwriters: An International Guide, 2015
Brian Yecies, Jie Yang, Matthew Berryman, Aegyung Shim, and Kai Soh, Korean Female Writer–Directors and SMART Analysis of Douban
Commentary Among China’s Digital Natives, Participations: International Journal of Audience Research, 2016
Sarah K. Howard, Jun Ma, Jie Yang, Kate Thompson, The use of data mining to explore factors of technology integration in learning and teaching,
EARLI 2015
Sarah K. Howard, Ellie Rennie, Jun Ma, Jie Yang, Big Data, Big Theory: Moving Beyond New Empiricism to Generate Powerful Explanations, The
New Data “Revolution” in Sociology, 2016
Jun Ma, Jie Yang, Rohan W. Denagamage and Murad Safadi, A Conceptual Model for Clustering Local Government Areas using Complex Fuzzy Sets,
Fuzz-IEEE, 2016
Publications
26. — OLPC (ARC-Linkage)
— NSW-DER
— CAAR
— China-South Korean Foundation
— Healthcare (Pubmed, Seer)
— Tourism business project (UTS)
— MTR
Projects and grants
27. — Big Data processing:
• Data collection; streaming data; data storage; and Machine
learning
• Open source libraries
— Other domains:
• Public transportation
• Business Intelligence
• Health care
Conclusions