Using open data to answer the question if it is harder to find a taxi, when it is raining. Live demo of analyzing taxi data with DashDB, R, and Bluemix.
Presented on data2day conference.
Analyze Twitter data completely in Bluemix. Collect data, add sentiment, copy to in-memory database, analyze with R or WatsonAnalytics. All in the cloud.
Sentiment Analysis with KNIME Analytics PlatformKNIMESlides
“Great movie with a nice story!”
What do you think, did the person like the film or hate it?
Most of the time it’s easy for us to decide whether the message of a text is positive or negative. But what if you wanted to automate the process of understanding the sentiment? For example, if you have a lot of customers leaving comments, or people publishing movie reviews, you will want to discern the sentiment and find out who is posting positive or negative messages.
Sentiment analysis is an important piece of many data analytics use cases. Whether it processes customer feedback, movie reviews, or tweets, sentiment scores often contribute an important piece to describing the whole scenario.
These are just some examples of a long list of use cases for sentiment analysis, which includes social media analysis, 360 degree customer views, customer intelligence, competitive analysis and many more. To avoid doing this manually, we apply sentiment analysis and teach an algorithm to understand text and extract the sentiment using Natural Language Processing.
A copy of the webinar can be viewed at https://www.youtube.com/watch?v=By4IZeIzxIw
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
The Finnish Meteorological Institute (FMI) opened its data in 2013, making basically all of its data with property rights publicly available in machine-readable formats. This includes near real-time and historical weather and climate data. The data is distributed through FMI's Open Data Portal, which follows INSPIRE requirements, as well as on Amazon Web Services (AWS) cloud platform through Amazon S3 buckets as part of a two-year pilot project to increase access and use of weather data. The AWS buckets contain HIRLAM surface and pressure level weather model data for Europe.
The document provides 10 facts about cloud storage to prepare attendees for the NetApp Insight conference in October and November. Some key facts include that 80% of companies see business benefits within 6 months of adopting cloud technologies, 90% of enterprises have implemented a cloud strategy, and global data center traffic is expected to triple from 2012 to 2017. The conferences will provide over 300 technical sessions on building data fabrics across flash, disk and cloud storage.
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9KNIMESlides
All the information about the latest features in KNIME Analytics Platform 4.0 and KNIME Server 4.9.
What we cover:
- New features of the KNIME Hub (hub.knime.com)
- What components are and how you can use these to bundle functionality for sharing and reusing
- Performance improvements
- New database integration
- New machine learning functionality
- New Plotly Integration (which brings all kinds of exciting interactive visualizations)
- New Spark nodes
- KNIME Server Remote Workflow Editor
- Scheduling improvements
- KNIME Server Distributed Executors
Webinar link: https://youtu.be/slOIiQzT_7E
What's New Document here: https://www.knime.com/whats-new-in-knime-40
In March and April 2018 KNIME hosted a series of Learnathons in the US. You can find the slides that were presented here.
For more upcoming events and courses visit: https://www.knime.com/learning/events
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedKNIMESlides
This document discusses different approaches to sentiment analysis: lexicon-based, machine learning, and deep learning. It provides an overview of how to perform sentiment analysis using the KNIME analytics platform, including reading and parsing data, enriching documents with semantic information, preprocessing documents, computing frequencies, transforming data for classification models, and creating workflows for each approach. The document encourages attendees to download a free book on text analytics from the KNIME Press using a provided code.
Analyze Twitter data completely in Bluemix. Collect data, add sentiment, copy to in-memory database, analyze with R or WatsonAnalytics. All in the cloud.
Sentiment Analysis with KNIME Analytics PlatformKNIMESlides
“Great movie with a nice story!”
What do you think, did the person like the film or hate it?
Most of the time it’s easy for us to decide whether the message of a text is positive or negative. But what if you wanted to automate the process of understanding the sentiment? For example, if you have a lot of customers leaving comments, or people publishing movie reviews, you will want to discern the sentiment and find out who is posting positive or negative messages.
Sentiment analysis is an important piece of many data analytics use cases. Whether it processes customer feedback, movie reviews, or tweets, sentiment scores often contribute an important piece to describing the whole scenario.
These are just some examples of a long list of use cases for sentiment analysis, which includes social media analysis, 360 degree customer views, customer intelligence, competitive analysis and many more. To avoid doing this manually, we apply sentiment analysis and teach an algorithm to understand text and extract the sentiment using Natural Language Processing.
A copy of the webinar can be viewed at https://www.youtube.com/watch?v=By4IZeIzxIw
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
The Finnish Meteorological Institute (FMI) opened its data in 2013, making basically all of its data with property rights publicly available in machine-readable formats. This includes near real-time and historical weather and climate data. The data is distributed through FMI's Open Data Portal, which follows INSPIRE requirements, as well as on Amazon Web Services (AWS) cloud platform through Amazon S3 buckets as part of a two-year pilot project to increase access and use of weather data. The AWS buckets contain HIRLAM surface and pressure level weather model data for Europe.
The document provides 10 facts about cloud storage to prepare attendees for the NetApp Insight conference in October and November. Some key facts include that 80% of companies see business benefits within 6 months of adopting cloud technologies, 90% of enterprises have implemented a cloud strategy, and global data center traffic is expected to triple from 2012 to 2017. The conferences will provide over 300 technical sessions on building data fabrics across flash, disk and cloud storage.
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9KNIMESlides
All the information about the latest features in KNIME Analytics Platform 4.0 and KNIME Server 4.9.
What we cover:
- New features of the KNIME Hub (hub.knime.com)
- What components are and how you can use these to bundle functionality for sharing and reusing
- Performance improvements
- New database integration
- New machine learning functionality
- New Plotly Integration (which brings all kinds of exciting interactive visualizations)
- New Spark nodes
- KNIME Server Remote Workflow Editor
- Scheduling improvements
- KNIME Server Distributed Executors
Webinar link: https://youtu.be/slOIiQzT_7E
What's New Document here: https://www.knime.com/whats-new-in-knime-40
In March and April 2018 KNIME hosted a series of Learnathons in the US. You can find the slides that were presented here.
For more upcoming events and courses visit: https://www.knime.com/learning/events
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedKNIMESlides
This document discusses different approaches to sentiment analysis: lexicon-based, machine learning, and deep learning. It provides an overview of how to perform sentiment analysis using the KNIME analytics platform, including reading and parsing data, enriching documents with semantic information, preprocessing documents, computing frequencies, transforming data for classification models, and creating workflows for each approach. The document encourages attendees to download a free book on text analytics from the KNIME Press using a provided code.
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
Hw09 Counting And Clustering And Other Data TricksCloudera, Inc.
This document summarizes the early use of Hadoop at The New York Times to generate PDFs of archived newspaper articles and analyze web traffic data. It describes how over 100 Amazon EC2 instances were used with Hadoop to pre-generate over 11 million PDFs from 4.3TB of source data in under 24 hours for a total cost of $240. The document then discusses how the Times began using Hadoop to perform web analytics, counting page views and unique users, and merging this data with demographic and article metadata to better understand user traffic and behavior.
The document presents 12 facts about flash storage and its advantages over disk storage. Flash storage capacity is projected to grow to be 1000x more than disk storage by 2026. Flash storage is also more reliable than disk storage and flash memory costs have been decreasing rapidly. Flash storage density has been increasing 2-4x every 2 years, resulting in widespread adoption.
Why we need open data? FMI Open Data on AWSRoope Tervo
Why we need open data and how should we provide it. FMI provides the same data in its own open data portal but also in AWS public dataset program. Different use cases require different services and channels.
Heise Developer World 2016 - Big Data ist tot, es lebe Business IntelligenzMarkus Schmidberger
2015 ist "Big Data" im Gartner Hype Cycle Report verschwunden, und Business-Intelligence-Lösungen finden vor allem in der Cloud neue Beachtung. BI-Experten freuen sich, den Big-Data-Hype mit all seinen neuen Technologien überstanden zu haben. Was passiert aber wirklich in der Praxis? Datengetriebene Firmen erkennen Lambda-Architekturen als geeignete Plattformen, um Daten in Echtzeit und Batch-Modus zu verarbeiten und um flexibel auf neue Anforderungen zu reagieren. In diesen Lambda-Architekturen finden neue Big-Data-Technologien und bewährte BI-Lösungen zusammen, und nur zusammen wird neuer Geschäftswert erzeugt. Dieser Vortrag erklärt Lambda-Architekturen und bringt Big Data sowie BI zusammen.
Big Data Science in the Cloud from Big Data World Conference 2013Markus Schmidberger
The document discusses big data and cloud computing. It notes that the new German coalition agreement will focus research and innovation funding on developing methods and tools for data analysis of big data. It then provides an overview of big data science in the cloud, including discussing putting applications and data in the best cloud locations, choosing cloud resources carefully, leveraging the full cloud technology stack such as AWS EMR with MapR, and ensuring data protection. The document promotes MongoSoup as the first German-based MongoDB cloud hosting solution and lists upcoming big data events.
PixieDust is an open source library that simplifies and improves Jupyter Python notebooks. It allows users to:
1. Easily install Python packages and libraries without modifying configuration files.
2. Create visualizations with a simple display() API that includes options for performance statistics, panning, and zooming.
3. Export data to cloud services or locally in CSV, JSON, HTML formats for further use or sharing.
OVH Analytics Data Compute and Apache Spark as a ServiceMojtaba Imani
If you have bigdata processing and you need a full up and ready private Apache Spark cluster just for you, OVH Analytics Data Compute is your answer. It will save your money and time alot.
This document summarizes the new features in FME 2017 including over 30 new data formats that can be read and written, more than 10 new transformers, updates to existing transformers, and improvements to usability such as additional options for inspecting data and managing transformer parameters. It also describes enhancements to FME Server including new instance types for running workspaces in the cloud and lower pricing for most users. The goal of FME is to allow data to flow freely between systems and formats while simplifying complex data integration tasks.
PowerStream: Propelling Energy Innovation with Predictive Analytics SingleStore
This document discusses a presentation about MemSQL PowerStream, a product for predicting the global health of wind turbines. The presentation covers renewable energy news stories, introduces PowerStream, demonstrates high-speed data ingestion and predictive analytics using MemSQL and Spark, and shows how SQL queries can be pushed down to MemSQL for faster processing. It concludes with a question and answer section.
1Spatial Australia: Introduction and getting started with fme 20171Spatial
This document introduces new features in FME 2017 including over 20 new data formats that can be read and written, more than 10 new transformers, updates to existing transformers, improved user interface features for workflows, expanded web services and file system capabilities, an updated data inspector, and new automation capabilities for running workflows on demand or on a schedule. The overall goal of FME is to allow data to flow freely between systems and applications while enabling users to spend more time making decisions rather than struggling with data integration tasks.
Kubernetes platform evolution at Audi Business Innovation started in 2016 with a PoC on AWS and grew to support over 1000 containers and 60 development teams by early 2018. Two outages occurred due to cluster upgrades and issues with the CNI plugin. Lessons learned included implementing a recovery plan using GitOps with Helm for deployments, monitoring, and backing up all configuration and data. The presentation emphasized choosing important work over urgent tasks, listening to feedback, and investing in relationships between people.
This document discusses machine data and how it can be used by local governments. It defines machine data as logs and usage data produced by computers, devices, sensors, and more. These data sources are found across government systems, infrastructure, and IoT devices. The document then provides examples of how machine data is currently used by governments for traffic analysis, utility management, ordinance enforcement, and more. It highlights the value of machine data for decision making, security, analytics, and monitoring.
This document discusses machine data and how it can be used by local governments. It defines machine data as logs and usage data produced by computers, devices, sensors, and more. These data sources are found across government systems, city infrastructure like traffic and utilities, and citizens' electronic devices. The document provides examples of how machine data is currently used by governments for tasks like traffic planning, utility billing, and ordinance enforcement. It notes that machine data was previously difficult to access and use but that it can now be applied to decision making, security, analytics, monitoring and more.
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIMESlides
Here are the slides from our Data Science Learnathons. A learnathon is where we learn more about the data science cycle - data access, data blending, data preparation, model training, optimization, testing, and deployment. We also work in groups to hack a workflow-based solution to guided exercises. The tool of choice for this learnathon is KNIME Analytics Platform.
Hw09 Counting And Clustering And Other Data TricksCloudera, Inc.
This document summarizes the early use of Hadoop at The New York Times to generate PDFs of archived newspaper articles and analyze web traffic data. It describes how over 100 Amazon EC2 instances were used with Hadoop to pre-generate over 11 million PDFs from 4.3TB of source data in under 24 hours for a total cost of $240. The document then discusses how the Times began using Hadoop to perform web analytics, counting page views and unique users, and merging this data with demographic and article metadata to better understand user traffic and behavior.
The document presents 12 facts about flash storage and its advantages over disk storage. Flash storage capacity is projected to grow to be 1000x more than disk storage by 2026. Flash storage is also more reliable than disk storage and flash memory costs have been decreasing rapidly. Flash storage density has been increasing 2-4x every 2 years, resulting in widespread adoption.
Why we need open data? FMI Open Data on AWSRoope Tervo
Why we need open data and how should we provide it. FMI provides the same data in its own open data portal but also in AWS public dataset program. Different use cases require different services and channels.
Heise Developer World 2016 - Big Data ist tot, es lebe Business IntelligenzMarkus Schmidberger
2015 ist "Big Data" im Gartner Hype Cycle Report verschwunden, und Business-Intelligence-Lösungen finden vor allem in der Cloud neue Beachtung. BI-Experten freuen sich, den Big-Data-Hype mit all seinen neuen Technologien überstanden zu haben. Was passiert aber wirklich in der Praxis? Datengetriebene Firmen erkennen Lambda-Architekturen als geeignete Plattformen, um Daten in Echtzeit und Batch-Modus zu verarbeiten und um flexibel auf neue Anforderungen zu reagieren. In diesen Lambda-Architekturen finden neue Big-Data-Technologien und bewährte BI-Lösungen zusammen, und nur zusammen wird neuer Geschäftswert erzeugt. Dieser Vortrag erklärt Lambda-Architekturen und bringt Big Data sowie BI zusammen.
Big Data Science in the Cloud from Big Data World Conference 2013Markus Schmidberger
The document discusses big data and cloud computing. It notes that the new German coalition agreement will focus research and innovation funding on developing methods and tools for data analysis of big data. It then provides an overview of big data science in the cloud, including discussing putting applications and data in the best cloud locations, choosing cloud resources carefully, leveraging the full cloud technology stack such as AWS EMR with MapR, and ensuring data protection. The document promotes MongoSoup as the first German-based MongoDB cloud hosting solution and lists upcoming big data events.
PixieDust is an open source library that simplifies and improves Jupyter Python notebooks. It allows users to:
1. Easily install Python packages and libraries without modifying configuration files.
2. Create visualizations with a simple display() API that includes options for performance statistics, panning, and zooming.
3. Export data to cloud services or locally in CSV, JSON, HTML formats for further use or sharing.
OVH Analytics Data Compute and Apache Spark as a ServiceMojtaba Imani
If you have bigdata processing and you need a full up and ready private Apache Spark cluster just for you, OVH Analytics Data Compute is your answer. It will save your money and time alot.
This document summarizes the new features in FME 2017 including over 30 new data formats that can be read and written, more than 10 new transformers, updates to existing transformers, and improvements to usability such as additional options for inspecting data and managing transformer parameters. It also describes enhancements to FME Server including new instance types for running workspaces in the cloud and lower pricing for most users. The goal of FME is to allow data to flow freely between systems and formats while simplifying complex data integration tasks.
PowerStream: Propelling Energy Innovation with Predictive Analytics SingleStore
This document discusses a presentation about MemSQL PowerStream, a product for predicting the global health of wind turbines. The presentation covers renewable energy news stories, introduces PowerStream, demonstrates high-speed data ingestion and predictive analytics using MemSQL and Spark, and shows how SQL queries can be pushed down to MemSQL for faster processing. It concludes with a question and answer section.
1Spatial Australia: Introduction and getting started with fme 20171Spatial
This document introduces new features in FME 2017 including over 20 new data formats that can be read and written, more than 10 new transformers, updates to existing transformers, improved user interface features for workflows, expanded web services and file system capabilities, an updated data inspector, and new automation capabilities for running workflows on demand or on a schedule. The overall goal of FME is to allow data to flow freely between systems and applications while enabling users to spend more time making decisions rather than struggling with data integration tasks.
Kubernetes platform evolution at Audi Business Innovation started in 2016 with a PoC on AWS and grew to support over 1000 containers and 60 development teams by early 2018. Two outages occurred due to cluster upgrades and issues with the CNI plugin. Lessons learned included implementing a recovery plan using GitOps with Helm for deployments, monitoring, and backing up all configuration and data. The presentation emphasized choosing important work over urgent tasks, listening to feedback, and investing in relationships between people.
This document discusses machine data and how it can be used by local governments. It defines machine data as logs and usage data produced by computers, devices, sensors, and more. These data sources are found across government systems, infrastructure, and IoT devices. The document then provides examples of how machine data is currently used by governments for traffic analysis, utility management, ordinance enforcement, and more. It highlights the value of machine data for decision making, security, analytics, and monitoring.
This document discusses machine data and how it can be used by local governments. It defines machine data as logs and usage data produced by computers, devices, sensors, and more. These data sources are found across government systems, city infrastructure like traffic and utilities, and citizens' electronic devices. The document provides examples of how machine data is currently used by governments for tasks like traffic planning, utility billing, and ordinance enforcement. It notes that machine data was previously difficult to access and use but that it can now be applied to decision making, security, analytics, monitoring and more.
Massachusetts Digital Government Summit 2015Splunk
This document discusses machine data and its value. It defines machine data as logs and usage data produced by computers, network devices, sensors, and other electronic devices. It notes that machine data comes from many sources, including applications, websites, consumer electronics, industrial equipment, sensors, and more. The document addresses common barriers to accessing machine data and suggests using a platform that can collect, store, and analyze data from any source in real time. It provides examples of how machine data is used from traffic management to regulatory compliance to improved government services. The overall message is that machine data has great potential value and is ready to be tapped into by the reader.
This document discusses machine data and its value. It defines machine data as logs and usage data produced by computers, network devices, sensors, and other electronic devices. It notes that machine data comes from many sources, including applications, websites, consumer electronics, industrial equipment, sensors, and more. The document addresses common barriers to accessing machine data and suggests using a tool that can collect, store, and analyze data from any source in real time. It provides examples of how machine data is used in areas like traffic management, utility billing, and regulatory compliance. Finally, it states that machine data has great potential value for organizations in both the public and private sectors.
Mobile usage is growing rapidly, with people checking their phones hundreds of times per day. Many companies have rushed to create mobile apps but struggle with user acquisition, analytics, and iteration. Adobe's Project Fast Track created a unified solution across its mobile apps and Creative Cloud to gather usage data and enable data-driven product improvements. The project team integrated Adobe's Mobile SDK and Analytics to provide insights from nine apps in just nine weeks. This system allows Adobe to better understand user behavior and quickly iterate apps to increase engagement and subscriptions.
The Internet of Things (IoT) - What Really Matters for a Start-UpSandy Carter
The document discusses the Internet of Things (IoT) ecosystem and what matters most for success. It notes that $1 billion was invested in IoT ventures in 2013 and that IoT units installed are estimated to reach 26 billion by 2020. The key focuses that really matter are having the right focus in three areas: domain expertise in industries like automotive and healthcare; understanding drivers of value for users/buyers; and prioritizing user-centered design. Integrated solutions that collect, analyze and optimize data from devices and sensors will be important long-term opportunities. Thriving ecosystems provide mentoring, access to capital, and opportunities to scale globally.
Preparing the next generation for the cognitive eraSteven Miller
Short version of my latest presentation used during a panel session at the ASA Research Symposium at Southern Illinois University Carbondale on November 21st 2015
Have your cake and eat it too: adopting technologies without sacrificing - Pa...Internet World
Interop Academy - June 19th, 11:30-12:00
The layer “cake” that’s become IT is flavored with consolidation, cloud, big data, BYOD, SDN and other acronyms to boot. Eating this cake inevitably leaves organizations with application performance issues. Come learn how IT can have its cake and eat it too.
BMC Engage 2015: IT Asset Management - An essential pillar for the digital en...Jon Stevens-Hall
This document discusses how IT asset management (ITAM) needs to evolve to support the digital enterprise. Assets are changing rapidly with virtualization, cloud computing, and the Internet of Things. Effective digital service management requires understanding both the services provided and the underlying assets. The document recommends aligning ITAM with digital services by using both traditional and new data collection methods, embedding ITAM into digital services, and taking a proactive approach to compliance and cost optimization. IT asset managers are well-positioned to provide oversight to CIOs in the new digital business environment.
How Spark Enables the Internet of Things- Paula Ta-ShmaSpark Summit
The document summarizes an IBM research paper on how Spark can enable Internet of Things (IoT) use cases. It describes an IoT architecture used for a smart city use case with Madrid city buses. Data is collected from 3000 traffic sensors into Kafka and aggregated into Swift objects using Secor. Spark is used to access and analyze the data to detect traffic patterns and inform bus routing decisions in real-time. The system aims to improve customer satisfaction and reduce costs by responding efficiently to traffic issues.
Building intelligent apps using iot and cognitive services - Prabhjot BakshiaOS Community
This document discusses building intelligent apps using IoT and cognitive services. It provides examples of using devices like Raspberry Pi and Arduino to collect data from connected vehicles and other IoT sources. The data is processed and analyzed in the cloud using Azure services like Event Hubs, HDFS, HBase, Hive and streaming technologies. Cognitive services are used to add capabilities like speech, vision, language and knowledge to solutions. The goal is to move from just collecting and analyzing data to taking prescriptive actions.
The document discusses the emergence of smart cities and the need for smart city platforms. It describes how smart city platforms can enable data-driven decision making, support growth through scalable architectures, and accelerate innovation. The platforms aim to promote transparency, trust and collaboration through open data sharing. The document provides examples of potential smart city use cases around traffic management and resource optimization. It also outlines attributes of an effective smart city platform including being open, agile, secure and able to support a city's entire portfolio through a software-defined approach.
This document discusses Ford's data analytics strategy. It notes that the volume of data Ford collects is increasing significantly from connected vehicles and other sources. This includes up to 25 gigabytes per hour from individual vehicles. Ford is working to build applications and drive adoption of analytics across the company through education and training programs to democratize access to tools and infrastructure while ensuring privacy, security, and governance of customer data. The goal is to provide the right data, tools, and support to analysts and data scientists to improve products and services.
Delivering Big Data - By Rod Smith at the CloudCon 2013exponential-inc
Rod Smith, VP of Emerging Technology at IBM, presented on cloud and big data analytics. The presentation covered IBM's work in emerging technologies, how IT and lines of business are evolving towards data-driven solutions, and examples of big data applications in healthcare and crime fighting. It also demonstrated a healthcare readmissions use case and encouraged collaboration to influence IBM's direction. The presentation argued that cloud architectures are better suited for new applications leveraging big data and analytics.
The document provides information about various IBM Bluemix services including Gamification, Watson, Internet of Things Foundation, DevOps, and more. It includes descriptions of the services, code examples, links to documentation, and tutorials. Specifically, it summarizes the Gamification service and provides REST API examples for managing game plans, events, and users. It also outlines the Internet of Things Foundation for connecting devices to apps and APIs.
Cognitive Sustainability Presentation for Berkeley Daryl Pereira
This document discusses cognitive computing and its applications. It describes cognitive computing as systems that can learn at scale, reason with purpose, and interact naturally with humans. It provides examples of cognitive systems like IBM Watson that can analyze large amounts of structured and unstructured data, provide real-time intelligence and reduce mundane tasks. The document also discusses how cognitive computing can enable open government and new forms of engagement through chatbots.
A Step into the Future – Educational Cloud Services - Patrick kirkSynetrix
The document discusses educational cloud services and their benefits. It describes how cloud computing provides dynamically scalable resources over the internet. Educational institutions can access applications, software, and IT capabilities from the cloud without having to manage their own infrastructure. This reduces costs while improving flexibility, reliability, and efficiency for educational needs.
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...sparktc
IBM researchers in Haifa, together with partners from the COSMOS EU-funded project, are using Spark to analyze the new wave of IoT data and solve problems in a way that is generic, integrated, and practical.
Similar to Is it harder to find a taxi when it is raining? (20)
Cloud Data Services - from prototyping to scalable analytics on cloudWilfried Hoge
Presentation from the German customer conference of IBM's Technical Expert Council. It shows how IBM's cloud data services could be used to explore data for new insights or business models.
innovations born in the cloud - cloud data services from IBM to prototype you...Wilfried Hoge
To bring your ideas to get insights from new data sources to live you must have the capabilities to prototype, fail fast if they don't work and bring to production easily if they are successful. See how IBM's cloud data services can help you to start testing your ideas with data.
- The document discusses IBM's Watson cognitive computing platform, which understands natural language, learns from interactions, and generates hypotheses.
- Watson Analytics allows users to analyze data using natural language and includes features like predictive analytics, data visualization, and self-service analytics.
- The document outlines IBM's Watson services like personality insights and describes the process for building cognitive apps using the Watson Developer Cloud.
InfoSphere BigInsights - Analytics power for Hadoop - field experienceWilfried Hoge
This document provides an overview and summary of InfoSphere BigInsights, an analytics platform for Hadoop. It discusses key features such as real-time analytics, storage integration, search, data exploration, predictive modeling, and application tooling. Case studies are presented on analyzing binary data and developing applications for transformation and analysis. Partnerships and certifications with other vendors are also mentioned. The document aims to demonstrate how BigInsights brings enterprise-grade features to Apache Hadoop and provides analytics capabilities for business users.
Presentation about BigData from a German Webcast: http://business-services.heise.de/it-management/big-data/beitrag/big-data-technologie-einsatzgebiete-datenschutz-160.html?source=IBM_12_2013_IT_Conn
InfoSphere BigInsights is IBM's distribution of Hadoop that:
- Enhances ease of use and usability for both technical and non-technical users.
- Includes additional tools, technologies, and accelerators to simplify developing and running analytics on Hadoop.
- Aims to help users gain business insights from their data more quickly through an integrated platform.
2012.04.26 big insights streams im forum2Wilfried Hoge
This document summarizes IBM's Big Data platform called InfoSphere BigInsights and InfoSphere Streams. It discusses how the platform can integrate and manage large volumes, varieties and velocities of data, apply advanced analytics to data in its native form, and enable visualization and development of new analytic applications. It also describes the key components of the BigInsights platform including Hadoop, data integration, governance and various accelerators.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"sameer shah
Embark on a captivating financial journey with 'Financial Odyssey,' our hackathon project. Delve deep into the past performance of two companies as we employ an array of financial statement analysis techniques. From ratio analysis to trend analysis, uncover insights crucial for informed decision-making in the dynamic world of finance."
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.