Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

•

7 likes•1,100 views

BigDataCloud

Big Data Cloud - June 3 Meetup - Presentation by Mark Davis Unlocking Big Data through Analytics and Search

Technology

Big Data Cloud Meetup Big Data & Cloud Computing - Help, Educate & Demystify. June 3rd 2011

Kitenga, Mark Davis CTO June 3rd 2011 Meetup Unlocking Big Data through Analytics and Search

Big Data Enormous transactional data Enormous unstructured information Too big for databases New tools are needed

Unstructured data explosion Multimedia Content Text Imagery Audio Video Sensor Streams Biometric data 3D Text Email Documents Web pages Tweets Posts <5% Structured Enterprise Data Datawarehouse CDRs Financial records Access logs 4

Big Data Trillions of user interactions/transactions == Big Data >100M <10M <1M Open source MySQL PHP Data warehousing Parallel SQL Big hardware NoSQL Hadoop/MapReduce Hbase/HIVE Emerging technologies Traditional (DBMS-based) solutions 5

The Structured/Unstructured Chasm SQL RDBMS Transactional Data BI Tools Search Documents Text Classification Taxonomies Ontologies

Unstructured Analytics: Surfacing Metadata

Information Extraction Machine-Learning Finite State Transducer Finite State Transducer Finite State Transducer Parts-of-Speech Tagging Lemmatization Tokenization

Search + Analytics Resource Integration Facet Browsing Facet Charting Autosuggest Spellcheck Query Language Indexing Metadata Extraction

Defense Intelligence Analyst support staff needs to convert raw data into actionable intelligence 10 Named Entity Extraction Image tagging Video analytics Linkage Analysis Network Visualization Search Improve Force Effectiveness Hadoop/MapReduce, GPUs, HDFS, Hbase, SOLR Situation Reports Geo-tagged Imagery US Army Navy DHS NSA

CASE STUDY: US ARMY 11 The Solution >200 data feeds <0.5s queries Fast analysis cycles Machine Learning Analytics Biometrics Linkage Analysis Face recognition Video tagging Collaborative systems Analysis Bottlenecks 200 data feeds Unacceptable response time Analysts avoid complete searches Basic entity extraction Slow analysis cycles Distribution by PowerPoint Enabling techonolgies: GPU clouds, Hadoop/MapReduce, Katta, Lucene, NoSQL, Hbase Enabling Technologies: Oracle and custom thick clients

Pharma Bioinformatics Increase speed of drug discovery 12 Biological Named Entity Extraction Author Name Extraction and Normalization Linkage Analysis Timelines Facetted Search ZettaVox Faster Discovery Hadoop/MapReduce, HDFS, Hbase, GPUs, SOLR Patents Genetic Sequence Data Journal Articles

Summary Big Data spans unstructured and structured data Effective tools for managing both involve understanding the differences and similarities of both Bridging the chasm between them means merging search and analytics together

Contact Info 20 mark@kitenga.com http://www.kitenga.com Kitenga, Inc. 2953 Bunker Hill Lane, Suite 400 Santa Clara, CA 95054 1-(408)-462-KITE 1-(253)-541-6799 (FAX)

As one of the largest financial institutions worldwide, JP Morgan is reliant on data to drive its day-to-day operations, against an ever evolving regulatory regime. Our global data landscape possesses particular challenges of effectively maintaining data governance and metadata management. The Data strategy at JP Morgan intends to: a) generate business value b) adhere to regulatory & compliance requirements c) reduce barriers to access d) democratize access to data In this talk, we show how JP Morgan leverages semantic technologies to drive the implementation of our data strategy. We demonstrate how we exploit knowledge graph capabilities to answer: 1) What Data do I need? 2) What Data do we have? 3) Where does my Data come from? 4) Where should my Data come from? 5) What Data should be shared most?

Big data landscape v 3.0 - Matt Turck (FirstMark) Matt Turck

PhD Research Topics in Data Mining Tutorials

PhD Services

Big Data, Big Deal? (A Big Data 101 presentation)

Matt Turck

Background: I prepared this slide deck for a couple of “Big Data 101” guest lectures I did in February 2013 at New York University’s Stern School of Business and at The New School. They’re intended for a college level, non technical audience, as a first exposure to Big Data and related concepts. I have re-used a number of stats, graphics, cartoons and other materials freely available on the internet. Thanks to the authors of those materials.

Big data? No. Big Decisions are What You Want

Stuart Miniman

Big data

kalyani reddy

Introduction to Big Data & Hadoop

iACT Global

Course in Big Data Analytics in association with IBM Everyday huge amount of data is created. This data comes from everywhere : sensors used to gather climate information, post to social media sites, digital pictures and videos, purchase transaction records and Cell phone GPS signals to name a few. This data is Big Data. Big data is a blanket term for any collection of data set so large and complex that it becomes difficult to process using on hand data management tools or traditional data processing applications. The challenges include capture, storage, search, sharing, transfer, analysis and visualization. Anyone who has knowledge on Java, basic UNIX and basic SQL can opt for Big Data training course.

CloudantDealmaker Media

Talk by Usama Fayyad at BigMine12 at KDD12. Virtually all organizations are having to deal with Big Data in many contexts: marketing, operations, monitoring, performance, and even financial management. Big Data is characterized not just by its size, but by its Velocity and its Variety for which keeping up with the data flux, let alone its analysis, is challenging at best and impossible in many cases. In this talk I will cover some of the basics in terms of infrastructure and design considerations for effective an efficient BigData. In many organizations, the lack of consideration of effective infrastructure and data management leads to unnecessarily expensive systems for which the benefits are insufficient to justify the costs. We will refer to example frameworks and clarify the kinds of operations where Map-Reduce (Hadoop and and its derivatives) are appropriate and the situations where other infrastructure is needed to perform segmentation, prediction, analysis, and reporting appropriately – these being the fundamental operations in predictive analytics. We will thenpay specific attention to on-line data and the unique challenges and opportunities represented there. We cover examples of Predictive Analytics over Big Data with case studies in eCommerce Marketing, on-line publishing and recommendation systems, and advertising targeting: Special focus will be placed on the analysis of on-line data with applications in Search, Search Marketing, and targeting of advertising. We conclude with some technical challenges as well as the solutions that can be used to these challenges in social network data.

Big Data analyticsThe Marketing Distillery

What is Big Data ?

AkhmadZakiAlsafi

Graph-based intelligence analysis

Linkurious

Big Data

Vinayak Kamath

Introduction to BigData

Abdelkader OUARED

Big data peresintaion

ahmed alshikh

Big data-ppt-

Bhagya Patil

Fraudes Financières: Méthodes de Prévention et Détection

Linkurious

Cette présentation en partenariat avec DataStax revient sur comment détecter en temps réel des activités frauduleuses telles que la fraude identitaire. Des applications concrètes de ces technologies seront détaillées, de l’affaire des Panama Papers à des cas d’usages quotidiens dans des banques et des institutions financières. Les techniques de lutte antifraude ainsi que les avantages des approches orientées graphe seront également présentés.

SEAD: Opening Data in the "Long Tail" for Active and Social Curation

SEAD

Data mining with big dataSandip Tipayle Patil

Big data analytics presented at meetup big data for decision makers

Ruhollah Farchtchi

Introduction to Big Data & Big Data 1.0 System

Petr Novotný

Big Data, a recent phenomenon. Everyone talks about it, but do you really know what Big Data is? Join our four-part series about Big Data and you will get answers to your questions! We will cover Introduction to Big Data and available platforms which we can use to deal with Big Data. And in the end, we are going to give you an insight into the possible future of dealing with Big Data. Today we will start with a brief introduction to Big Data. We will talk about how Big Data is generated, where we can apply it and also about the first world-wide famous platform of BigData 1.0 System, which is Hadoop. #CHEDTEB www.chedteb.eu

Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...

Connected Data World

BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...

BigDataCloud

Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase

BigDataCloud

Big Data Analytics is characterized by analysis of data on three vectors: exploding data volume, proliferating data variety (relational, multi-media), and accelerating data velocity. However, other key vectors such as costs and skill set needed for Big Data Analytics are often overlooked. In this session, we will consider all five vectors by exploring various techniques where traditional but progressive technologies such as column store DBMS and Event Stream Processing is combined with open source frameworks such as Hadoop to exploit the full potential of Big Data Analytics. Agenda: - Big Data Analytics in the real world - Commercial and Open Source techniques - Bringing together Commercial and Open Source techniques * Architectures * Programming APIs (e.g. embedded and federated MapReduce) - Conclusions

What's hot

Thilga

THILAKAVATHIRAMRAJ

data mining with big data

swathi78

Presentation at Google Day on Big DataRezaur Rahman

Alfresco Corporate Presentation

XeniT Solutions nv

Big Data Landscape 2018

Leanne Hwee

A chart of the big data ecosystem

Matt Turck

Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...

BigMine

Big Data analyticsThe Marketing Distillery

What is Big Data ?

AkhmadZakiAlsafi

Graph-based intelligence analysis

Linkurious

Big Data

Vinayak Kamath

Introduction to BigData

Abdelkader OUARED

Big data peresintaion

ahmed alshikh

Big data-ppt-

Bhagya Patil

Fraudes Financières: Méthodes de Prévention et Détection

Linkurious

SEAD: Opening Data in the "Long Tail" for Active and Social Curation

SEAD

Data mining with big dataSandip Tipayle Patil

Big data analytics presented at meetup big data for decision makers

Ruhollah Farchtchi

Introduction to Big Data & Big Data 1.0 System

Petr Novotný

Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...

Connected Data World

What's hot (20)

Thilga

data mining with big data

Presentation at Google Day on Big Data

Alfresco Corporate Presentation

Big Data Landscape 2018

A chart of the big data ecosystem

Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...

Big Data analytics

What is Big Data ?

Graph-based intelligence analysis

Big Data

Introduction to BigData

Big data peresintaion

Big data-ppt-

Fraudes Financières: Méthodes de Prévention et Détection

SEAD: Opening Data in the "Long Tail" for Active and Social Curation

Data mining with big data

Big data analytics presented at meetup big data for decision makers

Introduction to Big Data & Big Data 1.0 System

Powerful Information Discovery with Big Knowledge Graphs –The Offshore Leaks ...

Viewers also liked

BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...

BigDataCloud

Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase

BigDataCloud

Why Hadoop is the New Infrastructure for the CMO?

BigDataCloud

As the big data market matures, discussions about Hadoop are expanding from pure technology to how businesses can use it to innovate and leap frog competitors. In this session, Karmasphere will outline how technologists can effectively work with their CMOs - the likely drivers of widespread Hadoop adoption, to unlock its business value. The discussion will include: how changes in marketing are driving the adoption of Hadoop big data analytics, the evolving role of the data and business analysts and a review of real-world big data analytics use cases. Karmasphere will demonstrate how the Full Fidelity Analytics of Hadoop can empower high-tech, e-commerce, etail and reatil banking to quickly and easily analyze complex data types across silos and apply sophisticated analytics to personalize customer engagement and optimize revenue.

Engagement slideshow final 6 4-2011bryanbigos

Big Data Cloud Meetup - Jan 24 2013 - Zettaset

BigDataCloud

Streak + Google Cloud Platform

BigDataCloud

Big Data Analytics in Motorola on the Google Cloud Platform

BigDataCloud

Creating Business Value from Big Data, Analytics & Technology.

BigDataCloud

Viewers also liked (8)

BigDataCloud Sept 8 2011 meetup - Big Data Analytics for Health by Charles Ka...

Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase

Why Hadoop is the New Infrastructure for the CMO?

Engagement slideshow final 6 4-2011

Big Data Cloud Meetup - Jan 24 2013 - Zettaset

Streak + Google Cloud Platform

Big Data Analytics in Motorola on the Google Cloud Platform

Creating Business Value from Big Data, Analytics & Technology.

Similar to Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

Big data data lake and beyond

Rajesh Kumar

Hadoop was born out of the need to process Big Data.Today data is being generated liked never before and it is becoming difficult to store and process this enormous volume and large variety of data, In order to cope this Big Data technology comes in.Today Hadoop software stack is go-to framework for large scale,data intensive storage and compute solution for Big Data Analytics Applications.The beauty of Hadoop is that it is designed to process large volume of data in clustered commodity computers work in parallel.Distributing the data that is too large across the nodes in clusters solves the problem of having too large data sets to be processed onto the single machine.

Demystify Big Data, Data Science & Signal Extraction Deep Dive

Hyderabad Scalability Meetup

Demystify big data data science

Mahesh Kumar CV

Demystify big data data science An overview of the shift to Data Science Platforms The 3 critical components of a Data Science platform Industries that are most likely to get disrupted and shift to Data Science Characteristics of firms that get left behind the Data Science wave Factors that push an industry towards Data Science A brief overview of aspects of platform architecture beyond technology

A Big Data ConceptDharmesh Tank

Big-Data-Analytics.8592259.powerpoint.pdf

rajsharma159890

02 a holistic approach to big dataRaul Chong

Big Data Analytics MIS presentationAASTHA PANDEY

Introduction Big Data

Frank Kienle

INTRODUCTION TO BIG DATA AND HADOOP

Dr Geetha Mohan

INTRODUCTION TO BIG DATA AND HADOOP 9 Introduction to Big Data, Types of Digital Data, Challenges of conventional systems - Web data, Evolution of analytic processes and tools, Analysis Vs reporting - Big Data Analytics, Introduction to Hadoop - Distributed Computing Challenges - History of Hadoop, Hadoop Eco System - Use case of Hadoop – Hadoop Distributors – HDFS – Processing Data with Hadoop – Map Reduce.

An Encyclopedic Overview Of Big Data Analytics

Audrey Britton

Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018

biGENiUS | Big Data & Data Warehouse Automation

Unit 1

vishal choudhary

Big Data Meetup: Analytical Systems Evolution

Provectus

Big data lecture notes

Mohit Saini

Hadoop 2.0: YARN to Further Optimize Data Processing

Hortonworks

Data is exponentially increasing in both types and volumes, creating opportunities for businesses. Watch this video and learn from three Big Data experts: John Kreisa, VP Strategic Marketing at Hortonworks, Imad Birouty, Director of Technical Product Marketing at Teradata and John Haddad, Senior Director of Product Marketing at Informatica. Multiple systems are needed to exploit the variety and volume of data sources, including a flexible data repository. Learn more about: - Apache Hadoop 2 and YARN - Data Lakes - Intelligent data management layers needed to manage metadata and usage patterns as well as track consumption across these data platforms.

An Comprehensive Study of Big Data Environment and its Challenges.

ijceronline

Big Data is a data analysis methodology enabled by recent advances in technologies and Architecture. Big data is a massive volume of both structured and unstructured data, which is so large that it's difficult to process with traditional database and software techniques. This paper provides insight to Big data and discusses its nature, definition that include such features as Volume, Velocity, and Variety .This paper also provides insight to source of big data generation, tools available for processing large volume of variety of data, applications of big data and challenges involved in handling big data

All About Big Data Sai Venkatesh

Big data's impact on online marketing

Pros Global Inc

Qo Introduction V2

Joe_F

Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS

Matt Stubbs

Date: 13th November 2018 Location: Keynote Theatre Time: 13:50 - 14:20 Speaker: Rob Davis Organisation: MicroStrategy About: While great strides have been made in equipping the analyst with ever smarter tools for gleaning insight from data, techniques and platforms for allowing the workforce to benefit from these insights in a timely fashion have been lacking. The third revolution in analytics will enable this wider workforce, consisting of front line workers who are not traditional users of data, to rapidly monetise insights coming from the business analyst even while their day to day actions improve the intelligence of the enterprise. In this session, you will learn what characteristics an analytics platform must possess in order to enable the third revolution as well as see examples of how to build the organisational and cultural changes that are also necessary. A case study and common pitfalls to be avoided will be presented. Key industry trends such as AI, embedded analytics, and widening data literacy will be discussed as enablers for the third revolution in analytics. Join Rob Davis, Vice President of Product Management for MicroStrategy, as he presents the importance of bridging this last mile of analytics to the creation of a truly Intelligent Enterprise.

Similar to Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup (20)

Big data data lake and beyond

Demystify Big Data, Data Science & Signal Extraction Deep Dive

Demystify big data data science

A Big Data Concept

Big-Data-Analytics.8592259.powerpoint.pdf

02 a holistic approach to big data

Big Data Analytics MIS presentation

Introduction Big Data

INTRODUCTION TO BIG DATA AND HADOOP

An Encyclopedic Overview Of Big Data Analytics

Bring Your Data Model Alive with Automation - Data Modeling Zone Europe 2018

Unit 1

Big Data Meetup: Analytical Systems Evolution

Big data lecture notes

Hadoop 2.0: YARN to Further Optimize Data Processing

An Comprehensive Study of Big Data Environment and its Challenges.

All About Big Data

Big data's impact on online marketing

Qo Introduction V2

Big Data LDN 2018: THE THIRD REVOLUTION IN ANALYTICS

More from BigDataCloud

Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

BigDataCloud

Crime Analysis & Prediction System

BigDataCloud

REAL-TIME RECOMMENDATION SYSTEMS

BigDataCloud

Generally in recommendation engines, user's past history on engagements with different items is a key input. However, in many situations in an enterprise’s business cycle, it is necessary to generate recommendations based on user activity in real time. In this Big Data Cloud's meetup on April 3, 2014, we discussed how to decipher real time click streams into meaningful recommendations in real time. Pranab Ghosh discussed the real time recommendations feature of Sifarish, which is an open source project built on Hadoop, Storm and Redis. Sifarish is a recommendation engine that does content based recommendation as well as social collaborative filtering based recommendation.

Cloud Computing Services

BigDataCloud

Google Enterprise Cloud Platform - Resources & $2000 credit!

BigDataCloud

Big Data in the Cloud - Solutions & Apps

BigDataCloud

Using Advanced Analyics to bring Business Value

BigDataCloud

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

BigDataCloud

A tutorial given at NAACL HLT 2013. Richard Socher and Christopher Manning http://nlp.stanford.edu/courses/NAACL2013/ Machine learning is everywhere in today's NLP, but by and large machine learning amounts to numerical optimization of weights for human designed representations and features. The goal of deep learning is to explore how computers can take advantage of data to develop features and representations appropriate for complex interpretation tasks. This tutorial aims to cover the basic motivation, ideas, models and learning algorithms in deep learning for natural language processing. Recently, these methods have been shown to perform very well on various NLP tasks such as language modeling, POS tagging, named entity recognition, sentiment analysis and paraphrase detection, among others. The most attractive quality of these techniques is that they can perform well without any external hand-designed resources or time-intensive feature engineering. Despite these advantages, many researchers in NLP are not familiar with these methods. Our focus is on insight and understanding, using graphical illustrations and simple, intuitive derivations. The goal of the tutorial is to make the inner workings of these techniques transparent, intuitive and their results interpretable, rather than black boxes labeled "magic here". The first part of the tutorial presents the basics of neural networks, neural word vectors, several simple models based on local windows and the math and algorithms of training via backpropagation. In this section applications include language modeling and POS tagging. In the second section we present recursive neural networks which can learn structured tree outputs as well as vector representations for phrases and sentences. We cover both equations as well as applications. We show how training can be achieved by a modified version of the backpropagation algorithm introduced before. These modifications allow the algorithm to work on tree structures. Applications include sentiment analysis and paraphrase detection. We also draw connections to recent work in semantic compositionality in vector spaces. The principle goal, again, is to make these methods appear intuitive and interpretable rather than mathematically confusing. By this point in the tutorial, the audience members should have a clear understanding of how to build a deep learning system for word-, sentence- and document-level tasks. The last part of the tutorial gives a general overview of the different applications of deep learning in NLP, including bag of words models. We will provide a discussion of NLP-oriented issues in modeling, interpretation, representational power, and optimization.

Recommendation Engines - An Architectural GuideBigDataCloud

Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal

BigDataCloud

Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB

BigDataCloud

A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook

BigDataCloud

At Facebook, we use various types of databases and storage system to satisfy the needs of different applications. The solutions built around these data store systems have a common set of requirements: they have to be highly scalable, maintenance costs should be low and they have to perform efficiently. We use a sharded mySQL+memcache solution to support real-time access of tens of petabytes of data and we use TAO to provide consistency of this web-scale database across geographical distances. We use Haystack datastore for storing the 3 billion new photos we host every week. We use Apache Hadoop to mine intelligence from 100 petabytes of clicklogs and combine it with the power of Apache HBase to store all Facebook Messages. This talk describes the reasons why each of these databases are appropriate for their workloads and the design decisions and tradeoffs that were made while implementing these solutions. We touch upon the consistency, availability and partitioning tolerance of each of these solutions. We touch upon the reasons why some of these systems need ACID semantics and other systems do not. We briefly touch upon some futures of how we plan to do big-data deployments across geographical locations and our requirements for a new breed of pure-memory and pure-SSD based transactional database.

What Does Big Data Mean and Who Will Win

BigDataCloud

Michael Ralph Stonebraker is a computer scientist specializing in database research. He is currently an adjunct professor at MIT, where he has been involved in the development of the Aurora, C-Store, H-Store, Morpheus, and SciDB systems.Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational database systems on the market today. He is also the founder of a number of database companies, including Ingres, Illustra, Cohera, StreamBase Systems, Vertica, VoltDB, and Paradigm4. He was previously the Chief Technical Officer (CTO) of Informix & a Professor of Computer Science at University of California, Berkeley. He is also an editor for the book "Readings in Database Systems"

BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentationBigDataCloud

BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...

BigDataCloud

BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...

BigDataCloud

Recommendation Engine Powered by Hadoop - Pranab Ghosh

BigDataCloud

Personalized recommendations are ubiquitous in social network and shopping sites these days. How do they do it? As long as enough user interaction data is available for items e.g., products in shopping sites, a kind of recommendation engine based on what’s known as ' Collaborative Filtering' is not that difficult to build. Since the solution causes a combinatorial explosion, Hadoop can play a critical role in processing massive amount of data in collaborative filtering based solutions. In this presentations, I will cover a Hadoop based recommendation engine implementation using collaborative filtering.

BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...BigDataCloud

Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup

BigDataCloud

More from BigDataCloud (19)

Webinar - Comparative Analysis of Cloud based Machine Learning Platforms

Crime Analysis & Prediction System

REAL-TIME RECOMMENDATION SYSTEMS

Cloud Computing Services

Google Enterprise Cloud Platform - Resources & $2000 credit!

Big Data in the Cloud - Solutions & Apps

Using Advanced Analyics to bring Business Value

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

Recommendation Engines - An Architectural Guide

Hadoop : A Foundation for Change - Milind Bhandarkar Chief Scientist, Pivotal

Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB

A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook

What Does Big Data Mean and Who Will Win

BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation

BigDataCloud Sept 8 2011 Meetup - Fail-Proofing Hadoop Clusters with Automati...

BigDataCloud Sept 8 2011 Meetup - Big Data Analytics for DoddFrank Regulation...

Recommendation Engine Powered by Hadoop - Pranab Ghosh

BigDataCloud meetup - July 8th - Cost effective big-data processing using Ama...

Optimizing Bursty Hadoop on AWS - Big Data Cloud - June 3rd Meetup

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

GridMate - End to end testing is a critical piece to ensure quality and avoid...

ThomasParaiso2

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject. We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup. Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved. The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring. The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise. By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.

Elizabeth Buie - Older adults: Are we really designing for our future selves?

Nexer Digital

How to Get CNIC Information System with Paksim Ga.pptx

danishmna97

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

Neo4j

UiPath Test Automation using UiPath Test Suite series, part 5

DianaGray10

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

Albert Hoitingh

National Security Agency - NSA mobile device best practices

Quotidiano Piemontese

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster. However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks. In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.

20240607 QFM018 Elixir Reading List May 2024

Matthew Sinclair

みなさんこんにちはこれ何文字まで入るの？40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの？えこ...

名前です男

Free Complete Python - A step towards Data Science

RinaMondal9

Communications Mining Series - Zero to Hero - Session 1

DianaGray10

This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered: • Communication Mining Overview • Why is it important? • How can it help today’s business and the benefits • Phases in Communication Mining • Demo on Platform overview • Q/A

The Art of the Pitch: WordPress Relationships and Sales

Laura Byrne

Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes? All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

Neo4j

Dr. Sean Tan, Head of Data Science, Changi Airport Group Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.

Climate Impact of Software Testing at Nordic Testing Days

Kari Kakkonen

My slides at Nordic Testing Days 6.6.2024 Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Epistemic Interaction - tuning interfaces to provide information for AI support

GridMate - End to end testing is a critical piece to ensure quality and avoid...

GraphRAG is All You need? LLM & Knowledge Graph

Artificial Intelligence for XMLDevelopment

Elizabeth Buie - Older adults: Are we really designing for our future selves?

How to Get CNIC Information System with Paksim Ga.pptx

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

UiPath Test Automation using UiPath Test Suite series, part 5

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024

National Security Agency - NSA mobile device best practices

Securing your Kubernetes cluster_ a step-by-step guide to success !

20240607 QFM018 Elixir Reading List May 2024

Free Complete Python - A step towards Data Science

Communications Mining Series - Zero to Hero - Session 1

The Art of the Pitch: WordPress Relationships and Sales

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

Climate Impact of Software Testing at Nordic Testing Days

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

1. Big Data Cloud Meetup Big Data & Cloud Computing - Help, Educate & Demystify. June 3rd 2011

2. Kitenga, Mark Davis CTO June 3rd 2011 Meetup Unlocking Big Data through Analytics and Search

3. Big Data Enormous transactional data Enormous unstructured information Too big for databases New tools are needed

4. Unstructured data explosion Multimedia Content Text Imagery Audio Video Sensor Streams Biometric data 3D Text Email Documents Web pages Tweets Posts <5% Structured Enterprise Data Datawarehouse CDRs Financial records Access logs 4

5. Big Data Trillions of user interactions/transactions == Big Data >100M <10M <1M Open source MySQL PHP Data warehousing Parallel SQL Big hardware NoSQL Hadoop/MapReduce Hbase/HIVE Emerging technologies Traditional (DBMS-based) solutions 5

6. The Structured/Unstructured Chasm SQL RDBMS Transactional Data BI Tools Search Documents Text Classification Taxonomies Ontologies

7. Unstructured Analytics: Surfacing Metadata

8. Information Extraction Machine-Learning Finite State Transducer Finite State Transducer Finite State Transducer Parts-of-Speech Tagging Lemmatization Tokenization

9. Search + Analytics Resource Integration Facet Browsing Facet Charting Autosuggest Spellcheck Query Language Indexing Metadata Extraction

10. Defense Intelligence Analyst support staff needs to convert raw data into actionable intelligence 10 Named Entity Extraction Image tagging Video analytics Linkage Analysis Network Visualization Search Improve Force Effectiveness Hadoop/MapReduce, GPUs, HDFS, Hbase, SOLR Situation Reports Geo-tagged Imagery US Army Navy DHS NSA

11. CASE STUDY: US ARMY 11 The Solution >200 data feeds <0.5s queries Fast analysis cycles Machine Learning Analytics Biometrics Linkage Analysis Face recognition Video tagging Collaborative systems Analysis Bottlenecks 200 data feeds Unacceptable response time Analysts avoid complete searches Basic entity extraction Slow analysis cycles Distribution by PowerPoint Enabling techonolgies: GPU clouds, Hadoop/MapReduce, Katta, Lucene, NoSQL, Hbase Enabling Technologies: Oracle and custom thick clients

12. Pharma Bioinformatics Increase speed of drug discovery 12 Biological Named Entity Extraction Author Name Extraction and Normalization Linkage Analysis Timelines Facetted Search ZettaVox Faster Discovery Hadoop/MapReduce, HDFS, Hbase, GPUs, SOLR Patents Genetic Sequence Data Journal Articles

13. PharmaTreemap 13

14. 14

15.

16. Demo

17.

18. Summary Big Data spans unstructured and structured data Effective tools for managing both involve understanding the differences and similarities of both Bridging the chasm between them means merging search and analytics together

19. Questions?

20. Contact Info 20 mark@kitenga.com http://www.kitenga.com Kitenga, Inc. 2953 Bunker Hill Lane, Suite 400 Santa Clara, CA 95054 1-(408)-462-KITE 1-(253)-541-6799 (FAX)

Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup

Similar to Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup (20)

More from BigDataCloud

More from BigDataCloud (19)

Recently uploaded

Recently uploaded (20)

Unlocking Big Data through Analytics and Search - Big Data Cloud - June 3 Meetup