The document discusses Mongo-Hadoop integration and provides examples of using the Mongo-Hadoop connector to run MapReduce jobs on data stored in MongoDB. It covers loading and writing data to MongoDB from Hadoop, using Java MapReduce, Hadoop Streaming with Python, and analyzing data with Pig and Hive. Examples show processing an email corpus to build a graph of sender-recipient relationships and message counts.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
For developers new to MongoDB and Node.js, however, some the common design patterns are very different than those of a RDBMS and traditional synchronous languages. Developers learning these technologies together may find it a bit bewildering. In reality, however, these tools fit perfectly together and enable I high degree of developer productivity and application performance.
This webinar will walk developers through common MongoDB development patterns in Node.js, such as efficiently loading data into MongoDB using MongoDB's bulk API, iterating through query results, and managing simultaneous asynchronous MongoDB queries to provide the best possible application performance. Working Node.js and MongoDB examples will be used throughout the presentation.
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
Applications get great efficiency from MongoDB by combining data that is accessed together into a single document. There are however situations where it is more efficient to have references between documents rather than embedding everything into a single document. This led to joins being our most requested feature. MongoDB 3.2 addresses this through the introduction of the $lookup stage in the aggregation pipeline to implement left-outer joins.
This webinar looks at $lookup as well as the other significant aggregation enhancements coming with MongoDB 3.2—why they're needed, what they deliver, and how to use them.
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
Questo è il secondo webinar della serie Back to Basics che ti offrirà un'introduzione al database MongoDB. In questo webinar ti dimostreremo come creare un'applicazione base per il blogging in MongoDB.
Back to Basics: My First MongoDB ApplicationMongoDB
This Back to Basics webinar series will introduce you to NoSQL and the MongoDB database. You will find out what MongoDB is, why you would use it, and what you would use it for.
For developers new to MongoDB and Node.js, however, some the common design patterns are very different than those of a RDBMS and traditional synchronous languages. Developers learning these technologies together may find it a bit bewildering. In reality, however, these tools fit perfectly together and enable I high degree of developer productivity and application performance.
This webinar will walk developers through common MongoDB development patterns in Node.js, such as efficiently loading data into MongoDB using MongoDB's bulk API, iterating through query results, and managing simultaneous asynchronous MongoDB queries to provide the best possible application performance. Working Node.js and MongoDB examples will be used throughout the presentation.
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
Applications get great efficiency from MongoDB by combining data that is accessed together into a single document. There are however situations where it is more efficient to have references between documents rather than embedding everything into a single document. This led to joins being our most requested feature. MongoDB 3.2 addresses this through the introduction of the $lookup stage in the aggregation pipeline to implement left-outer joins.
This webinar looks at $lookup as well as the other significant aggregation enhancements coming with MongoDB 3.2—why they're needed, what they deliver, and how to use them.
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
Questo è il secondo webinar della serie Back to Basics che ti offrirà un'introduzione al database MongoDB. In questo webinar ti dimostreremo come creare un'applicazione base per il blogging in MongoDB.
MongoDB + Java - Everything you need to know Norberto Leite
Learn everything you need to know to get started building a MongoDB-based app in Java. We'll explore the relationship between MongoDB and various languages on the Java Virtual Machine such as Java, Scala, and Clojure. From there, we'll examine the popular frameworks and integration points between MongoDB and the JVM including Spring Data and object-document mappers like Morphia.
Beyond the Basics 2: Aggregation Framework MongoDB
The aggregation framework is one of the most powerful analytical tools available with MongoDB.
Learn how to create a pipeline of operations that can reshape and transform your data and apply a range of analytics functions and calculations to produce summary results across a data set.
Webinar: Back to Basics: Thinking in DocumentsMongoDB
New applications, users and inputs demand new types of data, like unstructured, semi-structured and polymorphic data. Adopting MongoDB means adopting to a new, document-based data model.
While most developers have internalized the rules of thumb for designing schemas for relational databases, these rules don't apply to MongoDB. Documents can represent rich data structures, providing lots of viable alternatives to the standard, normalized, relational model. In addition, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense.
In this session, Buzz Moschetti explores how you can take advantage of MongoDB's document model to build modern applications.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB
The popularity of dedicated graph technologies has risen greatly in recent years, at least partly fuelled by the explosion in social media and similar systems, where a friend network or recommendation engine is often a critical component when delivering a successful application. MongoDB 3.4 introduces a new Aggregation Framework graph operator, $graphLookup, to enable some of these types of use cases to be built easily on top of MongoDB. We will see how semantic relationships can be modelled inside MongoDB today, how the new $graphLookup operator can help simplify this in 3.4, and how $graphLookup can be used to leverage these relationships and build a commercially focused news article recommendation system.
Webinar: Building Your First App with MongoDB and JavaMongoDB
This webinar will walk you through building a simple Java-based application in MongoDB. We’ll cover the basics of MongoDB’s document model, query language, aggregation framework, and deployment architecture.
In this webinar, you will discover:
- How easy it is to start building Java applications with MongoDB
- Key features for manipulating and accessing data
- High availability and scale-out architecture
- WriteConcerns and ReadPreference
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
Development time is wasted as the bulk of the work shifts from adding business features to struggling with the RDBMS. MongoDB, the leading NoSQL database, offers a flexible and scalable solution.
MongoDB - Back to Basics - La tua prima ApplicazioneMassimo Brignoli
Eccoci alla seconda puntata della serie Back to Basics edizione 2017. Vedremo come sviluppare un'applicazione con MongoDB studiando come interagire con la base dati. Vedremo come fare le query, creare un indice e studiarne il piano di esecuzione
The flexibility of MongoDB makes it perfect for storing analytics. I'll discuss a few patterns for storing data that we have learned while growing Gaug.es from zero to millions of page views a day. You'll leave with a desire to measure everything and the ability to do it.
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
This is the fifth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
This presentation is showing how to use the Aggregation Framework, the powerful aggregation language of MongoDB. Using some real data coming from the USA Census, we will discover the most important operations.
Webinar: Working with Graph Data in MongoDBMongoDB
With the release of MongoDB 3.4, the number of applications that can take advantage of MongoDB has expanded. In this session we will look at using MongoDB for representing graphs and how graph relationships can be modeled in MongoDB.
We will also look at a new aggregation operation that we recently implemented for graph traversal and computing transitive closure. We will include an overview of the new operator and provide examples of how you can exploit this new feature in your MongoDB applications.
Migration from SQL to MongoDB - A Case Study at TheKnot.com MongoDB
8 out of 10 couples use TheKnot.com to help plan their wedding. A key part of planning involves selecting articles, photographs, and other resources and storing these in the user's Favorites. Recently we migrated major parts of our technology stack to open source technologies. As part of our migration strategy, we zeroed in on MongoDB, since it better suited our requirements for speed and data structure as well as eliminating the need for a caching layer. The transition required a period in which both our legacy and new API where working concurrently with data being persisted on both databases (SQL and Mongo) and all records were being synched with every request. We resourced to many strategies and applications to achieve this goal, including: Pentaho, AWS SQS and SNS, a queue messenger system and some proprietary ruby gems. In this session we will review our strategy and some of the lessons we learned about successfully migrating with zero downtime.
MongoDB + Java - Everything you need to know Norberto Leite
Learn everything you need to know to get started building a MongoDB-based app in Java. We'll explore the relationship between MongoDB and various languages on the Java Virtual Machine such as Java, Scala, and Clojure. From there, we'll examine the popular frameworks and integration points between MongoDB and the JVM including Spring Data and object-document mappers like Morphia.
Beyond the Basics 2: Aggregation Framework MongoDB
The aggregation framework is one of the most powerful analytical tools available with MongoDB.
Learn how to create a pipeline of operations that can reshape and transform your data and apply a range of analytics functions and calculations to produce summary results across a data set.
Webinar: Back to Basics: Thinking in DocumentsMongoDB
New applications, users and inputs demand new types of data, like unstructured, semi-structured and polymorphic data. Adopting MongoDB means adopting to a new, document-based data model.
While most developers have internalized the rules of thumb for designing schemas for relational databases, these rules don't apply to MongoDB. Documents can represent rich data structures, providing lots of viable alternatives to the standard, normalized, relational model. In addition, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense.
In this session, Buzz Moschetti explores how you can take advantage of MongoDB's document model to build modern applications.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
MongoDB Europe 2016 - Graph Operations with MongoDBMongoDB
The popularity of dedicated graph technologies has risen greatly in recent years, at least partly fuelled by the explosion in social media and similar systems, where a friend network or recommendation engine is often a critical component when delivering a successful application. MongoDB 3.4 introduces a new Aggregation Framework graph operator, $graphLookup, to enable some of these types of use cases to be built easily on top of MongoDB. We will see how semantic relationships can be modelled inside MongoDB today, how the new $graphLookup operator can help simplify this in 3.4, and how $graphLookup can be used to leverage these relationships and build a commercially focused news article recommendation system.
Webinar: Building Your First App with MongoDB and JavaMongoDB
This webinar will walk you through building a simple Java-based application in MongoDB. We’ll cover the basics of MongoDB’s document model, query language, aggregation framework, and deployment architecture.
In this webinar, you will discover:
- How easy it is to start building Java applications with MongoDB
- Key features for manipulating and accessing data
- High availability and scale-out architecture
- WriteConcerns and ReadPreference
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
Este es el quinto seminario web de la serie Conceptos básicos, en la que se realiza una introducción a la base de datos MongoDB. En este seminario web, se analizan los aspectos básicos de Aggregation Framework.
Development time is wasted as the bulk of the work shifts from adding business features to struggling with the RDBMS. MongoDB, the leading NoSQL database, offers a flexible and scalable solution.
MongoDB - Back to Basics - La tua prima ApplicazioneMassimo Brignoli
Eccoci alla seconda puntata della serie Back to Basics edizione 2017. Vedremo come sviluppare un'applicazione con MongoDB studiando come interagire con la base dati. Vedremo come fare le query, creare un indice e studiarne il piano di esecuzione
The flexibility of MongoDB makes it perfect for storing analytics. I'll discuss a few patterns for storing data that we have learned while growing Gaug.es from zero to millions of page views a day. You'll leave with a desire to measure everything and the ability to do it.
Back to Basics Webinar 5: Introduction to the Aggregation FrameworkMongoDB
This is the fifth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
This presentation is showing how to use the Aggregation Framework, the powerful aggregation language of MongoDB. Using some real data coming from the USA Census, we will discover the most important operations.
Webinar: Working with Graph Data in MongoDBMongoDB
With the release of MongoDB 3.4, the number of applications that can take advantage of MongoDB has expanded. In this session we will look at using MongoDB for representing graphs and how graph relationships can be modeled in MongoDB.
We will also look at a new aggregation operation that we recently implemented for graph traversal and computing transitive closure. We will include an overview of the new operator and provide examples of how you can exploit this new feature in your MongoDB applications.
Migration from SQL to MongoDB - A Case Study at TheKnot.com MongoDB
8 out of 10 couples use TheKnot.com to help plan their wedding. A key part of planning involves selecting articles, photographs, and other resources and storing these in the user's Favorites. Recently we migrated major parts of our technology stack to open source technologies. As part of our migration strategy, we zeroed in on MongoDB, since it better suited our requirements for speed and data structure as well as eliminating the need for a caching layer. The transition required a period in which both our legacy and new API where working concurrently with data being persisted on both databases (SQL and Mongo) and all records were being synched with every request. We resourced to many strategies and applications to achieve this goal, including: Pentaho, AWS SQS and SNS, a queue messenger system and some proprietary ruby gems. In this session we will review our strategy and some of the lessons we learned about successfully migrating with zero downtime.
Modern architectures are moving away from a "one size fits all" approach. We are well aware that we need to use the best tools for the job. Given the large selection of options available today, chances are that you will end up managing data in MongoDB for your operational workload and with Spark for your high speed data processing needs.
Description: When we model documents or data structures there are some key aspects that need to be examined not only for functional and architectural purposes but also to take into consideration the distribution of data nodes, streaming capabilities, aggregation and queryability options and how we can integrate the different data processing software, like Spark, that can benefit from subtle but substantial model changes. A clear example is when embedding or referencing documents and their implications on high speed processing.
Over the course of this talk we will detail the benefits of a good document model for the operational workload. As well as what type of transformations we should incorporate in our document model to adjust for the high speed processing capabilities of Spark.
We will look into the different options that we have to connect these two different systems, how to model according to different workloads, what kind of operators we need to be aware of for top performance and what kind of design and architectures we should put in place to make sure that all of these systems work well together.
Over the course of the talk we will showcase different libraries that enable the integration between spark and MongoDB, such as MongoDB Hadoop Connector, Stratio Connector and MongoDB Spark Native Connector.
By the end of the talk I expect the attendees to have an understanding of:
How they connect their MongoDB clusters with Spark
Which use cases show a net benefit for connecting these two systems
What kind of architecture design should be considered for making the most of Spark + MongoDB
How documents can be modeled for better performance and operational process, while processing these data sets stored in MongoDB.
The talk is suitable for:
Developers that want to understand how to leverage Spark
Architects that want to integrate their existing MongoDB cluster and have real time high speed processing needs
Data scientists that know about Spark, are playing with Spark and want to integrate with MongoDB for their persistency layer
Realtime Analytics Using MongoDB, Python, Gevent, and ZeroMQRick Copeland
With over 180,000 projects and over 2 million users, SourceForge has tons of data about people developing and downloading open source projects. Until recently, however, that data didn't translate into usable information, so Zarkov was born. Zarkov is system that captures user events, logs them to a MongoDB collection, and aggregates them into useful data about user behavior and project statistics. This talk will discuss the components of Zarkov, including its use of Gevent asynchronous programming, ZeroMQ sockets, and the pymongo/bson driver.
Spray Json and MongoDB Queries: Insights and Simple Tricks.Andrii Lashchenko
This presentation will cover the history of creation, implementation details and various challenges related to embedded documents querying in MongoDB, along with examples of how to properly create and utilize the extension on top of official MongoDB Scala Driver. This newly introduced extension allows to fully utilize Spray JSON and represents bidirectional serialization for case classes in BSON, as well as flexible DSL for MongoDB query operators, documents and collections.
These are slides from our Big Data Warehouse Meetup in April. We talked about NoSQL databases: What they are, how they’re used and where they fit in existing enterprise data ecosystems.
Mike O’Brian from 10gen, introduced the syntax and usage patterns for a new aggregation system in MongoDB and give some demonstrations of aggregation using the new system. The new MongoDB aggregation framework makes it simple to do tasks such as counting, averaging, and finding minima or maxima while grouping by keys in a collection, complementing MongoDB’s built-in map/reduce capabilities.
For more information, visit our website at http://casertaconcepts.com/ or email us at info@casertaconcepts.com.
Map visualization using D3 js and Topojson File Format. Meclenburg county Zip Codes are shown with a overlay of per-capita income and (arbitrary) number of Starbucks.
CouchDB Mobile - From Couch to 5K in 1 HourPeter Friese
In this talk, I explain how to use CouchDB mobile to connect your iPhone or Android phone with a a remote ChouchDB to build a RunKeeper clone. The code for this talk is available at https://github.com/peterfriese/CouchTo5K
Webinar: Data Processing and Aggregation OptionsMongoDB
MongoDB scales easily to store mass volumes of data. However, when it comes to making sense of it all what options do you have? In this talk, we'll take a look at 3 different ways of aggregating your data with MongoDB, and determine the reasons why you might choose one way over another. No matter what your big data needs are, you will find out how MongoDB the big data store is evolving to help make sense of your data.
Similar to Hadoop - MongoDB Webinar June 2014 (20)
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
During this talk we'll navigate through a customer's journey as they migrate an existing MongoDB deployment to MongoDB Atlas. While the migration itself can be as simple as a few clicks, the prep/post effort requires due diligence to ensure a smooth transfer. We'll cover these steps in detail and provide best practices. In addition, we’ll provide an overview of what to consider when migrating other cloud data stores, traditional databases and MongoDB imitations to MongoDB Atlas.
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
MongoDB Kubernetes operator and MongoDB Open Service Broker are ready for production operations. Learn about how MongoDB can be used with the most popular container orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications. A demo will show you how easy it is to enable MongoDB clusters as an External Service using the Open Service Broker API for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
Time series data is increasingly at the heart of modern applications - think IoT, stock trading, clickstreams, social media, and more. With the move from batch to real time systems, the efficient capture and analysis of time series data can enable organizations to better detect and respond to events ahead of their competitors or to improve operational efficiency to reduce cost and risk. Working with time series data is often different from regular application data, and there are best practices you should observe.
This talk covers:
Common components of an IoT solution
The challenges involved with managing time-series data in IoT applications
Different schema designs, and how these affect memory and disk utilization – two critical factors in application performance.
How to query, analyze and present IoT time-series data using MongoDB Compass and MongoDB Charts
At the end of the session, you will have a better understanding of key best practices in managing IoT time-series data with MongoDB.
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
Our clients have unique use cases and data patterns that mandate the choice of a particular strategy. To implement these strategies, it is mandatory that we unlearn a lot of relational concepts while designing and rapidly developing efficient applications on NoSQL. In this session, we will talk about some of our client use cases, the strategies we have adopted, and the features of MongoDB that assisted in implementing these strategies.
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
Encryption is not a new concept to MongoDB. Encryption may occur in-transit (with TLS) and at-rest (with the encrypted storage engine). But MongoDB 4.2 introduces support for Client Side Encryption, ensuring the most sensitive data is encrypted before ever leaving the client application. Even full access to your MongoDB servers is not enough to decrypt this data. And better yet, Client Side Encryption can be enabled at the "flick of a switch".
This session covers using Client Side Encryption in your applications. This includes the necessary setup, how to encrypt data without sacrificing queryability, and what trade-offs to expect.
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
MongoDB Kubernetes operator is ready for prime-time. Learn about how MongoDB can be used with most popular orchestration platform, Kubernetes, and bring self-service, persistent storage to your containerized applications.
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
These days, everyone is expected to be a data analyst. But with so much data available, how can you make sense of it and be sure you're making the best decisions? One great approach is to use data visualizations. In this session, we take a complex dataset and show how the breadth of capabilities in MongoDB Charts can help you turn bits and bytes into insights.
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
When you need to model data, is your first instinct to start breaking it down into rows and columns? Mine used to be too. When you want to develop apps in a modern, agile way, NoSQL databases can be the best option. Come to this talk to learn how to take advantage of all that NoSQL databases have to offer and discover the benefits of changing your mindset from the legacy, tabular way of modeling data. We’ll compare and contrast the terms and concepts in SQL databases and MongoDB, explain the benefits of using MongoDB compared to SQL databases, and walk through data modeling basics so you feel confident as you begin using MongoDB.
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
Join this talk and test session with a MongoDB Developer Advocate where you'll go over the setup, configuration, and deployment of an Atlas environment. Create a service that you can take back in a production-ready state and prepare to unleash your inner genius.
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
Query performance should be the unsung hero of an application, but without proper configuration, can become a constant headache. When used properly, MongoDB provides extremely powerful querying capabilities. In this session, we'll discuss concepts like equality, sort, range, managing query predicates versus sequential predicates, and best practices to building multikey indexes.
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
Aggregation pipeline has been able to power your analysis of data since version 2.2. In 4.2 we added more power and now you can use it for more powerful queries, updates, and outputting your data to existing collections. Come hear how you can do everything with the pipeline, including single-view, ETL, data roll-ups and materialized views.
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
Are you new to schema design for MongoDB, or are you looking for a more complete or agile process than what you are following currently? In this talk, we will guide you through the phases of a flexible methodology that you can apply to projects ranging from small to large with very demanding requirements.
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
MongoDB Atlas Data Lake is a new service offered by MongoDB Atlas. Many organizations store long term, archival data in cost-effective storage like S3, GCP, and Azure Blobs. However, many of them do not have robust systems or tools to effectively utilize large amounts of data to inform decision making. MongoDB Atlas Data Lake is a service allowing organizations to analyze their long-term data to discover a wealth of information about their business.
This session will take a deep dive into the features that are currently available in MongoDB Atlas Data Lake and how they are implemented. In addition, we'll discuss future plans and opportunities and offer ample Q&A time with the engineers on the project.
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
Virtual assistants are becoming the new norm when it comes to daily life, with Amazon’s Alexa being the leader in the space. As a developer, not only do you need to make web and mobile compliant applications, but you need to be able to support virtual assistants like Alexa. However, the process isn’t quite the same between the platforms.
How do you handle requests? Where do you store your data and work with it to create meaningful responses with little delay? How much of your code needs to change between platforms?
In this session we’ll see how to design and develop applications known as Skills for Amazon Alexa powered devices using the Go programming language and MongoDB.
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
aux Core Data, appréciée par des centaines de milliers de développeurs. Apprenez ce qui rend Realm spécial et comment il peut être utilisé pour créer de meilleures applications plus rapidement.
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
Il n’a jamais été aussi facile de commander en ligne et de se faire livrer en moins de 48h très souvent gratuitement. Cette simplicité d’usage cache un marché complexe de plus de 8000 milliards de $.
La data est bien connu du monde de la Supply Chain (itinéraires, informations sur les marchandises, douanes,…), mais la valeur de ces données opérationnelles reste peu exploitée. En alliant expertise métier et Data Science, Upply redéfinit les fondamentaux de la Supply Chain en proposant à chacun des acteurs de surmonter la volatilité et l’inefficacité du marché.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
2. We will cover:
•what it is
•how it works
•a tour of what it can do
A quick briefing on what Mongo
and Hadoop are all about:
(Q+A at the end)
3. document-oriented database with
dynamic schema
stores data in JSON-like documents:
{
_id : “kosmo kramer”,
age : 42,
location : {
state : ”NY”,
zip : ”10024”
},
favorite_colors : [“red”, “green”]
}
different structure in each document
values can be simple like strings and ints or nested documents
5. Java-based framework for MapReduce
Excels at batch processing on large data sets
by taking advantage of parallelism
map reduce created by google (white paper)
implemented in open source by hadoop
6. Mongo-Hadoop Connector - Why
Lots of people using Hadoop and Mongo
separately but need integration
Custom import/export scripts often
used to get data in+out
Scalability and flexibility with changes in
Hadoop or MongoDB configurations
Need to process data across multiple sources
custom scripts slow, fragile
7. Mongo-Hadoop Connector
Turn MongoDB into a Hadoop-enabled filesystem:
use as the input or output for Hadoop
.BSON
-or-
input
data
.BSON
-or-
Hadoop
Cluster
output
results
bson file new in 1.1
bson is the output of mongodump
8. Mongo-Hadoop Connector
Benefits + Features
Takes advantage of full multi-core
parallelism to process data in Mongo
Full integration with Hadoop and JVM ecosystems
Can be used with Amazon Elastic MapReduce
Can read and write backup files from local
filesystem, HDFS, or S3
9. Mongo-Hadoop Connector
Vanilla Java MapReduce
write MapReduce code in
ruby
or if you don’t want to use Java,
support for Hadoop Streaming.
Benefits + Features
can write your own language binding
10. Mongo-Hadoop Connector
Support for Pig
high-level scripting language for data analysis and
building MapReduce workflows
Support for Hive
SQL-like language for ad-hoc queries + analysis of data sets on
Hadoop-compatible file systems
Benefits + Features
11. Mongo-Hadoop Connector
How it works:
Adapter examines the MongoDB input collection and
calculates a set of splits from the data
Each split gets assigned to a node in Hadoop cluster
In parallel, Hadoop nodes pull data for splits from
MongoDB (or BSON) and process them locally
Hadoop merges results and streams output back to
MongoDB or BSON
12. Tour of Mongo-Hadoop, by Example
- Using Java MapReduce with Mongo-Hadoop
- Using Hadoop Streaming
- Pig and Hive with Mongo-Hadoop
- Elastic MapReduce + BSON
14. {"_id": {"t":"bob@enron.com", "f":"alice@enron.com"}, "count" : 14}
{"_id": {"t":"bob@enron.com", "f":"eve@enron.com"}, "count" : 9}
{"_id": {"t":"alice@enron.com", "f":"charlie@enron.com"}, "count" : 99}
{"_id": {"t":"charlie@enron.com", "f":"bob@enron.com"}, "count" : 48}
{"_id": {"t":"eve@enron.com", "f":"charlie@enron.com"}, "count" : 20}
Let’s use Hadoop to build a graph of (senders → recipients)
and the count of messages exchanged between each pair
bob
alice
eve
charlie
14
99
9
48
20
sample, simplified data
nodes are people. edges/arrows # of msgs from A to B
15. Example 1 - Java MapReduce
mongodb document passed into
Hadoop MapReduce
Map phase - each input doc gets
passed through a Mapper function
@Override
public
void
map(NullWritable
key,
BSONObject
val,
final
Context
context){
BSONObject
headers
=
(BSONObject)val.get("headers");
if(headers.containsKey("From")
&&
headers.containsKey("To")){
String
from
=
(String)headers.get("From");
String
to
=
(String)headers.get("To");
String[]
recips
=
to.split(",");
for(int
i=0;i<recips.length;i++){
String
recip
=
recips[i].trim();
context.write(new
MailPair(from,
recip),
new
IntWritable(1));
}
}
}
input value doc from mongo. connector will handle translation into
BSONObject for you
16. output written back to MongoDB
Example 1 - Java MapReduce (cont)
Reduce phase - outputs of Map are grouped
together by key and passed to Reducer
the {to, from} key
list of all the values
collected under the key
public
void
reduce(
final
MailPair
pKey,
final
Iterable<IntWritable>
pValues,
final
Context
pContext
){
int
sum
=
0;
for
(
final
IntWritable
value
:
pValues
){
sum
+=
value.get();
}
BSONObject
outDoc
=
new
BasicDBObjectBuilder().start()
.add(
"f"
,
pKey.from)
.add(
"t"
,
pKey.to
)
.get();
BSONWritable
pkeyOut
=
new
BSONWritable(outDoc);
pContext.write(
pkeyOut,
new
IntWritable(sum)
);
}
17. Example 1 - Java MapReduce (cont)
mongo.job.input.format=com.mongodb.hadoop.MongoInputFormat
mongo.input.uri=mongodb://my-db:27017/enron.messages
Read from MongoDB
Read from BSON
mongo.job.input.format=com.mongodb.hadoop.BSONFileInputFormat
mapred.input.dir=file:///tmp/messages.bson
hdfs:///tmp/messages.bson
s3:///tmp/messages.bson
18. Example 1 - Java MapReduce (cont)
mongo.job.output.format=com.mongodb.hadoop.MongoOutputFormat
mongo.output.uri=mongodb://my-db:27017/enron.results_out
Write output to MongoDB
Write output to BSON
mongo.job.output.format=com.mongodb.hadoop.BSONFileOutputFormat
mapred.output.dir=file:///tmp/results.bson
hdfs:///tmp/results.bson
s3:///tmp/results.bson
20. Example 2 - Hadoop Streaming
Let’s do the same Enron MapReduce job with
Python instead of Java
$ pip install pymongo_hadoop
21. Example 2 - Hadoop Streaming (cont)
Hadoop passes data to an external process
via STDOUT/STDIN
map(k, v)
map(k, v)
map(k, v)map()
JVM
STDIN
Python / Ruby / JS
interpreter
STDOUT
Hadoop (JVM)
def mapper(documents):
. . .
22. Example 2 - Hadoop Streaming (cont)
from pymongo_hadoop import BSONMapper
def mapper(documents):
i = 0
for doc in documents:
i = i + 1
from_field = doc['headers']['From']
to_field = doc['headers']['To']
recips = [x.strip() for x in to_field.split(',')]
for r in recips:
yield {'_id': {'f':from_field, 't':r}, 'count': 1}
BSONMapper(mapper)
print >> sys.stderr, "Done Mapping."
BSONMapper is pymongo layer that translates from hadoop streaming
back to hadoop
23. Example 2 - Hadoop Streaming (cont)
from pymongo_hadoop import BSONReducer
def reducer(key, values):
print >> sys.stderr, "Processing from/to %s" % str(key)
_count = 0
for v in values:
_count += v['count']
return {'_id': key, 'count': _count}
BSONReducer(reducer)
25. Example 3 - Mongo-Hadoop and Pig
Let’s do the same thing yet again, but this
time using Pig
Pig is a powerful language that can generate
sophisticated MapReduce workflows from simple
scripts
Can perform JOIN, GROUP, and execute
user-defined functions (UDFs)
26. Example 3 - Mongo-Hadoop and Pig (cont)
Pig directives for loading data:
BSONLoader and MongoLoader
Writing data out
BSONStorage and MongoInsertStorage
data = LOAD 'mongodb://localhost:27017/db.collection'
using com.mongodb.hadoop.pig.MongoLoader;
STORE records INTO 'file:///output.bson'
using com.mongodb.hadoop.pig.BSONStorage;
27. Pig has its own special datatypes:
Bags, Maps, and Tuples
Mongo-Hadoop Connector intelligently
converts between Pig datatypes and
MongoDB datatypes
Example 3 - Mongo-Hadoop and Pig (cont)
bags -> arrays
maps -> objects
28. raw = LOAD 'hdfs:///messages.bson'
using com.mongodb.hadoop.pig.BSONLoader('','headers:[]') ;
send_recip = FOREACH raw GENERATE $0#'From' as from, $0#'To' as to;
send_recip_filtered = FILTER send_recip BY to IS NOT NULL;
send_recip_split = FOREACH send_recip_filtered GENERATE
from as from, TRIM(FLATTEN(TOKENIZE(to))) as to;
send_recip_grouped = GROUP send_recip_split BY (from, to);
send_recip_counted = FOREACH send_recip_grouped GENERATE
group, COUNT($1) as count;
STORE send_recip_counted INTO 'file:///enron_results.bson'
using com.mongodb.hadoop.pig.BSONStorage;
Example 3 - Mongo-Hadoop and Pig (cont)
29. Hive with Mongo-Hadoop
Similar idea to Pig - process your data without
needing to write MapReduce code from
scratch
...but with SQL as the language of choice
30. Hive with Mongo-Hadoop
Sample Data:
db.users
db.users.find()
{ "_id": 1, "name": "Tom", "age": 28 }
{ "_id": 2, "name": "Alice", "age": 18 }
{ "_id": 3, "name": "Bob", "age": 29 }
{ "_id": 101, "name": "Scott", "age": 10 }
{ "_id": 104, "name": "Jesse", "age": 52 }
{ "_id": 110, "name": "Mike", "age": 32 }
...
CREATE TABLE mongo_users (id int, name string, age int)
STORED BY "com.mongodb.hadoop.hive.MongoStorageHandler"
WITH SERDEPROPERTIES( "mongo.columns.mapping" = "_id,name,age" )
TBLPROPERTIES ( "mongo.uri" = "mongodb://localhost:27017/test.users");
first, declare the collection to be
accessible in Hive:
31. Hive with Mongo-Hadoop
...then you can run SQL on it, like a table.
SELECT name,age FROM mongo_users WHERE id > 100 ;
SELECT * FROM mongo_users GROUP BY age WHERE id > 100 ;
you can use GROUP BY:
or JOIN multiple tables/collections together:
SELECT * FROM mongo_users T1
JOIN user_emails T2
WHERE T1.id = T2.id;
subset of SQL
32. Write the output of queries back into new tables:
INSERT OVERWRITE TABLE old_users SELECT id,name,age
FROM mongo_users WHERE age > 100 ;
DROP TABLE mongo_users;
Drop a table in Hive to delete the
underlying collection in MongoDB
use “external” when declaring your table to prevent the collection drop
33. Usage with Amazon Elastic MapReduce
Run mongo-hadoop jobs without
needing to set up or manage your
own Hadoop cluster.
Pig, Hive, and streaming work on EMR, too!
Logs get captured into S3 files
34. Usage with Amazon Elastic MapReduce
First, make a “bootstrap” script that
fetches dependencies (mongo-hadoop
jar and java drivers)
#!/bin/sh
wget -P /home/hadoop/lib http://central.maven.org/maven2/org/
mongodb/mongo-java-driver/2.12.2/mongo-java-driver-2.12.2.jar
wget -P /home/hadoop/lib https://s3.amazonaws.com/mongo-hadoop-
code/mongo-hadoop-core_1.1.2-1.1.0.jar
this will get executed on each node in
the cluster that EMR builds for us.
working on updating hadoop artifacts in maven
35. Example 4 - Usage with Amazon Elastic MapReduce
Put the bootstrap script, and all your code,
into an S3 bucket where Amazon can see it.
s3cp ./bootstrap.sh s3://$S3_BUCKET/bootstrap.sh
s3mod s3://$S3_BUCKET/bootstrap.sh public-read
s3cp $HERE/../enron/target/enron-example.jar s3://$S3_BUCKET/
enron-example.jar
s3mod s3://$S3_BUCKET/enron-example.jar public-read
36. $ elastic-mapreduce --create --jobflow ENRON000
--instance-type m1.xlarge
--num-instances 5
--bootstrap-action s3://$S3_BUCKET/bootstrap.sh
--log-uri s3://$S3_BUCKET/enron_logs
--jar s3://$S3_BUCKET/enron-example.jar
--arg -D --arg mongo.job.input.format=com.mongodb.hadoop.BSONFileInputFormat
--arg -D --arg mapred.input.dir=s3n://mongo-test-data/messages.bson
--arg -D --arg mapred.output.dir=s3n://$S3_BUCKET/BSON_OUT
--arg -D --arg mongo.job.output.format=com.mongodb.hadoop.BSONFileOutputFormat
# (any additional parameters here)
Example 4 - Usage with Amazon Elastic MapReduce
...then launch the job from the command
line, pointing to your S3 locations
Control the type and
number of instances
in the cluster
37. Example 4 - Usage with Amazon Elastic MapReduce
Easy to kick off a Hadoop job, without needing
to manage a Hadoop cluster
Pig, Hive, and streaming work on EMR, too!
Logs get captured into S3 files
38. Example 5 - New Feature: MongoUpdateWritable
... but we can also modify an existing output
collection
Works by applying mongodb update modifiers:
$push, $pull, $addToSet, $inc, $set, etc.
Can be used to do incremental MapReduce or
“join” two collections
In previous examples, we wrote job output data
by inserting into a new collection
39. Example 5 - MongoUpdateWritable
For example,
let’s say we have two collections.
{
"_id":
ObjectId("51b792d381c3e67b0a18d678"),
"sensor_id":
ObjectId("51b792d381c3e67b0a18d4a1"),
"value":
3328.5895416489802,
"timestamp":
ISODate("2013-‐05-‐18T13:11:38.709-‐0400"),
"loc":
[-‐175.13,51.658]
}
{
"_id":
ObjectId("51b792d381c3e67b0a18d0ed"),
"name":
"730LsRkX",
"type":
"pressure",
"owner":
"steve",
}
sensors
log
events
refers to which sensor
logged the event
For each owner, we want to calculate how many events
were recorded for each type of sensor that logged it.
40. For each owner, we want to calculate how many events
were recorded for each type of sensor that logged it.
Plain english:
Bob’s sensors for temperature have stored 1300 readings
Bob’s sensors for pressure have stored 400 readings
Alice’s sensors for humidity have stored 600 readings
Alice’s sensors for temperature have stored 700 readings
etc...
41. sensors
(mongodb collection)
Stage 1 -MapReduce
on sensors collection
Results
(mongodb collection)
for each sensor, emit:
{key: owner+type, value: _id}
group data from map() under each key, output:
{key: owner+type, val: [ list of _ids] }
read from
mongodb
insert() new records
to mongodb
MapReduce
log events
(mongodb collection)
do this in two stages
42. the sensor’s
owner and type
After stage one, the output
docs look like:
list of ID’s of
sensors with this
owner and type
{
"_id":
"alice
pressure",
"sensors":
[
ObjectId("51b792d381c3e67b0a18d475"),
ObjectId("51b792d381c3e67b0a18d16d"),
ObjectId("51b792d381c3e67b0a18d2bf"),
…
]
}
Now we just need to count the total # of log events recorded for
any sensors that appear in the list for each owner/type group.
43. sensors
(mongodb collection)
Stage 2 -MapReduce on
log events collection
read from
mongodb
Results
(mongodb collection)
update() existing
records in mongodb
MapReduce
log events
(mongodb collection)
for each sensor, emit:
{key: sensor_id, value: 1}
group data from map() under each key
for each value in that key:
update({sensors: key}, {$inc : {logs_count:1}})
context.write(null,
new
MongoUpdateWritable(
query,
//which
documents
to
modify
update,
//how
to
modify
($inc)
true,
//upsert
false)
);
//
multi
44. Example - MongoUpdateWritable
Result after stage 2
{
"_id":
"1UoTcvnCTz
temp",
"sensors":
[
ObjectId("51b792d381c3e67b0a18d475"),
ObjectId("51b792d381c3e67b0a18d16d"),
ObjectId("51b792d381c3e67b0a18d2bf"),
…
],
"logs_count":
1050616
}
now populated with correct count
45. New Features in v1.2 and beyond
Continually improving Hive support
Performance Improvements - Lazy BSON
Support for multi-collection input sources
API for adding
custom splitter implementations
and more
primarily focusing on hive but pig is next
maven central
46. Recap
Mongo-Hadoop - use Hadoop to do massive computations
on big data sets stored in MongoDB/BSON
Tools and APIs make it easier:
Streaming, Pig, Hive, EMR, etc.
MongoDB becomes a Hadoop-enabled filesystem