A presentation on how Showyou uses the Riak datastore at Showyou.com, as well as work we've been doing on a custom Riak backend for search and analytics.
The document provides an overview of NoSQL databases and focuses on MongoDB and Riak. It discusses how these databases address the needs of web applications by providing flexibility, scalability, and high performance. Riak is highlighted as using a distributed architecture with no single point of failure and tunable consistency properties. Its ability to link documents and handle high availability through replication is also summarized.
James Turner (Caplin) - Enterprise HTML5 Patternsakqaanoraks
Most HTML5 web applications are relatively small scale – they are maintained by a single team and contain relatively little JavaScript, CSS and HTML5 code.
At Caplin we build "thick client" replacement financial trading systems containing considerable business logic implemented by hundreds of thousands of lines of JavaScript code. The code is maintained by multiple development teams spread across multiple business units. The talk describes the problems faced and how they can be solved using componetization, loose coupling, services, event bus, design patterns, BDD, the best open source libraries, test by contract, and test automation etc.
The document provides an overview of courses offered by Balujalabs related to ASP.NET, VB.NET, and other Microsoft technologies. It lists the duration, fees, and topics covered for courses in ASP.NET, VB.NET, Advanced .NET, and other technologies like LINQ, WPF, Silverlight, and WCF. It also provides contact information for Balujalabs and highlights their course structure which includes classroom guidance, study material, mock tests, personal attention, and tests.
Scala and Spark are ideal for big data applications. Scala is a functional programming language that runs on the Java Virtual Machine and has strong typing, concise syntax, and supports both object-oriented and functional programming. Spark is an open source cluster computing framework that provides fast, in-memory processing of large datasets across clusters of machines using its Resilient Distributed Datasets (RDDs). Using Scala with Spark provides benefits like leveraging Spark's Scala API and leveraging functional features of Scala that are a natural fit with Spark's programming model.
This document provides an overview of Central Log Management at the University of Cape Town. It discusses Splunk and the ELK stack for collecting, analyzing, and monitoring machine data from various sources. Splunk is featured for its collection, search, reporting, and alerting capabilities. The ELK stack deployed at UCT includes Logstash to process logs from firewalls and send them to Elasticsearch for storage and querying in Kibana for visualization. Shipper and indexer configurations are shown for ingesting Palo Alto firewall logs into Elasticsearch.
Scala and Spark are Ideal for Big Data - Data Science Pop-up SeattleDomino Data Lab
Scala and Spark are each great tools for data processing and they work well together. They can process data via small simple interactive queries as well as in very large highly-available and scalable production systems. They provide an integrated framework for an ever growing wide range of data processing capabilities. We examine the reasons for this and also look a couple of simple data processing examples written in Scala. Presented by John Nestor, Sr Architect at 47 Degrees.
A presentation on how Showyou uses the Riak datastore at Showyou.com, as well as work we've been doing on a custom Riak backend for search and analytics.
The document provides an overview of NoSQL databases and focuses on MongoDB and Riak. It discusses how these databases address the needs of web applications by providing flexibility, scalability, and high performance. Riak is highlighted as using a distributed architecture with no single point of failure and tunable consistency properties. Its ability to link documents and handle high availability through replication is also summarized.
James Turner (Caplin) - Enterprise HTML5 Patternsakqaanoraks
Most HTML5 web applications are relatively small scale – they are maintained by a single team and contain relatively little JavaScript, CSS and HTML5 code.
At Caplin we build "thick client" replacement financial trading systems containing considerable business logic implemented by hundreds of thousands of lines of JavaScript code. The code is maintained by multiple development teams spread across multiple business units. The talk describes the problems faced and how they can be solved using componetization, loose coupling, services, event bus, design patterns, BDD, the best open source libraries, test by contract, and test automation etc.
The document provides an overview of courses offered by Balujalabs related to ASP.NET, VB.NET, and other Microsoft technologies. It lists the duration, fees, and topics covered for courses in ASP.NET, VB.NET, Advanced .NET, and other technologies like LINQ, WPF, Silverlight, and WCF. It also provides contact information for Balujalabs and highlights their course structure which includes classroom guidance, study material, mock tests, personal attention, and tests.
Scala and Spark are ideal for big data applications. Scala is a functional programming language that runs on the Java Virtual Machine and has strong typing, concise syntax, and supports both object-oriented and functional programming. Spark is an open source cluster computing framework that provides fast, in-memory processing of large datasets across clusters of machines using its Resilient Distributed Datasets (RDDs). Using Scala with Spark provides benefits like leveraging Spark's Scala API and leveraging functional features of Scala that are a natural fit with Spark's programming model.
This document provides an overview of Central Log Management at the University of Cape Town. It discusses Splunk and the ELK stack for collecting, analyzing, and monitoring machine data from various sources. Splunk is featured for its collection, search, reporting, and alerting capabilities. The ELK stack deployed at UCT includes Logstash to process logs from firewalls and send them to Elasticsearch for storage and querying in Kibana for visualization. Shipper and indexer configurations are shown for ingesting Palo Alto firewall logs into Elasticsearch.
Scala and Spark are Ideal for Big Data - Data Science Pop-up SeattleDomino Data Lab
Scala and Spark are each great tools for data processing and they work well together. They can process data via small simple interactive queries as well as in very large highly-available and scalable production systems. They provide an integrated framework for an ever growing wide range of data processing capabilities. We examine the reasons for this and also look a couple of simple data processing examples written in Scala. Presented by John Nestor, Sr Architect at 47 Degrees.
Presentation is about Neo4j database. Some slides i have taken from other presentations as well, but it will you some basic idea.
For sample exercies in the end, you can go with this schema:
1.) http://www.neo4j.org/graphgist?7820655
2.) Sample Movie Schema comes by default
Anatomy of Data Frame API : A deep dive into Spark Data Frame APIdatamantra
In this presentation, we discuss about internals of spark data frame API. All the code discussed in this presentation available at https://github.com/phatak-dev/anatomy_of_spark_dataframe_api
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMPT.JUG
Sanne Grinovero presented on Hibernate Object/Grid Mapper (OGM), which provides an object-oriented interface for NoSQL databases using JPA. OGM stores entities as serialized tuples and uses Lucene/Hibernate Search for querying. It reuses Hibernate Core and is targeted at Infinispan but also works with other NoSQL databases. The goals are to encourage new data usage patterns with a familiar programming model and ease of use while pushing NoSQL exploration in enterprises.
Datomic – A Modern Database - StampedeCon 2014StampedeCon
At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database."
Datomic is a distributed database designed to run on next-generation cloud architectures. Datomic stores facts and retractions using a flexible schema, consistent transactions, and a logic-based query language. The focus on facts over time gives you the ability to look at the state of the database at any point in time and traverse your transactional data in many ways.
We’ll take a tour of the Datomic data model, transactions, query language, and architecture to highlight some of the unique attributes of Datomic and why it is an ideal modern database.
Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterp...Restlet
Lessons learned by Restlet when deploying DataStax Enterprise search with APISpark. Presentation by Jerome Louvel and Guillaume Blondeau at the Cassandra Summit 2015. Includes 7 challenges and solutions when deploying DataStax.
Masterless Distributed Computing with Riak Core - EUC 2010Rusty Klophaus
Riak Core--an open-source Erlang library created by Basho Technologies that powers Riak KV and Riak Search--allows developers to build distributed, scalable, failure-tolerant applications based on a generalized version of Amazon's Dynamo architecture. In this talk, Rusty will explain why Riak Core was built, discuss what problems it solves and how it works, and walk through the steps to using Riak Core in an Erlang application.
This document summarizes a presentation on the Elastic Stack. It discusses the main components - Elasticsearch for storing and searching data, Logstash for ingesting data, Kibana for visualizing data. It provides examples of using Elasticsearch for search, analytics, and aggregations. It also briefly mentions new features across the Elastic Stack like update by query, ingest nodes, pipeline improvements, and APIs for management and metrics.
TweetMogaz - The Arabic Tweets Platform: Presented by Ahmed Adel, BADRLucidworks
The document summarizes TweetMogaz, an Arabic tweets platform developed by BADR. It describes the key modules of the system including tweets processing, indexing, event detection, archiving and analytics. The system collects and analyzes Arabic tweets in real-time using Apache Solr, identifies trending topics and events, and allows users to browse, search and visualize tweets and analytics. It addresses challenges of analyzing micro-blogs and Arabic language variations. Future work includes improving the adaptive classifier and integrating statistical processing with R.
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Cohesive Networks
Slides from the Chicago AWS user group on May 5th, 2016. Asaf Yigal, Co-Founder and VP Product at Logz.io, presented on using Elasticsearch, Logstash, and Kibana in Amazon Web Services.
"Setting up the increasingly-popular open-source ELK Stack (Elasticsearch, Logstash, and Kibana) on AWS might seem like an easy task, but we have gone through several iterations in our architecture and have made some mistakes in our deployments that have turned out to be common in the industry. In this talk, we will go through what we did and explain what worked and what failed -- and why. We will also provide a complete blueprint of how to set up ELK for production on AWS." ~ @asafyigal
Apache Arrow Workshop at VLDB 2019 / BOSS SessionWes McKinney
Technical deep dive for database system developers in the Arrow columnar format, binary protocol, C++ development platform, and Arrow Flight RPC.
See demo Jupyter notebooks at https://github.com/wesm/vldb-2019-apache-arrow-workshop
1. Apache Spark is an open source cluster computing framework for large-scale data processing. It is compatible with Hadoop and provides APIs for SQL, streaming, machine learning, and graph processing.
2. Over 3000 companies use Spark, including Microsoft, Uber, Pinterest, and Amazon. It can run on standalone clusters, EC2, YARN, and Mesos.
3. Spark SQL, Streaming, and MLlib allow for SQL queries, streaming analytics, and machine learning at scale using Spark's APIs which are inspired by Python/R data frames and scikit-learn.
Introduction to Structured Data Processing with Spark SQLdatamantra
An introduction to structured data processing using Data source and Dataframe API's of spark.Presented at Bangalore Apache Spark Meetup by Madhukara Phatak on 31/05/2015.
Logging is one of those things that everyone complains about, but doesn't dedicate time to. Of course, the first rule of logging is "do it". Without that, you have no visibility into system activities when investigations are required. But, the end goal is much, much more than this. Almost all applications require security audit logs for compliance; application logs for visibility across all cloud properties; and application tracing for tracking usage patterns and business intelligence. The latter is that magic sauce that helps businesses learn about their customer or in some cases the data is FOR the customer. Without a strategy this can get very messy, fast. In this session Michele will discuss design patterns for a sound logging and audit strategy; considerations for security and compliance; the benefits of a noSQL approach; and more.
The document introduces Datomic, an immutable database with an architecture that separates reads, writes, and storage. It has several key benefits, including built-in data distribution and caching, elastic scaling, and a data model based on immutable facts rather than embedded structures. The programming model uses a peer embedded in applications to pull indexed data as needed, and supports transactional updates and time-based queries using a declarative Datalog language.
Introduction to CosmosDB - Azure Bootcamp 2018Josh Carlisle
Josh Carlisle introduces Azure Cosmos DB, a globally distributed, multi-model database service. Cosmos DB offers turnkey global distribution, high availability up to 99.999%, and low latency reads and writes typically under 10ms. It uses request units to reserve throughput and ensure service level agreements. Cosmos DB supports multiple APIs including MongoDB, SQL, Cassandra, and table storage and scales elastically.
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
Wes McKinney is a leading open source developer who created Python's pandas library and now leads the Apache Arrow project. Apache Arrow is an open standard for in-memory analytics that aims to improve data sharing and reuse across systems by defining a common columnar data format and memory layout. It allows data to be accessed and algorithms to be reused across different programming languages with near-zero data copying. Arrow is being integrated into various data systems and is working to expand its computational libraries and language support.
Riak 2.0 : For Beginners, and Everyone ElseEngin Yoeyen
This document provides an overview of Riak, a key-value database. It discusses that Riak uses buckets and bucket types to organize data, and supports common CRUD operations via its HTTP API. It also covers Riak's use of eventual consistency and quorums to balance availability and partition tolerance. Data types like sets and maps allow Riak to understand and resolve conflicts in the data.
The document discusses the production of a music video. It describes using Final Cut Express to edit video footage from an HD camera, including a focus group, animatic, and final music video. Effects were added to make it look like a professional music video. An iMac was used for editing footage as well as accessing websites to support various aspects of producing the music video.
Presentation is about Neo4j database. Some slides i have taken from other presentations as well, but it will you some basic idea.
For sample exercies in the end, you can go with this schema:
1.) http://www.neo4j.org/graphgist?7820655
2.) Sample Movie Schema comes by default
Anatomy of Data Frame API : A deep dive into Spark Data Frame APIdatamantra
In this presentation, we discuss about internals of spark data frame API. All the code discussed in this presentation available at https://github.com/phatak-dev/anatomy_of_spark_dataframe_api
Using JPA applications in the era of NoSQL: Introducing Hibernate OGMPT.JUG
Sanne Grinovero presented on Hibernate Object/Grid Mapper (OGM), which provides an object-oriented interface for NoSQL databases using JPA. OGM stores entities as serialized tuples and uses Lucene/Hibernate Search for querying. It reuses Hibernate Core and is targeted at Infinispan but also works with other NoSQL databases. The goals are to encourage new data usage patterns with a familiar programming model and ease of use while pushing NoSQL exploration in enterprises.
Datomic – A Modern Database - StampedeCon 2014StampedeCon
At StampedeCon 2014, Alex Miller (Cognitect) presented "Datomic – A Modern Database."
Datomic is a distributed database designed to run on next-generation cloud architectures. Datomic stores facts and retractions using a flexible schema, consistent transactions, and a logic-based query language. The focus on facts over time gives you the ability to look at the state of the database at any point in time and traverse your transactional data in many ways.
We’ll take a tour of the Datomic data model, transactions, query language, and architecture to highlight some of the unique attributes of Datomic and why it is an ideal modern database.
Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterp...Restlet
Lessons learned by Restlet when deploying DataStax Enterprise search with APISpark. Presentation by Jerome Louvel and Guillaume Blondeau at the Cassandra Summit 2015. Includes 7 challenges and solutions when deploying DataStax.
Masterless Distributed Computing with Riak Core - EUC 2010Rusty Klophaus
Riak Core--an open-source Erlang library created by Basho Technologies that powers Riak KV and Riak Search--allows developers to build distributed, scalable, failure-tolerant applications based on a generalized version of Amazon's Dynamo architecture. In this talk, Rusty will explain why Riak Core was built, discuss what problems it solves and how it works, and walk through the steps to using Riak Core in an Erlang application.
This document summarizes a presentation on the Elastic Stack. It discusses the main components - Elasticsearch for storing and searching data, Logstash for ingesting data, Kibana for visualizing data. It provides examples of using Elasticsearch for search, analytics, and aggregations. It also briefly mentions new features across the Elastic Stack like update by query, ingest nodes, pipeline improvements, and APIs for management and metrics.
TweetMogaz - The Arabic Tweets Platform: Presented by Ahmed Adel, BADRLucidworks
The document summarizes TweetMogaz, an Arabic tweets platform developed by BADR. It describes the key modules of the system including tweets processing, indexing, event detection, archiving and analytics. The system collects and analyzes Arabic tweets in real-time using Apache Solr, identifies trending topics and events, and allows users to browse, search and visualize tweets and analytics. It addresses challenges of analyzing micro-blogs and Arabic language variations. Future work includes improving the adaptive classifier and integrating statistical processing with R.
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Cohesive Networks
Slides from the Chicago AWS user group on May 5th, 2016. Asaf Yigal, Co-Founder and VP Product at Logz.io, presented on using Elasticsearch, Logstash, and Kibana in Amazon Web Services.
"Setting up the increasingly-popular open-source ELK Stack (Elasticsearch, Logstash, and Kibana) on AWS might seem like an easy task, but we have gone through several iterations in our architecture and have made some mistakes in our deployments that have turned out to be common in the industry. In this talk, we will go through what we did and explain what worked and what failed -- and why. We will also provide a complete blueprint of how to set up ELK for production on AWS." ~ @asafyigal
Apache Arrow Workshop at VLDB 2019 / BOSS SessionWes McKinney
Technical deep dive for database system developers in the Arrow columnar format, binary protocol, C++ development platform, and Arrow Flight RPC.
See demo Jupyter notebooks at https://github.com/wesm/vldb-2019-apache-arrow-workshop
1. Apache Spark is an open source cluster computing framework for large-scale data processing. It is compatible with Hadoop and provides APIs for SQL, streaming, machine learning, and graph processing.
2. Over 3000 companies use Spark, including Microsoft, Uber, Pinterest, and Amazon. It can run on standalone clusters, EC2, YARN, and Mesos.
3. Spark SQL, Streaming, and MLlib allow for SQL queries, streaming analytics, and machine learning at scale using Spark's APIs which are inspired by Python/R data frames and scikit-learn.
Introduction to Structured Data Processing with Spark SQLdatamantra
An introduction to structured data processing using Data source and Dataframe API's of spark.Presented at Bangalore Apache Spark Meetup by Madhukara Phatak on 31/05/2015.
Logging is one of those things that everyone complains about, but doesn't dedicate time to. Of course, the first rule of logging is "do it". Without that, you have no visibility into system activities when investigations are required. But, the end goal is much, much more than this. Almost all applications require security audit logs for compliance; application logs for visibility across all cloud properties; and application tracing for tracking usage patterns and business intelligence. The latter is that magic sauce that helps businesses learn about their customer or in some cases the data is FOR the customer. Without a strategy this can get very messy, fast. In this session Michele will discuss design patterns for a sound logging and audit strategy; considerations for security and compliance; the benefits of a noSQL approach; and more.
The document introduces Datomic, an immutable database with an architecture that separates reads, writes, and storage. It has several key benefits, including built-in data distribution and caching, elastic scaling, and a data model based on immutable facts rather than embedded structures. The programming model uses a peer embedded in applications to pull indexed data as needed, and supports transactional updates and time-based queries using a declarative Datalog language.
Introduction to CosmosDB - Azure Bootcamp 2018Josh Carlisle
Josh Carlisle introduces Azure Cosmos DB, a globally distributed, multi-model database service. Cosmos DB offers turnkey global distribution, high availability up to 99.999%, and low latency reads and writes typically under 10ms. It uses request units to reserve throughput and ensure service level agreements. Cosmos DB supports multiple APIs including MongoDB, SQL, Cassandra, and table storage and scales elastically.
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
Wes McKinney is a leading open source developer who created Python's pandas library and now leads the Apache Arrow project. Apache Arrow is an open standard for in-memory analytics that aims to improve data sharing and reuse across systems by defining a common columnar data format and memory layout. It allows data to be accessed and algorithms to be reused across different programming languages with near-zero data copying. Arrow is being integrated into various data systems and is working to expand its computational libraries and language support.
Riak 2.0 : For Beginners, and Everyone ElseEngin Yoeyen
This document provides an overview of Riak, a key-value database. It discusses that Riak uses buckets and bucket types to organize data, and supports common CRUD operations via its HTTP API. It also covers Riak's use of eventual consistency and quorums to balance availability and partition tolerance. Data types like sets and maps allow Riak to understand and resolve conflicts in the data.
The document discusses the production of a music video. It describes using Final Cut Express to edit video footage from an HD camera, including a focus group, animatic, and final music video. Effects were added to make it look like a professional music video. An iMac was used for editing footage as well as accessing websites to support various aspects of producing the music video.
El documento presenta el formato para la presentación del tema de un anteproyecto de trabajo de titulación sobre una auditoría de gestión del Canal de Televisión Digital Tv Canal 28 en Morona, Ecuador. Incluye secciones sobre el tema, problema, justificación, objetivos y marco teórico-conceptual sobre auditoría interna de gestión. El objetivo general es realizar una auditoría de gestión para mejorar la productividad a través del control de la eficacia, eficiencia y economía en el servicio del canal.
This document provides guidance for internal PSMs on inputting opportunities in Siebel, including which fields to populate for standard vs. sub-opportunities vs. partner opportunities. It also recommends reviewing the activation kit, creating logins for partners, and contacting specific people with any questions about the activation kit, PSPs, or reporting.
This document discusses security and ethics issues related to data breaches. It notes that in recent years there have been some massive data breaches, including over 77 million customer account details being hacked from Sony's PS3 network. Malware and SQL coding have accounted for many breaches. A 2010 report found that over 71% of hacking attacks were carried out via remote access and desktop services. While awareness programs are helpful for prevention, encryption is also gaining ground. Data breaches increased 7% in 2010 and can be very costly for companies to resolve. Going forward, secure data access, inhibit copying of resources, keep systems updated, and use strong passwords and encryption were recommended to help prevent further breaches.
Ralph the bunny got an idea to save himself and the other barnyard animals from attacking foxes by throwing Easter eggs and dye at the foxes from the attic window. When the foxes tried to get Ralph with a ladder and tree, Ralph scared them away by kicking a soccer ball that caused a chain reaction among the other animals, culminating in an angry bull chasing the foxes off for good. In the end, Ralph was rewarded with coffee cake and got to play soccer with the other animals to celebrate being safe.
This document provides information on voter eligibility and voting locations for residents of Phoenix, Arizona. It states that to vote in Arizona you must be 18 years or older, a US citizen, an Arizona resident for at least 29 days, not convicted of a felony or declared incapacitated, and registered to vote at least 29 days before the election. It then lists over 20 specific polling locations around Phoenix where residents can cast their ballot on election day. It also notes that a national mail voter registration form can be used to register to vote in federal elections from any state. The document concludes by reminding voters to bring valid photo ID to the polls.
New Haven, Connecticut was founded in 1638 by Reverend John Davenport and Theophilus Eaton, who hoped to develop the city's harbor for trade and transportation. While international trade increased in the 18th century, New Haven remained largely agricultural until later. In the 19th century, New Haven saw growth of its Long Wharf, oyster industry, and transportation including steamboats, canals, and railroads, transforming the city and economy. Recreational activities also emerged along the waterfront for residents.
The document summarizes the key elements included on the front and back cover of a digipak for an album. The front includes the artist name, album title, and image to identify the artist. The back lists the songs on the two CDs, barcode, website for additional information, and logo representing the compact disc format. Overall the digipak is designed to introduce customers to the artist and album through visual elements and tracklisting.
This document discusses product and price management. It covers topics such as product mix, new product development, product life cycles, pricing objectives, pricing strategies, and responding to competitors' pricing actions. The key points are:
1) A company's product mix considers the number of product lines, individual products, and variations of each product.
2) New product success rates are low, with most new products being improvements on existing concepts.
3) Products go through different stages in their life cycles from introduction to growth, maturity, and decline.
4) Setting prices involves determining objectives, demand, costs, competitors, and selecting a pricing method.
5) Pricing strategies include discounts,
Forefront for exchange entrenamiento ventas esFitira
Este documento describe Forefront Server Security para Exchange, una solución de seguridad que incluye 8 motores antivirus integrados directamente con Microsoft Exchange Server 2007. La solución ofrece protección contra virus, filtros de archivos y contenido, y notificaciones al administrador. Incluye una prueba gratuita de 120 días.
Simplify and run your development environments with Vagrant on OpenStackB1 Systems GmbH
Here are the steps to resolve the network issue:
1. Create a new internal network (e.g. 192.168.0.0/24)
2. Create a new router
3. Add the PublicNetwork as the gateway for the router
4. Add the internal network as an interface to the router
This will allow instances on the internal network to get floating IPs from the PublicNetwork via the router. The original error indicates direct access to the external network is forbidden, so routing traffic through an internal network and router is required.
The Poker Entrepreneurship: Speaking @ JFDI.Asiasaumilnanavati
This is a talk I gave at JFDI.Asia incubator in Singapore on March 23, 2012.
Most tips to budding entrepreneur is about the "how to's" of a start-up company. This is a RARE look into how to build and manage success (& risks) as a start-up entrepreneur.
This document provides an overview and contents for a 3-day Microsoft Inside Selling training module. The training is designed to equip sales representatives with core phone selling skills, including call preparation, questioning techniques, handling objections, and qualifying opportunities. Trainees will practice skills through role-playing and dedicated calling sessions with feedback. The module covers call opening, probing customers' needs, proving value, and advancing opportunities to closing.
The document discusses new features in the latest release of the Firefox browser including improved developer tools, a 3D view of pages using WebGL, and improved simplicity, speed and security. The Firefox browser is aiming to become the world's number one browser.
This document provides an overview and contents for a 3-day Microsoft Inside Selling training module. The training is designed to equip sales representatives with core phone selling skills like call preparation, questioning techniques, handling objections, and qualifying opportunities. The module consists of lessons on call structure, effective openings, probing questions, proving value, and next steps. Participants practice skills through role-plays and calling sessions with feedback. The goal is for reps to maximize effectiveness and results from phone interactions.
This document discusses processing large graphs. It introduces graph processing with MapReduce and Apache Giraph. MapReduce algorithms for finding triangles and connected components in graphs are described. The limitations of MapReduce for graph processing are discussed. Alternative graph processing technologies including Neo4j, a graph database, are presented. A movie recommendation use case is demonstrated using Neo4j to find similar users and recommend unseen movies.
This document provides an overview of Riak, a distributed NoSQL database. It discusses Riak's origins from Dynamo and Akamai, its support for the CAP theorem by allowing tradeoffs between consistency, availability, and partition tolerance. Key features covered include Riak being homogeneous, using a single keyspace, being distributed and replicated across nodes, providing predictable scalability, and being data agnostic. The document also discusses concepts like conflict resolution, replication, data distribution using vnodes, and extra features including MapReduce, links, hooks, and backends.
Tuning Flink For Robustness And PerformanceStefan Richter
Flink's stateful stream processing engine presents a huge variety of optional features and configuration choices to the user. Figuring out the "optimal" choices for any production environment and use-case can therefore often be challenging. In this talk, we will explore and discuss the universe of Flink configuration with respect to robustness and performance.
We will start with a closer look under the hood, at core data structures and algorithms, to build the foundation for understanding the impact of tuning parameters and the costs-benefit-tradeoffs that come with certain features and options. In particular, we will focus on state backend choices (Heap vs RocksDB), tuning checkpointing (incremental checkpoints, ...) and recovery (local recovery), file systems, TTL state, and considerations for the network stack. This also includes a discussion about estimating memory requirements and memory partitioning.
Stefan Richter
Flink Forward 2018
Berlin
Flink Forward Berlin 2018: Stefan Richter - "Tuning Flink for Robustness and ...Flink Forward
Flink's stateful stream processing engine presents a huge variety of optional features and configuration choices to the user. Figuring out the ""optimal"" choices for any production environment and use-case can therefore often be challenging. In this talk, we will explore and discuss the universe of Flink configuration with respect to robustness and performance.
We will start with a closer look under the hood, at core data structures and algorithms, to build the foundation for understanding the impact of tuning parameters and the costs-benefit-tradeoffs that come with certain features and options. In particular, we will focus on state backend choices (Heap vs RocksDB), tuning checkpointing (incremental checkpoints, ...) and recovery (local recovery), file systems, TTL state, and considerations for the network stack. This also includes a discussion about estimating memory requirements and memory partitioning.
OSDC 2012 | Scaling with MongoDB by Ross LawleyNETWAYS
MongoDB's architecture features built-in support for horizontal scalability, and high availability through replica sets. Auto-sharding allows users to easily distribute data across many nodes. Replica sets enable automatic failover and recovery of database nodes within or across data centers. This session will provide an introduction to scaling with MongoDB by one of the developers working on the project.
The document introduces Ruby on Rails and provides an overview of its features and benefits. It summarizes the speaker's experience with web development over time, introduces MVC and ORM concepts, and demonstrates Rails through a live coding example. Key advantages of Rails highlighted include its convention over configuration approach, use of Ruby as a dynamic scripting language, and ability to rapidly develop database-backed web applications.
This document discusses building a social analytics tool using MongoDB from a developer's perspective. It covers using MongoDB for its schema-less data and ability to handle fast read-write operations. Key topics include using aggregation queries to gain insights from data by chaining queries together and filtering/manipulating results at each stage. JavaScript capabilities in MongoDB allow applying business logic directly to data. Examples demonstrate removing garbage data and stopwords. Indexes, current progress, and tips/tricks learned around cloning collections and removing vs dropping are also covered, with a demo planned.
This document discusses how bookmarklets can function as applications by interacting with web pages in a secure manner. It describes how the bookmarklet uses elementFromPoint for fast hit detection, resets CSS to robustly render its UI, and transmits data to a server through signed cross-domain POST messages for security. Examples of embedding the bookmarklet code on a page and customizing its appearance are also provided.
This document summarizes lessons learned from building the Dutch public broadcasting company's website omroep.nl. Key points include:
- The site was built using Ruby on Rails with 6 developers over 6 months to handle 30,000-40,000 daily pageviews and traffic spikes.
- Extensive testing was done including over 2,000 RSpec tests and 410 Cucumber scenarios to help ensure quality.
- Caching was heavily used to improve performance including caching pages, fragments, and external data from feeds.
- Resilience was important given the large amounts of external data from various sources, and errors were rescued and logged.
- Ongoing monitoring and optimization was needed to
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Kevin Xu
This presentation was delivered at the NYC SQL meetup on September 27, 2018. It provided a technical overview of the TiDB Platform, a deep dive into TiDB's MySQL compatible layer and MySQL ecosystem tools, use case of Mobike, and appendix with detail materials on coprocessor and transaction model.
This document provides lessons learned from building the Dutch public broadcasting company's website omroep.nl. Key points include using Ruby on Rails, BDD with RSpec and Cucumber, caching everything possible, rescuing errors, testing extensively, and handling large amounts of external data from various XML/RSS feeds and APIs. Performance was optimized through techniques like moving static assets to a front proxy, page caching, fragment caching, and using Memcache. The team of 6 people built the CMS from scratch over 6 months.
Tweaking perfomance on high-load projects_Думанский ДмитрийGeeksLab Odessa
This document discusses optimizing the performance of several high-load projects delivering billions of requests per month. It summarizes the evolution and delivery loads of different projects over time. It then analyzes the technical stacks and architectures used, identifying problems and solutions implemented around areas like querying, data storage, processing, and networking. Key lessons learned are around sharding and resharding data, optimizing I/O, using streaming processing like Storm over batch processing like Hadoop, and working within AWS limits and capabilities.
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
Adaptive compilation and runtime in the OpenJDK Hotspot VM offers significant performance enhancements for our tools and applications in Java and other JVM languages. Understanding how it works provides developers with critical information on the Java HotSpot JIT compilation and runtime techniques such as vectorization, compressed OOPs etc., to assist in understanding performance for both client and server applications. We will focus on the internals of OpenJDK 8, the reference implementation for Java SE 8.
The document discusses breaking the protections of the ionCube virtual machine used to protect PHP code. It describes how ionCube works at a high level to protect intellectual property in PHP code. It then details the steps taken to break ionCube protections and extract the raw protected PHP code, including decoding the base64 encoded data, validating headers and values against hard-coded constants, and interpreting encrypted values in the header. The goal is to understand how ionCube's protections work internally so the encoded PHP code can be executed natively.
The document discusses Java 8 streams and stream performance. It provides background on streams and why they were introduced in Java 8. It discusses sequential and parallel streams, how to visualize them, and practical benefits. It covers microbenchmarking and a case study comparing a sequential grep implementation to a parallelized version. Key points are that streams can improve readability but performance must be tested, parallelism helps if the workload is large enough to outweigh overhead, and stream sources need to be splittable for parallelism.
This document provides an overview and summary of TiDB, an open-source distributed SQL database compatible with MySQL. It discusses TiDB's architecture which includes TiDB for the SQL layer, TiKV for storage, and PD for placement driving. TiDB provides features like horizontal scalability, distributed transactions, and high availability. Example use cases are also presented, like Mobike's use of TiDB for locking/unlocking bikes and real-time analytics of bike usage data across 200 cities in China.
This document discusses optimizing performance for high-load projects. It summarizes the delivery loads and technologies used for several projects including mGage, mobclix and XXXX. It then discusses optimizations made to improve performance, including using Solr for search, Redis for real-time data, Hadoop for reporting, and various Java optimizations in moving to Java 7. Specific optimizations discussed include reducing garbage collection, improving random number generation, and minimizing I/O operations.
Things to Consider When Choosing a Website Developer for your Website | FODUUFODUU
Choosing the right website developer is crucial for your business. This article covers essential factors to consider, including experience, portfolio, technical skills, communication, pricing, reputation & reviews, cost and budget considerations and post-launch support. Make an informed decision to ensure your website meets your business goals.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfTechgropse Pvt.Ltd.
In this blog post, we'll delve into the intersection of AI and app development in Saudi Arabia, focusing on the food delivery sector. We'll explore how AI is revolutionizing the way Saudi consumers order food, how restaurants manage their operations, and how delivery partners navigate the bustling streets of cities like Riyadh, Jeddah, and Dammam. Through real-world case studies, we'll showcase how leading Saudi food delivery apps are leveraging AI to redefine convenience, personalization, and efficiency.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
4. Riak
An awesome noSQL data store:
• Super easy to scale up AND down
• Fault tolerant – no SPoF
• Flexible schema
• Full-text search out of the box
• Can be fixed and improved in Erlang (the
Basho folks awesomely take our commits)
5. Riak – Basics
• Data in Riak is grouped buckets
(effectively namespaces)
• Basic operations are:
• Get, save, delete, search, map, reduce
• Eventual consistency managed through
N, R, and W bucket parameters.
• Everything we put in Riak is JSON
• We talk to Riak through the excellent riak-js
node library by Francisco Treacy
6. Data Model – Clips
title ctime
domain
author
mentions annotation tags
7. Data Model - Clips
Clips are the gateway to all of our data
<html> Comments on Clip ‘abc’
… “F1rst”
</html>
key: abc Blob “Nice clip yo!”
“Saw this on Reddit…”
Clip Key: abc
Comment Cache
9. Riak Search
• Gets many things out of Riak by something
other than the primary key.
• You specify a schema (the types for the
field within a JSON object).
• Works great but with one big gotcha:
– Index is uses term-based partitioning instead
of document-based partitioning
– Implication: joins + sort + pagination sucks
– We know how to work around this
10. Riak Search – Querying
• Query syntax based on Lucene
• Basic Query
text:funny
• Compound Query
login:greg OR (login:gary AND tags:riak)
• Range Query
ctime:[98685879630026 TO 98686484430026]
11. Clipboard App Flow
Client node.js Riak
Go to clipboard.com/home
Search clips bucket
query = login:greg
Top 20 results
Top 20 results
start
rendering
(For each clip)
API Request for blob
GET from blobs bucket
Return blob to client
render
blob
12. Clipboard Queries
login:greg
mentions:greg
ctime:[98685879630026 TO 98686484430026]
(Search)
13. Clipboard Queries cont.
login:greg AND tags:riak
login:greg AND text:node AND text:javascript
(Search)
14. Uh oh
login:greg AND private:false
Matches only my clips Matches 20% of all clips!
login:greg AND text:iPhone
(Search)
16. Doc Partition Query Processing
1. x AND y (sort z, start = 990, count = 10)
2. On Each node:
1. Perform x AND y
2. Sort on z
3. Slice [ 0 .. 1000 ]
4. Send to aggregator
3. On aggregator
1. Merge all results (N x 1000)
2. Slice [ 990 .. 1000 ]
17. Term Partition Query Processing
1. x AND y (sort z, start = 990, count = 10)
2. On x node: search for x (and send all)
3. On y node: search for y (and send all)
4. On aggregator:
1. Do x AND y
2. Sort on z
3. Slice to [ 990 .. 1000 ]
18. Riak Search Issues
1. For any singular term, all results must be
sent back to aggregator.
2. Incorrectly performs sort and slice (does
sort then slice)
3. ANDs take time O(MAX(|x|, |y|)) instead
of O(MIN(|x|, |y|).
4. All matches must be read to get sort field.
19. Riak Search Fixes
1. Inline fields for short and common
attributes.
2. Dynamic fields for precomputed ANDs.
3. PRESORT option for sorting without
document reads.
20. Inline Fields
Nifty feature added recently to Riak Search
Fields only used to prune result set can be
made inline for a big perf win
Normal query applied first – then results filtered
quickly with inline “filter” query
High storage cost – only viable for small fields!
(Search)
21. Riak Search – Inline Fields cont.
login:greg AND private:false
becomes
Query - login:greg
Filter Query – private:false
private:false is efficiently applied only to results of
login:greg. Hooray!
(Search)
22. Fixing ANDs
But what about login:greg AND text:iPhone?
text field is too large to inline!
We had to get creative.
(Search)
23. Dynamic Fields
Our Solution: Create a new field - text_u
(u for user)
Values in text_u have the user’s name appended
In greg’s clip
text:iPhone text_greg:iPhone
In bob’s clip
text:iPhone text_bob:iPhone
(Search)
24. Presort on Keys
• Our addition to Riak code base.
• Does sort before slice
• If PRESORT=key, then never reads the docs
• Tremendous win (> 100x compared to M/R
approaches)
25. Clip Keys
<Time (ms)><User (guid)><SHA1 of Value>
• Base-64 encode each component
• Only use first 4 characters of user & content
• Only 16 bytes
Collisions? 1 in 17M if clipped the same thing
at same time.
26. Our Query Processing
1. w AND (x AND y)
(sort z, start = 990, count = 10)
2. On w_x node: search and send w_x
3. On w_y node: search and send all w_y
4. On aggregator:
1. Do w_x AND w_y
2. Sort on z
3. Slice to [ 990 .. 1000 ]
27. Summary
• Use inline fields for short and common bits
• Use dynamic fields for prebuilt ANDs
• Use keys that imply sort order
• Use same techniques for pagination
• Out approach yields search throughput
that is 100x better than out of the box (and
better as you scale outward).