The document discusses using Cassandra and Celery to solve big data problems. Cassandra is described as a distributed, fault-tolerant, and scalable NoSQL database that stores data as key-value pairs. Celery is presented as a tool for distributing tasks across multiple machines by adding tasks to queues. An example architecture is proposed using RabbitMQ to connect an application to Celery workers that interface with the Cassandra database. Configuration and usage of Cassandra, Celery, and their integration are outlined.
Te has preguntado alguna vez ¿Existe vida después de ActiveRecord::Base? si es así, en esta charla vamos descubrir que Rails es algo más que un simple MVC, es un framework repleto de herramientas cuyo conocimiento nos va a facilitar enormemente la vida.
Con estas herramientas vas a poder extender Rails de una forma que no habías imaginado hasta ahora: crearas tus propios validadores, responders y renderers; serás capaz de enviar datos por streaming, de interceptar mails o añadir un nuevo middleware a tu stack.
Queremos mostrar 10 de estas herramientas junto con ejemplos de uso de cada una de ellas, para que las puedas incorporar poco a poco en tu día a día y descubras el mundo de posibilidades que realmente tienes en tus manos.
El conjunto de herramientas que vamos a mostrar es:
ActiveModel::Model
ActiveModel::Validator
ActiveSupport::Concern
ActionSupport::Notifications
ActionController::Renderers
ActionController::Responder
ActionController::Live
ActionView::Resolvers
ActionMailer Interceptors
Rack Middleware
Presentación de Álex González (@nihilistbird) en el #DataBeers del 29 de noviembre de 2014 (Madrid). ¿Por qué sumergirse? Esto nos otorga más dimensiones de representación. Capacidad de recrear entornos reales. Nuevas sensaciones asociadas a datos. Y mucho más.
Te has preguntado alguna vez ¿Existe vida después de ActiveRecord::Base? si es así, en esta charla vamos descubrir que Rails es algo más que un simple MVC, es un framework repleto de herramientas cuyo conocimiento nos va a facilitar enormemente la vida.
Con estas herramientas vas a poder extender Rails de una forma que no habías imaginado hasta ahora: crearas tus propios validadores, responders y renderers; serás capaz de enviar datos por streaming, de interceptar mails o añadir un nuevo middleware a tu stack.
Queremos mostrar 10 de estas herramientas junto con ejemplos de uso de cada una de ellas, para que las puedas incorporar poco a poco en tu día a día y descubras el mundo de posibilidades que realmente tienes en tus manos.
El conjunto de herramientas que vamos a mostrar es:
ActiveModel::Model
ActiveModel::Validator
ActiveSupport::Concern
ActionSupport::Notifications
ActionController::Renderers
ActionController::Responder
ActionController::Live
ActionView::Resolvers
ActionMailer Interceptors
Rack Middleware
Presentación de Álex González (@nihilistbird) en el #DataBeers del 29 de noviembre de 2014 (Madrid). ¿Por qué sumergirse? Esto nos otorga más dimensiones de representación. Capacidad de recrear entornos reales. Nuevas sensaciones asociadas a datos. Y mucho más.
* Conceptos esenciales. ¿Qué problemas resuelve Django Migrations?
* Bases de datos: Peculiaridades que debemos saber al implementar la librería según nuestro motor de base de datos.
* Un vistazo a los componentes internos que conforman Django Migrations.
* Presentación compartida el 2016-05-25 en la comunidad de Python Dominicana.
* https://www.facebook.com/events/1787153718174739/
Volunteering assistance to online geocoding services through a distributed kn...José Pablo Gómez Barrón S.
Work presented in RICH-VGI: enRICHment of volunteered geographic information (VGI): Techniques, practices and current state of knowledge, Workshop at the 18th AGILE Conference on Geographic Information Science, June 9th, 2015, Lisboa, Portugal.
This talk is quick reference of all the different queerability options that MongoDB offers to developers that want to build mobile and geospatial referenced applications. We reviewed the basic functionality but also recent improvements in the query and indexation engine of MongoDB geospatial features
Time series with Apache Cassandra - Long versionPatrick McFadin
Apache Cassandra has proven to be one of the best solutions for storing and retrieving time series data. This talk will give you an overview of the many ways you can be successful. We will discuss how the storage model of Cassandra is well suited for this pattern and go over examples of how best to build data models.
There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and cross-validation of models are just a few examples. In this talk I'll describe several techniques available in R to speed up workloads like these, by running multiple iterations simultaneously, in parallel.
Many of these techniques require the use of a cluster of machines running R, and I'll provide examples of using cloud-based services to provision clusters for parallel computations. In particular, I will describe how you can use the SparklyR package to distribute data manipulations using the dplyr syntax, on a cluster of servers provisioned in the Azure cloud.
Presented by David Smith at Data Day Texas in Austin, January 27 2018.
* Conceptos esenciales. ¿Qué problemas resuelve Django Migrations?
* Bases de datos: Peculiaridades que debemos saber al implementar la librería según nuestro motor de base de datos.
* Un vistazo a los componentes internos que conforman Django Migrations.
* Presentación compartida el 2016-05-25 en la comunidad de Python Dominicana.
* https://www.facebook.com/events/1787153718174739/
Volunteering assistance to online geocoding services through a distributed kn...José Pablo Gómez Barrón S.
Work presented in RICH-VGI: enRICHment of volunteered geographic information (VGI): Techniques, practices and current state of knowledge, Workshop at the 18th AGILE Conference on Geographic Information Science, June 9th, 2015, Lisboa, Portugal.
This talk is quick reference of all the different queerability options that MongoDB offers to developers that want to build mobile and geospatial referenced applications. We reviewed the basic functionality but also recent improvements in the query and indexation engine of MongoDB geospatial features
Time series with Apache Cassandra - Long versionPatrick McFadin
Apache Cassandra has proven to be one of the best solutions for storing and retrieving time series data. This talk will give you an overview of the many ways you can be successful. We will discuss how the storage model of Cassandra is well suited for this pattern and go over examples of how best to build data models.
There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and cross-validation of models are just a few examples. In this talk I'll describe several techniques available in R to speed up workloads like these, by running multiple iterations simultaneously, in parallel.
Many of these techniques require the use of a cluster of machines running R, and I'll provide examples of using cloud-based services to provision clusters for parallel computations. In particular, I will describe how you can use the SparklyR package to distribute data manipulations using the dplyr syntax, on a cluster of servers provisioned in the Azure cloud.
Presented by David Smith at Data Day Texas in Austin, January 27 2018.
Deploying, Backups, and Restore w Datastax + Azure at Albertsons/Safeway (Gur...DataStax
Albertsons/Safeway, America’s second largest supermarket chain, relies on DataStax Enterprise for their online customer facing application known as “LOYALTY”. With over 6 Million users and 1 Billion coupon clips per year, Albertson’s Safeway engages its buyers with their shopping experience from Web as well as Mobile app – but how does the organization ensure backup, restore, and redundancy?
This talk will explore how Albertsons/Safeway uses DataStax Enterprise for disaster avoidance, high availability, and extremely fast reads/writes. We will discuss how to run customized scripts in OpsCenter to ensure all nodes in the cluster are backed up without incurring performance hits and how Apache Cassandra data can be backed up while running on Azure using OS utilities and the system restored seamlessly without impacting app performance.
About the Speaker
Gurpreet Singh Data Services, Albertsons/ Safeway
Gurpreet Singh is a Cassandra Architect responsible for deploying, maintaining, and tuning customer facing applications that manage data, the most valuable asset in the organization.
C* Summit EU 2013: No Whistling Required: Cabs, Cassandra, and Hailo DataStax Academy
Speaker: Dave Gardner, Architect at Hailo
Video: http://www.youtube.com/watch?v=6cUuE7sTdU0&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=16
Hailo has leveraged Cassandra to build one of the most successful startups in European history. This presentations looks at how Hailo grew from a simple MySQL-backed infrastructure to a resilient Cassandra-backed system running in three data centres globally. Topics covered include: the process of migration, experience running multi-DC on AWS, common data modeling patterns and security implications for achieving PCI compliance.
Did you know that Oracle is already putting RESULT_CACHE hints in their Apex code that will kick in as soon as you go to 11g. The Oracle 11g results cache transforms the way we can access data in an Oracle database. It presents an option for lightning fast access to data. If it is really the solution to all our performance problems we should be writing all our code in 10g to take advantage of it when we get to 11g. This presentation will describe the feature, demonstrate how we can write our code to take advantage of it, and discuss the realities of its benefits and limitations.
I have 77 modules on the CPAN and I haven't
yet given a talk about most of them. I'll pick ten
useful but less-known modules of mine and give two
minute introductions to each
Pollfish is a survey platform which provides access to millions of targeted users. Pollfish allows easy distribution and targeting of surveys through existing mobile apps. (https://www.pollfish.com/). At pollfish we use Cassandra for difference use cases, eg. for application data store to maximize write throughput when appropriate and for our analytics project to find insights in application generated data. As a medium to accomplish our success so far, we use the Datastax's DSE 4.6 environment which integrates Appache Cassadra, Spark and a hadoop compatible file system (CFS). We will discuss how we started, how the journey was and the impressions gained so far along with some tips learned the hard way. This is a result of joint work of an excellent team here at Pollfish.
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidiaMail.ru Group
Все мы знаем, что наш любимый Pandas исключительно однопоточный, а модели из scikit-learn часто учатся не очень быстро даже в несколько процессов. Поэтому в докладе я расскажу о проекте RAPIDS - наборе библиотек для анализа данных и построения предиктивных моделей с использованием NVIDIA GPU. В докладе я предложу подискутировать о том, что закон Мура больше не выполняется, рассмотрю принципы работы архитектуры CUDA. Разберу библиотеки cuDF и cuML, а также постараюсь предельно честно рассказать о том, ждать ли чуда от перехода на GPU и в каких случаях чудо неизбежно.
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...DataStax Academy
Hailo has leveraged Cassandra to build one of the most successful startups in European history. This presentations looks at how Hailo grew from a simple MySQL-backed infrastructure to a resilient Cassandra-backed system running in three data centers globally. Topics covered include: the process of migration, experience running multi-DC on AWS, common data modeling patterns and security implications for achieving PCI compliance.
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
NoSQL schemas are designed with very different goals in mind than SQL schemas. Where SQL normalizes data, NoSQL denormalizes. Where SQL joins ad-hoc, NoSQL pre-joins. And where SQL tries to push performance to the runtime, NoSQL bakes performance into the schema. Join us for an exploration of the core concepts of NoSQL schema design, using Scylla as an example to demonstrate the tradeoffs and rationale.
How to obtain maximum performance from your Progressive web app
In this talk i want to explain what PWA is and how to obtain maximum performance for your project focusing on the caching strategies, use cases when you can prefer your caching data vs use cases when it's better to try to fetch your data first.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Big data amb Cassandra i Celery ##bbmnk
1. Big Data amb Cassandra i Celery
#bbmnk novembre 2013
Santi Camps Taltavull
@santicamps
@socialvane
2. La Problemàtica (Big Data)
➲
➲
➲
➲
➲
Gran volum d'informació (TeraBytes)
Informació no estructurada
Poca densitat d'informació útil
Altíssima capacitat de processament
Poca pasta
3. Les solucions aplicades
➲
➲
➲
➲
➲
➲
➲
➲
➲
➲
➲
➲
BBDD distribuida Cassandra
Gestor de tasques distribuides Celery
Gestor de missatgeria RabbitMQ
Aplicació --> RabbitMQ --> Celery <--> Cassandra
4 servidors inicials
12 TB de capacitat
208 GB de RAM
44 nuclis de CPU
Tolerant a fallades
Redundant
Molt Fàcilment Escalable
I Barat !!
4. Cassandra
➲
➲
➲
➲
➲
➲
Neix dins de Facebook i s'allibera
L'adopta la fundació Apache
Twitter també l'empra
Està escrit amb Java
És una BBDD NO SQL
Les dades es guarden com a clau -> valor
6. Cassandra - Inconvenients
➲
➲
➲
➲
No té gestió de transaccions
Es coordina amb timestamps
En mode RandomPartitioner no permet ordenar
En mode RandomPartitioner filtrar es fa difícil
7. Cassandra - Característiques
➲
➲
➲
➲
➲
Name Space = BBDD
Column Family = Taula
Cada Registre pot tenir columnes diferents
Un Registre pot tenir milions de columnes
Tots es guarda com a clau -> valor
13. Celery
➲
➲
➲
➲
➲
Es configuren cues d'execució
S'engeguen N workers a M màquines escoltant cada cua
Les tasques distribuibles es marquen al codi
Es defineix la cua d'execució de cada tasca
Es poden cridar síncronament o asíncrona
➲
➲
➲
Molt senzill d'implantar
Molt fàcil d'escalar
Cal vigilar la concurrència