Many Cores with Java, Session Three: Threads - Why Can't You Just Play Nicely With Your Memory? moves on to start to think about concurrent code on two core. Some fundemental strategies are introduced briefly - non-blocking and locking.
Let's discuss the various sources of data corruption in databases (and in PostgreSQL in general) - cosmic rays, storage, bugs in the OS, database or application. I'll share some general observations about those various sources, and also some recommendations how to detect those issues in PostgreSQL (data checksums, ...) and how to deal with them.
Many Cores with Java, Session Three: Threads - Why Can't You Just Play Nicely With Your Memory? moves on to start to think about concurrent code on two core. Some fundemental strategies are introduced briefly - non-blocking and locking.
Let's discuss the various sources of data corruption in databases (and in PostgreSQL in general) - cosmic rays, storage, bugs in the OS, database or application. I'll share some general observations about those various sources, and also some recommendations how to detect those issues in PostgreSQL (data checksums, ...) and how to deal with them.
JMeter performance and scalability in Moodle Montana Moot 2014moorejon
Using jMeter Moodle admins can help assess the capacity or potential capabilities of their Moodle site. With jMeter testing, admins can determine what kind of concurrency they should expect to be able to achieve with their current server configuration. This workshop would then tie into a one-hour session related to performance
This presentation builds on a 6 month student project about measuring the performance of a moodle installation, and suggestions for what can be done to improve the performance, without changing the code.
This presentation summarises our testing method, and our performance recommendations
Have you experienced a Moodle site failure during a critical time? Are you worried that your Moodle service won't be able to meet your needs at the busiest times? This session will cover a variety of methods to ensure optimal performance of Moodle under peak load. The session will address general resource guidelines for expected concurrency and help administrators determine the correct sizing of IT resources for an expected Moodle load. The session will also cover benchmarking techniques that can be used to measure the actual performance of your Moodle infrastructure.
Introduction to MongoDB and CRUD operationsAnand Kumar
Learn about MongoDB basics, its advantages, history.
Learn about the installation of MongoDB.
Learn Basics of create,insert,update,delete documents in MongoDB.
Learn basics of NoSQL.
Webinar: Tales from the Field - 48 Hours to Data Centre RecoveryMongoDB
In this webinar Ger Hartnett, Director of Engineering, Technical Services, talked about what happened when a data centre outage caused chaos and uncovered some significant flaws in a disaster recovery plan. It was late on a Friday evening, 17TB of data was at risk, and there was uncertainty about the reliability of the backups. The Technical Services team had until Monday morning to get everything back to normal.
MongoDB: Advantages of an Open Source NoSQL DatabaseFITC
Save 10% off ANY FITC event with discount code 'slideshare'
See our upcoming events at www.fitc.ca
OVERVIEW
The presentation will present an overview of the MongoDB NoSQL database, its history and current status as the leading NoSQL database. It will focus on how NoSQL, and in particular MongoDB, benefits developers building big data or web scale applications. Discuss the community around MongoDB and compare it to commercial alternatives. An introduction to installing, configuring and maintaining standalone instances and replica sets will be provided.
Presented live at FITC's Spotlight:MEAN Stack on March 28th, 2014.
More info at FITC.ca
JMeter performance and scalability in Moodle Montana Moot 2014moorejon
Using jMeter Moodle admins can help assess the capacity or potential capabilities of their Moodle site. With jMeter testing, admins can determine what kind of concurrency they should expect to be able to achieve with their current server configuration. This workshop would then tie into a one-hour session related to performance
This presentation builds on a 6 month student project about measuring the performance of a moodle installation, and suggestions for what can be done to improve the performance, without changing the code.
This presentation summarises our testing method, and our performance recommendations
Have you experienced a Moodle site failure during a critical time? Are you worried that your Moodle service won't be able to meet your needs at the busiest times? This session will cover a variety of methods to ensure optimal performance of Moodle under peak load. The session will address general resource guidelines for expected concurrency and help administrators determine the correct sizing of IT resources for an expected Moodle load. The session will also cover benchmarking techniques that can be used to measure the actual performance of your Moodle infrastructure.
Introduction to MongoDB and CRUD operationsAnand Kumar
Learn about MongoDB basics, its advantages, history.
Learn about the installation of MongoDB.
Learn Basics of create,insert,update,delete documents in MongoDB.
Learn basics of NoSQL.
Webinar: Tales from the Field - 48 Hours to Data Centre RecoveryMongoDB
In this webinar Ger Hartnett, Director of Engineering, Technical Services, talked about what happened when a data centre outage caused chaos and uncovered some significant flaws in a disaster recovery plan. It was late on a Friday evening, 17TB of data was at risk, and there was uncertainty about the reliability of the backups. The Technical Services team had until Monday morning to get everything back to normal.
MongoDB: Advantages of an Open Source NoSQL DatabaseFITC
Save 10% off ANY FITC event with discount code 'slideshare'
See our upcoming events at www.fitc.ca
OVERVIEW
The presentation will present an overview of the MongoDB NoSQL database, its history and current status as the leading NoSQL database. It will focus on how NoSQL, and in particular MongoDB, benefits developers building big data or web scale applications. Discuss the community around MongoDB and compare it to commercial alternatives. An introduction to installing, configuring and maintaining standalone instances and replica sets will be provided.
Presented live at FITC's Spotlight:MEAN Stack on March 28th, 2014.
More info at FITC.ca
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018Holden Karau
Apache Spark is one of the most popular big data systems, but once the shiny finish starts to wear off you can find yourself wondering if you've accidentally deployed a Ford Pinto into production. This talk will look at the challenges that come with scaling Spark jobs. Also, the talk will explore Spark's new(ish) Dataset/DataFrame API, as well as how it’s evolving in Spark 2.3 with improved Python support.
If you're already a Spark user, come to find out why it’s not all your fault. If you aren't already a Spark user, come to find out how to save yourself from some of the pitfalls once you move beyond the example code.
Check out Holden's newest book, High Performance Spark, for more information!
From https://niketechtalksjan2018.splashthat.com/
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
Speakers: Dominic Dwyer & Wei Shan Ang
This talk was presented in Percona Live Europe 2017. However, we did not have enough time to test against more scenario. We will be giving an updated talk with a more comprehensive tests and numbers. We hope to run it against citusDB and MongoRocks as well to provide a comprehensive comparison.
https://www.percona.com/live/e17/sessions/high-performance-json-postgresql-vs-mongodb
Keeping the fun in functional w/ Apache Spark @ Scala Days NYCHolden Karau
Apache Spark has been a great driver of not only Scala adoption, but introducing a new generation of developers to functional programming concepts. As Spark places more emphasis on its newer DataFrame & Dataset APIs, it’s important to ask ourselves how we can benefit from this while still keeping our fun functional roots. We will explore the cases where the Dataset APIs empower us to do cool things we couldn’t before, what the different approaches to serialization mean, and how to figure out when the shiny new API is actually just trying to steal your lunch money (aka CPU cycles).
A fast introduction to PySpark with a quick look at Arrow based UDFsHolden Karau
This talk will introduce Apache Spark (one of the most popular big data tools), the different built ins (from SQL to ML), and, of course, everyone's favorite wordcount example. Once we've got the nice parts out of the way, we'll talk about some of the limitations and the work being undertaken to improve those limitations. We'll also look at the cases where Spark is more like trying to hammer a screw. Since we want to finish on a happy note, we will close out with looking at the new vectorized UDFs in PySpark 2.3.
The world has changed and having one huge server won’t do the job anymore, when you’re talking about vast amounts of data, growing all the time the ability to Scale Out would be your savior. Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
This lecture will be about the basics of Apache Spark and distributed computing and the development tools needed to have a functional environment.
Scaling up and accelerating Drupal 8 with NoSQLOSInet
Drupal 8 can scale well and serve pages fast to many users, especially by offloading parts of the work load from the main SQL database to NoSQL solutions.
This presentation describes the strategies and technologies usable to achieve such gains, including specific configuration, contributed modules and custom coding strategies.
Introduction to Apache Airflow, it's main concepts and features and an example of a DAG. Afterwards some lessons and best practices learned by from the 3 years I have been using Airflow to power workflows in production.
a comprehensive good introduction to the the Big data world in AWS cloud, hadoop, Streaming, batch, Kinesis, DynamoDB, Hbase, EMR, Athena, Hive, Spark, Piq, Impala, Oozie, Data pipeline, Security , Cost, Best practices
DevOops & How I hacked you DevopsDays DC June 2015Chris Gates
In a quest to move faster, organizations can end up creating security vulnerabilities using the tools and products meant to protect them. Both Chris Gates and Ken Johnson will share their collaborative research into the technology driving DevOps as well as share their stories of what happens when these tools are used insecurely as well as when the tools are just insecure.
Technologies discussed will encompass AWS Technology, Chef, Puppet, Hudson/Jenkins, Vagrant, Kickstart and much, much more. This talk will most definitely be an entertaining one but a cautionary tale as well, provoking attendees into action. Ultimately, this is research targeted towards awareness for those operating within a DevOps environment.
Ao contrário do que todo mundo pensa, o Doctrine não é somente um Mapeador de objeto relacional. É um projeto focado em desenvolver soluções para persistência de dados e tecnologias relacionadas. Nessa palestra você verá o uso de várias ferramentas que fazem o uso de pacotes do projeto que serão úteis no seu ambiente desenvolvimento desde a implementação ao deploy.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
3. Definition
Active Data Store
Data that at any point can be queried, manipulated, and transformed within a service
layer
Meaning: Anything within a database, daemon, or service. Manipulating files on the
filesystem don’t count.
Exception/Debate: Parquet / HDFS / Big Data Anything
4. General Disclaimer
I pick technology with this mantra:
The right tool for the right job
I will present with this mantra.
That does not mean technology A can’t be used for purpose B
(and especially if a faculty is set on it)
Demos will use Docker.
Remember containers are ephemeral. Once the container is destroyed, so too is your
data.
5. PostgreSQL - High Level
● RDBMS - Relational Database Management System
● Enforced relationships (and schemas) between data types / models
● Clustering can be… a chore
● Writes can be slow (and compounded when clustered)
● Read performance reasonable
● Joins/Views a-plenty
● Granular access control
● Queries written using… SQL (shock)
● Memory footprint: Depends on usage
● Overall a solid service
6. PostgreSQL - Docker Playground / Connect Info
Zero to SQL Shell with Docker
#!/bin/bash
docker run -d --name postgres postgres:latest
docker exec -ti postgres bash
su postgres
psql
● In prompt: “h” for help, “q” to quit SQL Shell
● Default port (when exposed) is 5432
● Many GUIs, default (pgAdmin, https://www.pgadmin.org/ ) is fantastic
7. MongoDB - High Level
● NoSQL (JSON Document store)
● Schemaless: Record A and Record B can have completely differing schemas
● Clustering and maintenance is fairly easy
● Writes are fairly fast (eventual consistency across cluster)
● Reads are extremely fast
● No tables? No joins. No views.
● Per-Database RBAC (More complex when clustered)
● Custom Query Language (Fairly easy to learn)
● Dedupe: You like it (Depending on storage engine)
● Memory footprint: It’s C
● Problems in the past give me pause
8. MongoDB - Docker Playground / Connect Info
Zero to Mongo Shell with Docker
#!/bin/bash
docker run -d --name mongo mongo:latest
docker exec -ti mongo bash
mongo
● “exit” exits, “help” helps
● Default port (when exposed) is 27017
● Many GUIs, I prefer “mongoclient” which is third-party OSS
https://docs.mongodb.com/ecosystem/tools/administration-interfaces/
9. ElasticSearch - High Level
● NoSQL (Document Store)
● Has a “schema” per “index” (think a table but not really)
● Press button: Receive Cluster (so easy even a caveman could do it)
● Writes extremely fast (eventually consistent w/ reads)
● Reads extremely fast (depending on “query”)
● No tables? You guessed it. No joins. Views depends on client (it reads fast)
● You like security? Hope you like iptables (or have lots of money)
● Query language: R-E-S-T-F-U-L (Sing it like Aretha) on top of Lucene/SOLR
● Dedupe: You like it
● Memory footprint: It’s Java.
● Self healing, set it and forget it. Very solid platform.
10. ElasticSearch - Connect
Zero to ElasticSearch “console” with Docker
#!/bin/bash
docker run -d --name elastic elasticsearch:latest
docker exec -ti elastic bash
curl -i -XGET 'localhost:9200/'
● It’s R-E-S-T-F-U-L so just curl it
● Default port (when exposed) is 9200
● Many GUIs (Kibana and Grafana do dashboards)
Kid in a candy store for features / GUIs
11. Conclusion / Q&A
● Small sampling of service
● Try to fit the right tool (service) for the right job (data)
● If not: fit the right handle (query/interface) for the right researcher
● All else fails or researcher wants “something completely different”
Contact ARC-TS (Jeremy) and we’ll facilitate a decision
Pick My Brain Time
Editor's Notes
Goal is under ten minutes.
Slides will be made available for reference.
Enjoy the ride and save questions for after, goal is there’ll be plenty of time for that
A datastore is an active service with an endpoint you can query, curl, run a client, etc. That is the basis for this definition.
There is some gray area when we start talking about HiveSQL, Hadoop, Anything big data. We’re ignoring those for now.
Read the mantra. Live the mantra. It is why anything I design is pluggable and I’m never married to a single solution.
If you don’t have access to a machine with Docker and you want to play with some of this later, see me after class.
At this point if you want practical hands-on experience with these things, Docker is the path of least resistance.
Also this is not a Docker presentation so that’s the extent of that.
Create a table, Select *, basic SQL commands
Postgres also DOES have a JSON Datatype but if you’re working with JSON… there are better options
Click for Bobby Tables
Click for Data Loss
Back in 2011… 1.0 vs 0.98
Yes they’ve gotten significantly better but they still have some data assurance issues
If you’re dealing with large redundant data such as log output (what this was built for) this will win 100% of the time
Click for Aretha