This document contains code for a basic word count program written in Java using Apache Spark. It defines Mapper and Reducer classes to count the frequency of words in a text file. The main method sets up the job configuration and runs the job. Other sections provide links about the history of Spark and summaries of Spark surveys from 2016 and 2017 focusing on trends in machine learning, streaming, and scaling Spark applications.
Talk given by Álvaro López García of the Instituto de Física de Cantabria at the Cloud Interoperability Week tutorial session of Cloud Plugfest 10 in Madrid, Spain, 19 Sep 2013.
Spraykatz is a credentials gathering tool automating remote procdump and parse of lsass process. Spraykatz is a tool able to retrieve credentials on Windows machines and large Active Directory environments. It simply tries to procdump machines and parse dumps remotely in order to avoid detections by antivirus softwares as much as possible.
Have you ever been annoyed at a new toy being broken by your 3-year old? I have. But guess what - every time I pick up some new technology I start behaving like a 3-year-old myself. I want to see what's going to happen if I press here or bend it there. Or indeed - how much the thing can be squeezed before it cracks or worse! Call it vandalism if you will, but it can be a very instructive process. By succumbing to this urge you start 'feeling' the different components of the technology. What it can do, what it's limits are. There's a term coined for this sort of technology 'feeling' - mechanical sympathy. Look it up it makes for a great reading.
Unlike the 3-year-old's toys that you need to throw away once broken, we're dealing with software here, so with tools like Docker under your belt the crash is, well, soft and easy to fix.
So let's inflict some pain on Apache Spark and see how and where it cracks.
After attending the workshop you'll be able to:
- deploy Apache Spark to run locally on your laptop using Docker
- run a simple distributed Python spark app in Jupyter Notebook
- see how it's all connected using Spark UIs
- follow the cycle: break it, feel it broken, fix it
Packet Filter is OpenBSD's system for filtering TCP/IP traffic and doing Network Address Translation. PF is also capable of normalizing and conditioning TCP/IP traffic, as well as providing bandwidth control and packet prioritization.
.NET Conf 2019 - Indexing and searching NuGet.org with Azure Functions and Se...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
https://blog.maartenballiauw.be/post/2019/07/30/indexing-searching-nuget-with-azure-functions-and-search.html
This power point presentation explains retrofit technology, it includes performance analysis, The libraries support, Implementation, serialization, Interfaces, Interface methods, header manipulation. You can learn more about retrofit implementation.
Quicli - From zero to a full CLI application in a few lines of RustDamien Castelltort
This talk presents step by step of easy it is to do a cli application using the rust crate Quicli by leverage a simple example from the crate. It was presented during the Montpellier Rust Meetup, https://bit.ly/mtp-rust
Quick Start Guide using Virtuozzo 7 (β) on AWS EC2Kentaro Ebisawa
Virtuozzo 7 was open sourced and available on Amazon EC2 since October 2015.
This document aims to give you a quick overview of steps to setup Virtuozzo on Amazon EC2.
Talk given by Álvaro López García of the Instituto de Física de Cantabria at the Cloud Interoperability Week tutorial session of Cloud Plugfest 10 in Madrid, Spain, 19 Sep 2013.
Spraykatz is a credentials gathering tool automating remote procdump and parse of lsass process. Spraykatz is a tool able to retrieve credentials on Windows machines and large Active Directory environments. It simply tries to procdump machines and parse dumps remotely in order to avoid detections by antivirus softwares as much as possible.
Have you ever been annoyed at a new toy being broken by your 3-year old? I have. But guess what - every time I pick up some new technology I start behaving like a 3-year-old myself. I want to see what's going to happen if I press here or bend it there. Or indeed - how much the thing can be squeezed before it cracks or worse! Call it vandalism if you will, but it can be a very instructive process. By succumbing to this urge you start 'feeling' the different components of the technology. What it can do, what it's limits are. There's a term coined for this sort of technology 'feeling' - mechanical sympathy. Look it up it makes for a great reading.
Unlike the 3-year-old's toys that you need to throw away once broken, we're dealing with software here, so with tools like Docker under your belt the crash is, well, soft and easy to fix.
So let's inflict some pain on Apache Spark and see how and where it cracks.
After attending the workshop you'll be able to:
- deploy Apache Spark to run locally on your laptop using Docker
- run a simple distributed Python spark app in Jupyter Notebook
- see how it's all connected using Spark UIs
- follow the cycle: break it, feel it broken, fix it
Packet Filter is OpenBSD's system for filtering TCP/IP traffic and doing Network Address Translation. PF is also capable of normalizing and conditioning TCP/IP traffic, as well as providing bandwidth control and packet prioritization.
.NET Conf 2019 - Indexing and searching NuGet.org with Azure Functions and Se...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
https://blog.maartenballiauw.be/post/2019/07/30/indexing-searching-nuget-with-azure-functions-and-search.html
This power point presentation explains retrofit technology, it includes performance analysis, The libraries support, Implementation, serialization, Interfaces, Interface methods, header manipulation. You can learn more about retrofit implementation.
Quicli - From zero to a full CLI application in a few lines of RustDamien Castelltort
This talk presents step by step of easy it is to do a cli application using the rust crate Quicli by leverage a simple example from the crate. It was presented during the Montpellier Rust Meetup, https://bit.ly/mtp-rust
Quick Start Guide using Virtuozzo 7 (β) on AWS EC2Kentaro Ebisawa
Virtuozzo 7 was open sourced and available on Amazon EC2 since October 2015.
This document aims to give you a quick overview of steps to setup Virtuozzo on Amazon EC2.
High-level Programming Languages: Apache Pig and Pig LatinPietro Michiardi
This slide deck is used as an introduction to the Apache Pig system and the Pig Latin high-level programming language, as part of the Distributed Systems and Cloud Computing course I hold at Eurecom.
Course website:
http://michiard.github.io/DISC-CLOUD-COURSE/
Sources available here:
https://github.com/michiard/DISC-CLOUD-COURSE
A case study of the usage of Gradle in the Ratpack web framework. First, we'll examine the Ratpack Gradle plugins, including their functionality, implementation, and testing. Next, we'll examine the build script for the Ratpack project itself. Here, we'll discuss various details of the project's build, including handling multiple projects, multiple types of testing, support for multiple styles of target hardware (developer workstations, cloud CI), and more. For each, we'll go over the desired behavior, how it was achieved, and why it was necessary.
Scalable and Flexible Machine Learning With Scala @ LinkedInVitaly Gordon
The presentation given by Chris Severs and myself at the Bay Area Scala Enthusiasts meetup. http://www.meetup.com/Bay-Area-Scala-Enthusiasts/events/105409962/
EuroPython 2015 - Big Data with Python and HadoopMax Tepkeev
Big Data - these two words are heard so often nowadays. But what exactly is Big Data ? Can we, Pythonistas, enter the wonder world of Big Data ? The answer is definitely “Yes”.
This talk is an introduction to the big data processing using Apache Hadoop and Python. We’ll talk about Apache Hadoop, it’s concepts, infrastructure and how one can use Python with it. We’ll compare the speed of Python jobs under different Python implementations, including CPython, PyPy and Jython and also discuss what Python libraries are available out there to work with Apache Hadoop.
Recent developments in Hadoop version 2 are pushing the system from the traditional, batch oriented, computational model based on MapRecuce towards becoming a multi paradigm, general purpose, platform. In the first part of this talk we will review and contrast three popular processing frameworks. In the second part we will look at how the ecosystem (eg. Hive, Mahout, Spark) is making use of these new advancements. Finally, we will illustrate "use cases" of batch, interactive and streaming architectures to power traditional and "advanced" analytics applications.
This slide shows you how to use Akka cluster in Java.
Source Code: https://github.com/jiayun/akka_samples
If you want to use the links in slide, please download the pdf file.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
4. package org.myorg;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class WordCount {
public static class Map extends MapReduceBase implements Mapper<Lon
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Tex
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
public static class Reduce extends MapReduceBase implements Reducer
public void reduce(Text key, Iterator values, OutputCollector<Tex
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(WordCount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}
wordcount