This document discusses building an inverted index to search text documents. It shows how to tokenize documents into words, build a postings list to map words to the documents that contain them, and use the postings list to search for words in documents. It also discusses additional enhancements like handling punctuation, stemming, stop words, and storing more metadata with postings.
Slides from a workshop I held for some colleagues during December 2010. Slides, exercises and solutions downloadable at http://kjeldahlnilsson.net/ruby101.zip
MongoDB supports replication for failover and redundancy. In this session we will introduce the basic concepts around replica sets, which provide automated failover and recovery of nodes. We'll cover how to set up, configure, and initiate a replica set; methods for using replication to scale reads; and proper architecture for durability.
Slides from a workshop I held for some colleagues during December 2010. Slides, exercises and solutions downloadable at http://kjeldahlnilsson.net/ruby101.zip
MongoDB supports replication for failover and redundancy. In this session we will introduce the basic concepts around replica sets, which provide automated failover and recovery of nodes. We'll cover how to set up, configure, and initiate a replica set; methods for using replication to scale reads; and proper architecture for durability.
One of the strongest points for using a NoSQL database is their focus on distribution — both for replication and sharding. This talks takes a short look at what replication is, why you should use it, and what is so difficult about it. We then take a look at MongoDB’s implementation in general and finally focus on what can go wrong. In a practical demo you see how to find the right balance between performance versus data safety and how to use it in your Java application.
RESTing with the new Yandex.Disk API, Clemens АuerYandex
A first-hand report on experiences writing a Swift SDK on top of Yandex.Disk’s REST API. The presentation will begin with a short introduction to the Yandex.Disk service, including a comparison of the various APIs and SDKs available for integrating third-party products with Yandex.Disk, and then move on to focus on the necessary steps taken, and the experiences gathered while implementing a REST API-based SDK in Swift.
PuppetCamp SEA @ Blk 71 - Nagios in under 10 mins with PuppetOlinData
Choon Ming, senior consultant at OlinData, gave an overview of how Puppet compliments Nagios, and how you can make Puppet work with Nagios in under 10 minutes.
Dev Jumpstart: Build Your First App with MongoDBMongoDB
New to MongoDB? This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app to store books. We’ll cover inserting, updating, and querying the database of books. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Tesora
Amrith Kumar of Tesora and Peter Boros of Percona present an in-depth exploration of transparent database scale out use the Tesora DVE framework for MySQL.
See <a href ="http://www.throwingbeans.org/personal_carbon_rationing.html">Personal Carbon Rationing</a> for more information.
While climate change races up the political agenda, governments are failing to offer bold solutions to the looming threat of global warming. Meanwhile, economists and environmentalists have begun to consolidate around the concept of personal carbon rationing, a simple framework which shares the right to pollute among the global population and uses market forces to promote low-carbon living.
This presentation outlines the arguments for and against carbon rationing, examines the alternative economic proposals and imagines the changes we would have to make in order to live within our rations.
One of the strongest points for using a NoSQL database is their focus on distribution — both for replication and sharding. This talks takes a short look at what replication is, why you should use it, and what is so difficult about it. We then take a look at MongoDB’s implementation in general and finally focus on what can go wrong. In a practical demo you see how to find the right balance between performance versus data safety and how to use it in your Java application.
RESTing with the new Yandex.Disk API, Clemens АuerYandex
A first-hand report on experiences writing a Swift SDK on top of Yandex.Disk’s REST API. The presentation will begin with a short introduction to the Yandex.Disk service, including a comparison of the various APIs and SDKs available for integrating third-party products with Yandex.Disk, and then move on to focus on the necessary steps taken, and the experiences gathered while implementing a REST API-based SDK in Swift.
PuppetCamp SEA @ Blk 71 - Nagios in under 10 mins with PuppetOlinData
Choon Ming, senior consultant at OlinData, gave an overview of how Puppet compliments Nagios, and how you can make Puppet work with Nagios in under 10 minutes.
Dev Jumpstart: Build Your First App with MongoDBMongoDB
New to MongoDB? This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app to store books. We’ll cover inserting, updating, and querying the database of books. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Tesora
Amrith Kumar of Tesora and Peter Boros of Percona present an in-depth exploration of transparent database scale out use the Tesora DVE framework for MySQL.
See <a href ="http://www.throwingbeans.org/personal_carbon_rationing.html">Personal Carbon Rationing</a> for more information.
While climate change races up the political agenda, governments are failing to offer bold solutions to the looming threat of global warming. Meanwhile, economists and environmentalists have begun to consolidate around the concept of personal carbon rationing, a simple framework which shares the right to pollute among the global population and uses market forces to promote low-carbon living.
This presentation outlines the arguments for and against carbon rationing, examines the alternative economic proposals and imagines the changes we would have to make in order to live within our rations.
A brief talk from Oxford Geek Night #10, outlining the concept of dynamic demand and introducing http://caniturniton.com
A video of this talk is available at http://ogn.s3.amazonaws.com/10-TomDyson.mp4
These are the slides of the second part of this multi-part series, from Learn Python Den Haag meetup group. It covers List comprehensions, Dictionary comprehensions and functions.
A talk I gave at the June 2010 meeting of the London Ruby User Group. It's about the first bit of ruby I ever wrote, way back in 2003. A little bit of personal history, a little bit of ruby history, a whole lot of terrible code for you to learn from.
PLOTCON NYC: Behind Every Great Plot There's a Great Deal of WranglingPlotly
If you are struggling to make a plot, tear yourself away from stackoverflow for a moment and ... take a hard look at your data. Is it really in the most favorable form for the task at hand? Time and time again I have found that my visualization struggles are really a symptom of unfinished data wrangling. R has long had excellent facilities for data aggregation or "split-apply-combine": split an object into pieces, compute on each piece, and glue the result back together again. Recent developments, especially in the purrr package, have made "split-apply-combine" even easier and more general. But this requires a certain comfort level with lists, especially with lists that are columns inside a data frame. This is unfamiliar to most of us. I give an overview of this set of problems and match them up with solutions based on grouped, nested, and split data frames.
The basics of Python are rather straightforward. In a few minutes you can learn most of the syntax. There are some gotchas along the way that might appear tricky. This talk is meant to bring programmers up to speed with Python. They should be able to read and write Python.
Simply Business is starting to look into new tools to improve some of our mission-critical systems. There is one application, which would hugely benefit from the concurrency and fault tolerance model offered by languages like Elixir.
To increase awareness and gauge interest in the technology, we will have a bootcamp dedicated to giving us more insights into how to build and architect applications using Elixir and OTP.
It is meant to aim for slightly more advanced concepts, so in order to prepare rest of the team to be able to read the code and have some basic understanding of constructs and tooling - we have organised a LevelUP session, to talk exactly about that...
A slightly-modified version of my IPRUG talk, this time for the BT DevCon5 developer conference at Adastral Park on 25 May 2012.
The main changes are the addition of the Ruby section and the increased number of HHGTTG references in honour of towel day.
How can I make it so my code works so my command line can look like -p (1).docxPaulntmMilleri
How can I make it so my code works so my command line can look like "python3 edfs.py -create /user/file" instead of "python3 edfs.py -create /user file" as it only works with spaces between.
here is the code transcript (indents may not be accurate):
import sys
import json
import requests
import base64
# Firebase URL
firebase_url = "insert url here"
# Function to get the data from Firebase
def get_firebase_data(url):
response = requests.get(url)
data = json.loads(response.text)
return data
# Function to list all files and directories under a given directory
def list_directory(dir_path):
data = get_firebase_data(firebase_url + dir_path + ".json")
if isinstance(data, dict):
for key in data.keys():
if key == "content":
print(dir_path + ": " + data[key])
else:
list_directory(dir_path + "/" + key)
# Function to create a new file with the given content
def create_file(file_path, content):
data = {"content": content}
response = requests.put(firebase_url + file_path + ".json", data=json.dumps(data))
if response.status_code == 200:
print("File created successfully")
elif response.status_code == 400:
print("File already exists")
# Function to create a new directory
def create_directory(dir_path):
data = {}
response = requests.put(firebase_url + dir_path + ".json", data=json.dumps(data))
if response.status_code == 200:
print("Directory created successfully")
elif response.status_code == 400:
print("Directory already exists")
# Function to remove a file or directory
def remove(path):
response = requests.delete(firebase_url + path + ".json")
if response.status_code == 200:
print("File/directory removed successfully")
elif response.status_code == 404:
print("File/directory not found")
# Function to export the file system structure in XML format
def export_xml():
root_data = get_firebase_data(firebase_url + ".json")
print("<root>")
for key in root_data.keys():
print("<" + key + ">")
export_directory(key)
print("</" + key + ">")
print("</root>")
# Function to export a directory and its contents in XML format
def export_directory(dir_path):
data = get_firebase_data(firebase_url + dir_path + ".json")
if isinstance(data, dict):
for key in data.keys():
if key == "content":
content = data[key]
print("<file name='" + dir_path + "'>" + base64.b64encode(content.encode('utf-8')).decode('utf-8') + "</file>")
else:
print("<directory name='" + key + "'>")
export_directory(dir_path + "/" + key)
print("</directory>")
# Parse the command-line arguments
args = sys.argv
if len(args) < 3:
print("Usage: python3 edfs.py <command> <path> [content]")
exit()
command = args[1]
path = args[2]
if command == "-create":
if len(args) < 4:
print("Usage: python3 edfs.py -create <path> <content>")
exit()
content = args[3]
create_file(path, content)
elif command == "-mkdir":
create_directory(path)
elif command == "-rmdir" or command == "-rm":
remove(path)
elif command == "-ls":
list_directory(path)
elif command == "-export":
export_xml()
else:
print("Invalid command: " + command)
print("Usag.
Many developers will be familiar with lex, flex, yacc, bison, ANTLR, and other related tools to generate parsers for use inside their own code. For recognizing computer-friendly languages, however, context-free grammars and their parser-generators leave a few things to be desired. This is about how the seemingly simple prospect of parsing some text turned into a new parser toolkit for Erlang, and why functional programming makes parsing fun and awesome
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
13. Postings
>>> postings = {}
>>> for doc in docs:
... for token in word.split(doc['content']):
... if len(token) == 0: break
... doc_name = doc['name']
... if token not in postings:
... postings[token.lower()] = [doc_name]
... else:
... postings[token.lower()].append(doc_name)
15. O(log n)
>>> def searcher(term):
... if term in postings:
... for match in postings[term]:
... print quot;found '%s' in '%s'quot; % (term, match)
...
>>> searcher('says')
found 'says' in 'doc 1'
found 'says' in 'doc 2'
found 'says' in 'doc 4'