DiscoRank: optimizing discoverability on SoundCloud

Windows Phone 7- The Metro designcarolineberman99

Nearest neighbors refers to something that is conceptually very simple. For a set of points in some space (possibly many dimensions), we want to find the closest k neighbors quickly. This presentation covers a library called Annoy built my me that that helps you do (approximate) nearest neighbor queries in high dimensional spaces. We're going through vector models, how to measure similarity, and why nearest neighbor queries are useful.

Zillow's favorite big data & machine learning tools

njstevens

Imitation learning tutorial

Yisong Yue

Collaborative Filtering with Spark

Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm. In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark. Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.

Hexagonify the World: The Theory and Applications of Uber H3

Safe Software

Imagine having more spatial data than you know what to do with, and yet it holds the key to improving your customer’s experience and your company’s bottom line. That’s the situation that Uber found themselves in, so they created a Hexagonal Hierarchical Spatial Indexing library (called H3) that indexes the world in 16 different levels of resolution as the foundation for solving their challenge. Fortunately for all of us, Uber open sourced H3 and the upcoming FME 2021.0 release will provide no-code, format agnostic access to its functionality. Join us to learn theory behind H3 directly from Unfolded’s Isaac Brodsky, one of the creators of H3, and see firsthand how you might apply the upcoming FME H3 transformer to solve a variety of real-world, big-data spatial problems.

Gremlin's Graph Traversal Machinery

Marko Rodriguez

This talk argues that the future of data query/analytic languages will be all about embedding the language into the native programming language of the developer. As an example of this style, the Gremlin graph traversal language is presented. Gremlin can be represented in any programming language that supports function composition and function nesting. The language representation is then compiled to Gremlin bytecode to ultimately be executed by the/a Gremlin graph traversal machine. This enables both the Gremlin language and machine to be agnostic to the execution language.

Homepage Personalization at Spotify

Oguz Semerci

Interactive Recommender Systems with Netflix and Spotify

Interactive recommender systems enable the user to steer the received recommendations in the desired direction through explicit interaction with the system. In the larger ecosystem of recommender systems used on a website, it is positioned between a lean-back recommendation experience and an active search for a specific piece of content. Besides this aspect, we will discuss several parts that are especially important for interactive recommender systems, including the following: design of the user interface and its tight integration with the algorithm in the back-end; computational efficiency of the recommender algorithm; as well as choosing the right balance between exploiting the feedback from the user as to provide relevant recommendations, and enabling the user to explore the catalog and steer the recommendations in the desired direction. In particular, we will explore the field of interactive video and music recommendations and their application at Netflix and Spotify. We outline some of the user-experiences built, and discuss the approaches followed to tackle the various aspects of interactive recommendations. We present our insights from user studies and A/B tests. The tutorial targets researchers and practitioners in the field of recommender systems, and will give the participants a unique opportunity to learn about the various aspects of interactive recommender systems in the video and music domain. The tutorial assumes familiarity with the common methods of recommender systems.

Graph R-CNN for Scene Graph Generation

Sangmin Woo

Beyond Just Usability: Desirability and Usefulness Testing

Susan Mercer

Much of our work in UX research focuses on usability – evaluating products and interfaces to ensure they are easy-to-use. However, in today’s digital world, they are no longer enough. Consumers also have come to expect entertaining and engaging experiences. Web and mobile applications need to be usable, useful and engaging. So, how do we evaluate web interfaces to determine how useful and engaging they are? Desirability has been evaluated in recent years by the use of the Product Reaction Card technique, originated by folks at Microsoft. However, there are many other techniques used in market and industrial design research that we can borrow to complement this technique. Likewise, we can use standard usability testing techniques with lines of questioning with a slightly different focus to evaluate the relative usefulness of different solutions for a particular user group. In this talk, I discuss several techniques that I have used in recent months to evaluate the usefulness and desirability of interfaces The best techniques I have discovered to evaluate usefulness involve open-ended interview questions regarding current processes and pain points, followed by a usability evaluation of the interface and then a reflective interview discussing the benefits and drawbacks of that solution to their personal situation. To evaluate desirability, I will discuss the product reaction card technique and variations using more defined vocabularies for emotional responses and product personalities. In addition I will show results from techniques borrowed from psychology and marketing research - sentence completion, collaging, and the use of dyad rating scales. These techniques offer a variety of both qualitative and quantitative data that can be used to compare different interface options.

因果推論を用いた群衆移動の誘導における介入効果推定

Koh Takeuchi

セレンディピティと機械学習

Kei Tateno

π計算

Yuuki Takano

言語と知識の深層学習@認知科学会サマースクール

Yuya Unno

ML+Hadoop at NYC Predictive Analytics

Intro to Cypher

Neo4j

This developer-focused webinar will explain how to use the Cypher graph query language. Cypher, a query language designed specifically for graphs, allows for expressing complex graph patterns using simple ASCII art-like notation and offers a simple but expressive approach for working with graph data. During this webinar you'll learn: -Basic Cypher syntax -How to construct graph patterns using Cypher -Querying existing data -Data import with Cypher -Using aggregations such as statistical functions -Extending the power of Cypher using procedures and functions

Graph attention network - deep learning paper review

taeseon ryu

Scaling Data Science at Airbnb

Work-Bench

Cassandra and Spark

nickmbailey

Apache Spark: The Next Gen toolset for Big Data Processing

prajods

The Spark project from Apache(spark.apache.org), is the next generation of Big Data processing systems. It uses a new architecture and in-memory processing for orders of magnitude improvement in performance. Some would call it the successor to the Hadoop set of tools. Hadoop is a batch mode Big Data processor and depends on disk based files. Spark improves on this and supports real time and interactive processing, in addition to batch processing. Table of contents: 1. The Big Data triangle 2. Hadoop stack and its limitations 3. Spark: An Overview 3.a. Spark Streaming 3.b. GraphX: Graph processing 3.c. MLib: Machine Learning 4. Performance characteristics of Spark

What's hot

Embedded based retrieval in modern search ranking system

Marsan Ma

NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...

ssuser4b1f48

Approximate nearest neighbor methods and vector models – NYC ML meetup

Zillow's favorite big data & machine learning tools

njstevens

Imitation learning tutorial

Yisong Yue

Collaborative Filtering with Spark

Hexagonify the World: The Theory and Applications of Uber H3

Safe Software

Gremlin's Graph Traversal Machinery

Marko Rodriguez

Homepage Personalization at Spotify

Oguz Semerci

Interactive Recommender Systems with Netflix and Spotify

Graph R-CNN for Scene Graph Generation

Sangmin Woo

Beyond Just Usability: Desirability and Usefulness Testing

Susan Mercer

因果推論を用いた群衆移動の誘導における介入効果推定

Koh Takeuchi

セレンディピティと機械学習

Kei Tateno

π計算

Yuuki Takano

言語と知識の深層学習@認知科学会サマースクール

Yuya Unno

ML+Hadoop at NYC Predictive Analytics

Intro to Cypher

Neo4j

Graph attention network - deep learning paper review

taeseon ryu

Scaling Data Science at Airbnb

Work-Bench

What's hot (20)

Embedded based retrieval in modern search ranking system

NS-CUK Journal club: HBKim, Review on "Neural Graph Collaborative Filtering",...

Approximate nearest neighbor methods and vector models – NYC ML meetup

Zillow's favorite big data & machine learning tools

Imitation learning tutorial

Collaborative Filtering with Spark

Hexagonify the World: The Theory and Applications of Uber H3

Gremlin's Graph Traversal Machinery

Homepage Personalization at Spotify

Interactive Recommender Systems with Netflix and Spotify

Graph R-CNN for Scene Graph Generation

Beyond Just Usability: Desirability and Usefulness Testing

因果推論を用いた群衆移動の誘導における介入効果推定

セレンディピティと機械学習

π計算

言語と知識の深層学習@認知科学会サマースクール

ML+Hadoop at NYC Predictive Analytics

Intro to Cypher

Graph attention network - deep learning paper review

Scaling Data Science at Airbnb

Similar to DiscoRank: optimizing discoverability on SoundCloud

Cassandra and Spark

nickmbailey

Apache Spark: The Next Gen toolset for Big Data Processing

prajods

Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...

DataStax Academy

Crisis Informatics is an area of research that investigates how members of the public make use of social media during times of crisis. The amount of social media data generated by a single event is significant: millions of tweets and status updates accompanied by gigabytes of photos and video. To investigate the types of digital behaviors that occur around these events requires a significant investment in designing, developing, and deploying large-scale software infrastructure for both data collection and analysis. Project EPIC at the University of Colorado has been making use of Cassandra since Spring 2012 to provide a solid foundation for Project EPIC's data collection and analysis activities. Project EPIC has collected terabytes of social media data associated with hundreds of disaster events that must be stored, processed, analyzed, and visualized. This talk will cover how Project EPIC makes use of Cassandra and discuss some of the architectural, modeling, and analysis challenges encountered while developing the Project EPIC software infrastructure.

«Scrapy internals» Александр Сибиряков, Scrapinghub

it-people

Frontera распределенный робот для обхода веба в больших объемах / Александр С...

Ontico

В этом докладе я собираюсь поделиться нашим опытом обхода испанского интернета. Мы поставили перед собой задачу обойти около 600 тысяч веб-сайтов в зоне .es с целью сбора статистики об узлах и их размерах. Я расскажу об архитектуре робота, хранилища, проблемах, с которыми мы столкнулись при обходе, и их решении. Наше решение доступно в форме open source фреймворка Frontera. Фреймворк позволяет построить распределенного робота для скачивания страниц из Интернета в больших объемах в реальном времени. Также он может быть использован для построения сфокусированных роботов для выкачивания подмножества заранее известных веб-сайтов. Фреймворк предлагает: настраиваемое хранилище URL документов (RDBMS или Key Value), управление стратегиями обхода, абстракцию транспортного уровня, абстракцию модуля загрузки. Доклад построен в увлекательной форме: описание проблемы, решение и проблемы, которые возникли в ходе разработки решения.

Processing Large Graphs

Nishant Gandhi

Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...

Chris Fregly

Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds

Flink Forward

http://flink-forward.org/kb_sessions/to-petascale-and-beyond-apache-flink-in-the-clouds/ Apache Flink performs with low latency but can also scale to great heights. Gelly is Flink’s laboratory for building and tuning scalable graph algorithms and analytics. In this talk we’ll discuss writing algorithms optimized for the Flink architecture, assembling and configuring a cloud compute cluster, and boosting performance through benchmarking and system profiling. This talk will cover recent developments in the Gelly library to include scalable graph generators and a mixed collection of modular algorithms written with native Flink operators. We’ll think like a data stream, keep a cool cache, and send the garbage collector on holiday. To this we’ll add a lightweight benchmarking harness to stress and validate core Flink and to identify and refactor hot code with aplomb.

Balboa Park Commons: Collaborative Digitization for a Public Resource

Anna Chiaretta Lavatelli

Presented at the University of San Diego 2014 Digital Initiatives Symposium. Presentations by: Alan Renga, Archivist, San Diego Air and Space Museum Rosa Longacre, Registrar/Archivist, San Diego Museum of Man Kristi Ehrig-Burgess, Library Archives and Digitization Manager, Mingei International Museum Anna Chiaretta Lavatelli, Asst. Director of Digital Media, Balboa Park Online Collaborative www.balboaparkcommons.org is an IMLS Funded project that was made possible by the hard work of Perian Sully, Christina DePaolo, Rich Cherry and Chris Borkowski and the participating partners of Balboa Park Online Collaborative.

JavaScript History

Rhio Kim

Solving Visibility and Streaming in The Witcher 3: Wild Hunt with Umbra 3

jasinb

TinkerPop: a story of graphs, DBs, and graph DBs

Joshua Shinavier

WebServices_Grid.ppt

EqinNiftalyev

LiveCoding Package for Pharo

ESUG

Implementing a VO archive for datacubes of galaxiesJose Enrique Ruiz

Using the SDACK Architecture to Build a Big Data Product

Evans Ye

You definitely have heard about the SMACK architecture, which stands for Spark, Mesos, Akka, Cassandra, and Kafka. It’s especially suitable for building a lambda architecture system. But what is SDACK? Apparently it’s very much similar to SMACK except the “D" stands for Docker. While SMACK is an enterprise scale, multi-tanent supported solution, the SDACK architecture is particularly suitable for building a data product. In this talk, I’ll talk about the advantages of the SDACK architecture, and how TrendMicro uses the SDACK architecture to build an anomaly detection data product. The talk will cover: 1) The architecture we designed based on SDACK to support both batch and streaming workload. 2) The data pipeline built based on Akka Stream which is flexible, scalable, and able to do self-healing. 3) The Cassandra data model designed to support time series data writes and reads.

Maablalbritton

RDA for Music: Scores

ALATechSource

Playlist Recommendations @ Spotify

Nikhil Tibrewal

Azure storage deep dive

Yves Goeleven

Similar to DiscoRank: optimizing discoverability on SoundCloud (20)

Cassandra and Spark

Apache Spark: The Next Gen toolset for Big Data Processing

Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...

«Scrapy internals» Александр Сибиряков, Scrapinghub

Frontera распределенный робот для обхода веба в больших объемах / Александр С...

Processing Large Graphs

Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...

Greg Hogan – To Petascale and Beyond- Apache Flink in the Clouds

Balboa Park Commons: Collaborative Digitization for a Public Resource

JavaScript History

Solving Visibility and Streaming in The Witcher 3: Wild Hunt with Umbra 3

TinkerPop: a story of graphs, DBs, and graph DBs

WebServices_Grid.ppt

LiveCoding Package for Pharo

Implementing a VO archive for datacubes of galaxies

Using the SDACK Architecture to Build a Big Data Product

Maa

RDA for Music: Scores

Playlist Recommendations @ Spotify

Azure storage deep dive

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

DanBrown980551

Do you want to learn how to model and simulate an electrical network from scratch in under an hour? Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)! During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook. PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides: - A fully editable and extendable library for grid component modelling; - Visualization tools to display your network; - Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses; The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well. What you will learn during the webinar: - For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills; - For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster. However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks. In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Knowledge engineering: from people to machines and back

Elena Simperl

Elevating Tactical DDD Patterns Through Object Calisthenics

Dorra BARTAGUIZ

After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!

UiPath Test Automation using UiPath Test Suite series, part 4

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap. The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies. Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques What will you get from this session? 1. Insights into SAP testing best practices 2. Heatmap utilization for testing 3. Optimization of testing processes 4. Demo Topics covered: Execution from the test manager Orchestrator execution result Defect reporting SAP heatmap example with demo Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

Designing Great Products: The Power of Design and Leadership by Chief Designe...

PCI PIN Basics Webinar from the Controlcase Team

ControlCase

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

Prayukth K V

The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development. The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers: State of global ICS asset and network exposure Sectoral targets and attacks as well as the cost of ransom Global APT activity, AI usage, actor and tactic profiles, and implications Rise in volumes of AI-powered cyberattacks Major cyber events in 2024 Malware and malicious payload trends Cyberattack types and targets Vulnerability exploit attempts on CVEs Attacks on counties – USA Expansion of bot farms – how, where, and why In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East Why are attacks on smart factories rising? Cyber risk predictions Axis of attacks – Europe Systemic attacks in the Middle East Download the full report from here: https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024

Tobias Schneck

As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other? Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...

Transcript: Selling digital books in 2024: Insights from industry leaders - T...

BookNet Canada

The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more. Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/ Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.

The Future of Platform Engineering

Jemma Hussein Allen

Mission to Decommission: Importance of Decommissioning Products to Increase E...

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...