These are the slides for the talk I presented at the LA Web Speed meetup hosted by Yahoo on May 17, 2013 - http://www.meetup.com/LAWebSpeed/events/115663212/
Build a Geospatial App with Redis 3.2- Andrew Bass, Sean Yesmunt, Sergio Prad...Redis Labs
We created an app to find nearby running partners, and to demonstrate Redis Data structures and functions. In this talk, we will review the data structures and walk through our NodeJS app that depends solely on Redis Geospatial Indexes. Functions demoed are GEOADD, ZREM, GEOHASH, GEOPOS, GEODIST, GEORADIUS, GEORADIUSBYMEMBER
Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)Maarten Balliauw
Serving up content on the Internet is something our web sites do daily. But are we doing this in the fastest way possible? How are users in faraway countries experiencing our apps? Why do we have three webservers serving the same content over and over again? In this session, we’ll explore the Azure Content Delivery Network or CDN, a service which makes it easy to serve up blobs, videos and other content from servers close to our users. We’ll explore simple file serving as well as some more advanced, dynamic edge caching scenarios.
A list of all URLs in the deck is at: https://gist.github.com/itamarhaber/87e8c8c7126fbfb3f722
A lightening talk filled to the brim with knowledge and tips about Redis, data structures, performance, RAM and tips to take Redis to the max
Redis is an advanced key-value store or a data structure server. This presentation will cover the following topics:
* An overview of Redis
* Data Structures
* Basics of Setup and Installation
* Basics of Administration
* Programming with Redis
* Considerations of Running Redis in a Virtual Machine
* Redis Resources There will be a number of demonstrations to help explain some of the concepts being presented.
RespClient - Minimal Redis Client for PowerShellYoshifumi Kawai
RespClient is a minimal RESP(REdis Serialization Protocol) client for C# and PowerShell.
https://github.com/neuecc/RespClient
at Japan PowerShell User Group #3
#jpposh
These are the slides for the talk I presented at the LA Web Speed meetup hosted by Yahoo on May 17, 2013 - http://www.meetup.com/LAWebSpeed/events/115663212/
Build a Geospatial App with Redis 3.2- Andrew Bass, Sean Yesmunt, Sergio Prad...Redis Labs
We created an app to find nearby running partners, and to demonstrate Redis Data structures and functions. In this talk, we will review the data structures and walk through our NodeJS app that depends solely on Redis Geospatial Indexes. Functions demoed are GEOADD, ZREM, GEOHASH, GEOPOS, GEODIST, GEORADIUS, GEORADIUSBYMEMBER
Get more than a cache back! The Microsoft Azure Redis Cache (NDC Oslo)Maarten Balliauw
Serving up content on the Internet is something our web sites do daily. But are we doing this in the fastest way possible? How are users in faraway countries experiencing our apps? Why do we have three webservers serving the same content over and over again? In this session, we’ll explore the Azure Content Delivery Network or CDN, a service which makes it easy to serve up blobs, videos and other content from servers close to our users. We’ll explore simple file serving as well as some more advanced, dynamic edge caching scenarios.
A list of all URLs in the deck is at: https://gist.github.com/itamarhaber/87e8c8c7126fbfb3f722
A lightening talk filled to the brim with knowledge and tips about Redis, data structures, performance, RAM and tips to take Redis to the max
Redis is an advanced key-value store or a data structure server. This presentation will cover the following topics:
* An overview of Redis
* Data Structures
* Basics of Setup and Installation
* Basics of Administration
* Programming with Redis
* Considerations of Running Redis in a Virtual Machine
* Redis Resources There will be a number of demonstrations to help explain some of the concepts being presented.
RespClient - Minimal Redis Client for PowerShellYoshifumi Kawai
RespClient is a minimal RESP(REdis Serialization Protocol) client for C# and PowerShell.
https://github.com/neuecc/RespClient
at Japan PowerShell User Group #3
#jpposh
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
High Performance Redis looks at a wide range of techniques - from programming to system tuning - to deploy and maintain an extremely high performing Redis cluster. From the operational
perspective, the talk lays out multiple techniques for clustering (sharding) Redis systems and examines how the different
approaches impact performance time. The talk further examines system settings (Linux network parameters, Redis
system) and how they impact performance (both good and bad). Finally, for the developer, we look at how different ways of structuring data actually demonstrate different performance characteristics
Using Redis as Distributed Cache for ASP.NET apps - Peter Kellner, 73rd Stre...Redis Labs
I will build from scratch in this session a Microsoft ASP.NET website that caches WebAPI REST calls with both MSOpenTech’s Redis implementation for running while developing in Visual Studio as well as running on a Windows server running IIS. I will show you how to build a safe reusable caching library in c# that can be used in any .net project. I will also demonstrate how to use the Redis cache services that are available on Microsoft’s Azure cloud platform. Further, I’ll demonstrate a real world web site that uses Azure Redis cache and show statistics on how Redis improves performance consistently and reliably.
Scalable Streaming Data Pipelines with RedisAvram Lyon
Slides for talk presented at LA Redis meetup, April 16, 2016 at Scopely.
This is a draft of a session to be presented at Redis Conference 2016.
Description:
Scopely's portfolio of social and mid-core games generates billions of events each day, covering everything from in-game actions to advertising to game engine performance. As this portfolio grew in the past two years, Scopely moved all event analysis from third-party hosted solutions to a new event analytics pipeline on top of Redis and Kinesis, dramatically reducing operating costs and enabling new real-time analysis and more efficient warehousing. Our solution receives events over HTTP and SQS and provides real-time aggregation using a custom Redis-backed application, as well as prompt loads into HDFS for batch analyses.
Recently, we migrated our realtime layer from a pure Redis datastore to a hybrid datastore with recent data in Redis and older data in DynamoDB, retaining performance while further reducing costs. In this session we will describe our experience building, tuning and monitoring this pipeline, and the role of Redis in supporting handling of Kinesis worker failover, deployment, and idempotence, in addition to its more visible role in data aggregation. This session is intended be helpful for those building streaming data systems and looking for solutions for aggregation and idempotence.
As a data scientist I frequently need to create web apps to provide interactive functionality, deliver data APIs or simply publish results. It is now easier than ever to deploy your data driven web app by using cloud based application platforms to do the heavy lifting. Cloud Foundry (http://cloudfoundry.org) is an open source public and private cloud platform that enables simple app deployment, scaling and connectivity to data services like PostgreSQL, MongoDB, Redis and Cassandra.
Resources: http://www.ianhuston.net/2015/01/cloud-foundry-for-data-science-talk/
Back your App with MySQL & Redis, the Cloud Foundry Way- Kenny Bastani, PivotalRedis Labs
In this session, we will build a minimum viable Spring Data web service with REST API, add a MySQL backing service as the primary data store, and a Redis Labs backing service for caching. We will demonstrate performance metrics without Redis caching enabled and then with Redis caching enabled. I will also provide an intro-level explanation of the platform capabilities within Pivotal Web Services.
Redis & MongoDB: Stop Big Data Indigestion Before It StartsItamar Haber
Efficiently digesting data in large volumes can prove to be challenging for any database. The challenges are compounded when this influx must be analyzed on the fly, or "tasted", to satisfy the sophisticated palates of modern apps. Luckily, there are several proven remedies you can concoct with Redis to help with potential indigestion.
The URLs from the presentation are also available at: https://gist.github.com/itamarhaber/325e515c1715a12ef132
March 29, 2016 Dr. Josiah Carlson talks about using Redis as a Time Series DBJosiah Carlson
Hear my talk on using Redis as a time series database, presented at Nextspace in Culver City, and sponsored by Redis Labs.
Audio: https://soundcloud.com/josiah-carlson-180844738/2016-03-29-redis-talk-josiah-carlson-at-nextspace-in-culver-city
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Approximate Vector Search at Scale, With Application to Image Search - SciPY ...Wakana Nogami
Mercari provides an image search feature, which makes it possible for users to find similar items by image. This talk describes how we implemented similar image search over 100s of millions of images, in a way that is accurate. We will also highlight the techniques we used to keep the system efficient and update to date.
Devoxx : being productive with JHipsterJulien Dubois
Slides from the "being productive with JHipster" talk at Devoxx Belgium 2016 by Julien Dubois (JHipster lead) & Deepu K Sasidharan (JHipster co-lead).
Live video is at: https://www.youtube.com/watch?v=dzdjP3CPOCs
Code commited (live!) during the presentation is at:
https://github.com/jhipster/devoxx-2016
"What we learned from 5 years of building a data science software that actual...Dataconomy Media
"What we learned from 5 years of building a data science software that actually works for everybody." Dr. Dennis Proppe, CTO and Chief Data Scientist at GPredictive GmbH
Watch more from Data Natives Berlin 2016 here: http://bit.ly/2fE1sEo
Visit the conference website to learn more: www.datanatives.io
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
https://www.youtube.com/c/DataNatives
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Dennis Proppe is the CTO and Chief Data Scientist at Gpredictive, where he helps building software that enables data scientists to build and deploy predictive models in a few minutes instead of weeks. He has 10 years+ of expertise in extracting business value from data. Before co-founding Gpredictive, he worked as a marketing science consultant. Dennis holds a Ph.D. in statistical marketing.
in this webinar, we were discussing about a mikrotik feature that is called metarouter, which can allow us to create an independent virtual router instance. we start the presentation from the introduction of mikrotik and GLC, and then the metarouter. we also do demo and QA and the end of presentation.
the recording is available on youtube: https://www.youtube.com/channel/UCI611_IIkQC0rsLWIFIx_yg
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
High Performance Redis looks at a wide range of techniques - from programming to system tuning - to deploy and maintain an extremely high performing Redis cluster. From the operational
perspective, the talk lays out multiple techniques for clustering (sharding) Redis systems and examines how the different
approaches impact performance time. The talk further examines system settings (Linux network parameters, Redis
system) and how they impact performance (both good and bad). Finally, for the developer, we look at how different ways of structuring data actually demonstrate different performance characteristics
Using Redis as Distributed Cache for ASP.NET apps - Peter Kellner, 73rd Stre...Redis Labs
I will build from scratch in this session a Microsoft ASP.NET website that caches WebAPI REST calls with both MSOpenTech’s Redis implementation for running while developing in Visual Studio as well as running on a Windows server running IIS. I will show you how to build a safe reusable caching library in c# that can be used in any .net project. I will also demonstrate how to use the Redis cache services that are available on Microsoft’s Azure cloud platform. Further, I’ll demonstrate a real world web site that uses Azure Redis cache and show statistics on how Redis improves performance consistently and reliably.
Scalable Streaming Data Pipelines with RedisAvram Lyon
Slides for talk presented at LA Redis meetup, April 16, 2016 at Scopely.
This is a draft of a session to be presented at Redis Conference 2016.
Description:
Scopely's portfolio of social and mid-core games generates billions of events each day, covering everything from in-game actions to advertising to game engine performance. As this portfolio grew in the past two years, Scopely moved all event analysis from third-party hosted solutions to a new event analytics pipeline on top of Redis and Kinesis, dramatically reducing operating costs and enabling new real-time analysis and more efficient warehousing. Our solution receives events over HTTP and SQS and provides real-time aggregation using a custom Redis-backed application, as well as prompt loads into HDFS for batch analyses.
Recently, we migrated our realtime layer from a pure Redis datastore to a hybrid datastore with recent data in Redis and older data in DynamoDB, retaining performance while further reducing costs. In this session we will describe our experience building, tuning and monitoring this pipeline, and the role of Redis in supporting handling of Kinesis worker failover, deployment, and idempotence, in addition to its more visible role in data aggregation. This session is intended be helpful for those building streaming data systems and looking for solutions for aggregation and idempotence.
As a data scientist I frequently need to create web apps to provide interactive functionality, deliver data APIs or simply publish results. It is now easier than ever to deploy your data driven web app by using cloud based application platforms to do the heavy lifting. Cloud Foundry (http://cloudfoundry.org) is an open source public and private cloud platform that enables simple app deployment, scaling and connectivity to data services like PostgreSQL, MongoDB, Redis and Cassandra.
Resources: http://www.ianhuston.net/2015/01/cloud-foundry-for-data-science-talk/
Back your App with MySQL & Redis, the Cloud Foundry Way- Kenny Bastani, PivotalRedis Labs
In this session, we will build a minimum viable Spring Data web service with REST API, add a MySQL backing service as the primary data store, and a Redis Labs backing service for caching. We will demonstrate performance metrics without Redis caching enabled and then with Redis caching enabled. I will also provide an intro-level explanation of the platform capabilities within Pivotal Web Services.
Redis & MongoDB: Stop Big Data Indigestion Before It StartsItamar Haber
Efficiently digesting data in large volumes can prove to be challenging for any database. The challenges are compounded when this influx must be analyzed on the fly, or "tasted", to satisfy the sophisticated palates of modern apps. Luckily, there are several proven remedies you can concoct with Redis to help with potential indigestion.
The URLs from the presentation are also available at: https://gist.github.com/itamarhaber/325e515c1715a12ef132
March 29, 2016 Dr. Josiah Carlson talks about using Redis as a Time Series DBJosiah Carlson
Hear my talk on using Redis as a time series database, presented at Nextspace in Culver City, and sponsored by Redis Labs.
Audio: https://soundcloud.com/josiah-carlson-180844738/2016-03-29-redis-talk-josiah-carlson-at-nextspace-in-culver-city
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
Postgres and Redis Sitting in a Tree | In today’s world of polyglot persistence, it’s likely that companies will be using multiple data stores for storing and working with data based on the use case. Typically a company will
start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?
Approximate Vector Search at Scale, With Application to Image Search - SciPY ...Wakana Nogami
Mercari provides an image search feature, which makes it possible for users to find similar items by image. This talk describes how we implemented similar image search over 100s of millions of images, in a way that is accurate. We will also highlight the techniques we used to keep the system efficient and update to date.
Devoxx : being productive with JHipsterJulien Dubois
Slides from the "being productive with JHipster" talk at Devoxx Belgium 2016 by Julien Dubois (JHipster lead) & Deepu K Sasidharan (JHipster co-lead).
Live video is at: https://www.youtube.com/watch?v=dzdjP3CPOCs
Code commited (live!) during the presentation is at:
https://github.com/jhipster/devoxx-2016
"What we learned from 5 years of building a data science software that actual...Dataconomy Media
"What we learned from 5 years of building a data science software that actually works for everybody." Dr. Dennis Proppe, CTO and Chief Data Scientist at GPredictive GmbH
Watch more from Data Natives Berlin 2016 here: http://bit.ly/2fE1sEo
Visit the conference website to learn more: www.datanatives.io
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
https://www.youtube.com/c/DataNatives
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Dennis Proppe is the CTO and Chief Data Scientist at Gpredictive, where he helps building software that enables data scientists to build and deploy predictive models in a few minutes instead of weeks. He has 10 years+ of expertise in extracting business value from data. Before co-founding Gpredictive, he worked as a marketing science consultant. Dennis holds a Ph.D. in statistical marketing.
in this webinar, we were discussing about a mikrotik feature that is called metarouter, which can allow us to create an independent virtual router instance. we start the presentation from the introduction of mikrotik and GLC, and then the metarouter. we also do demo and QA and the end of presentation.
the recording is available on youtube: https://www.youtube.com/channel/UCI611_IIkQC0rsLWIFIx_yg
We have calculated 31.4 trillion digits of Pi in 2019 and broke the world record in the Pi computation. This talk will discuss the nature of the calculation, the architecture, challenges and techniques, and of course the brief history of Pi computation. Calculating Pi has been the speaker's childhood dream and this talk will also explain how the small dream grew to the new world record.
When we started with analytics department, we had a couple of AB tests a month and each took us about a week of work. With more apps and increased interest in data, we started running more and more tests, coming to a couple a week. A lot of metrics were the same in all reports (retention, engagement, revenue data) and so we decided it would make sense to automatise this and have hours for digging deeper into other metrics or even do other analyses, such as churn prediction and user behaviour.
With this came additional positive thing which is we can now daily see what is happening with tests we are currently running and if there are any bugs, we can spot them sooner than before.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
2. What we want?
● Periodical UV
○ daily
○ weekly
○ monthly
○ total
● UV by dimension
○ country
○ service
3. before logic
● traditional logic
○ make and maintain daily UV list
○ make weekly/monthly UV list using daily UV
Daily Logs
Daily UV
Daily UV
Daily UV
Daily UV
Weekly
UV
Monthly
UV
Total UV
UV by
country
UV by
service
4. before logic
● size is too large !!!
○ if daily UV is 50 million, and store (Date, User-ID, country, service),
we need about 55GB
● cannot support real-time UV !!!
● problem is...
○ “maintain” daily UV list
○ aggregate from “whole” daily logs
5. new logic
● Concept :
○ use Redis HashSet & Bitmap
○ maintain whole user HashSet
○ maintain daily bitmap
○ calcurate weekly, monthly, total UV with Bit
operation
8. new logic
● When we need UV for a day,
we can get it with BITCOUNT command
> BITCOUNT UV_{YYYYMMDD}
● If we need for UV for n-day,
we can get it with BITOP, BITCOUNT command
> BITOP OR dest UV_{YYYYMMDD1} UV_{YYYYMMDD2} … UV_
{YYYYMMDDn}
> BITCOUNT dest
9. new logic
● The Bitmap name “UV_{YYYYMMDD}” can be extended
to “UV_service_{YYYYMMDD}” and “UV_country_
{YYYYMMDD}”
● When we need UV for ServiceA and CountryA, we can
get it with BITOP command
> BITOP AND dest UV_serviceA_{YYYYMMDD} UV_countryA_
{YYYYMMDD}
> BITCOUNT dest
10. Benefit of the new logic
● We can get UV for elastic period (last 3day, 4day or temporary
period)
● We can get UV in real-time
● Time complexity of Redis Bitmap operation is just O(1)
● It need very small memory for UV (1.5GB when total user is 1
billion.
11. Future work
● Redis Bitmap can cover to 2^32 (about 4
billion)
● If user count increased, this logic need to be
improved (shard would be one candidate)