Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

•

3 likes•681 views

Greg Whalin

Meetup
founded in 2002 in NYC
81k+ paying groups
1.2m+ RSVPs per months
25k+ Meetups per week
6 straight quarters of profitability
team of 70 people working on MEME
we are hiring

Photos (processing and storage)
~500k photo uploads per month
processing
3 - 5 different dimensions
orientation (not dealing with for now)
exif
storage
cheap
easy to scale
performance not critical because ...
serving
fast
but CDN ok
local cache

Processing
ImageMagick (or GraphicsMagick)
CPU intensive
No processing in core application
for large uploads, can hold up request
separate service (backed by distributed cluster)
should be able to work behind dumb round-robin load
balancer
Batch for background processing
rapid turnaround still important

Storage
Requirements: cheap, scalable, redundant, easy
local RAID
easy, but not scalable, redundant or cheap
SAN
scalable, easy, and redundant, but not cheap
NAS
scalable, easy, and cheap, but not super redundant
Distributed
can be cheap, scalable, redundant, and easy
We went w/ MogileFS
Cheap and scalable: jbod on entire network
Redundant: auto-replication schemes (rack and chassis
aware), no SPOF
Easy: kinda, but no POSIX interface; web api (GET/PUT)

Photo upload life cycle
1. User POSTs photo(s) from computer to Meetup App
servers
2. Record inserted into db in a pending state
3. App server POSTs photo w/ some metadata to staging
cluster (behind dumb load balancer)
4. lighttpd+fastCGI process running on each stager stores
photo and meta to filesystem
5. Stager job monitors directory (using inotify) and wakes up
when new photo to process
6. Stager code (Perl) processes photo, stores in MogileFS and
updates DB that it is complete

Current Setup
2 x dedicated upload app servers machines
rarely ever restarted so as to avoid interrupting long
uploads
java + tomcat
8 proc Opteron w/ 16 GB
2 x dedicated staging/processing boxes
lighttpd + perl daemon
4 proc Opteron w/ 8 GB
9 x storage nodes (Mogile)
4 proc Opteron w/ 8 GB
JBOD (old boxes 8 x 500GB, new 16 x 1TB)
total capacity now 86TB (72 used)
2 x db boxes for meta info (running mysql)

Serving
2 serving nodes
2 proc Opterons w/ 4 GB (old cheap boxes)
lighttpd + fastCGI
retrieve photo from MogileFS, and return it
not high speed at serving so...
Akamai
photos served w/ long TTL for cache
even so, 20% of all requests hit our origin servers (lots of
edge?) so ...
Varnish cache in front of origin servers (coming soon)
Akamai midgress

Links
http://www.danga.com/mogilefs/
http://www.imagemagick.org/script/index.php

WildFly core is fully modular application server, which is used as base to build WildFly EE container and much more. Functionalities such as EE are implemented as sets of extensions also known as subsystems. Extensions give you low level access to application server’s functionalities such as JBoss Modules for class loading Domain management model Deployment processors Modular Service Container (aka service kernel)

Java & containers: What I wish I knew before I used it | DevNation Tech Talk

Red Hat Developers

They told you that using Java with containers was great, but they never mentioned that it wouldn't be that easy, right? You have all the advantages of containers like isolation, scalability, ease of deployment, and version management, but what about the pitfalls when using it with Java? Memory management, image size, initialization time … sometimes they can be tricky. But there's a way out. Luckily, there are some best practices that will rescue your application from failing. From the Dockerfile to the Java updates (from 9 to 14 and beyond), you can have the best of both worlds right in your hands. Join this practical session to transform the way you deal with Java and containers today.

Spark / Mesos Cluster Optimization

ebiznext

Sheepdog: yet another all in-one storage for openstackLiu Yuan

Elephant Roads: a tour of Postgres forks

Command Prompt., Inc

Josh Berkus Most users know that PostgreSQL has a 23-year development history. But did you know that Postgres code is used for over a dozen other database systems? Thanks to our liberal licensing, many companies and open source projects over the years have taken the Postgres or PostgreSQL code, changed it, added things to it, and/or merged it into something else. Illustra, Truviso, Aster, Greenplum, and others have seen the value of Postgres not just as a database but as some darned good code they could use. We'll explore the lineage of these forks, and go into the details of some of the more interesting ones.

Over the years, people have questioned if PHP is a good choice for building web services. In this talk, I will share how we use PHP on the backend for Glu Mobile’s flagship mobile game Design Home, enabling it to regularly rank amongst the top free mobile games in the Apple App Store and the Google Play Store. We will deep dive into the thought processes, development, testing, and deployment strategy, showcasing what we have achieved with PHP.

JahiaOne - Performance Tuning

Jahia Solutions Group

Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team

Mydbops

Using advanced options in MariaDB Connector/J

MariaDB plc

MariaDB Connector/J is our widely used Type 4 JDBC driver for Java. This session covers the basics of getting started with Java and MariaDB, and moves quickly to more advanced topics, including connection pooling, automatic failover and debugging. Diego Dupin also includes an overview of popular object/relational mapping (ORM) and programming frameworks for Java. Even if you have been using the MariaDB Connector/J for years, come to this session to learn about the latest release, see where the connector is going and discover the latest tips and tricks.

JSR107 Come, Code, Cache, Compute!

Payara

OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...

NETWAYS

Data Processing and Ruby in the World

SATOSHI TAGOMORI

Evolution of MySQL Parallel Replication

Mydbops

Building an Angular 2 App

Felix Gessert

This talk demonstrates how to develop single page apps with the new angular2 framework and TypeScript. We show the new concepts of angular2 not only in theory, but using a real application. To this end, we develop a real-time angular2 website, for users to to ask and upvote questions during a talk identified by a hash tag. The session chair can ask the most popular questions at the end of the talk. Dieser Vortrag zeigt, wie man mit dem neuen Angular2 Framework und TypeScript schnelle Single Page Apps entwickelt. Die neuen Konzepte von Angular2 zeigen wir dabei nicht nur in der Theorie, sondern ganz praktisch. Dazu entwickeln wir live eine Real-Time Angular2 App, mit der Zuhörer während eines Vortrags – identifiziert durch einen Hash-Tag - Fragen stellen und gegenseitig upvoten können. Der Session Chair kann so am Ende eines Vortrags die bestbewertesten Fragen an den Speaker stellen.

Automate MongoDB with MongoDB Management Service

MongoDB

MongoDB Management Service makes operations effortless, reducing complicated tasks to a single click. You can now provision machines, configure replica sets and sharded clusters, and upgrade your MongoDB deployment all through the MMS interface. We'll walk through demos of all the new MMS features, including provisioning, expanding and contracting a cluster, resizing the oplog, and managing users.

OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek

NETWAYS

Running virtualized Galera instances for fun and profit

Raghavendra Prabhu

MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS

Jelastic Multi-Cloud PaaS

Availability and performance have a direct business impact for most of the companies nowadays. No one wants to lose money because of occasional downtime or data loss. Thus, to minimize the risk and ensure an extra level of redundancy, clustering and automatic scaling should be used. In this video Ruslan Synytsky presented how Jelastic PaaS implemented auto-clustering of MariaDB by providing the customers with different replication options out-of-box with no need in manual configurations. It is also detailed how to automate vertical and horizontal scaling of databases running in the cloud. Video recording of the session https://www.youtube.com/watch?v=6MND3feb5zM

Introduction to Apache Mesos

Joe Stein

OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...

OpenNebula Project

Since 2008, Harvard Research Computing has undertaken a significant scaling challenge increasing their available HPC and storage from 200 cores and 20TB to over 70,000 cores and 35PB of storage. James will discuss the journey and the highlights of extending the computing to support world class research and education. During the evolution of the computing platforms at Harvard they also helped to support and build the Massachusetts Green High Performance Computing Center which is a dedicated high performance research computing facility in Holyoke, MA. This facility continues to support large scale research computing with sustainable energy and advanced networking. Recently the NESE project (New England Storage Exchange) was funded by the National Science Foundation. This is a multi-petabyte object store that is supported by the existing MGHPCC facility supporting the region. The Data Science Initiative at Harvard has also been recently announced and will require even further advanced computation to support their research faculty. Now as the world takes a grip on "cloud" but more importantly remotely provisioned infrastructure, hybrid models for compute and storage are required along with flexibility to be able to further accelerate science. James will discuss their strategy moving forwards and the current and existing infrastructures in place to allow for seamless provisioning of research computing. Justin Riley Team Lead at Harvard, will follow this talk with a deep technical discussion of the specific implementation of the systems that Harvard are designing in concert with the development teams and leadership at OpenNebula to support research computing to make their platforms more resilient and able to continue to scale.

mogpresxlight

SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...

Chester Chen

Machine Learning at the Limit John Canny, UC Berkeley How fast can machine learning and graph algorithms be? In "roofline" design, every kernel is driven toward the limits imposed by CPU, memory, network etc. This can lead to dramatic improvements: BIDMach is a toolkit for machine learning that uses rooflined design and GPUs to achieve two- to three-orders of magnitude improvements over other toolkits on single machines. These speedups are larger than have been reported for *cluster* systems (e.g. Spark/MLLib, Powergraph) running on hundreds of nodes, and BIDMach with a GPU outperforms these systems for most common machine learning tasks. For algorithms (e.g. graph algorithms) which do require cluster computing, we have developed a rooflined network primitive called "Kylix". We can show that Kylix approaches the rooline limits for sparse Allreduce, and empirically holds the record for distributed Pagerank. Beyond rooflining, we believe there are great opportunities from deep algorithm/hardware codesign. Gibbs Sampling (GS) is a very general tool for inference, but is typically much slower than alternatives. SAME (State Augmentation for Marginal Estimation) is a variation of GS which was developed for marginal parameter estimation. We show that it has high parallelism, and a fast GPU implementation. Using SAME, we developed a GS implementation of Latent Dirichlet Allocation whose running time is 100x faster than other samplers, and within 3x of the fastest symbolic methods. We are extending this approach to general graphical models, an area where there is currently a void of (practically) fast tools. It seems at least plausible that a general-purpose solution based on these techniques can closely approach the performance of custom algorithms. Bio John Canny is a professor in computer science at UC Berkeley. He is an ACM dissertation award winner and a Packard Fellow. He is currently a Data Science Senior Fellow in Berkeley's new Institute for Data Science and holds a INRIA (France) International Chair. Since 2002, he has been developing and deploying large-scale behavioral modeling systems. He designed and protyped production systems for Overstock.com, Yahoo, Ebay, Quantcast and Microsoft. He currently works on several applications of data mining for human learning (MOOCs and early language learning), health and well-being, and applications in the sciences.

What's hot

CFWheels - Pragmatic, Beautiful Code

indiver

Apache Mesos: a simple explanation of basics

Gladson Manuel

Overview of sheepdogLiu Yuan

MySQL Rebuild using Logical Backups

Mydbops

MySQL Performance Schema in Action

Mydbops

Massively Scaled High Performance Web Services with PHP

Demin Yin

JahiaOne - Performance Tuning

Jahia Solutions Group

Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team

Mydbops

Using advanced options in MariaDB Connector/J

MariaDB plc

JSR107 Come, Code, Cache, Compute!

Payara

OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...

NETWAYS

Data Processing and Ruby in the World

SATOSHI TAGOMORI

Evolution of MySQL Parallel Replication

Mydbops

Building an Angular 2 App

Felix Gessert

Automate MongoDB with MongoDB Management Service

MongoDB

OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek

NETWAYS

Running virtualized Galera instances for fun and profit

Raghavendra Prabhu

MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS

Jelastic Multi-Cloud PaaS

Introduction to Apache Mesos

Joe Stein

OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...

OpenNebula Project

What's hot (20)

CFWheels - Pragmatic, Beautiful Code

Apache Mesos: a simple explanation of basics

Overview of sheepdog

MySQL Rebuild using Logical Backups

MySQL Performance Schema in Action

Massively Scaled High Performance Web Services with PHP

JahiaOne - Performance Tuning

Insight on MongoDB Change Stream - Abhishek.D, Mydbops Team

Using advanced options in MariaDB Connector/J

JSR107 Come, Code, Cache, Compute!

OpenNebula Conf 2014 | Lightning talk: Proactive Autonomic Management Feature...

Data Processing and Ruby in the World

Evolution of MySQL Parallel Replication

Building an Angular 2 App

Automate MongoDB with MongoDB Management Service

OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek

Running virtualized Galera instances for fun and profit

MariaDB Auto-Clustering, Vertical and Horizontal Scaling within Jelastic PaaS

Introduction to Apache Mesos

OpenNebulaconf2017US: Rapid scaling of research computing to over 70,000 cor...

Similar to Meetup photo processing, storage, and serving - NYC Tech Talks Meetup

mogpresxlight

SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...

Chester Chen

MySQL Scalability and Reliability for Replicated Environment

Jean-François Gagné

You have a working application that is using MySQL: great! At the beginning, you are probably using a single database instance, and maybe – but not necessarily – you have replication for backups, but you are not reading from slaves yet. Scalability and reliability were not the main focus in the past, but they are starting to be a concern. Soon, you will have many databases and you will have to deal with replication lag. This talk will present how to tackle the transition. We mostly cover standard/asynchronous replication, but we will also touch on Galera and Group Replication. We present how to adapt the application to become replication-friendly, which facilitate reading from and failing over to slaves. We also present solutions for managing read views at scale and enabling read-your-own-writes on slaves. We also touch on vertical and horizontal sharding for when deploying bigger servers is not possible anymore. Are UNIQUE and FOREIGN KEYs still possible at scale, what are the downsides of AUTO_INCREMENTs, how to avoid overloading replication, what are the limits of archiving, … Come to this talk to get answers and to leave with tools for tackling the challenges of the future.

mogpresHiroshi Ono

11g R2afa reg

Why Wordnik went non-relational

Tony Tam

Galera webinar migration to galera cluster from my sql async replication

Codership Oy - Creators of Galera Cluster

SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...

Fred de Villamil

Beyond Tiered Storage: Serverless Kafka with No Local Disks

HostedbyConfluent

"Separation of compute and storage has become the de-facto standard in the data industry for batch processing. The addition of tiered storage to open source Apache Kafka is the first step in bringing true separation of compute and storage to the streaming world. In this talk, we'll discuss in technical detail how to take the concept of tiered storage to its logical extreme by building an Apache Kafka protocol compatible system that has zero local disks. Eliminating all local disks in the system requires not only separating storage from compute, but also separating data from metadata. This is a monumental task that requires reimagining Kafka's architecture from the ground up, but the benefits are worth it. This approach enables a stateless, elastic, and serverless deployment model that minimizes operational overhead and also drives inter-zone networking costs to almost zero."

Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...

varien

Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...MagentoImagine

-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...Гриднев Виталий

Galaxy Big Data with MariaDB

MariaDB Corporation

[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화

OpenStack Korea Community

Joomla! Performance on Steroids

SiteGround.com

Scabiv0.2

Dilshad Mustafa

IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration

Bobby Curtis

MySQL Scalability and Reliability for Replicated Environment

Jean-François Gagné

Oracle Golden Gate Interview Questions

Arun Sharma

Doag data replication with oracle golden gate: Looking behind the scenes

Trivadis

Similar to Meetup photo processing, storage, and serving - NYC Tech Talks Meetup (20)

mogpres

SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...

MySQL Scalability and Reliability for Replicated Environment

mogpres

11g R2

Why Wordnik went non-relational

Galera webinar migration to galera cluster from my sql async replication

SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...

Beyond Tiered Storage: Serverless Kafka with No Local Disks

Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...

Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...

-Kerberos and Health Checks and Bare Metal, Oh My! Updates to OpenStack Sahar...

Galaxy Big Data with MariaDB

[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화

Joomla! Performance on Steroids

Scabiv0.2

IOUG Data Integration SIG w/ Oracle GoldenGate Solutions and Configuration

MySQL Scalability and Reliability for Replicated Environment

Oracle Golden Gate Interview Questions

Doag data replication with oracle golden gate: Looking behind the scenes

Recently uploaded

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject. We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup. Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved. The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring. The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise. By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.

National Security Agency - NSA mobile device best practices

Quotidiano Piemontese

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...

Neo4j

Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Large Language Model (LLM) and it’s Geospatial Applications

Rohit Gautam

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Aggregage

UiPath Test Automation using UiPath Test Suite series, part 6

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI. UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities. Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes. What will you get from this session? 1. Insights into integrating generative AI. 2. Understanding how this integration enhances test automation within the UiPath platform 3. Practical demonstrations 4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath Topics covered: What is generative AI Test Automation with generative AI and Open AI. UiPath integration with generative AI Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...

Neo4j

Dr. Sean Tan, Head of Data Science, Changi Airport Group Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx

nkrafacyberclub

GridMate - End to end testing is a critical piece to ensure quality and avoid...

ThomasParaiso2

Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI

Vladimir Iglovikov, Ph.D.

Presented by Vladimir Iglovikov: - https://www.linkedin.com/in/iglovikov/ - https://x.com/viglovikov - https://www.instagram.com/ternaus/ This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation. Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners. This case study covers various aspects, including: People: The contributors and community that have supported Albumentations. Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions. Challenges: The hurdles in monetizing open-source projects and measuring user engagement. Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration. Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community. Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations. Mental Health: Maintaining balance and not feeling pressured by user demands. Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth. Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects. Explore more about Albumentations and join the community at: GitHub: https://github.com/albumentations-team/albumentations Website: https://albumentations.ai/ LinkedIn: https://www.linkedin.com/company/100504475 Twitter: https://x.com/albumentations

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

Introduction to CHERI technology - Cybersecurity

mikeeftimakis1

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs

Alex Pruden

This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second). Paper: https://eprint.iacr.org/2023/1886

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

By Design, not by Accident - Agile Venture Bolzano 2024

Pierluigi Pugliese

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024

Neo4j

Climate Impact of Software Testing at Nordic Testing Days

Kari Kakkonen

My slides at Nordic Testing Days 6.6.2024 Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.

DevOps and Testing slides at DASA Connect

Kari Kakkonen

Recently uploaded (20)