This document discusses indexing NuGet packages using Azure Functions and Azure Search to power search capabilities in ReSharper and Rider. It proposes using Functions triggered by changes to the NuGet.org catalog to download packages, index them using reflection metadata, and upload the results to an Azure Search index. Each step would be a separate function to allow independent scaling. The final system would watch the catalog, index new/updated packages, and provide APIs for searching packages by type or namespace.
NDC Oslo 2019 - Indexing and searching NuGet.org with Azure Functions and SearchMaarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
CloudBurst 2019 - Indexing and searching NuGet.org with Azure Functions and S...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
NDC Oslo 2019 - Indexing and searching NuGet.org with Azure Functions and SearchMaarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
CloudBurst 2019 - Indexing and searching NuGet.org with Azure Functions and S...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
This slides are used to present the following Twitter pipeline using the ELK stack (Elasticsearch, Logstash, Kibana): https://github.com/melvynator/ELK_twitter It shows how to integrate Machine Learning into your Twitter pipeline.
Distributed Eventing in OSGi - Carsten Ziegelermfrancis
OSGi Community Event 2013 (http://www.osgi.org/CommunityEvent2013/Schedule)
ABSTRACT
One of the major topics the OSGi alliance is working on is a proposal for distributed eventing especially in the cloud. This session starts with an overview of the current state in the alliance and then shows already available solutions from the Apache Sling open source project. This includes distributing events through event admin and controlled processing of events by exactly one processor in distributed installations. The current implementations will be set in context to the ongoing activations in the alliance.
SPEAKER BIO
Carsten Ziegeler is senior developer at Adobe Research Switzerland and spends most of his time on architectural and infrastructure topics. Working for over 25 years in open source projects, Carsten is a member of the Apache Software Foundation and heavily participates in several Apache communities including Sling, Felix and ACE. He is a frequent speaker on technology and open source conferences and participates in the OSGi Core Platform and Enterprise expert groups.
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
In this talk from Lucene/Solr Revolution 2015, Solr and centralized logging experts Radu Gheorghe and Rafal Kuć cover topics like: flow in Logstash, flow in rsyslog, parsing JSON, log shipping, Solr tuning, time-based collections and tiered clusters.
ELK Stack workshop covers real-world use cases and works with the participants to - implement them. This includes Elastic overview, Logstash configuration, creation of dashboards in Kibana, guidelines and tips on processing custom log formats, designing a system to scale, choosing hardware, and managing the lifecycle of your logs.
Building Your First App with Shawn Mcarthy MongoDB
This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app. We’ll cover inserting, updating, and querying data. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
Hacking the Mesh: Extending Istio with WebAssembly Modules | DevNation Tech TalkRed Hat Developers
Moving common network functionality such as client load-balancing and circuit-breaking into out-of-process proxies has kickstarted the technology we call Service Mesh, with Istio as its most popular incarnation. With the introduction of WASM support, you can now extend its functionality by writing custom Filters for Istio's proxy, Envoy, in any language that supports WASM. Let's take a closer look at developing, building, and deploying WASM Filters using the example of an OpenID Connect Filter that adds Keycloak Authentication functionality to any service in your Mesh - no code changes required.
Indexing and searching NuGet.org with Azure Functions and Search - .NET fwday...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
Maarten Balliauw "Indexing and searching NuGet.org with Azure Functions and S...Fwdays
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps to find the correct NuGet package based on a public type name.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result. Expect code, insights into the service design process, and more!
This slides are used to present the following Twitter pipeline using the ELK stack (Elasticsearch, Logstash, Kibana): https://github.com/melvynator/ELK_twitter It shows how to integrate Machine Learning into your Twitter pipeline.
Distributed Eventing in OSGi - Carsten Ziegelermfrancis
OSGi Community Event 2013 (http://www.osgi.org/CommunityEvent2013/Schedule)
ABSTRACT
One of the major topics the OSGi alliance is working on is a proposal for distributed eventing especially in the cloud. This session starts with an overview of the current state in the alliance and then shows already available solutions from the Apache Sling open source project. This includes distributing events through event admin and controlled processing of events by exactly one processor in distributed installations. The current implementations will be set in context to the ongoing activations in the alliance.
SPEAKER BIO
Carsten Ziegeler is senior developer at Adobe Research Switzerland and spends most of his time on architectural and infrastructure topics. Working for over 25 years in open source projects, Carsten is a member of the Apache Software Foundation and heavily participates in several Apache communities including Sling, Felix and ACE. He is a frequent speaker on technology and open source conferences and participates in the OSGi Core Platform and Enterprise expert groups.
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
In this talk from Lucene/Solr Revolution 2015, Solr and centralized logging experts Radu Gheorghe and Rafal Kuć cover topics like: flow in Logstash, flow in rsyslog, parsing JSON, log shipping, Solr tuning, time-based collections and tiered clusters.
ELK Stack workshop covers real-world use cases and works with the participants to - implement them. This includes Elastic overview, Logstash configuration, creation of dashboards in Kibana, guidelines and tips on processing custom log formats, designing a system to scale, choosing hardware, and managing the lifecycle of your logs.
Building Your First App with Shawn Mcarthy MongoDB
This talk will introduce the philosophy and features of MongoDB. We’ll discuss the benefits of the document-based data model that MongoDB offers by walking through how one can build a simple app. We’ll cover inserting, updating, and querying data. This session will jumpstart your knowledge of MongoDB development, providing you with context for the rest of the day's content.
Hacking the Mesh: Extending Istio with WebAssembly Modules | DevNation Tech TalkRed Hat Developers
Moving common network functionality such as client load-balancing and circuit-breaking into out-of-process proxies has kickstarted the technology we call Service Mesh, with Istio as its most popular incarnation. With the introduction of WASM support, you can now extend its functionality by writing custom Filters for Istio's proxy, Envoy, in any language that supports WASM. Let's take a closer look at developing, building, and deploying WASM Filters using the example of an OpenID Connect Filter that adds Keycloak Authentication functionality to any service in your Mesh - no code changes required.
Indexing and searching NuGet.org with Azure Functions and Search - .NET fwday...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
Maarten Balliauw "Indexing and searching NuGet.org with Azure Functions and S...Fwdays
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps to find the correct NuGet package based on a public type name.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result. Expect code, insights into the service design process, and more!
.NET Conf 2019 - Indexing and searching NuGet.org with Azure Functions and Se...Maarten Balliauw
Which NuGet package was that type in again? In this session, let's build a "reverse package search" that helps finding the correct NuGet package based on a public type.
Together, we will create a highly-scalable serverless search engine using Azure Functions and Azure Search that performs 3 tasks: listening for new packages on NuGet.org (using a custom binding), indexing packages in a distributed way, and exposing an API that accepts queries and gives our clients the best result.
https://blog.maartenballiauw.be/post/2019/07/30/indexing-searching-nuget-with-azure-functions-and-search.html
Everybody is consuming or producing NuGet packages these days. It’s easy, right? We’ll look beyond what everyone is doing. How can we use the NuGet client API to fetch data from NuGet? Can we build an application plugin system based on NuGet? What hidden gems are there in the NuGet server API? Can we create a full copy of NuGet.org?
Everybody is consuming NuGet packages these days. It’s easy, right? But how can we create and share our own packages? What is .NET Standard? How should we version, create, publish and share our package?
Once we have those things covered, we’ll look beyond what everyone is doing. How can we use the NuGet client API to fetch data from NuGet? Can we build an application plugin system based on NuGet? What hidden gems are there in the NuGet server API? Can we create a full copy of NuGet.org?
Good questions! In this talk, we will get them answered.
- What are Internal Developer Portal (IDP) and Platform Engineering?
- What is Backstage?
- How Backstage can help dev to build developer portal to make their job easier
Jirayut Nimsaeng
Founder & CEO
Opsta (Thailand) Co., Ltd.
Youtube Record: https://youtu.be/u_nLbgWDwsA?t=850
Dev Mountain Tech Festival @ Chiang Mai
November 12, 2022
This presentation was given at the Boston Django meetup on November 16, and surveyed several leading PaaS providers including Stackato, Dotcloud, OpenShift and Heroku.
For each PaaS provider, I documented the steps necessary to deploy Mezzanine, a popular Django-based CMS and blogging platform.
At the end of the presentation, I do a wrap-up of the different providers and provide a comparison matrix showing which providers have which features. This matrix is likely to go out-of-date quickly because these providers are adding new features all the time.
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
Apache Calcite is a dynamic data management framework. Think of it as a toolkit for building databases: it has an industry-standard SQL parser, validator, highly customizable optimizer (with pluggable transformation rules and cost functions, relational algebra, and an extensive library of rules), but it has no preferred storage primitives.
In this tutorial (given at BOSS '21 in Copenhagen as part of VLDB '21) the attendees will use Apache Calcite to build a fully fledged query processor from scratch with very few lines of code. This processor is a full implementation of SQL over an Apache Lucene storage engine. (Lucene does not support SQL queries and lacks a declarative language for performing complex operations such as joins or aggregations.) Attendees will also learn how to use Calcite as an effective tool for research.
Presenters: Julian Hyde and Stamatis Zampetakis
Django apps and ORM Beyond the basics [Meetup hosted by Prodeers.com]Udit Gangwani
Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle of Web development, so you can focus on writing your app without needing to reinvent the wheel. It’s free and open source.
Following is the agenda of the meetup:
1. How to get started with Django
2. Advanced overview of Django components
1. Views
2. Models
3. Templates
4. Middlewares
5. Routing
3. Deep dive into Django ORM
4. How to write complex Django queries using Model Managers, Query Sets and Q library
5. How do Django models work internally
Whether you're a newer Django developer wanting to improve your understanding of some key concepts, or a seasoned Djangonaut, there should be something for you.
Talk at RubyKaigi 2015.
Plugin architecture is known as a technique that brings extensibility to a program. Ruby has good language features for plugins. RubyGems.org is an excellent platform for plugin distribution. However, creating plugin architecture is not as easy as writing code without it: plugin loader, packaging, loosely-coupled API, and performance. Loading two versions of a gem is a unsolved challenge that is solved in Java on the other hand.
I have designed some open-source software such as Fluentd and Embulk. They provide most of functions by plugins. I will talk about their plugin-based architecture.
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018Holden Karau
Apache Spark is one of the most popular big data systems, but once the shiny finish starts to wear off you can find yourself wondering if you've accidentally deployed a Ford Pinto into production. This talk will look at the challenges that come with scaling Spark jobs. Also, the talk will explore Spark's new(ish) Dataset/DataFrame API, as well as how it’s evolving in Spark 2.3 with improved Python support.
If you're already a Spark user, come to find out why it’s not all your fault. If you aren't already a Spark user, come to find out how to save yourself from some of the pitfalls once you move beyond the example code.
Check out Holden's newest book, High Performance Spark, for more information!
From https://niketechtalksjan2018.splashthat.com/
Bringing nullability into existing code - dammit is not the answer.pptxMaarten Balliauw
The C# nullability features help you minimize the likelihood of encountering that dreaded System.NullReferenceException. Nullability syntax and annotations give hints as to whether a type can be nullable or not, and better static analysis is available to catch unhandled nulls while developing your code. What's not to like?
Introducing explicit nullability into an existing code bases is a Herculean effort. There's much more to it than just sprinkling some `?` and `!` throughout your code. It's not a silver bullet either: you'll still need to check non-nullable variables for null.
In this talk, we'll see some techniques and approaches that worked for me, and explore how you can migrate an existing code base to use the full potential of C# nullability.
Nerd sniping myself into a rabbit hole... Streaming online audio to a Sonos s...Maarten Balliauw
After buying a set of Sonos-compatible speakers at IKEA, I was disappointed there's no support for playing audio from a popular video streaming service. They stream Internet radio, podcasts and what not. Well, not that service I want it to play!
Determined - and not knowing how deep the rabbit hole would be - I ventured on a trip that included network sniffing on my access point, learning about UPnP and running a web server on my phone (without knowing how to write anything Android), learning how MP4 audio is packaged (and has to be re-packaged). This ultimately resulted in an Android app for personal use, which does what I initially wanted: play audio from that popular video streaming service on Sonos.
Join me for this story about an adventure that has no practical use, probably violates Terms of Service, but was fun to build!
Building a friendly .NET SDK to connect to SpaceMaarten Balliauw
Space is a team tool that integrates chats, meetings, git hosting, automation, and more. It has an HTTP API to integrate third party apps and workflows, but it's massive! And slightly opinionated.
In this session, we will see how we built the .NET SDK for Space, and how we make that massive API more digestible. We will see how we used code generation, and incrementally made the API feel more like a real .NET SDK.
Microservices for building an IDE - The innards of JetBrains Rider - NDC Oslo...Maarten Balliauw
Ever wondered how IDE’s are built? In this talk, we’ll skip the marketing bit and dive into the architecture and implementation of JetBrains Rider. We’ll look at how and why we have built (and open sourced) a reactive protocol, and how the IDE uses a “microservices” architecture to communicate with the debugger, Roslyn, a WPF renderer and even other tools like Unity3D. We’ll explore how things are wired together, both in-process and across those microservices.
NDC Sydney 2019 - Microservices for building an IDE – The innards of JetBrain...Maarten Balliauw
Ever wondered how IDE’s are built? In this talk, we’ll skip the marketing bit and dive into the architecture and implementation of JetBrains Rider.
We’ll look at how and why we have built (and open sourced) a reactive protocol, and how the IDE uses a “microservices” architecture to communicate with the debugger, Roslyn, a WPF renderer and even other tools like Unity3D. We’ll explore how things are wired together, both in-process and across those microservices. Let’s geek out!
JetBrains Australia 2019 - Exploring .NET’s memory management – a trip down m...Maarten Balliauw
The .NET Garbage Collector (GC) helps provide our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Can we do without allocations? Are strings evil? It still matters to understand when and where memory is allocated.
In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Approaches for application request throttling - Cloud Developer Days PolandMaarten Balliauw
Speaking from experience building a SaaS: users are insane. If you are lucky, they use your service, but in reality, they probably abuse. Crazy usage patterns resulting in more requests than expected, request bursts when users come back to the office after the weekend, and more! These all pose a potential threat to the health of our web application and may impact other users or the service as a whole. Ideally, we can apply some filtering at the front door: limit the number of requests over a given timespan, limiting bandwidth, ...
In this talk, we’ll explore the simple yet complex realm of rate limiting. We’ll go over how to decide on which resources to limit, what the limits should be and where to enforce these limits – in our app, on the server, using a reverse proxy like Nginx or even an external service like CloudFlare or Azure API management. The takeaway? Know when and where to enforce rate limits so you can have both a happy application as well as happy customers.
Approaches for application request throttling - dotNetCologneMaarten Balliauw
Speaking from experience building a SaaS: users are insane. If you are lucky, they use your service, but in reality, they probably abuse. Crazy usage patterns resulting in more requests than expected, request bursts when users come back to the office after the weekend, and more! These all pose a potential threat to the health of our web application and may impact other users or the service as a whole. Ideally, we can apply some filtering at the front door: limit the number of requests over a given timespan, limiting bandwidth, ...
In this talk, we’ll explore the simple yet complex realm of rate limiting. We’ll go over how to decide on which resources to limit, what the limits should be and where to enforce these limits – in our app, on the server, using a reverse proxy like Nginx or even an external service like CloudFlare or Azure API management. The takeaway? Know when and where to enforce rate limits so you can have both a happy application as well as happy customers.
CodeStock - Exploring .NET memory management - a trip down memory laneMaarten Balliauw
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
ConFoo Montreal - Microservices for building an IDE - The innards of JetBrain...Maarten Balliauw
Ever wondered how IDE’s are built? In this talk, we’ll skip the marketing bit and dive into the architecture and implementation of JetBrains Rider. We’ll look at how and why we have built (and open sourced) a reactive protocol, and how the IDE uses a “microservices” architecture to communicate with the debugger, Roslyn, a WPF renderer and even other tools like Unity3D. We’ll explore how things are wired together, both in-process and across those microservices. Let’s geek out!
ConFoo Montreal - Approaches for application request throttlingMaarten Balliauw
Speaking from experience building a SaaS: users are insane. If you are lucky, they use your service, but in reality, they probably abuse. Crazy usage patterns resulting in more requests than expected, request bursts when users come back to the office after the weekend, and more! These all pose a potential threat to the health of our web application and may impact other users or the service as a whole. Ideally, we can apply some filtering at the front door: limit the number of requests over a given timespan, limiting bandwidth, ...
In this talk, we’ll explore the simple yet complex realm of rate limiting. We’ll go over how to decide on which resources to limit, what the limits should be and where to enforce these limits – in our app, on the server, using a reverse proxy like Nginx or even an external service like CloudFlare or Azure API management. The takeaway? Know when and where to enforce rate limits so you can have both a happy application as well as happy customers.
Microservices for building an IDE – The innards of JetBrains Rider - TechDays...Maarten Balliauw
Ever wondered how IDE’s are built? In this talk, we’ll skip the marketing bit and dive into the architecture and implementation of JetBrains Rider. We’ll look at how and why we have built (and open sourced) a reactive protocol, and how the IDE uses a “microservices” architecture to communicate with the debugger, Roslyn, a WPF renderer and even other tools like Unity3D. We’ll explore how things are wired together, both in-process and across those microservices. Let’s geek out!
JetBrains Day Seoul - Exploring .NET’s memory management – a trip down memory...Maarten Balliauw
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
VISUG - Approaches for application request throttlingMaarten Balliauw
Speaking from experience building a SaaS: users are insane. If you are lucky, they use your service, but in reality, they probably abuse. Crazy usage patterns resulting in more requests than expected, request bursts when users come back to the office after the weekend, and more! These all pose a potential threat to the health of our web application and may impact other users or the service as a whole. Ideally, we can apply some filtering at the front door: limit the number of requests over a given timespan, limiting bandwidth, ...
In this talk, we’ll explore the simple yet complex realm of rate limiting. We’ll go over how to decide on which resources to limit, what the limits should be and where to enforce these limits – in our app, on the server, using a reverse proxy like Nginx or even an external service like CloudFlare or Azure API management. The takeaway? Know when and where to enforce rate limits so you can have both a happy application as well as happy customers.
What is going on - Application diagnostics on Azure - TechDays FinlandMaarten Balliauw
We all like building and deploying cloud applications. But what happens once that’s done? How do we know if our application behaves like we expect it to behave? Of course, logging! But how do we get that data off of our machines? How do we sift through a bunch of seemingly meaningless diagnostics? In this session, we’ll look at how we can keep track of our Azure application using structured logging, AppInsights and AppInsights analytics to make all that data more meaningful.
ConFoo - Exploring .NET’s memory management – a trip down memory laneMaarten Balliauw
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Speaking from experience building MyGet.org: users are insane. If you are lucky, they use your service, but in reality, they probably abuse. Crazy usage patterns resulting in more requests than expected, request bursts when users come back to the office after the weekend, and more! These all pose a potential threat to the health of our web application and may impact other users or the service as a whole. Ideally, we can apply some filtering at the front door: limit the number of requests over a given timespan, limiting bandwidth, ...
In this talk, we’ll explore the simple yet complex realm of rate limiting. We’ll go over how to decide on which resources to limit, what the limits should be and where to enforce these limits – in our app, on the server, using a reverse proxy like Nginx or even an external service like CloudFlare or Azure API management. The takeaway? Know when and where to enforce rate limits so you can have both a happy application as well as happy customers.
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
The .NET Garbage Collector (GC) is really cool. It helps providing our applications with virtually unlimited memory, so we can focus on writing code instead of manually freeing up memory. But how does .NET manage that memory? What are hidden allocations? Are strings evil? It still matters to understand when and where memory is allocated. In this talk, we’ll go over the base concepts of .NET memory management and explore how .NET helps us and how we can help .NET – making our apps better. Expect profiling, Intermediate Language (IL), ClrMD and more!
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
5. “Find this type on NuGet.org”
In ReSharper and Rider
Search for namespaces
& types that are not yet referenced
6. “Find this type on NuGet.org”
Idea in 2013, introduced in ReSharper 9
(2015 - https://www.jetbrains.com/resharper/whatsnew/whatsnew_9.html)
Consists of
ReSharper functionality
A service that indexes packages and powers search
Azure Cloud Service (Web and Worker role)
Indexer uses NuGet OData feed
https://www.nuget.org/api/v2/Packages?$select=Id,Version,NormalizedVersion,LastEdited,Published&$
orderby=LastEdited%20desc&$filter=LastEdited%20gt%20datetime%272012-01-01%27
8. NuGet over time...
Repo-signing announced August 10, 2018
Big chunk of packages signed
over holidays 2018/2019
Re-download all metadata & binaries
Very slow over Odata
Is there a better way?
https://blog.nuget.org/20180810/Introducing-Repository-Signatures.html
10. NuGet talks to a repository
Can be on disk/network share or remote over HTTP(S)
HTTP(S) API’s
V2 – OData based (used by pretty much all NuGet servers out there)
V3 – JSON based (available on NuGet.org, TeamCity, MyGet.org, Azure DevOps)
11. V2 Protocol
Started as “OData-to-LINQ-to-Entities” (V1 protocol)
Optimizations added to reduce # of random DB queries (VS2013+ & NuGet 2.x)
Search – Package manager list/search
FindPackagesById – Package restore (Does it exist? Where to download?)
GetUpdates – Package manager updates
https://www.nuget.org/api/v2 (code in https://github.com/NuGet/NuGetGallery)
12. V3 Protocol
JSON based
A “resource provider” of various endpoints per purpose
Catalog (NuGet.org only) – append-only event log
Registrations – materialization of newest state of a package
Flat container – .NET Core package restore (and VS autocompletion)
Report abuse URL template
Statistics
…
https://api.nuget.org/v3/index.json (code in https://github.com/NuGet/NuGet.Services.Metadata)
13. How does NuGet.org work?
User uploads to NuGet.org
Data added to database
Data added to catalog (append-only data stream)
Various jobs run over catalog using a cursor
Registrations (last state of a package/version), reference catalog entry
Flatcontainer (fast restores)
Search index (search, autocomplete, NuGet Gallery search)
…
14. Catalog seems interesting!
Append-only stream of mutations on NuGet.org
Updates (add/update) and Deletes
Chronological
Can continue where left off (uses a timestamp cursor)
Can restore NuGet.org to a given point in time
Structure
Root https://api.nuget.org/v3/catalog0/index.json
+ Page https://api.nuget.org/v3/catalog0/page0.json
+ Leaf https://api.nuget.org/v3/catalog0/data/2015.02.01.06.22.45/adam.jsgenerator.1.1.0.json
16. “Find this type on NuGet.org”
Refactor from using OData to using V3?
Mostly done, one thing missing: download counts (using search now)
https://github.com/NuGet/NuGetGallery/issues/3532
Build a new version?
Welcome to this talk
18. What do we need?
Watch the NuGet.org catalog for package changes
For every package change
Scan all assemblies
Store relation between package id+version and namespace+type
API that is compatible with all ReSharper and Rider version
Bonus points!
Easy way to re-index later (copy .nupkg binaries + dump index to JSON blobs)
19. What do we need?
Watch the NuGet.org catalog for package changes periodic check
For every package change based on a queue
Scan all assemblies
Store relation between package id+version and namespace+type
API that is compatible with all ReSharper and Rider version always up, flexible scale
Bonus points!
Easy way to re-index later (copy .nupkg binaries + dump index to JSON blobs)
21. Sounds like functions!
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
23. Functions best practices
https://medium.com/@PaulDJohnston/serverless-best-practices-b3c97d551535
Each function should do only one thing (“it depends”)
Easier error handling & scaling
Functions don’t call other functions
Avoid coupling
Use as few libraries in your functions as possible (preferably zero)
Slower cold starts with more dependencies
Learn to use messages and queues
Asynchronous means of communicating, helps scale
...
27. We’re making progress!
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
29. Next up: indexing
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
30. Indexing
Opening up the .nupkg and reflecting on assemblies
System.Reflection.Metadata
Does not load the assembly being reflected into application process
Provides access to Portable Executable (PE) metadata in assembly
Store relation between package id+version and namespace+type
Azure Search? A database? Redis? Other?
32. System.Reflection.Metadata
using (var portableExecutableReader = new PEReader(assemblySeekableStream))
{
var metadataReader = portableExecutableReader.GetMetadataReader();
foreach (var typeDefinition in metadataReader.TypeDefinitions.Select(metadataReader
.GetTypeDefinition))
{
if (!typeDefinition.Attributes.HasFlag(TypeAttributes.Public)) continue;
var typeNamespace = metadataReader.GetString(typeDefinition.Namespace);
var typeName = metadataReader.GetString(typeDefinition.Name);
if (typeName.StartsWith("<") || typeName.StartsWith("__Static") ||
typeName.Contains("c__DisplayClass")) continue;
typeNames.Add($"{typeNamespace}.{typeName}");
}
}
33. Azure Search
“Search-as-a-Service”
Scales across partitions and replicas
Define an index that will hold documents consisting of fields
Fields can be searchable, facetable, filterable, sortable, retrievable
Can’t be changed easily, think upfront!
Have to define what we want to search, and what we want to display
My function will also write documents to a JSON blob
Can re-index using Azure Search importer in case needed
35. “Do one thing well”
Our function shouldn’t care about creating a search index.
Better: return index operations, have something else handle those
Custom output binding?
37. Almost there…
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
40. One issue left...
Download counts - used for sorting and scoring search results
Change continuously on NuGet
Not part of V3 catalog
Could use search but that’s N(packages) queries
https://github.com/NuGet/NuGetGallery/issues/3532
If that data existed, how to update search?
Merge data! new PackageDocumentDownloads(key, downloadcount)
41. We’re done!
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
42. We’re done!
Functions
Collect changes from NuGet catalog
Download binaries
Index binaries using PE Header
Make search index available in API
Trigger, input and output bindings
Each function should do only one thing
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
43. We’re done!
All our functions can scale (and fail)
independently
Full index in May 2019 took ~12h on 2 B1 instances
Can be faster on more CPU’s
~ 1.7mio packages (NuGet.org homepage says)
~ 2.1mio packages (the catalog says )
~ 8 400 catalog pages
with ~ 4 200 000 catalog leaves
(hint: repo signing)
NuGet.org catalog Watch catalog
Index command
Find type API
Find namespace API
Search index
Index package
Raw .nupkg
Index as JSON
Download packageDownload command
44. Closing thoughts…
Would deploy in separate function apps for cost
Trigger binding collects all the time so needs dedicated capacity (and thus, cost)
Others can scale within bounds (think of $$$)
Would deploy in separate function apps for failure boundaries
Trigger, indexing, downloading should not affect health of API
Are bindings portable...?
Avoid them if lock-in matters to you
45. Please rate this session using
Event Master
Mobile App
at the booth
in Lobby
login.developerdays.pl
Show feature in action in Visual Studio (and show you can see basic metadata etc.)
Copied in 2017 in VS - https://www.hanselman.com/blog/VisualStudio2017CanAutomaticallyRecommendNuGetPackagesForUnknownTypes.aspx
Demo the feed quickly?
Around 3 TB in May 2019
Demo ODataDump quickly
Demo: click around in the API to show some base things
Raw API - click around in the API to show some base things, explain how a cursor could go over it
Root https://api.nuget.org/v3/catalog0/index.json
Page https://api.nuget.org/v3/catalog0/page0.json
Leaf https://api.nuget.org/v3/catalog0/data/2015.02.01.06.22.45/adam.jsgenerator.1.1.0.json
Explain CatalogDump
NuGet.Protocol.Catalog comes from GitHub
CatalogProcessor feches all pages between min and max timestamp
My implementation BatchCatalogProcessor fetches multiple pages at the same time and build a “latest state” – much faster!
Fetches leaves, for every leaf calls into a simple method
Much faster, easy to pause (keep track of min/max timestamp)
LOL input, process, output
More serious: events trigger code
Periodic check for packages
Queue message to index things
API request runs a search
No server management or capacity planning
Will use storage queues n demo’s to be able to run things locally. Ideally use SB topics or event grid (transactional)
Create a new TimerTrigger function
We will need a function to index things from NuGet
Timer will trigger every X amount of time
Timer provides last timestamp and next timestamp, so we can run our collector for that period
Snippet: demo-timertrigger
Mention HttpClient not used correctly: not disposed, so will starve TCP connections at some point
Go over code example and run it
var httpClient = new HttpClient();
var cursor = new InMemoryCursor(timer.ScheduleStatus?.Last ?? DateTimeOffset.UtcNow);
var processor = new CatalogProcessor(
cursor,
new CatalogClient(httpClient, new NullLogger<CatalogClient>()),
new DelegatingCatalogLeafProcessor(
added =>
{
log.LogInformation("[ADDED] " + added.PackageId + "@" + added.PackageVersion);
return Task.FromResult(true);
},
deleted =>
{
log.LogInformation("[DELETED] " + deleted.PackageId + "@" + deleted.PackageVersion);
return Task.FromResult(true);
}),
new CatalogProcessorSettings
{
MinCommitTimestamp = timer.ScheduleStatus?.Last ?? DateTimeOffset.UtcNow,
MaxCommitTimestamp = timer.ScheduleStatus?.Next ?? DateTimeOffset.UtcNow,
ServiceIndexUrl = "https://api.nuget.org/v3/index.json"
},
new NullLogger<CatalogProcessor>());
await processor.ProcessAsync(CancellationToken.None);
Each function should only do one thing! We are violating this.
Go over Approach1 code – Enqueuer class
Mention we are using roughly the same code as before
Differences are that our function is now no longer doing things itself, instead it’s adding messages to a queue for processing later on
That Queue binding is interesting. This is where the input/output comes from. Instead of managing our own queue connection, we let the framework handle all plumbing so we can focus on adding messages.
In Indexer, we use the Queue as an input binding, and read messages.
We can now scale enqueuing and scaling separately! But are we there yet?
Go over Approach2 code
Show this is MUCH simpler – trigger binding that provides input, queue output bindign to write that input to a queue
Let’s go over what it takes to build a trigger binding
NuGetCatalogTriggerAttribute – the data needed for the trigger to work – go over properties and attributes
Hooking it up requires a binding configuration – NuGetCatalogTriggerExtensionConfigProvider
It says: if you see this specific binding, register it as a trigger that maps to some provider
So we need that provider – NuGetCatalogTriggerAttributeBindingProvider
Provider is there to create an object that provides data. In our case we need to store the NuGet catalog timestamp cursor, so we do that on storage, and then return the actual binding – NuGetCatalogTriggerBinding
In NuGetCatalogTriggerBinding, we have to specify how data can be bound. What if I use a differnt type of object than PackageOperation? What if someone used a node.js or Python function instead of .NET. Need to define the shape of the data our trigger provides.
PackageOperationValueProvider is also interesting, this provides data shown in the portal diagnostics
CreateListenerAsync is where the actual triger code will be created – NuGetCatalogListener
NuGetCatalogListener uses the BatchCatalogProcessor we had previously, and when a package is added or deleted it will call into the injected ITriggeredFunctionExecutor
ITriggeredFunctionExecutor is Azure Functions framework specific, but it’s the glue that will clal into our function with the data we provide
Note StartAsync/StopAsync where you can add startup/shutdown code
ONE THING LEFT THAT IS NOT DOCUMENTED – Startup.cs to register the binding.
And since we are in a different class library, also need Microsoft.Azure.WebJobs.Extensions referenced to generate \bin\Debug\netcoreapp2.1\bin\extensions.json
As a result our code is now MUCH cleaner, show it again and maybe also show it in action
Mention [Singleton(Mode = SingletonMode.Listener)] – we need to ensure this binding only runs single-instance (cursor clashes otherwise). This is due to ho the catalog works, parallel processing is harder to do. But we can fix that by scaling the Indexer later on.
Show Approach3 PopulateQueueAndTable
Same code, but a bit more production worthy
Sending data to two queues (indexing and downloading)
Storing data in a table (and yes, violating “do one thing” again but I call it architectural freedom)
Next up will be downloading and indexing. Let’s start with downloading.
Grab a copy of the .nupkg from NuGet and store it in a blob
Redundancy - no need to re-download/stress NuGet on a re-index
Go over Approach3 code
DownloadToStorage uses a QueueTrigger to run whenever a message appears in queue
Note no singleton: we can scale this across multiple instances/multiple servers
Uses a Blob input binding that provides access to a blob
Note the parameters, name of the blob is resolved based on data from other inputs which is prety nifty
Our code checks whether it’s an add or a delete, and either downloads+uploads to the blob reference, or delets the blob reference
Next up will be indexing itself. There are a couple of things here…
Go over Approach3 code
PackageIndexer uses a QueueTrigger to run whenever a message appears in queue
Uses a Blob input binding that provides access to a blob where we can write our indexed entity – will show this later
Based on package operation, we will add or delete from the index
RunAddPackageAsync has some plumbing, probably too much, to dowload the .nupkg file and store it on disk
Note: we store it on disk as we need a seekable stream. So why no memoy stream? Some NuGet packages are HUGE.
Find PEReader usage and show how it will index a given package’s public types and namespaces
All goes into a typeNames collection.
Now: how do we add this info to the index?
Show PackageDocument class, has MANY properties
First important: the Identifier property has [Key] applied. Azure Search needs a key for teh document so we can retrieve by key, which could be useful when updating existing content or to find a specific document and delete it from the index.
Second important: TypeNames is searchable. Also mention “simpleanalyzer”: “Divides text at non-letters and converts them to lower case.” Other analyzers remove stopwords and do other things, this one should be as searchable as possible.
Other fields are sometimes searchable, sometimes facetable – a bit of leftover from me thinking about search use cases. The R# API ony searches on typename so could make everything else just retrievable as well.
Of course, index is not there by default, so need to create it. We do this when our function is instantiated (static constructor, so only once per launch of our functions)
Is this good? Yes, because only once per server instance our function runs on. No because we do it at one point, what if the index is deleted in between and needs to be recreated? Edge case, but a retry strategy could be a good idea...
Next, we create our package document, and at one point we add it to a list of index actions, and to blob storage
indexActions.Add(IndexAction.MergeOrUpload(packageToIndex));
JsonSerializer.Serialize(jsonWriter, packagesToIndex);
Writing to index using batch - var indexBatch = IndexBatch.New(actions);
Leftover code from earlier, batch makes no sense for one document, but in case you want to do multiple in one go this is the way. Do beware a batch can only be several MB in size, for this NuGet indexing I can only do ~25 in a batch before payload is too large.
That’s… it!
Run approach 3 (for last hour) and see functions being hit / packages added to index
Go to Azure Search portal as well, show how importer would work in case of fire
Go over Approach3 code
PackageIndexerWithCustomBinding is mostly the same code
One difference: it uses the [AzureSearchIndex] binding to write add/delete operations to the index instead
Go over how it works. Again, an attribute with settings – AzureSearchIndexAttribute
Also a configuration that registers the binding as an output binding using BindToCollector – AzureSearchExtensionConfigProvider
Now, what’s this OpenType?
It’s some sort of dynamic type. If we want to create an AzureSearch output binding, we better support more than just our PackageDocument use case!
So we need a collector builder that can create the actual binding implementation based on the real type requested by our function parameter – AzureSearchAsyncCollectorBuilder
In AzureSearchAsyncCollectorBuilder, we do that. Very simple bootstrap code in this case, but could be more complex depending on the type of binding you are creating.
Our AzureSearchAsyncCollector uses the attribute to check for Azure Search connection details, as well as the type of operation we expect it to handle. Why not all? Well, IAsyncCollector only has Add and Flush.
Note: add called manually, flush at function complete – could use flush to send things in a batch...
Code itself pretty straightforward. On Add, we add an action to search. With a retry in case the index does not exist – we then create it.
Creation code kind of interesting as we use some reflection in case we specify a given type of coument to index.
Why? Cause when we do Upserts, we may want to update just one or two properties, and can use a different Dto in that case (but still have the index shaped to the full document shape)
Run when time left, but nothing fancy here...
Now we need to make ReSharper talk to our search. We have the index, so that should be a breeze, right?
Go over Web code
RunFindTypeApiAsync and RunFindNamespaceAsync
Both use “name” as their query parameter to search for
RunInternalAsync does the heavy lifting
Grabs other parameters
Runs search, and collects several pages of results
Why is this ForEachAsync there?
Search index has multiple versions for every package id, yet ReSharper expects only the latest matching all parameters
Azure Search has no group by / distinct by, so need to do this in memory. Doing it here by fetching a maximum number of results and doing the grouping manually.
Use the collected data to build result. Add matching type names etc.
Example requests:
http://localhost:7071/api/v1/find-type?name=JsonConvert
http://localhost:7071/api/v1/find-type?name=CamoServer&allowPrerelease=true&latestVersion=false
https://nugettypesearch.azurewebsites.net/api/v1/find-type?name=JsonConvert
In ReSharper (devenv /ReSharper.Internal, go to NuGet tool window, set base URL to https://nugettypesearch.azurewebsites.net/api/v1/)
Write some code that uses JsonConvert / JObject and try it out.