In Data Engineer's Lunch #76, Arpan Patel will cover how to connect Airflow and Dataproc with a demo using an Airflow DAG to create a Dataproc cluster, submit an Apache Spark job to Dataproc, and destroy the Dataproc cluster upon completion.
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
At Google Cloud Platform, we're combining the Apache Spark and Hadoop ecosystem with our software and hardware innovations. We want to make these awesome tools easier, faster, and more cost-effective, from 3 to 30,000 cores. This presentation will showcase how Google Cloud Platform is innovating with the goal of bringing the Hadoop ecosystem to everyone.
Bio: "I love data because it surrounds us - everything is data. I also love open source software, because it shows what is possible when people come together to solve common problems with technology. While they are awesome on their own, I am passionate about combining the power of open source software with the potential unlimited uses of data. That's why I joined Google. I am a product manager for Google Cloud Platform and manage Cloud Dataproc and Apache Beam (incubating). I've previously spent time hanging out at Disney and Amazon. Beyond Google, love data, amateur radio, Disneyland, photography, running and Legos."
Introduction to Oracle Cloud Infrastructure ServicesKnoldus Inc.
Knoldus is a technology consulting firm that delivers solutions using Reactive Products, IoT, Microservices, API, Data Science, Data Engineering and DevOps. Oracle Cloud Infrastructure (OCI) provides compute, storage and networking capabilities in a highly available hosted environment. OCI's Free Tier offers $300 in cloud credits valid for 30 days to spend on eligible services. OCI accounts have Always Free resources like VMs and databases that can be used for small applications without charges. OCI provides regions, availability domains and fault domains for high availability and scalability across isolated data centers.
Deploying software and controlling infrastructure quickly and safely is a hard task.
In this talk, Brice Fernandes, Customer Success Engineer at Weaveworks, discusses GitOps, an operational model for Kubernetes and beyond to speed up development, while retaining extremely strong security guarantees. Brice describes and shows several open source tools developed at Weaveworks to support this approach. You will have a good idea of how to use the GitOps principles to create software pipelines that are fast, safe, and reproducible, while creating clear and high quality audit trails.
Check out the full presentation on YouTube: https://youtu.be/QdCwUUtcj4I
The document discusses challenges facing today's enterprises such as cutting costs, driving value with tight budgets, maintaining security while increasing access, and finding the right transformative capabilities. It then discusses challenges in building applications related to scaling, availability, and costs. The remainder summarizes Microsoft's Windows Azure cloud computing platform, how it addresses these challenges, example use cases, and pricing models.
Oracle Cloud Infrastructure (OCI) provides computing power and services for running cloud native applications and workloads. It offers autonomous services, integrated security, and performance. OCI delivers compute, storage, networking and other services globally. Key features include automated services using AI/ML, lower costs than AWS, and easy migration of Oracle applications from on-premises. OCI also includes analytics services like Oracle Analytics Cloud for data visualization and insights.
Using Azure DevOps to continuously build, test, and deploy containerized appl...Adrian Todorov
Using Azure DevOps and containers, developers can continuously build, test, and deploy applications to Kubernetes with ease. Azure DevOps provides tools for continuous integration, release management, and monitoring that integrate well with containerized applications on Kubernetes. Developers benefit from being able to focus on writing code while operations manages the infrastructure. Azure Kubernetes Service (AKS) makes it simple to deploy and manage Kubernetes clusters in Azure without having to worry about installing or maintaining the Kubernetes master components.
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
At Google Cloud Platform, we're combining the Apache Spark and Hadoop ecosystem with our software and hardware innovations. We want to make these awesome tools easier, faster, and more cost-effective, from 3 to 30,000 cores. This presentation will showcase how Google Cloud Platform is innovating with the goal of bringing the Hadoop ecosystem to everyone.
Bio: "I love data because it surrounds us - everything is data. I also love open source software, because it shows what is possible when people come together to solve common problems with technology. While they are awesome on their own, I am passionate about combining the power of open source software with the potential unlimited uses of data. That's why I joined Google. I am a product manager for Google Cloud Platform and manage Cloud Dataproc and Apache Beam (incubating). I've previously spent time hanging out at Disney and Amazon. Beyond Google, love data, amateur radio, Disneyland, photography, running and Legos."
Introduction to Oracle Cloud Infrastructure ServicesKnoldus Inc.
Knoldus is a technology consulting firm that delivers solutions using Reactive Products, IoT, Microservices, API, Data Science, Data Engineering and DevOps. Oracle Cloud Infrastructure (OCI) provides compute, storage and networking capabilities in a highly available hosted environment. OCI's Free Tier offers $300 in cloud credits valid for 30 days to spend on eligible services. OCI accounts have Always Free resources like VMs and databases that can be used for small applications without charges. OCI provides regions, availability domains and fault domains for high availability and scalability across isolated data centers.
Deploying software and controlling infrastructure quickly and safely is a hard task.
In this talk, Brice Fernandes, Customer Success Engineer at Weaveworks, discusses GitOps, an operational model for Kubernetes and beyond to speed up development, while retaining extremely strong security guarantees. Brice describes and shows several open source tools developed at Weaveworks to support this approach. You will have a good idea of how to use the GitOps principles to create software pipelines that are fast, safe, and reproducible, while creating clear and high quality audit trails.
Check out the full presentation on YouTube: https://youtu.be/QdCwUUtcj4I
The document discusses challenges facing today's enterprises such as cutting costs, driving value with tight budgets, maintaining security while increasing access, and finding the right transformative capabilities. It then discusses challenges in building applications related to scaling, availability, and costs. The remainder summarizes Microsoft's Windows Azure cloud computing platform, how it addresses these challenges, example use cases, and pricing models.
Oracle Cloud Infrastructure (OCI) provides computing power and services for running cloud native applications and workloads. It offers autonomous services, integrated security, and performance. OCI delivers compute, storage, networking and other services globally. Key features include automated services using AI/ML, lower costs than AWS, and easy migration of Oracle applications from on-premises. OCI also includes analytics services like Oracle Analytics Cloud for data visualization and insights.
Using Azure DevOps to continuously build, test, and deploy containerized appl...Adrian Todorov
Using Azure DevOps and containers, developers can continuously build, test, and deploy applications to Kubernetes with ease. Azure DevOps provides tools for continuous integration, release management, and monitoring that integrate well with containerized applications on Kubernetes. Developers benefit from being able to focus on writing code while operations manages the infrastructure. Azure Kubernetes Service (AKS) makes it simple to deploy and manage Kubernetes clusters in Azure without having to worry about installing or maintaining the Kubernetes master components.
The document discusses Azure Arc, Microsoft's solution for extending Azure management and security capabilities to any infrastructure. Key points include:
- Azure Arc allows deploying and managing Kubernetes applications across environments using DevOps techniques and ensuring consistent configuration.
- It enables running data services anywhere for latency or compliance reasons and seamlessly managing data assets across on-premises, clouds and edge.
- Azure Arc provides a way to centrally organize and govern Kubernetes clusters and servers that may be sprawling across clouds, datacenters and edge from a single place.
CloudRun is a serverless compute platform that allows running stateless containers without managing infrastructure or clusters. It supports many languages and automatically scales applications. CloudRun has several advantages over Google Kubernetes Engine (GKE) like automatic scaling, pay per use, and a fully managed platform. However, GKE allows more control and supports additional GCP products and deployment strategies at the cost of managing infrastructure.
The document provides an introduction to Red Hat OpenShift, including:
- An overview of the differences between virtual machines and container technologies like Docker.
- The evolution of container technologies and standards like Kubernetes, CRI, and CNI.
- Why Kubernetes is used for container orchestration and why Red Hat OpenShift is a popular Kubernetes distribution.
- Key features of Red Hat OpenShift like source-to-image builds, integrated monitoring, security, and log aggregation with EFK.
1- Introduction of Azure data factory.pptxBRIJESH KUMAR
Azure Data Factory is a cloud-based data integration service that allows users to easily construct extract, transform, load (ETL) and extract, load, transform (ELT) processes without code. It offers job scheduling, security for data in transit, integration with source control for continuous delivery, and scalability for large data volumes. The document demonstrates how to create an Azure Data Factory from the Azure portal.
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Janusz Nowak
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anything to Anywhere with Azure DevOps
Janusz Nowak
@jnowwwak
https://www.linkedin.com/in/janono
https://github.com/janusznowak
https://blog.janono.pl
Serverless with Google Cloud FunctionsJerry Jalava
This document discusses Google Cloud Functions, a serverless platform for running code in response to events. It provides an overview of Google Cloud Functions' features such as triggers from Cloud Pub/Sub and Storage, integration with other Google Cloud services, and use cases including building mobile backends, APIs, data processing, and IoT. The document also discusses using Google Cloud Functions with Firebase and pricing.
Learn how to deliver software like Pivotal and Google.
In this one-day program, Pivotal and Google share how we deliver software applications. By demonstrating the capabilities of a cloud-native software organization, we’ll share the promises Pivotal Cloud Foundry can help you keep when combined with industry-leading services and infrastructure using Google Cloud Platform (GCP).
We built Pivotal Cloud Foundry so you can deliver software with increased velocity and reduced risk. Together we will share how to make the principles of Google’s Site Reliability Engineering (SRE) achievable on Pivotal Cloud Foundry. Google and Pivotal collaborated to make Pivotal Cloud Foundry a reliable place for your applications to live.
The day will open with an introduction to Pivotal, Google, and our shared partner ecosystem. Pivotal will share how culture and technology combine to reinforce each other. We will go hands-on to show you how easy it is to develop applications with Spring Boot, integrate with Google Cloud services, and use Concourse to automate shipping applications to Pivotal Cloud Foundry.
In the afternoon, we’ll show you how Pivotal Cloud Foundry operators can empower development teams by enabling GCP integrations in their Pivotal Cloud Foundry environment. We’ll then focus on the developer experience of integrating applications with GCP’s powerful services.
Questions? Please email us at cloudnativeroadshow@pivotal.io.
This document discusses improving the developer experience through GitOps and ArgoCD. It recommends building developer self-service tools for cloud resources and Kubernetes to reduce frustration. Example GitLab CI/CD pipelines are shown that handle releases, deployments to ECR, and patching apps in an ArgoCD repository to sync changes. The goal is to create faster feedback loops through Git operations and automation to motivate developers.
La plataforma Azure está compuesta por más de 200 productos y servicios en la nube diseñados para ayudarle a dar vida a nuevas soluciones que permitan resolver las dificultades actuales y crear el futuro. Cree, ejecute y administre aplicaciones en varias nubes, en el entorno local y en el perímetro, con las herramientas y los marcos que prefiera.
Helm - Application deployment management for KubernetesAlexei Ledenev
Use Helm to package and deploy a composed application to any Kubernetes cluster. Manage your releases easily over time and across multiple K8s clusters.
Simplify DevOps with Microservices and Mobile Backends.pptxssuser5faa791
This document discusses simplifying DevOps with microservices and mobile backends. It introduces Oracle's Backend for Spring Boot platform, which provides a unified backend for developing apps using Kubernetes, containers, and the Oracle database. The platform offers developer tools, platform services, and integration with the Oracle database. It also discusses managing transactions across microservices using sagas and Oracle's Transaction Manager. The presentation concludes by inviting attendees to try out building a sample banking application in the provided hands-on lab.
The document discusses infrastructure as code (IAC) and its principles and categories. Some key points:
- IAC treats infrastructure like code by writing code to define, deploy, and update infrastructure. This allows infrastructure to be managed programmatically.
- Common categories of IAC include ad hoc scripts, configuration management tools like Ansible and Puppet, server templating tools like Packer, and server provisioning tools like Terraform.
- Benefits of IAC include automation, consistency, repeatability, versioning, validation, reuse, and allowing engineers to focus on code instead of manual tasks.
- AWS offers CloudFormation for provisioning AWS resources through templates. Other tools integrate with Cloud
In this session, Diógenes gives an introduction of the basic concepts that make OpenShift, giving special attention to its relationship with Linux containers and Kubernetes.
Oracle Cloud Infrastructure (OCI) is a secure, scalable, and highly available cloud computing service provided by Oracle. It offers infrastructure services like compute, storage, and networking, and features built-in security, high performance, and hybrid integration capabilities. Customers can use OCI to run enterprise workloads, develop applications, process big data, and more, with flexible pricing and 24/7 technical support.
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019Henning Jacobs
Talk given at 25. Handelsblatt Jahrestagung Strategisches IT-Management in Munich on 2019-01-23. Original title (German): "Developer Experience bei Zalando: Entwicklerproduktivität steigern mit Cloud Native Infrastruktur"
- Wie macht man mehr als 1100 Entwickler glücklich und effektiv?
- Entwickler als Kunde: Produktmanagement für Plattformteams
- You build it – you run it: Self-Service-Infrastruktur mit Kubernetes und AWS
- Der Weg vom klassischen Infrastrukturteam zu Developer Productivity als Abteilung
Introduction to Kubernetes and Google Container Engine (GKE)Opsta
Kubernetes is an open-source system for automating
deployment, scaling, and management of containerized
applications. This presentation will show you overview of Kubernetes concept and benefit with Google Container Engineer (GKE)
GDG DevFest Bangkok 2017 at Ananda UrbanTech FYI Center on October 7, 2017
See Facebook Live here
https://www.facebook.com/gamez.always/videos/10204052467627401/
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...James Anderson
Do you know The Cloud Girl? She makes the cloud come alive with pictures and storytelling.
The Cloud Girl, Priyanka Vergadia, Chief Content Officer @Google, joins us to tell us about Scaleable Data Analytics in Google Cloud.
Maybe, with her explanation, we'll finally understand it!
Priyanka is a technical storyteller and content creator who has created over 300 videos, articles, podcasts, courses and tutorials which help developers learn Google Cloud fundamentals, solve their business challenges and pass certifications! Checkout her content on Google Cloud Tech Youtube channel.
Priyanka enjoys drawing and painting which she tries to bring to her advocacy.
Check out her website The Cloud Girl: https://thecloudgirl.dev/ and her new book: https://www.amazon.com/Visualizing-Google-Cloud-Illustrated-References/dp/1119816327
OpenShift is a Platform-as-a-Service that provides development environments on demand using containers. It automates application lifecycles including build, deploy, and retirement. OpenShift uses containers to package applications and dependencies in a portable way. Red Hat addresses concerns around adopting containers at scale through OpenShift, which provides security, scalability, integration, management and certification capabilities. OpenShift runs on a user's choice of infrastructure and orchestrates applications across nodes using Kubernetes.
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
There is no doubt Kubernetes has emerged as the next generation of cloud native infrastructure to support a wide variety of distributed workloads. Apache Spark has evolved to run both Machine Learning and large scale analytics workloads. There is growing interest in running Apache Spark natively on Kubernetes. By combining the flexibility of Kubernetes and scalable data processing with Apache Spark, you can run any data and machine pipelines on this infrastructure while effectively utilizing resources at disposal.
In this talk, Rajesh Thallam and Sougata Biswas will share how to effectively run your Apache Spark applications on Google Kubernetes Engine (GKE) and Google Cloud Dataproc, orchestrate the data and machine learning pipelines with managed Apache Airflow on GKE (Google Cloud Composer). Following topics will be covered: – Understanding key traits of Apache Spark on Kubernetes- Things to know when running Apache Spark on Kubernetes such as autoscaling- Demonstrate running analytics pipelines on Apache Spark orchestrated with Apache Airflow on Kubernetes cluster.
Talend Summer '17 Release: New Features and Tech OverviewTalend
See the new release: https://www.talend.com/products/talend-6/
Talend Summer ’17 delivers the latest cloud and big data innovations so you can get a 360-degree view of your customer across multiple cloud platforms. Accelerate AWS, Microsoft Azure and Google Cloud Platform adoption, with the flexibility and portability to easily reuse development work across the cloud.
In this presentation, we break down new features for data quality, ESB, cloud, big data and more.
The document discusses Azure Arc, Microsoft's solution for extending Azure management and security capabilities to any infrastructure. Key points include:
- Azure Arc allows deploying and managing Kubernetes applications across environments using DevOps techniques and ensuring consistent configuration.
- It enables running data services anywhere for latency or compliance reasons and seamlessly managing data assets across on-premises, clouds and edge.
- Azure Arc provides a way to centrally organize and govern Kubernetes clusters and servers that may be sprawling across clouds, datacenters and edge from a single place.
CloudRun is a serverless compute platform that allows running stateless containers without managing infrastructure or clusters. It supports many languages and automatically scales applications. CloudRun has several advantages over Google Kubernetes Engine (GKE) like automatic scaling, pay per use, and a fully managed platform. However, GKE allows more control and supports additional GCP products and deployment strategies at the cost of managing infrastructure.
The document provides an introduction to Red Hat OpenShift, including:
- An overview of the differences between virtual machines and container technologies like Docker.
- The evolution of container technologies and standards like Kubernetes, CRI, and CNI.
- Why Kubernetes is used for container orchestration and why Red Hat OpenShift is a popular Kubernetes distribution.
- Key features of Red Hat OpenShift like source-to-image builds, integrated monitoring, security, and log aggregation with EFK.
1- Introduction of Azure data factory.pptxBRIJESH KUMAR
Azure Data Factory is a cloud-based data integration service that allows users to easily construct extract, transform, load (ETL) and extract, load, transform (ELT) processes without code. It offers job scheduling, security for data in transit, integration with source control for continuous delivery, and scalability for large data volumes. The document demonstrates how to create an Azure Data Factory from the Azure portal.
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...Janusz Nowak
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anything to Anywhere with Azure DevOps
Janusz Nowak
@jnowwwak
https://www.linkedin.com/in/janono
https://github.com/janusznowak
https://blog.janono.pl
Serverless with Google Cloud FunctionsJerry Jalava
This document discusses Google Cloud Functions, a serverless platform for running code in response to events. It provides an overview of Google Cloud Functions' features such as triggers from Cloud Pub/Sub and Storage, integration with other Google Cloud services, and use cases including building mobile backends, APIs, data processing, and IoT. The document also discusses using Google Cloud Functions with Firebase and pricing.
Learn how to deliver software like Pivotal and Google.
In this one-day program, Pivotal and Google share how we deliver software applications. By demonstrating the capabilities of a cloud-native software organization, we’ll share the promises Pivotal Cloud Foundry can help you keep when combined with industry-leading services and infrastructure using Google Cloud Platform (GCP).
We built Pivotal Cloud Foundry so you can deliver software with increased velocity and reduced risk. Together we will share how to make the principles of Google’s Site Reliability Engineering (SRE) achievable on Pivotal Cloud Foundry. Google and Pivotal collaborated to make Pivotal Cloud Foundry a reliable place for your applications to live.
The day will open with an introduction to Pivotal, Google, and our shared partner ecosystem. Pivotal will share how culture and technology combine to reinforce each other. We will go hands-on to show you how easy it is to develop applications with Spring Boot, integrate with Google Cloud services, and use Concourse to automate shipping applications to Pivotal Cloud Foundry.
In the afternoon, we’ll show you how Pivotal Cloud Foundry operators can empower development teams by enabling GCP integrations in their Pivotal Cloud Foundry environment. We’ll then focus on the developer experience of integrating applications with GCP’s powerful services.
Questions? Please email us at cloudnativeroadshow@pivotal.io.
This document discusses improving the developer experience through GitOps and ArgoCD. It recommends building developer self-service tools for cloud resources and Kubernetes to reduce frustration. Example GitLab CI/CD pipelines are shown that handle releases, deployments to ECR, and patching apps in an ArgoCD repository to sync changes. The goal is to create faster feedback loops through Git operations and automation to motivate developers.
La plataforma Azure está compuesta por más de 200 productos y servicios en la nube diseñados para ayudarle a dar vida a nuevas soluciones que permitan resolver las dificultades actuales y crear el futuro. Cree, ejecute y administre aplicaciones en varias nubes, en el entorno local y en el perímetro, con las herramientas y los marcos que prefiera.
Helm - Application deployment management for KubernetesAlexei Ledenev
Use Helm to package and deploy a composed application to any Kubernetes cluster. Manage your releases easily over time and across multiple K8s clusters.
Simplify DevOps with Microservices and Mobile Backends.pptxssuser5faa791
This document discusses simplifying DevOps with microservices and mobile backends. It introduces Oracle's Backend for Spring Boot platform, which provides a unified backend for developing apps using Kubernetes, containers, and the Oracle database. The platform offers developer tools, platform services, and integration with the Oracle database. It also discusses managing transactions across microservices using sagas and Oracle's Transaction Manager. The presentation concludes by inviting attendees to try out building a sample banking application in the provided hands-on lab.
The document discusses infrastructure as code (IAC) and its principles and categories. Some key points:
- IAC treats infrastructure like code by writing code to define, deploy, and update infrastructure. This allows infrastructure to be managed programmatically.
- Common categories of IAC include ad hoc scripts, configuration management tools like Ansible and Puppet, server templating tools like Packer, and server provisioning tools like Terraform.
- Benefits of IAC include automation, consistency, repeatability, versioning, validation, reuse, and allowing engineers to focus on code instead of manual tasks.
- AWS offers CloudFormation for provisioning AWS resources through templates. Other tools integrate with Cloud
In this session, Diógenes gives an introduction of the basic concepts that make OpenShift, giving special attention to its relationship with Linux containers and Kubernetes.
Oracle Cloud Infrastructure (OCI) is a secure, scalable, and highly available cloud computing service provided by Oracle. It offers infrastructure services like compute, storage, and networking, and features built-in security, high performance, and hybrid integration capabilities. Customers can use OCI to run enterprise workloads, develop applications, process big data, and more, with flexible pricing and 24/7 technical support.
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019Henning Jacobs
Talk given at 25. Handelsblatt Jahrestagung Strategisches IT-Management in Munich on 2019-01-23. Original title (German): "Developer Experience bei Zalando: Entwicklerproduktivität steigern mit Cloud Native Infrastruktur"
- Wie macht man mehr als 1100 Entwickler glücklich und effektiv?
- Entwickler als Kunde: Produktmanagement für Plattformteams
- You build it – you run it: Self-Service-Infrastruktur mit Kubernetes und AWS
- Der Weg vom klassischen Infrastrukturteam zu Developer Productivity als Abteilung
Introduction to Kubernetes and Google Container Engine (GKE)Opsta
Kubernetes is an open-source system for automating
deployment, scaling, and management of containerized
applications. This presentation will show you overview of Kubernetes concept and benefit with Google Container Engineer (GKE)
GDG DevFest Bangkok 2017 at Ananda UrbanTech FYI Center on October 7, 2017
See Facebook Live here
https://www.facebook.com/gamez.always/videos/10204052467627401/
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...James Anderson
Do you know The Cloud Girl? She makes the cloud come alive with pictures and storytelling.
The Cloud Girl, Priyanka Vergadia, Chief Content Officer @Google, joins us to tell us about Scaleable Data Analytics in Google Cloud.
Maybe, with her explanation, we'll finally understand it!
Priyanka is a technical storyteller and content creator who has created over 300 videos, articles, podcasts, courses and tutorials which help developers learn Google Cloud fundamentals, solve their business challenges and pass certifications! Checkout her content on Google Cloud Tech Youtube channel.
Priyanka enjoys drawing and painting which she tries to bring to her advocacy.
Check out her website The Cloud Girl: https://thecloudgirl.dev/ and her new book: https://www.amazon.com/Visualizing-Google-Cloud-Illustrated-References/dp/1119816327
OpenShift is a Platform-as-a-Service that provides development environments on demand using containers. It automates application lifecycles including build, deploy, and retirement. OpenShift uses containers to package applications and dependencies in a portable way. Red Hat addresses concerns around adopting containers at scale through OpenShift, which provides security, scalability, integration, management and certification capabilities. OpenShift runs on a user's choice of infrastructure and orchestrates applications across nodes using Kubernetes.
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
There is no doubt Kubernetes has emerged as the next generation of cloud native infrastructure to support a wide variety of distributed workloads. Apache Spark has evolved to run both Machine Learning and large scale analytics workloads. There is growing interest in running Apache Spark natively on Kubernetes. By combining the flexibility of Kubernetes and scalable data processing with Apache Spark, you can run any data and machine pipelines on this infrastructure while effectively utilizing resources at disposal.
In this talk, Rajesh Thallam and Sougata Biswas will share how to effectively run your Apache Spark applications on Google Kubernetes Engine (GKE) and Google Cloud Dataproc, orchestrate the data and machine learning pipelines with managed Apache Airflow on GKE (Google Cloud Composer). Following topics will be covered: – Understanding key traits of Apache Spark on Kubernetes- Things to know when running Apache Spark on Kubernetes such as autoscaling- Demonstrate running analytics pipelines on Apache Spark orchestrated with Apache Airflow on Kubernetes cluster.
Talend Summer '17 Release: New Features and Tech OverviewTalend
See the new release: https://www.talend.com/products/talend-6/
Talend Summer ’17 delivers the latest cloud and big data innovations so you can get a 360-degree view of your customer across multiple cloud platforms. Accelerate AWS, Microsoft Azure and Google Cloud Platform adoption, with the flexibility and portability to easily reuse development work across the cloud.
In this presentation, we break down new features for data quality, ESB, cloud, big data and more.
Spark on Dataproc - Israel Spark Meetup at taboolatsliwowicz
This document summarizes a presentation about Google Cloud Dataproc, a fully managed Spark and Hadoop service. It provides an overview of Dataproc's features like fast cluster provisioning, minute-based billing, and integration with other Google Cloud services. The presentation demonstrates Dataproc's pricing and performance advantages over AWS EMR, and outlines Google's roadmap to add more frameworks, tools, and data stores to Dataproc.
Introduction to Apache Airflow, it's main concepts and features and an example of a DAG. Afterwards some lessons and best practices learned by from the 3 years I have been using Airflow to power workflows in production.
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
This document discusses automating Apache Cassandra operations using Apache Airflow. It recommends using Airflow to schedule and automate workflows for ETL, data hygiene, import/export, and more. It provides an overview of using Apache Spark jobs within Airflow DAGs to perform tasks like data cleaning, deduplication, and migrations for Cassandra. The document includes demos of using Airflow and Spark with Cassandra on DataStax Astra and discusses considerations for implementing this solution.
Hybrid data lake on google cloud with alluxio and dataprocAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Hybrid Data Lake on Google Cloud with Alluxio and Dataproc
Roderick Yao, Strategic Cloud Engineer (Google Cloud)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
[Study Guide] Google Professional Cloud Architect (GCP-PCA) CertificationAmaaira Johns
Start Here---> https://bit.ly/3bGEd9l <---Get complete detail on GCP-PCA exam guide to crack Professional Cloud Architect. You can collect all information on GCP-PCA tutorial, practice test, books, study material, exam questions, and syllabus. Firm your knowledge on Professional Cloud Architect and get ready to crack GCP-PCA certification. Explore all information on GCP-PCA exam with the number of questions, passing percentage, and time duration to complete the test.
Running Dataproc At Scale in production - Searce Talk at GDG DelhiSearce Inc
This document provides information about Dataproc, Google Cloud's fully managed Spark and Hadoop service. It discusses how Dataproc allows users to create clusters on-demand to process large datasets in a flexible and cost-effective manner. It also covers how Dataproc integrates with other Google Cloud services and provides open-source tools like Spark, Hadoop, Hive and Pig. Additionally, it summarizes best practices for using Dataproc such as leveraging initialization actions, specifying cluster versions, and using the Jobs API for submissions.
The document discusses upcoming features and changes in Apache Airflow 2.0. Key points include:
1. Scheduler high availability will use an active-active model with row-level locks to allow killing a scheduler without interrupting tasks.
2. DAG serialization will decouple DAG parsing from scheduling to reduce delays, support lazy loading, and enable features like versioning.
3. Performance improvements include optimizing the DAG file processor and using a profiling tool to identify other bottlenecks.
4. The Kubernetes executor will integrate with KEDA for autoscaling and allow customizing pods through templating.
5. The official Helm chart, functional DAGs, and smaller usability changes
Extending Twitter's Data Platform to Google CloudDataWorks Summit
Twitter's Data Platform is built using multiple complex open source and in house projects to support Data Analytics on hundreds of petabytes of data. Our platform support storage, compute, data ingestion, discovery and management and various tools and libraries to help users for both batch and realtime analytics. Our DataPlatform operates on multiple clusters across different data centers to help thousands of users discover valuable insights. As we were scaling our Data Platform to multiple clusters, we also evaluated various cloud vendors to support use cases outside of our data centers. In this talk we share our architecture and how we extend our data platform to use cloud as another datacenter. We walk through our evaluation process, challenges we faced supporting data analytics at Twitter scale on cloud and present our current solution. Extending Twitter's Data platform to cloud was complex task which we deep dive in this presentation.
How a distributed graph analytics platform uses Apache Kafka for data ingesti...HostedbyConfluent
Using Kafka to stream data into TigerGraph, a distributed graph database, is a common pattern in our customers’ data architecture. In the TigerGraph database, Kafka Connect framework was used to build the native S3 data loader. In TigerGraph Cloud, we will be building native integration with many data sources such as Azure Blob Storage and Google Cloud Storage using Kafka as an integrated component for the Cloud Portal.
In this session, we will be discussing both architectures: 1. built-in Kafka Connect framework within TigerGraph database; 2. using Kafka cluster for cloud native integration with other popular data sources. Demo will be provided for both data streaming processes.
Cloud Composer workshop at Airflow Summit 2023.pdfLeah Cole
Cloud Composer workshop agenda includes:
- Introductions from engineering managers and staff
- Setting up workshop projects and GCP credits for participants
- Introduction to Cloud Composer architecture and features
- Disaster recovery process using Cloud Composer snapshots for high availability
- Demonstrating data lineage capabilities between Cloud Composer, BigQuery and Dataproc
The document summarizes a meetup on data streaming and machine learning with Google Cloud Platform. The meetup consisted of two presentations:
1. The first presentation discussed using Apache Beam (Dataflow) on Google Cloud Platform to parallelize machine learning training for improved performance. It showed how Dataflow was used to reduce training time from 12 hours to under 30 minutes.
2. The second presentation demonstrated building a streaming pipeline for sentiment analysis on Twitter data using Dataflow. It covered streaming patterns, batch vs streaming processing, and a demo that ingested tweets from PubSub and analyzed them using Cloud NLP API and BigQuery.
The document summarizes a meetup on data streaming and machine learning with Google Cloud Platform. The meetup consisted of two presentations:
1. The first presentation discussed using Apache Beam and Google Cloud Dataflow to parallelize machine learning training for hyperparameter optimization. It showed how Dataflow reduced training time from 12 hours to under 30 minutes.
2. The second presentation demonstrated building a streaming Twitter sentiment analysis pipeline with Dataflow. It covered streaming patterns, batch vs streaming considerations, and a demo that ingested tweets from PubSub, analyzed sentiment with NLP, and loaded results to BigQuery.
Flink Forward SF 2017: James Malone - Make The Cloud Work For YouFlink Forward
You should spend your time using the powerful Apache Flink ecosystem to get value from your data, not on your data processing infrastructure. Cloud environments can help you with this problem by providing managed services and infrastructure. Since Google Cloud Dataproc, Google's managed service to power the Apache big data ecosystem, runs Flink, you can easily combine the benefits of cloud with your Flink data pipelines. With new support for Flink and long-running streaming jobs, we will show you how you can set up a cluster and a streaming job in less than three minutes.
AWS Big Data Demystified #3 | Zeppelin + spark sql, jdbc + thrift, ganglia, r...Omid Vahdaty
AWS Big Data Demystified is all about knowledge sharing b/c knowledge should be given for free. in this lecture we will dicusss the advantages of working with Zeppelin + spark sql, jdbc + thrift, ganglia, r+ spark r + livy, and a litte bit about ganglia on EMR.\
subscribe to you youtube channel to see the video of this lecture:
https://www.youtube.com/channel/UCzeGqhZIWU-hIDczWa8GtgQ?view_as=subscriber
Cloud Dataflow is a fully managed service and SDK from Google that allows users to define and run data processing pipelines. The Dataflow SDK defines the programming model used to build streaming and batch processing pipelines. Google Cloud Dataflow is the managed service that will run and optimize pipelines defined using the SDK. The SDK provides primitives like PCollections, ParDo, GroupByKey, and windows that allow users to build unified streaming and batch pipelines.
SEC302 Twitter's GCP Architecture for its petabyte scale data storage in gcs...Vrushali Channapattan
Twitter collects petabytes of data every day and empowers its engineers and data scientists for large data processing with an hybrid on-premises and cloud model. In this talk, we will look at its GCP architecture and the resource hierarchy. We will deep dive into the storage design that uses Google Cloud Storage to organize petabytes of data that are replicated from on-premises HDFS clusters. We will take a look at how the user-management tooling has been designed to create and manage access for thousands of accounts (human and service accounts) at Twitter. We will talk about how the design deals with the security measures for accounts and tooling systems running in GCP and the complexities of dataset permissions. We will share the challenges we faced as we tried to design our system at scale and our learnings and solutions.
As more workloads move to severless-like environments, the importance of properly handling downscaling increases. While recomputing the entire RDD makes sense for dealing with machine failure, if your nodes are more being removed frequently, you can end up in a seemingly loop-like scenario, where you scale down and need to recompute the expensive part of your computation, scale back up, and then need to scale back down again.
Even if you aren’t in a serverless-like environment, preemptable or spot instances can encounter similar issues with large decreases in workers, potentially triggering large recomputes. In this talk, we explore approaches for improving the scale-down experience on open source cluster managers, such as Yarn and Kubernetes-everything from how to schedule jobs to location of blocks and their impact (shuffle and otherwise).
This document introduces Spark 2.0 and its key features, including the Dataset abstraction, Spark Session API, moving from RDDs to Datasets, Dataset and DataFrame APIs, handling time windows, and adding custom optimizations. The major focus of Spark 2.0 is standardizing on the Dataset abstraction and improving performance by an order of magnitude. Datasets provide a strongly typed API that combines the best of RDDs and DataFrames.
Similar to Data Engineer's Lunch #76: Airflow and Google Dataproc (20)
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137Anant Corporation
Discussion of LLM fine-tuning with an overview of fine-tuning types and datasets: specifically we will talk about the method that we used to turn an existing collection of Cassandra information into a set of instructions and responses that we can use for fine tuning.
What's AGI? How is it different from an Agent or an AI Assistant? If you're looking to understand how AI Agents/AGI can help your company, check this out.
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotAnant Corporation
In this meetup, we will introduce the concepts of Real Time Analytics, why it is important, the evolution of Analytics, and how companies such as LinkedIn, Stripe, Uber and more are using Real Time analytics to grow their audience and improve usability by using Apache Pinot. What is Apache Pinot? Followed by Demo and Q&A.
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...Anant Corporation
Series: Using AI / ChatGPT at Work - GPT Automation
Are you a small business owner or web developer interested in leveraging the power of GPT (Generative Pretrained Transformer) technology to enhance your business processes? If so, Join us for a series of events focused on using GPT in business. Whether you're a small business owner or a web developer, you'll learn how to leverage GPT to improve your workflow and provide better services to your customers.
GPT Automation: What it is and How it Works
How Time-Saving GPT Automation Can Improve Your Business
Cost-Effective GPT Automation: How it Can Save Your Business Money
Using GPT Automation for Customer Service: Benefits and Best Practices
The Power of GPT Automation for Content Creation
Data Analysis Made Easy with GPT Automation
Top GPT-3 Automation Tools for Businesses
The Ethical Considerations of GPT Automation
Overcoming Bias in GPT Automation: Best Practices
The Future of GPT Automation: Trends and Predictions
Since we focus on "no code" here, we'll explore the tools that are already out there such as ChatGPT plugins for Chrome, OpenAI GPT API, low-code/no-code platforms like Make/Integromat and Zapier, existing apps like Jasper/Rytr, and ecosystem tools like Everyprompt. We'll also discuss the resources available for those interested in learning more about GPT, including other people’s prompts.
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
This document provides an agenda for a full-day bootcamp on large language models (LLMs) like GPT-3. The bootcamp will cover fundamentals of machine learning and neural networks, the transformer architecture, how LLMs work, and popular LLMs beyond ChatGPT. The agenda includes sessions on LLM strategy and theory, design patterns for LLMs, no-code/code stacks for LLMs, and building a custom chatbot with an LLM and your own data.
In Apache Cassandra Lunch #131: YugabyteDB Developer Tools, we discussed third party developer tools that are compatible with YugabyteDB. We talked about using Yugabyte Developer Tools for data visualization and schema management. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST.
Developer tools play a critical role in simplifying and streamlining database development and management. They allow developers and administrators to be more productive, reducing the time and effort required to create and maintain database schemas, write SQL queries, test database performance, and enable collaboration. Developer tools also make it possible to track changes over time, improving the ability to manage the entire development lifecycle.
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
In this episode we'll discuss the different flavors of prompt engineering in the LLM/GPT space. According to your skill level you should be able to pick up at any of the following:
Leveling up with GPT
1: Use ChatGPT / GPT Powered Apps
2: Become a Prompt Engineer on ChatGPT/GPT
3: Use GPT API with NoCode Automation, App Builders
4: Create Workflows to Automate Tasks with NoCode
5: Use GPT API with Code, make your own APIs
6: Create Workflows to Automate Tasks with Code
7: Use GPT API with your Data / a Framework
8: Use GPT API with your Data / a Framework to Make your own APIs
9: Create Workflows to Automate Tasks with your Data /a Framework
10: Use Another LLM API other than GPT (Cohere, HuggingFace)
11: Use open source LLM models on your computer
12: Finetune / Build your own models
Series: Using AI / ChatGPT at Work - GPT Automation
Are you a small business owner or web developer interested in leveraging the power of GPT (Generative Pretrained Transformer) technology to enhance your business processes?
If so, Join us for a series of events focused on using GPT in business. Whether you're a small business owner or a web developer, you'll learn how to leverage GPT to improve your workflow and provide better services to your customers.
In Data Engineer’s Lunch #89: Machine Learning Orchestration with Airflow, we discussed using Apache Airflow to manage and schedule machine learning tasks. By following the best practices of ML Ops, teams can streamline their ML workflows and build scalable, efficient, and accurate models that deliver real-world business value. Properly implemented ML Ops can help organizations stay ahead of the curve and achieve their goals in the fast-paced world of machine learning. Apache Airflow is an open-source tool for scheduling and automating workflows. Airflow allows you to define workflows in Python, with tasks defined as Python functions that can include Operators for all sorts of external tools. This makes it easy to automate repeated processes and define dependencies between tasks, creating directed-acyclic-graphs of tasks that can be scheduled using cron syntax or frequency tasks. Airflow also features a user-friendly UI for monitoring task progress and viewing logs, giving you greater control over your data pipeline.
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
If you didn't attend, you don't want to miss a much shorter synopsis of what was covered and get some thoughts from us as to why they are important. We'll talk about the main topics of the event.
1. ACID transactions on Cassandra by Aaron Ploetz, Datastax
2. Apache Flink with Apache Cassandra at Satyajit Thadeswar, Netflix
3. Durable Execution built on Apache Cassandra by Loren Sands-Ramshaw, Temporal
4. Switching from Mongo to Cassandra with Mongoose & new Stargate JSON API, Valeri Karpov
5. Cloud Native and Realtime AI/ML with Patrick Mcfadin and Davor Boncaci, Datastax
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
In Data Engineer's Lunch 90, Eric Ramseur teaches our audience how to use Arcion.
From best practices to real-world examples, this talk will provide you with the knowledge and insights you need to ensure a successful migration of your SQL data. So whether you're new to data migration or looking to improve your existing process, join us and discover how Arcion can help you achieve your goals.
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
In Data Engineer's Lunch 89, Obioma Anomnachi will discuss how to manage and schedule Machine Learning operations via Airflow. Learn how you can write complete end-to-end pipelines starting with retrieving raw data to serving ML predictions to end-users, entirely in Airflow.
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
As the demand for real-time data processing continues to grow, so too do the challenges associated with building production-ready applications that can handle large volumes of data and handle it quickly. In this talk, we will explore common problems faced when building real-time applications at scale, with a focus on a specific use case: detecting and responding to cyclist crashes. Using telemetry data collected from a fitness app, we’ll demonstrate how we used a combination of Apache Kafka and Python-based microservices running on Kubernetes to build a pipeline for processing and analyzing this data in real-time. We'll also discuss how we used machine learning techniques to build a model for detecting collisions and how we implemented notifications to alert family members of a crash. Our ultimate goal is to help you navigate the challenges that come with building data-intensive, real-time applications that use ML models. By showcasing a real-world example, we aim to provide practical solutions and insights that you can apply to your own projects.
Key takeaways:
An understanding of the common challenges faced when building real-time applications at scale
Strategies for using Apache Kafka and Python-based microservices to process and analyze data in real-time
Tips for implementing machine learning models in a real-time application
Best practices for responding to and handling critical events in a real-time application
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
What are the design considerations that go into architecting a modern data warehouse? This presentation will cover some of the requirements analysis, design decisions, and execution challenges of building a modern data lake/data warehouse.
In Apache Cassandra Lunch #121: Migrating to Azure Managed Instance for Apache Cassandra, we discussed different methods for migrating data from existing Cassandra instances to Azure hosted options.
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
In this talk, Dremio Developer Advocate, Alex Merced, discusses strategies for migrating your existing data over to Apache Iceberg. He'll go over the following:
How to Migrate Hive, Delta Lake, JSON, and CSV sources to Apache Iceberg
Pros and Cons of an In-place or Shadow Migration
Migrating between Apache Iceberg catalogs Hive/Glue -- Arctic/Nessie
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsAnant Corporation
In this lunch, Johnny will show us how easy it is to start monitoring your Cassandra cluster in minutes. He will explain the various aspects and features of Cassandra that need to be monitored, how to do it, and most importantly why! Approaches for backups and Cassandra repairs will be discussed and explored in detail.
Learn how AxonOps significantly reduces the complexity and overhead when looking after Cassandra and ensures your Cassandra cluster is reliable and resilient.
Experienced developer, DevOps, architect, and AxonOps co-founder, Johnny Miller, has worked with a wide variety of companies – from small start-ups to large enterprises. He has been working with Cassandra for many years and has a deep understanding of the challenges facing modern companies looking to adopt Apache Cassandra.
In Apache Cassandra Lunch #119, Rahul Singh will cover a refresher on GUI desktop/web tools for users that want to get their hands dirty with Cassandra but don't want to deal with CQLSH to do simple queries. Some of the tools are web-based and others are installed on your desktop. Since the beginning days of Cassandra, a lot has changed and there are many options for command-line-haters to use Cassandra.
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
In Data Engineer's Lunch #60, Rahul Singh, CEO here at Anant, will discuss modern data processing/pipeline approaches.
Want to learn about modern data engineering patterns & practices for global data platforms? A high-level overview of different types, frameworks, and workflows in data processing and pipeline design.
Generative Classifiers: Classifying with Bayesian decision theory, Bayes’ rule, Naïve Bayes classifier.
Discriminative Classifiers: Logistic Regression, Decision Trees: Training and Visualizing a Decision Tree, Making Predictions, Estimating Class Probabilities, The CART Training Algorithm, Attribute selection measures- Gini impurity; Entropy, Regularization Hyperparameters, Regression Trees, Linear Support vector machines.
06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann
06-18-2024-Princeton Meetup-Introduction to Milvus
tim.spann@zilliz.com
https://www.linkedin.com/in/timothyspann/
https://x.com/paasdev
https://github.com/tspannhw
https://github.com/milvus-io/milvus
Get Milvused!
https://milvus.io/
Read my Newsletter every week!
https://github.com/tspannhw/FLiPStackWeekly/blob/main/142-17June2024.md
For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here
https://www.youtube.com/@MilvusVectorDatabase/videos
Unstructured Data Meetups -
https://www.meetup.com/unstructured-data-meetup-new-york/
https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7
https://www.meetup.com/pro/unstructureddata/
https://zilliz.com/community/unstructured-data-meetup
https://zilliz.com/event
Twitter/X: https://x.com/milvusio https://x.com/paasdev
LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/
GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw
Invitation to join Discord: https://discord.com/invite/FjCMmaJng6
Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann
Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications.
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of March 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
Data Engineer's Lunch #76: Airflow and Google Dataproc
1. Version 1.0
Airflow and Google Dataproc
In Data Engineer's Lunch #76, Arpan Patel will cover how to
connect Airflow and Google Dataproc with a demo using an Airflow
DAG to create a Dataproc cluster, submit an Apache Spark job to
Dataproc, and destroy the Dataproc cluster upon completion.
Arpan Patel
Engineer @ Anant
2. Google Dataproc
● Fully managed and highly scalable service for running
Apache Spark, Apache Flink, Presto, and 30+ open source
tools and frameworks
○ Lets you take advantage of open source data tools
for batch processing, querying, streaming, and
machine learning
● Dataproc clusters are quick to start, scale, and shutdown,
with each of these operations taking 90 seconds or less,
on average
● Built-in integration with other Google Cloud Platform
services, such as BigQuery, Cloud Storage, Cloud
Bigtable, Cloud Logging, and Cloud Monitoring
● Can easily interact with clusters and Spark or Hadoop
jobs through the Google Cloud console, the Cloud SDK, or
the Dataproc REST API
4. Google Dataproc + DataStax Astra
● Cluster Properties
○ dataproc:dataproc.conscrypt.provider.enable=false
● Job Properties
○ spark.jars.packages → com.datastax.spark:spark-cassandra-connector_2.12:3.1.0
● DAG param mappings to GCP REST API mappings
○ need to convert camel casing to "_". For example masterConfig -> master_config
○ if we want to use GKE for Dataproc cluster creation, then need to swap cluster_config for
virtual_cluster_config
5. Demo
● Open repo on Gitpod
● Set GCP Connection and Variables
● Run Dag that will:
○ Spin up Dataproc Cluster on GCE
○ Submit Dataproc Spark Job to read from DataStax Astra
○ Destroy Cluster
6. Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037