Every company,
no matter how far from the tech they are,
is evolving into a software company,
and by extension a data company.
For a small company it’s important
to have access to modern BigData tools
without running a dedicated team for it.
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryMárton Kodok
Every scientist who needs big data analytics to save millions of lives should have that power. Powering Interactive Data Analysis require massive architecture, and know-how to build a fast real-time computing system. You will learn how Google BigQuery solves this problem by enabling super-fast, SQL queries against petabytes of data using the processing power of Google’s infrastructure. After this session you will be able to work with BigQuery, do streaming inserts, write User Defined Functions in Javascript, and several use cases for everyday developer: funnel analytics, behavioral analytics, exploring unstructured data. You will be able to run arbitrary queries on open-data such as historical data about Github commits, Stackoverflow Q&A data, or analysing Reddit comments to find out books the community talks about.
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
Teaser: provide developers a new way of understanding advanced analytics and choosing the right cloud architecture
The new buzzword is #serverless, as there are many great services that helps us abstract away the complexity associated with managing servers. In this session we will see how serverless helps on large data analytics backends.
We will see how to architect for Cloud and implement into an existing project components that will take us into the #serverless architecture that will ingest our streaming data, run advanced analytics on petabytes of data using BigQuery on Google Cloud Platform - all this next to an existing stack, without being forced to reengineer our app.
BigQuery enables super-fast, SQL/Javascript queries against petabytes of data using the processing power of Google’s infrastructure. We will cover its core features, SQL 2011 standard, working with streaming inserts, User Defined Functions written in Javascript, reference external JS libraries, and several use cases for everyday backend developer: funnel analytics, email heatmap, custom data processing, building dashboards, extracting data using JS functions, emitting rows based on business logic.
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...Márton Kodok
Every scientist who needs big data analytics to save millions of lives should have that power. Complex interactive Big Data analytics solutions require massive architecture, and Know-How to build a fast real-time computing system.BigQuery solves this problem by enabling super-fast, SQL-like queries against petabytes of data using the processing power of Google’s infrastructure. We will cover its core features, working with BigQuery, streaming inserts, User Defined Functions in Javascript, and several use cases for everyday developer: funnel analytics, behavioral analytics, exploring unstructured data.
Google BigQuery for Everyday DeveloperMárton Kodok
IV. IT&C Innovation Conference - October 2016 - Sovata, Romania
A. Every scientist who needs big data analytics to save millions of lives should have that power
Legacy systems don’t provide the power.
B. The simple fact is that you are brilliant but your brilliant ideas require complex analytics.
Traditional solutions are not applicable.
The Plan: have oversight over developments as they happen.
Goal: Store everything accessible by SQL immediately.
What is BigQuery?
Analytics-as-a-Service - Data Warehouse in the Cloud
Fully-Managed by Google (US or EU zone)
Scales into Petabytes
Ridiculously fast
Decent pricing (queries $5/TB, storage: $20/TB) *October 2016 pricing
100.000 rows / sec Streaming API
Open Interfaces (Web UI, BQ command line tool, REST, ODBC)
Familiar DB Structure (table, views, record, nested, JSON)
Convenience of SQL + Javascript UDF (User Defined Functions)
Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors
Client libraries available in YFL (your favorite languages)
Our benefits
no provisioning/deploy
no running out of resources
no more focus on large scale execution plan
no need to re-implement tricky concepts
(time windows / join streams)
pay only the columns we have in your queries
run raw ad-hoc queries (either by analysts/sales or Devs)
no more throwing away-, expiring-, aggregating old data.
BigQuery ML - Machine learning at scale using SQLMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the data warehouse environment and training it on massive datasets. We are going to demonstrate how to build, train, eval and predict, your own scalable machine learning models using standard SQL language in Google BigQuery.
We will see how can we use CREATE MODEL sql syntax to build different models such as:
Linear regression
Multiclass logistic regression for classification
K-means clustering
Import TensorFlow models for prediction in BigQuery
We will see how we can apply these models on tabular data in retail and marketing use cases.
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryMárton Kodok
Every scientist who needs big data analytics to save millions of lives should have that power. Powering Interactive Data Analysis require massive architecture, and know-how to build a fast real-time computing system. You will learn how Google BigQuery solves this problem by enabling super-fast, SQL queries against petabytes of data using the processing power of Google’s infrastructure. After this session you will be able to work with BigQuery, do streaming inserts, write User Defined Functions in Javascript, and several use cases for everyday developer: funnel analytics, behavioral analytics, exploring unstructured data. You will be able to run arbitrary queries on open-data such as historical data about Github commits, Stackoverflow Q&A data, or analysing Reddit comments to find out books the community talks about.
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
Teaser: provide developers a new way of understanding advanced analytics and choosing the right cloud architecture
The new buzzword is #serverless, as there are many great services that helps us abstract away the complexity associated with managing servers. In this session we will see how serverless helps on large data analytics backends.
We will see how to architect for Cloud and implement into an existing project components that will take us into the #serverless architecture that will ingest our streaming data, run advanced analytics on petabytes of data using BigQuery on Google Cloud Platform - all this next to an existing stack, without being forced to reengineer our app.
BigQuery enables super-fast, SQL/Javascript queries against petabytes of data using the processing power of Google’s infrastructure. We will cover its core features, SQL 2011 standard, working with streaming inserts, User Defined Functions written in Javascript, reference external JS libraries, and several use cases for everyday backend developer: funnel analytics, email heatmap, custom data processing, building dashboards, extracting data using JS functions, emitting rows based on business logic.
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...Márton Kodok
Every scientist who needs big data analytics to save millions of lives should have that power. Complex interactive Big Data analytics solutions require massive architecture, and Know-How to build a fast real-time computing system.BigQuery solves this problem by enabling super-fast, SQL-like queries against petabytes of data using the processing power of Google’s infrastructure. We will cover its core features, working with BigQuery, streaming inserts, User Defined Functions in Javascript, and several use cases for everyday developer: funnel analytics, behavioral analytics, exploring unstructured data.
Google BigQuery for Everyday DeveloperMárton Kodok
IV. IT&C Innovation Conference - October 2016 - Sovata, Romania
A. Every scientist who needs big data analytics to save millions of lives should have that power
Legacy systems don’t provide the power.
B. The simple fact is that you are brilliant but your brilliant ideas require complex analytics.
Traditional solutions are not applicable.
The Plan: have oversight over developments as they happen.
Goal: Store everything accessible by SQL immediately.
What is BigQuery?
Analytics-as-a-Service - Data Warehouse in the Cloud
Fully-Managed by Google (US or EU zone)
Scales into Petabytes
Ridiculously fast
Decent pricing (queries $5/TB, storage: $20/TB) *October 2016 pricing
100.000 rows / sec Streaming API
Open Interfaces (Web UI, BQ command line tool, REST, ODBC)
Familiar DB Structure (table, views, record, nested, JSON)
Convenience of SQL + Javascript UDF (User Defined Functions)
Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors
Client libraries available in YFL (your favorite languages)
Our benefits
no provisioning/deploy
no running out of resources
no more focus on large scale execution plan
no need to re-implement tricky concepts
(time windows / join streams)
pay only the columns we have in your queries
run raw ad-hoc queries (either by analysts/sales or Devs)
no more throwing away-, expiring-, aggregating old data.
BigQuery ML - Machine learning at scale using SQLMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the data warehouse environment and training it on massive datasets. We are going to demonstrate how to build, train, eval and predict, your own scalable machine learning models using standard SQL language in Google BigQuery.
We will see how can we use CREATE MODEL sql syntax to build different models such as:
Linear regression
Multiclass logistic regression for classification
K-means clustering
Import TensorFlow models for prediction in BigQuery
We will see how we can apply these models on tabular data in retail and marketing use cases.
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.
Supercharge your data analytics with BigQueryMárton Kodok
Powering interactive data analysis require massive architecture, and Know-How to build a fast real-time computing system. BigQuery solves this problem by enabling super-fast, SQL-like queries against petabytes of data using the processing power of Google’s infrastructure. We will cover its core features, creating tables, columns, views, working with partitions, clustering for cost optimizations, streaming inserts, User Defined Functions, and several use cases for everydaay developer: funnel analytics, behavioral analytics, exploring unstructured data.
The other part will be about BigQuery ML, which enables users to create and execute machine learning models in BigQuery using standard SQL queries. BigQuery ML democratizes machine learning by enabling SQL practitioners to build models using existing SQL tools and skills. BigQuery ML increases development speed by eliminating the need to move data.
Introduction to our Datawarehouse solutions called BigQuery.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
DevFest Romania 2020 Keynote: Bringing the Cloud to you.Márton Kodok
Next OnAir 20 in review,
Real-time AI solutions
like anomaly detection, pattern recognition, and predictive forecasting
2. Recommendations AI rich experience to personalized product recommendations
3. Media Translation API real-time speech translation from streaming audio
4. Lending DocAI solution powered by Document AI for mortgage industry
5. Contact Center AI support over chat/voice calls by identifying intent and providing assistance
Confidential VMs are a breakthrough technology that allow customers to encrypt their most sensitive data in the cloud while it's being processed
Cloud Run: - Minimum idle instances
- Allocate 4 vCPUs and 4GiB memory
- Requests up to 60 minutes
- Server-side HTTP + gRPC streaming
- VPC access support
- External Load Balancing
Serverless orchestration and automation with Cloud Workflows (beta)
- Steps defined in YAML
- Built-in decision and conditional exec
- Subworkflows
- Support for external API calls
- Custom predicate for retries
Predict, recommend and forecast with BigQuery ML
CREATE MODEL syntax in BigQuery to run Machine Learning tasks
Supported models:
- K-means clustering for data segmentation
- Recommend with Matrix Factorization
- Perform time-series forecast
- Import TensorFlow models
Single interface for multiple services with API Gateway
Find Your Topic and Skill Level
Qwiklabs + New Tutorials Center
BigQuery ML - Machine learning at scale using SQLMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the data warehouse environment and training it on massive datasets. We are going to demonstrate how to build, train, eval and predict, your own scalable machine learning models using standard SQL language in Google BigQuery.
We will see how can we use CREATE MODEL sql syntax to build different models such as:
-Linear regression
-Multiclass logistic regression for classification
-K-means clustering
-Import TensorFlow models for prediction in BigQuery
We will see how we can apply these models on tabular data in retail and marketing use cases.
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.
[LondonSEO 2020] BigQuery & SQL for SEOsAreej AbuAli
In this talk, Areej will share her learning process, how SEOs can get acquainted with the world of BigQuery and why SQL is the new and improved Excel. The audience will walk away with a handful of scripts and tips to get their BigQuery journey started!
Data Lineage with Apache Airflow using Marquez Willy Lulciuc
The term data quality is used to describe the dependability, reliability, and usability of datasets. Data scientists and business analysts often determine the quality of a dataset by its trustworthiness and completeness. But what information might be needed to differentiate between useful vs noisy data? How quickly can data quality issues be identified and explored? More importantly, how can metadata enable data scientists to make better sense of the high volume of data within their organization from a variety of data sources?
With Airflow now ubiquitous for DAG orchestration, organizations increasingly dependon Airflow to manage complex inter-DAG dependencies and provide up-to-date runtime visibility into DAG execution. At WeWork, Airflow has quickly become an important component of our Data Platform powering billing, space inventory, etc. But what effects (if any) would upstream DAGs have on downstream DAGs if dataset consumption was delayed? What alerting rules should be in place to notify downstream DAGs of possible upstream processing issues or failures?
At WeWork, we feel it’s critical that DAG metadata is collected, maintained, and shared across the organization. This investment in metadata enables:
● Data lineage
● Data governance
● Data discovery
In this talk, we introduce Marquez: an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata. We will demonstrate how metadata management with Marquez helps maintain inter-DAG dependencies, catalog historical runs of DAGs, and minimize data quality issues.
Applying BigQuery ML on e-commerce data analyticsMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the database environment and training it on massive datasets. We are going to demonstrate common marketing Machine Learning use cases we do at REEA.net to build, train, eval and predict, your own scalable machine learning models using SQL language in Google BigQuery and to address the following use cases:
Customer Segmentation
Customer Lifetime Value (LTV) prediction
Conversion/Purchase prediction
The audience will get first hand experience how to write CREATE MODEL sql syntax to build machine learning models such as:
Multiclass logistic regression for classification
K-means clustering
Import TensorFlow models for prediction in BigQuery
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor
An short introduction on Big Query. With this presentation you'll quickly discover :
How load data in BigQuery
How to build dashboard using BigQuery
How to work with BigQuery
and, at last but not least, we've added some best practices
We hope you'll enjoy this presentation and that it will help you to start exploring this wonderful solution. Don't hesitate to send us your feedbacks or questions
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital
Avancerad dataanalys och ”big data” har under de senaste åren klättrat på trendlistorna och är nu ett av de mest prioriterade områdena i utvecklingen av nya tjänster och produkter för ledarföretag i det digitala landskapet.
Informationen som byggs upp i systemen när kundmötena digitaliseras har visat sig vara guld värt. Här finns allt vi behöver veta för att göra våra affärer mer effektiva.
Sedan sommaren 2013 har Connecta tillsammans med Google ett etablerat samarbete för att hjälpa våra kunder med övergången till moln-tjänster för bland annat avancerad dataanalys. För att göra oss själva redo att hjälpa våra kunder har vi under ett antal år utvecklat såväl kunskaper som skaffat oss erfarenheter kring Googles olika moln-produkter, som exempelvis ”Big Query”.
Big Query är ett molnbaserat analysverktyg och en del av Google Cloud Platform. Big Query gör det möjligt att ställa snabba frågor mot enorma dataset på bara någon sekund. Big Query och Google Cloud Platform erbjuder färdiga lösningar för att sätta upp och underhålla en infrastruktur som med enkla medel gör allt detta möjligt.
På Connecta Digital Consultings tredje event för våren introducerade vi våra kunder och partners i koncepten dataanalys och Big Query.
Under eventet berördes följande punkter:
- Big Data och Business Intelligence (BI)
- “The Google Big Data tools” – framgångsfaktorer och hur man kommer igång
- Google Cloud Platform och hur man genomför en framgångsrik molnsatsning
Vi presenterade case och berättade om viktiga lärdomar vi dragit i samarbetet med Google och våra kunder.
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauMongoDB
Pairing your real-time operational data stored in a modern database like MongoDB with first-class business intelligence platforms like Tableau enables new insights to be discovered faster than ever before.
Many leading organizations already use MongoDB in conjunction with Tableau including a top American investment bank and the world’s largest airline. With the Connector for BI 2.0, it’s never been easier to streamline the connection process between these two systems.
In this webinar, we will create a live connection from Tableau Desktop to a MongoDB cluster using the Connector for BI. Once we have Tableau Desktop and MongoDB connected, we will demonstrate the visual power of Tableau to explore the agile data storage of MongoDB.
You’ll walk away knowing:
- How to configure MongoDB with Tableau using the updated connector
- Best practices for working with documents in a BI environment
- How leading companies are using big data visualization strategies to transform their businesses
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Imply
Target is one of the largest retailers in the United States, with brick-and-mortar stores in all 50 states and one of the most-visited ecommerce sites in the country. In addition to typical merchandising functions like assortment planning, pricing and inventory management, Target also operates a large supply chain, financial/banking operations and property management organizations. As a data-driven organization, we need a data analytics platform that can address the unique needs of each of these various business units, while scaling to hundreds of thousands of users and accommodating an ever-increasing amount of data.
In this talk we’ll cover why Target chose to create our own analytics platform and specifically how Druid makes this platform successful. We’ll cover how we utilize key features in Druid, such as union datasources, arbitrary granularities, real-time ingestion, complex aggregation expressions and lightning-fast query response to provide analytics to users at all levels of the organization. We’ll also cover how Druid’s speed and flexibility allow us to provide interactive analytics to front-line, edge-of-business consumers to address hundreds of unique use-cases across several business units.
The 'macro view' on Big Query:
We started with an overview, some typical uses and moved to project hierarchy, access control and security.
In the end we touch about tools and demos.
As Twitch grew, both the amount of data we received and the number of employees interested in the data grew rapidly. In order to continue empowering decision making as we scaled, we turned to using Druid and Imply to provide self service analytics to both our technical and non technical staff allowing them to drill into high level metrics in lieu of reading generated reports.
In this talk, learn how Twitch implemented a common analytics platform for the needs of many different teams supporting hundreds of users, thousands of queries, and ~5 billion events each day. This session will explain our Druid architecture in detail, including:
-The end-to-end architecture deployed on Amazon that includes Kinesis, RDS, S3, Druid, Pivot and Tableau
-How the data is brought together to deliver a unified view of live customer engagement and historical trends
-Operational best practices we learnt scaling Druid
-An example walk through using the platform
MongoDB .local Paris 2020: Les bonnes pratiques pour travailler avec les donn...MongoDB
Les données de séries chronologiques sont de plus en plus au cœur des applications modernes: pensez à l'IoT, aux transactions sur actions, aux flux de clics, aux médias sociaux, etc. Avec le passage des systèmes batch aux systèmes temps réel, la capture et l'analyse efficaces des données de séries chronologiques peuvent permettre aux entreprises de mieux détecter et réagir aux événements en avance sur leurs concurrents ou d'améliorer l'efficacité opérationnelle pour réduire les coûts et les risques. Travailler avec des données de séries chronologiques est souvent différent des données d’application classiques et vous devez observer les meilleures pratiques. Cette conférence couvre: Composants communs d'une solution IoT Les défis liés à la gestion de données chronologiques dans les applications IoT Différentes conceptions de schéma et leur incidence sur l'utilisation de la mémoire et du disque sont deux facteurs déterminants dans les performances des applications. Comment interroger, analyser et présenter les données de séries chronologiques IoT à l'aide de MongoDB Compass et MongoDB Charts À la fin de la session, vous aurez une meilleure compréhension des meilleures pratiques clés en matière de gestion des données de séries chronologiques de l'IoT avec MongoDB.
Creating a Modern Data Architecture for Digital TransformationMongoDB
By managing Data in Motion, Data at Rest, and Data in Use differently, modern Information Management Solutions are enabling a whole range of architecture and design patterns that allow enterprises to fully harness the value in data flowing through their systems. In this session we explored some of the patterns (e.g. operational data lakes, CQRS, microservices and containerisation) that enable CIOs, CDOs and senior architects to tame the data challenge, and start to use data as a cross-enterprise asset.
Complex realtime event analytics using BigQuery @Crunch WarmupMárton Kodok
Complex event analytics solutions require massive architecture, and Know-How to build a fast real-time computing system. Google BigQuery solves this problem by enabling super-fast, SQL-like queries against append-only tables, using the processing power of Google’s infrastructure.In this presentation we will see how Bigquery solves our ultimate goal: Store everything accessible by SQL immediately at petabyte-scale. We will discuss some common use cases: funnels, user retention, affiliate metrics.
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Christopher Gutknecht
In this data management session, Christopher describes how to build robust and reliable data products in BigQuery and dbt, for PPC and SEO use cases. After an introduction to the modern data stack, six principles of reliable data products are presented, followed by the following use cases:
- Google Ads Conversion upload
- SEO sitemap efficiency report
- Google Shopping product rating sync
- Large-Scale link checker with advertools
- Inventory-based PPC campaigns with dbt
Here is the referenced selection of gists on github: https://gist.github.com/ChrisGutknecht
Supercharge your data analytics with BigQueryMárton Kodok
Powering interactive data analysis require massive architecture, and Know-How to build a fast real-time computing system. BigQuery solves this problem by enabling super-fast, SQL-like queries against petabytes of data using the processing power of Google’s infrastructure. We will cover its core features, creating tables, columns, views, working with partitions, clustering for cost optimizations, streaming inserts, User Defined Functions, and several use cases for everydaay developer: funnel analytics, behavioral analytics, exploring unstructured data.
The other part will be about BigQuery ML, which enables users to create and execute machine learning models in BigQuery using standard SQL queries. BigQuery ML democratizes machine learning by enabling SQL practitioners to build models using existing SQL tools and skills. BigQuery ML increases development speed by eliminating the need to move data.
Introduction to our Datawarehouse solutions called BigQuery.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
DevFest Romania 2020 Keynote: Bringing the Cloud to you.Márton Kodok
Next OnAir 20 in review,
Real-time AI solutions
like anomaly detection, pattern recognition, and predictive forecasting
2. Recommendations AI rich experience to personalized product recommendations
3. Media Translation API real-time speech translation from streaming audio
4. Lending DocAI solution powered by Document AI for mortgage industry
5. Contact Center AI support over chat/voice calls by identifying intent and providing assistance
Confidential VMs are a breakthrough technology that allow customers to encrypt their most sensitive data in the cloud while it's being processed
Cloud Run: - Minimum idle instances
- Allocate 4 vCPUs and 4GiB memory
- Requests up to 60 minutes
- Server-side HTTP + gRPC streaming
- VPC access support
- External Load Balancing
Serverless orchestration and automation with Cloud Workflows (beta)
- Steps defined in YAML
- Built-in decision and conditional exec
- Subworkflows
- Support for external API calls
- Custom predicate for retries
Predict, recommend and forecast with BigQuery ML
CREATE MODEL syntax in BigQuery to run Machine Learning tasks
Supported models:
- K-means clustering for data segmentation
- Recommend with Matrix Factorization
- Perform time-series forecast
- Import TensorFlow models
Single interface for multiple services with API Gateway
Find Your Topic and Skill Level
Qwiklabs + New Tutorials Center
BigQuery ML - Machine learning at scale using SQLMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the data warehouse environment and training it on massive datasets. We are going to demonstrate how to build, train, eval and predict, your own scalable machine learning models using standard SQL language in Google BigQuery.
We will see how can we use CREATE MODEL sql syntax to build different models such as:
-Linear regression
-Multiclass logistic regression for classification
-K-means clustering
-Import TensorFlow models for prediction in BigQuery
We will see how we can apply these models on tabular data in retail and marketing use cases.
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.
[LondonSEO 2020] BigQuery & SQL for SEOsAreej AbuAli
In this talk, Areej will share her learning process, how SEOs can get acquainted with the world of BigQuery and why SQL is the new and improved Excel. The audience will walk away with a handful of scripts and tips to get their BigQuery journey started!
Data Lineage with Apache Airflow using Marquez Willy Lulciuc
The term data quality is used to describe the dependability, reliability, and usability of datasets. Data scientists and business analysts often determine the quality of a dataset by its trustworthiness and completeness. But what information might be needed to differentiate between useful vs noisy data? How quickly can data quality issues be identified and explored? More importantly, how can metadata enable data scientists to make better sense of the high volume of data within their organization from a variety of data sources?
With Airflow now ubiquitous for DAG orchestration, organizations increasingly dependon Airflow to manage complex inter-DAG dependencies and provide up-to-date runtime visibility into DAG execution. At WeWork, Airflow has quickly become an important component of our Data Platform powering billing, space inventory, etc. But what effects (if any) would upstream DAGs have on downstream DAGs if dataset consumption was delayed? What alerting rules should be in place to notify downstream DAGs of possible upstream processing issues or failures?
At WeWork, we feel it’s critical that DAG metadata is collected, maintained, and shared across the organization. This investment in metadata enables:
● Data lineage
● Data governance
● Data discovery
In this talk, we introduce Marquez: an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata. We will demonstrate how metadata management with Marquez helps maintain inter-DAG dependencies, catalog historical runs of DAGs, and minimize data quality issues.
Applying BigQuery ML on e-commerce data analyticsMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the database environment and training it on massive datasets. We are going to demonstrate common marketing Machine Learning use cases we do at REEA.net to build, train, eval and predict, your own scalable machine learning models using SQL language in Google BigQuery and to address the following use cases:
Customer Segmentation
Customer Lifetime Value (LTV) prediction
Conversion/Purchase prediction
The audience will get first hand experience how to write CREATE MODEL sql syntax to build machine learning models such as:
Multiclass logistic regression for classification
K-means clustering
Import TensorFlow models for prediction in BigQuery
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor
An short introduction on Big Query. With this presentation you'll quickly discover :
How load data in BigQuery
How to build dashboard using BigQuery
How to work with BigQuery
and, at last but not least, we've added some best practices
We hope you'll enjoy this presentation and that it will help you to start exploring this wonderful solution. Don't hesitate to send us your feedbacks or questions
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital
Avancerad dataanalys och ”big data” har under de senaste åren klättrat på trendlistorna och är nu ett av de mest prioriterade områdena i utvecklingen av nya tjänster och produkter för ledarföretag i det digitala landskapet.
Informationen som byggs upp i systemen när kundmötena digitaliseras har visat sig vara guld värt. Här finns allt vi behöver veta för att göra våra affärer mer effektiva.
Sedan sommaren 2013 har Connecta tillsammans med Google ett etablerat samarbete för att hjälpa våra kunder med övergången till moln-tjänster för bland annat avancerad dataanalys. För att göra oss själva redo att hjälpa våra kunder har vi under ett antal år utvecklat såväl kunskaper som skaffat oss erfarenheter kring Googles olika moln-produkter, som exempelvis ”Big Query”.
Big Query är ett molnbaserat analysverktyg och en del av Google Cloud Platform. Big Query gör det möjligt att ställa snabba frågor mot enorma dataset på bara någon sekund. Big Query och Google Cloud Platform erbjuder färdiga lösningar för att sätta upp och underhålla en infrastruktur som med enkla medel gör allt detta möjligt.
På Connecta Digital Consultings tredje event för våren introducerade vi våra kunder och partners i koncepten dataanalys och Big Query.
Under eventet berördes följande punkter:
- Big Data och Business Intelligence (BI)
- “The Google Big Data tools” – framgångsfaktorer och hur man kommer igång
- Google Cloud Platform och hur man genomför en framgångsrik molnsatsning
Vi presenterade case och berättade om viktiga lärdomar vi dragit i samarbetet med Google och våra kunder.
Webinar: Introducing the MongoDB Connector for BI 2.0 with TableauMongoDB
Pairing your real-time operational data stored in a modern database like MongoDB with first-class business intelligence platforms like Tableau enables new insights to be discovered faster than ever before.
Many leading organizations already use MongoDB in conjunction with Tableau including a top American investment bank and the world’s largest airline. With the Connector for BI 2.0, it’s never been easier to streamline the connection process between these two systems.
In this webinar, we will create a live connection from Tableau Desktop to a MongoDB cluster using the Connector for BI. Once we have Tableau Desktop and MongoDB connected, we will demonstrate the visual power of Tableau to explore the agile data storage of MongoDB.
You’ll walk away knowing:
- How to configure MongoDB with Tableau using the updated connector
- Best practices for working with documents in a BI environment
- How leading companies are using big data visualization strategies to transform their businesses
Building an Enterprise-Scale Dashboarding/Analytics Platform Powered by the C...Imply
Target is one of the largest retailers in the United States, with brick-and-mortar stores in all 50 states and one of the most-visited ecommerce sites in the country. In addition to typical merchandising functions like assortment planning, pricing and inventory management, Target also operates a large supply chain, financial/banking operations and property management organizations. As a data-driven organization, we need a data analytics platform that can address the unique needs of each of these various business units, while scaling to hundreds of thousands of users and accommodating an ever-increasing amount of data.
In this talk we’ll cover why Target chose to create our own analytics platform and specifically how Druid makes this platform successful. We’ll cover how we utilize key features in Druid, such as union datasources, arbitrary granularities, real-time ingestion, complex aggregation expressions and lightning-fast query response to provide analytics to users at all levels of the organization. We’ll also cover how Druid’s speed and flexibility allow us to provide interactive analytics to front-line, edge-of-business consumers to address hundreds of unique use-cases across several business units.
The 'macro view' on Big Query:
We started with an overview, some typical uses and moved to project hierarchy, access control and security.
In the end we touch about tools and demos.
As Twitch grew, both the amount of data we received and the number of employees interested in the data grew rapidly. In order to continue empowering decision making as we scaled, we turned to using Druid and Imply to provide self service analytics to both our technical and non technical staff allowing them to drill into high level metrics in lieu of reading generated reports.
In this talk, learn how Twitch implemented a common analytics platform for the needs of many different teams supporting hundreds of users, thousands of queries, and ~5 billion events each day. This session will explain our Druid architecture in detail, including:
-The end-to-end architecture deployed on Amazon that includes Kinesis, RDS, S3, Druid, Pivot and Tableau
-How the data is brought together to deliver a unified view of live customer engagement and historical trends
-Operational best practices we learnt scaling Druid
-An example walk through using the platform
MongoDB .local Paris 2020: Les bonnes pratiques pour travailler avec les donn...MongoDB
Les données de séries chronologiques sont de plus en plus au cœur des applications modernes: pensez à l'IoT, aux transactions sur actions, aux flux de clics, aux médias sociaux, etc. Avec le passage des systèmes batch aux systèmes temps réel, la capture et l'analyse efficaces des données de séries chronologiques peuvent permettre aux entreprises de mieux détecter et réagir aux événements en avance sur leurs concurrents ou d'améliorer l'efficacité opérationnelle pour réduire les coûts et les risques. Travailler avec des données de séries chronologiques est souvent différent des données d’application classiques et vous devez observer les meilleures pratiques. Cette conférence couvre: Composants communs d'une solution IoT Les défis liés à la gestion de données chronologiques dans les applications IoT Différentes conceptions de schéma et leur incidence sur l'utilisation de la mémoire et du disque sont deux facteurs déterminants dans les performances des applications. Comment interroger, analyser et présenter les données de séries chronologiques IoT à l'aide de MongoDB Compass et MongoDB Charts À la fin de la session, vous aurez une meilleure compréhension des meilleures pratiques clés en matière de gestion des données de séries chronologiques de l'IoT avec MongoDB.
Creating a Modern Data Architecture for Digital TransformationMongoDB
By managing Data in Motion, Data at Rest, and Data in Use differently, modern Information Management Solutions are enabling a whole range of architecture and design patterns that allow enterprises to fully harness the value in data flowing through their systems. In this session we explored some of the patterns (e.g. operational data lakes, CQRS, microservices and containerisation) that enable CIOs, CDOs and senior architects to tame the data challenge, and start to use data as a cross-enterprise asset.
Complex realtime event analytics using BigQuery @Crunch WarmupMárton Kodok
Complex event analytics solutions require massive architecture, and Know-How to build a fast real-time computing system. Google BigQuery solves this problem by enabling super-fast, SQL-like queries against append-only tables, using the processing power of Google’s infrastructure.In this presentation we will see how Bigquery solves our ultimate goal: Store everything accessible by SQL immediately at petabyte-scale. We will discuss some common use cases: funnels, user retention, affiliate metrics.
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Christopher Gutknecht
In this data management session, Christopher describes how to build robust and reliable data products in BigQuery and dbt, for PPC and SEO use cases. After an introduction to the modern data stack, six principles of reliable data products are presented, followed by the following use cases:
- Google Ads Conversion upload
- SEO sitemap efficiency report
- Google Shopping product rating sync
- Large-Scale link checker with advertools
- Inventory-based PPC campaigns with dbt
Here is the referenced selection of gists on github: https://gist.github.com/ChrisGutknecht
Modern Thinking: Cómo el Big Data y Cognitive están cambiando la estrategia de Marketing
Por: Ismael Yuste, Strategic Cloud Engineer Google Cloud
Presentación: Introducción a las soluciones Big Data de Google
Applications need data, but the legacy approach of n-tiered application architecture doesn’t solve for today’s challenges. Developers aren’t empowered to build and iterate their code quickly without lengthy review processes from other teams. New data sources cannot be quickly adopted into application development cycles, and developers are not able to control their own requirements when it comes to data platforms.
Part of the challenge here is the existing relationship between two groups: developers and DBAs. Developers are trying to go faster, automating build/test/release cycles with CI/CD, and thrive on the autonomy provided by microservices architectures. DBAs are stewards of data protection, governance, and security. Both of these groups are critically important to running data platforms, but many organizations deal with high friction between these teams. As a result, applications get to market more slowly, and it takes longer for customers to see value.
What if we changed the orientation between developers and DBAs? What if developers consumed data products from data teams? In this session, Pivotal’s Dormain Drewitz and Solstice’s Mike Koleno will speak about:
- Product mindset and how balanced teams can reduce internal friction
- Creating data as a product to align with cloud-native application architectures, like microservices and serverless
- Getting started bringing lean principles into your data organization
- Balancing data usability with data protection, governance, and security
Presenter : Dormain Drewitz, Pivotal & Mike Koleno, Solstice
Discover BigQuery ML, build your own CREATE MODEL statementMárton Kodok
With BigQuery ML, you can build machine learning models without leaving the database environment and training it on massive datasets. In this demo session we are going to demonstrate common marketing Machine Learning use cases of how to build, train, eval, and predict, your own scalable machine learning models using SQL language in Google BigQuery and to address the following use cases: - Customer Segmentation + Product cross sale recommendation - Conversion/Purchase prediction - Inference with other in-built >20 models The audience will get first-hand experience with how to write CREATE MODEL sql syntax to build machine learning models such as: - Multiclass logistic regression for classification - K-means clustering - Matrix factorization - ARIMA time series predictions ... and more Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision-making through predictive analytics across the organization without leaving the query editor. In the end, the audience will learn how everyday developers can build/train/run their own machine-learning models straight from the database query editor, by issuing CREATE MODEL statements
Webinar: Faster Big Data Analytics with MongoDBMongoDB
Learn how to leverage MongoDB and Big Data technologies to derive rich business insight and build high performance business intelligence platforms. This presentation includes:
- Uncovering Opportunities with Big Data analytics
- Challenges of real-time data processing
- Best practices for performance optimization
- Real world case study
This presentation was given in partnership with CIGNEX Datamatics.
BigdataConference Europe - BigQuery MLMárton Kodok
One of the hottest topics in database land these days is BigQuery ML. A new way to use machine learning on top of tabular data straight on your tables without leaving the query editor.
With BigQuery ML, you can build machine learning models without leaving the database environment and training it on massive datasets.
In this demo session, we are going to demonstrate common marketing Machine Learning use cases how to build, train, eval and predict, your own scalable machine learning models using SQL language.
The audience will get first hand experience how to write CREATE MODEL sql syntax to build machine learning models such as:
– Multiclass logistic regression for classification
– K-means clustering
– Matrix factorization
– ARIMA time series predictions
– Import TensorFlow models for prediction in BigQuery
Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
Watch full webinar here: https://bit.ly/3offv7G
Presented at AI Live APAC
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spend most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Watch this on-demand session to learn how companies can use data virtualization to:
- Create a logical architecture to make all enterprise data available for advanced analytics exercise
- Accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- Integrate popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc.
GDG Heraklion - Architecting for the Google Cloud PlatformMárton Kodok
Learn about cloud components, architecture overviews to build an app using GCP components.
You will get hands-on information on how to build highly scalable and flexible applications optimized to run in GCP on the same infrastructure that powers Google. We will discuss cloud concepts and highlights various design patterns and best practices.
By the end of the session you will have hands-on experience to build a basic cloud application, it could be a simple web tier, powered by highly distributed database, background tasks executed on a pub/subsystem, and you get information how to go next level with advanced concepts like analytics warehouse, recommendation engines, and ML.
Data Ingestion in Big Data and IoT platformsGuido Schmutz
Many of the Big Data and IoT use cases are based on combining data from multiple data sources and to make them available on a Big Data platform for analysis. The data sources are often very heterogeneous, from simple files, databases to high-volume event streams from sensors (IoT devices). It’s important to retrieve this data in a secure and reliable manner and integrate it with the Big Data platform so that it is available for analysis in real-time (stream processing) as well as in batch (typical big data processing). In past some new tools have emerged, which are especially capable of handling the process of integrating data from outside, often called Data Ingestion. From an outside perspective, they are very similar to a traditional Enterprise Service Bus infrastructures, which in larger organization are often in use to handle message-driven and service-oriented systems. But there are also important differences, they are typically easier to scale in a horizontal fashion, offer a more distributed setup, are capable of handling high-volumes of data/messages, provide a very detailed monitoring on message level and integrate very well with the Hadoop ecosystem. This session will present and compare Apache NiFi, StreamSets and the Kafka Ecosystem and show how they handle the data ingestion in a Big Data solution architecture.
As an official MongoDB-as-a-Service offering from MongoDB Inc., the maker for MongoDB, Atlas is becoming a very popular service offering for those who wish to build their applications in the cloud, regardless on AWS, Azure or GCP. One less known cloud product offered on the Atlas platform is Stitch, A group of services designed to interact with Atlas in every conceivable way, including creating endpoints, triggers, user authentication flows, serverless functions, and a UI to handle all of this. Adding these together, you have a server-less solution running on top of MongoDB cloud.
Maximizing Data Lake ROI with Data Virtualization: A Technical DemonstrationDenodo
Watch full webinar here: https://bit.ly/3ohtRqm
Companies with corporate data lakes also need a strategy for how to best integrate them with their overall data fabric. To take full advantage of a data lake, data architects must determine what data belongs in the Lake vs. other sources, how end users are going to find and connect to the data they need as well as the best way to leverage the processing power of the data lake. This webinar will provide you with a deep dive look at how the Denodo Platform for data virtualization enables companies to maximize their investment in their corporate data lake.
Watch on-demand this webinar to learn:
- How to create a logical data fabric with Denodo
- How to leverage the a data lake for MPP Acceleration and Summary Views
- How to leverage Presto with Denodo for file based data lakes (ie. S3, ADLS, HDFS, etc.)
Entrepreneurship Tips With HTML5 & App Engine Startup Weekend (June 2012)Ido Green
My talk in Startup Weekend 2012 during Google I/O. It cover, startup life tips, modern web apps and how to leverage Google cloud (specific App Engine).
Similar to Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery (20)
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionMárton Kodok
Build applications with generative AI on Google Cloud! We are going to see in action what Gen App Builder is for developers to build and deploy AI-driven applications. We will explore Model Garden powered experiences, then we are going to learn more about the integration of these generative AI APIs. Vertex AI includes a suite of models that work with code. Together these code models are referred to as the PaLM and Codey APIs. The Vertex AI Codey APIs include the code generation API which supports generating code using a natural language description. We will show strategies for creating prompts that work with the model to generate code. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative AI industry trends.
DevBCN Vertex AI - Pipelines for your MLOps workflowsMárton Kodok
In recent years, one of the biggest trends in applications development has been the rise of Machine Learning solutions, tools, and managed platforms. Vertex AI is a managed unified ML platform for all your AI workloads. On the MLOps side, Vertex AI Pipelines solutions let you adopt experiment pipelining beyond the classic build, train, eval, and deploy a model. It is engineered for data scientists and data engineers, and it’s a tremendous help for those teams who don’t have DevOps or sysadmin engineers, as infrastructure management overhead has been almost completely eliminated. Based on practical examples we will demonstrate how Vertex AI Pipelines scores high in terms of developer experience, how fits custom ML needs, and analyze results. It’s a toolset for a fully-fledged machine learning workflow, a sequence of steps in the model development, a deployment cycle, such as data preparation/validation, model training, hyperparameter tuning, model validation, and model deployment. Vertex AI comes with all classic resources plus an ML metadata store, a fully managed feature store, and a fully managed pipelines runner. Vertex AI Pipelines is a managed serverless toolkit, which means you don't have to fiddle with infrastructure or back-end resources to run workflows.
Cloud Run - the rise of serverless and containerizationMárton Kodok
Two of the biggest trends in applications development in recent years have been the rise of serverless and containerization. And Cloud Run has become a defacto container runtime service to production in seconds. Based on practical examples we will demonstrate how Cloud Run scores high in terms of developer experience. It differs from functions runtime as You can bring your own container, your own code, a folder, or binarys and it pairs great with the container ecosystem: Cloud Build, Cloud Code, Artifact Registry, and Docker. Each Cloud Run service gets an out-of-the-box stable HTTPS endpoint, with TLS termination handled for you. Map your services to your own domains and use either for web sites, backend APIs, workflows, invoke and connect services with the newest protocols of HTTP/2, WebSockets or gRPC (unary and streaming). Cloud Run is serverless containers, which means you don't have to fiddle with infrastructure or back-end resources to run applications.
BigQuery best practices and recommendations to reduce costs with BI Engine, S...Márton Kodok
best practices and recommendations for tuning BI Engine for your existing BigQuery workloads for cheaper and faster queries. Learn how we at REEA are orchestrating BI Engine reservations, on a 5TB dataset, considered small for BigQuery but with big cost savings and accelerated queries. We are seeing many presentations for big enterprises, but now we are showcasing how our queries perform better with lower costs. We are going to address the top considerations when to turn on BI Engine, how to use cloud orchestration for making this an automatic process, and combined with BigQuery and Datastudio query complexity that might save precious development time, lower bills, faster queries.
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudMárton Kodok
Vertex AI is a managed ML platform for practitioners to accelerate experiments and deploy AI models.
Enhanced developer experience
- Build with the groundbreaking ML tools that power Google
- Approachable from the non-ML developer perspective (AutoML, managed models, training)
- Ease the life of a data scientist/ML (has feature store, managed datasets, endpoints, notebooks)
- Infrastructure management overhead have been almost completely eliminated
- Unified UI for the entire ML workflow
- End-to-end integration for data and AI with build pipelines that outperform and solve complex ML tasks
- Explainable AI and TensorBoard to visualize and track ML experiments
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
In recent years, one of the biggest trends in applications development has been the rise of Machine Learning solutions, tools, and managed platforms. Vertex AI is a managed unified ML platform for all your AI workloads. On the MLOps side, Vertex AI Pipelines solutions let you adopt experiment pipelining beyond the classic build, train, eval, and deploy a model. It is engineered for data scientists and data engineers, and it’s a tremendous help for those teams who don’t have DevOps or sysadmin engineers, as infrastructure management overhead has been almost completely eliminated.
Based on practical examples we will demonstrate how Vertex AI Pipelines scores high in terms of developer experience, how fits custom ML needs, and analyze results. It’s a toolset for a fully-fledged machine learning workflow, a sequence of steps in the model development, a deployment cycle, such as data preparation/validation, model training, hyperparameter tuning, model validation, and model deployment. Vertex AI comes with all standard resources plus an ML metadata store, a fully managed feature store, and a fully managed pipelines runner.
Vertex AI Pipelines is a managed serverless toolkit, which means you don't have to fiddle with infrastructure or back-end resources to run workflows.
Cloud Workflows What's new in serverless orchestration and automationMárton Kodok
understand how Cloud Workflows resolves challenges in connecting services, HTTP based service orchestration and automation. We are going to dive deep how serverless HTTP service automation works to automate step engines. Based on practical examples we will demonstrate the newest features that lets you automate the cloud and integration with any Google Cloud product without worrying about authentication
Serverless orchestration and automation with Cloud WorkflowsMárton Kodok
Join this session to understand how Cloud Workflows resolves challenges in connecting services, HTTP based service orchestration and automation. We are going to dive deep how serverless HTTP service automation works to automate step engines. Based on practical examples we will demonstrate the built-in decision and conditional executions, subworkflows, support for external built-in API calls, and integration with any Google Cloud product without worrying about authentication. We are going to cover Marketing, Retail, Industrial and Developer possibilities, such as event driven marketing workflow execution, or inventory chain operations, generating and automatic state machines, or orchestrate DevOps workflows and automating the Cloud.
Serverless orchestration and automation with Cloud WorkflowsMárton Kodok
Join this session to understand how Cloud Workflows resolves challenges in connecting services, HTTP based service orchestration and automation. We are going to dive deep how serverless HTTP service automation works to automate step engines. Based on practical examples we will demonstrate the built-in decision and conditional executions, subworkflows, support for external built-in API calls, and integration with any Google Cloud product without worrying about authentication. We are going to cover Marketing, Retail, Industrial and Developer possibilities, such as event driven marketing workflow execution, or inventory chain operations, generating and automatic state machines, or orchestrate DevOps workflows and automating the Cloud.
Serverless orchestration and automation with Cloud WorkflowsMárton Kodok
Join this session to understand how Cloud Workflows resolves challenges in connecting services, HTTP based service orchestration and automation. We are going to dive deep how serverless HTTP service automation works to automate step engines. Based on practical examples we will demonstrate the built-in decision and conditional executions, subworkflows, support for external built-in API calls, and integration with any Google Cloud product without worrying about authentication. We are going to cover Marketing, Retail, Industrial and Developer possibilities, such as event driven marketing workflow execution, or inventory chain operations, generating and automatic state machines, or orchestrate DevOps workflows and automating the Cloud.
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer ExpertigMárton Kodok
VIBE Koli 2019 - Vibe Garázs - Gokart.
Kodok Márton, miután elvégezte tanulmányait a Sapientián, IT-s karriert épített ki magának, ma pedig már tagja a Google Developer Expert (GDE) csapatának, így az ország kiemelkedő szakemberei közé tartozik. A VIBE Kolin abban segít neked, hogy megtaláld a saját utad. Bebizonyítja, csak akaraterő kell ahhoz, hogy egy társadhoz képest mást, többet csinálj.
Google Cloud Platform Solutions for DevOps EngineersMárton Kodok
learn the DevOps essentials about cloud components, FaaS, PaaS architectural patterns that make use of Cloud Functions, Pub/Sub, Dataflow, Kubernetes and how we develop and deploy cloud software. You will get hands on information how to build, run, monitor highly scalable and flexible applications optimized to run on GCP. We will discuss cloud concepts and highlights various design patterns and best practices.
GDG DevFest Romania - Architecting for the Google Cloud PlatformMárton Kodok
Learn about FaaS, PaaS architectural patterns that make use of Cloud Functions, Pub/Sub, Dataflow, Kubernetes and platforms that hides the management of servers from the user and have changed how we develop and deploy future software.
We discuss the difference between an event-driven approach - this means that you can trigger a function whenever something interesting happens within the cloud environment - and the simpler HTTP approach. Quota and pricing of per invocation, and the advantages and disadvantages of the serverless systems.
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud PlatformonMárton Kodok
Az előadás témája hogyan építhető fel egy rugalmas, jól skálázható szolgáltatás a felhőszolgáltatók platformjain. Hogyan lehet megoldani, hogy a szolgáltatás, amelynek induláskor legfeljebb néhány tíz vagy száz felhasználót kell kiszolgálnia, akár több ezer vagy nagyságrendekkel több felhasználót is képes legyen kiszolgálni rugalmasan? Hátradőlni és csodálni az autoscaling funkciót a Black Friday napján. Beszélni fogunk virtualizációról, platformszintű virtualizációről, szuperkönnyű alkalmazáskonténerekről, a munkaterhek közel valósidejű “pakolgatásával”. Bemutatásra kerül a Google Cloud Platform számos komponense. Bankok, biztosítók, webshopok és így tovább mind a cloudban látják a kitörési pontot.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
Tim Combridge from Sensible Giraffe and Salesforce Ben presents some important tips that all developers should know when dealing with Flows in Salesforce.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Why React Native as a Strategic Advantage for Startup Innovation.pdfayushiqss
Do you know that React Native is being increasingly adopted by startups as well as big companies in the mobile app development industry? Big names like Facebook, Instagram, and Pinterest have already integrated this robust open-source framework.
In fact, according to a report by Statista, the number of React Native developers has been steadily increasing over the years, reaching an estimated 1.9 million by the end of 2024. This means that the demand for this framework in the job market has been growing making it a valuable skill.
But what makes React Native so popular for mobile application development? It offers excellent cross-platform capabilities among other benefits. This way, with React Native, developers can write code once and run it on both iOS and Android devices thus saving time and resources leading to shorter development cycles hence faster time-to-market for your app.
Let’s take the example of a startup, which wanted to release their app on both iOS and Android at once. Through the use of React Native they managed to create an app and bring it into the market within a very short period. This helped them gain an advantage over their competitors because they had access to a large user base who were able to generate revenue quickly for them.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
1. Powering Interactive Data Analysis
with Google BigQuery
Márton Kodok / @martonkodok
Google Developer Expert at REEA.net
Nov 2017 - Cluj Napoca, Romania
2. ● Geek. Hiker. Do-er.
● Among the Top3 romanians on Stackoverflow
● Google Developer Expert on Cloud technologies
● Crafting Web/Mobile backends at REEA.net
● BigQuery and database engine expert
● Active in mentoring
Twitter: @martonkodok
StackOverflow: pentium10
Slideshare: martonkodok
GitHub: pentium10
Powering Interactive Data Analysis with Google BigQuery @martonkodok
About me
3. Everycompany,
no matter how far from the tech they are,
isevolvingintoasoftwarecompany,
and by extension a datacompany.
Turning everything into “data” drives innovation
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
4. For a small company it’simportant
to have access to modernBigDatatools
withoutrunningadedicatedteam for it.
Small companies should do BigData - but how?
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
5. ❏ Need backend/database to STORE, QUERY, EXTRACT data
❏ Deep analytics - large, multi-source, complex, unstructured
❏ Be real time
❏ Petabyte scale - Cost effective
❏ Run Ad-Hoc reports - Without Developer - interactive
❏ Minimal engineering efforts - no dedicated BigData team
❏ Simple Query language (prefered SQL / Javascript)
Making analytics accessible to more companies
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
6. “We can't solve problems by
using the same kind of
thinking we used when we
created them”
-Albert Einstein
Powering Interactive Data Analysis with Google BigQuery @martonkodok
The Challenge
7. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Legacy Business Reporting System
Web
Mobile
Web Server
Database
SQL
Cached
Platform Services
CMS/Framework
Report & Share
Business Analysis
Scheduled
Tasks
Batch Processing
Compute Engine
Multiple Instances
8. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Web
Mobile
Web Server
Database
SQL
Cached
Platform Services
CMS/Framework
Report & Share
Business Analysis
Scheduled
Tasks
Batch Processing
Compute Engine
Multiple Instances
BehindtheScenes:
DaysToInsights
9. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Legacy Business Reporting System
Web
Mobile
Web Server
Database
SQL
Cached
Platform Services
CMS/Framework
Report & Share
Business Analysis
Scheduled
Tasks
Batch Processing
Compute Engine
Multiple Instances
Minutes
to kick in
Hours to Run
Batch Processing
Hours to Clean and
Aggregate
DAYS TO
INSIGHTS
10. ● Terabyte scalable storage
● Real-time row ingestion
● Ask sophisticated queries
● Query-performance
● Low-maintenance
● Cost effective
● Wire them up easily
Goal: Store everything accessible by SQL immediately.
Powering Interactive Data Analysis with Google BigQuery @martonkodok
Desired system/platform
Engines:
● MongoDB, Riak, Redis
● ELK Stack (Elasticsearch-Logstash-Kibana)
● Cassandra, Hive, Hadoop...
● Amazon Athena, Google BigQuery...
12. ● Analytics-as-a-Service - Data Warehouse in the Cloud
● Fully-Managed by Google (US or EU zone)
● Scales into Petabytes
● Ridiculously fast
● SQL 2011 Standard + Javascript UDF (User Defined Functions)
● Familiar DB Structure (table, views, record, nested, JSON)
● Open Interfaces (Web UI, BQ command line tool, REST, ODBC)
● Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors
● Decent pricing (queries $5/TB, storage: $20/TB cold: $10/TB) *Oct 2017
Powering Interactive Data Analysis with Google BigQuery @martonkodok
What is BigQuery?
13. ● Columnar storage (max 10 000 columns in table)
● Batch load file size limits: 5TB (CSV or JSON)
● User Defined Functions in SQL or Javascript
● Rich SQL 2011: JSON, IP, Math, RegExp, Window functions
● Data types: String, Integer, Float, Boolean, Timestamp,
Record, Nested, Struct, Array.
● Append-only tables prefered (DML syntax available)
● Day partitioned tables
● ACL - row level locking (individual or group based)
Powering Interactive Data Analysis with Google BigQuery @martonkodok
BigQuery: Convenience of SQL
14. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Architecting for The Cloud
BigQuery
On-Premises Servers
Pipelines
ETL
Engine
Event Sourcing
Frontend
Platform Services
Metrics / Logs/
Streaming
15. Data Pipeline Integration at REEA.net
Analytics Backend
BigQuery
On-Premises Servers
Pipelines
FluentD
Event Sourcing
Frontend
Platform Services
Metrics / Logs/
Streaming
Development
Team
Data Analysts
Report & Share
Business Analysis
Tools
Tableau
QlikView
Data Studio
Internal
Dashboard
Database
SQL
Application
ServersServers
Cloud Storage
archive
Load
Export
Replay
Standard
Devices
HTTPS
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
16. The following slides will present a sample Fluentd configuration to:
1. Transform a record
2. Copy event to multiple outputs
3. Store event data in File (for backup/log purposes)
4. Stream to BigQuery (for immediate analyses)
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
17. <filter frontend.user.*>
@type record_transformer
</filter>
<match frontend.user.*>
@type copy
<store>
@type forest
subtype file
</store>
<store>
@type bigquery
</store>
…
</match>
Filter plugin mutates incoming data. Add/modify/delete
event data transform attributes without a code deploy.1
2
3
4
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
The copy output plugin copies events to multiple outputs.
File(s), multiple databases, DB engines.
Great to ship same event to multiple subsystems.
The Bigquery output plugin on the fly streams the event to
the BigQuery warehouse. No need to write integration.
Data is available immediately for querying.
Whenever needed other output plugins can be wired in:
Kafka, Google Cloud Storage output plugin.
18. record_transformer copy file BigQuery
<filter frontend.user.*>
@type record_transformer
enable_ruby
remove_keys host
<record>
bq {"insert_id":"${uid}","host":"${host}",
"created":"${time.to_i}"}
avg ${record["total"] / record["count"]}
</record>
</filter>
syntax: Ruby, easy to use.
Great for:
- date transformation,
- quick normalizations,
- calculating something on the fly,
and store in clear log/analytics db
- renaming without code deploy.
1
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
2 3 4
21. ● On data that it is difficult to process/analyze using traditional databases
● On exploring unstructured data
● Not a replacement to traditional DBs, but it compliments the system
● Applying Javascript UDF on columnar storage to resolve complex tasks
(eg: JS for natural language processing)
● On streams (forms, Kafka)
● On IoT streams
● Major strength is handling Large datasets
Powering Interactive Data Analysis with Google BigQuery @martonkodok
Where to use BigQuery?
22. ➢ Optimize product pages
Find, store, analyse in BQ time consuming user actions from using
25x more custom events/hits than Google Analytics
➢ Email engagement
Having stored every open/click raw data improve: subject line, layout,
follow up action emails, assistant like experience by heavy
A/B Split Tests on email marketing campaigns (interactive feedback loop)
➢ Funnel Analysis
Wrangle all the data to discover: a small improvement, an AI driven
upsell personal like experience, pre-sell products configured on the go -
not yet in catalog, but easily can be tweaked/customized
Achievements - goal reached by measuring everything
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
23. Go to the BigQuery web UI.
https://bigquery.cloud.google.com/
Powering Interactive Data Analysis with Google BigQuery @martonkodok
Query a public dataset
24. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Romanian stations that record the most days of snow
25. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Mentions of RO politicians since ‘16 Nov in GDELT articles
31. ● Funnel Analysis
● Email URL click heatmap
● Email Health Dashboard (SPAM, ISP deferral, content
A/B split tests, trends or low open rate campaigns)
● Advanced segmentation (all raw data stored)
● Behavioral analytics - engaged users etc...
Powering Interactive Data Analysis with Google BigQuery @martonkodok
Achievements Continued
32. ● no provisioning/deploy
● no running out of resources
● no more focus on large scale execution plan
● no need to re-implement tricky concepts
(time windows / join streams)
● run raw ad-hoc queries (either by analysts/sales or Devs)
● no more throwing away-, expiring-, aggregating old data.
Powering Interactive Data Analysis with Google BigQuery @martonkodok
Our benefits
33. ● No manual sharding
● No capacity guessing
● No idle resources
● No maintenance windows
● No manual scaling
● No file mgmt
BigQuery: Serverless Data Warehouse
Powering Interactive Data Analysis with Google BigQuery @martonkodok #dfua
serverless data warehouse depicted
34. Powering Interactive Data Analysis with Google BigQuery @martonkodok
Easily Build Custom Reports and Dashboards
35. Thank you.
Slides available on: slideshare.net/martonkodok
Reea.net - Integrated web solutions driven by creativity to deliver projects.