Introduction to Google BigQuery. Slides used at the first GDG Cloud meetup in Brussels, about big data on Google Cloud Platform. (http://www.meetup.com/GDG-Cloud-Belgium/events/228206131)
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Discover dark data that you are currently not analyzing.
- Analyze dark data without moving it into your data warehouse.
- Visualize the results of your dark data analytics.
The 'macro view' on Big Query:
We started with an overview, some typical uses and moved to project hierarchy, access control and security.
In the end we touch about tools and demos.
In this lecture we analyze document oriented databases. In particular we consider why there are the first approach to nosql and what are the main features. Then, we analyze as example MongoDB. We consider the data model, CRUD operations, write concerns, scaling (replication and sharding).
Finally we presents other document oriented database and when to use or not document oriented databases.
Introduction to Google BigQuery. Slides used at the first GDG Cloud meetup in Brussels, about big data on Google Cloud Platform. (http://www.meetup.com/GDG-Cloud-Belgium/events/228206131)
Tackle Your Dark Data Challenge with AWS Glue - AWS Online Tech TalksAmazon Web Services
Learning Objectives:
- Discover dark data that you are currently not analyzing.
- Analyze dark data without moving it into your data warehouse.
- Visualize the results of your dark data analytics.
The 'macro view' on Big Query:
We started with an overview, some typical uses and moved to project hierarchy, access control and security.
In the end we touch about tools and demos.
In this lecture we analyze document oriented databases. In particular we consider why there are the first approach to nosql and what are the main features. Then, we analyze as example MongoDB. We consider the data model, CRUD operations, write concerns, scaling (replication and sharding).
Finally we presents other document oriented database and when to use or not document oriented databases.
Online Store Website Design Proposal PowerPoint Presentation SlidesSlideTeam
If your company needs to submit a Online Store Website Design Proposal PowerPoint Presentation Slides look no further. Our researchers have analyzed thousands of proposals on this topic for effectiveness and conversion. Just download our template, add your company data and submit to your client for a positive response.
E-commerce is the software that allows you to build your online store. It provides all tools to maintain buy and sell a product online. It enables an online store to maintain different Payment modes; Customer support, SEO, Good product navigation, Site management system, Order management system, Shipping, Product review and rating system, Marketing and promotion and more features are waiting for popular virtual stores.
There are some popular, robust, flexible and easily manageable open sources listed below. These are open source so we can use it with our convenience.
Our team works on it and customizes it to make it manageable. Let give as an opportunity to make your online shop and help you to generate more ROI.
Online Shopping is a lifestyle, e-commerce web applications, which provides various electronic and lifestyle products. This project allows viewing various products available enables registered users to purchase desired products instantly using now Cash on Delivery payment system can place an order by using option. This project provides easy access to Administrators and Managers to view orders placed using Pay Later options.
Presented by Claudius Li, Solutions Architect at MongoDB, at MongoDB Evenings New England 2017.
MongoDB Atlas is the premier database as a service offering. Find out how MongoDB Atlas can help your team to deploy more easily, develop faster and easily manage deployment, maintenance, upgrades and expansions. We will also demonstrate some of the key features and tools that come with MongoDB Atlas.
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthAmazon Web Services
Get a technical deep dive into Amazon Redshift and Redshift Spectrum. Learn best practices for taking advantage of Amazon Redshift’s columnar technology and parallel processing capabilities, to improve overall database performance. This session will explain how to migrate from existing data warehouses, create an optimized schema, efficiently load data, use workload management and use Redshift Spectrum to query data directly in Amazon S3. The session will feature Jeff Battisti, Director Global Cloud BI&A Medical IT at Cardinal Health, and Greg Cantwell, Senior Consultant, Business Metrics / Analytics, who will provide lessons learned and best practices, from creating a new data warehouse to supporting Global Sales & Financial reporting in over 60 countries with Amazon Redshift.
This talk will provide a brief update on Microsoft’s recent history in Open Source with specific emphasis on Azure Databricks, a fast, easy and collaborative Apache Spark-based analytics service. Attendees will learn how to integrate MongoDB Atlas with Azure Databricks using the MongoDB Connector for Spark. This integration allows users to process data in MongoDB with the massive parallelism of Spark, its machine learning libraries, and streaming API.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
Introduction to our Datawarehouse solutions called BigQuery.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
Online Store Website Design Proposal PowerPoint Presentation SlidesSlideTeam
If your company needs to submit a Online Store Website Design Proposal PowerPoint Presentation Slides look no further. Our researchers have analyzed thousands of proposals on this topic for effectiveness and conversion. Just download our template, add your company data and submit to your client for a positive response.
E-commerce is the software that allows you to build your online store. It provides all tools to maintain buy and sell a product online. It enables an online store to maintain different Payment modes; Customer support, SEO, Good product navigation, Site management system, Order management system, Shipping, Product review and rating system, Marketing and promotion and more features are waiting for popular virtual stores.
There are some popular, robust, flexible and easily manageable open sources listed below. These are open source so we can use it with our convenience.
Our team works on it and customizes it to make it manageable. Let give as an opportunity to make your online shop and help you to generate more ROI.
Online Shopping is a lifestyle, e-commerce web applications, which provides various electronic and lifestyle products. This project allows viewing various products available enables registered users to purchase desired products instantly using now Cash on Delivery payment system can place an order by using option. This project provides easy access to Administrators and Managers to view orders placed using Pay Later options.
Presented by Claudius Li, Solutions Architect at MongoDB, at MongoDB Evenings New England 2017.
MongoDB Atlas is the premier database as a service offering. Find out how MongoDB Atlas can help your team to deploy more easily, develop faster and easily manage deployment, maintenance, upgrades and expansions. We will also demonstrate some of the key features and tools that come with MongoDB Atlas.
SRV405 Deep Dive Amazon Redshift & Redshift Spectrum at Cardinal HealthAmazon Web Services
Get a technical deep dive into Amazon Redshift and Redshift Spectrum. Learn best practices for taking advantage of Amazon Redshift’s columnar technology and parallel processing capabilities, to improve overall database performance. This session will explain how to migrate from existing data warehouses, create an optimized schema, efficiently load data, use workload management and use Redshift Spectrum to query data directly in Amazon S3. The session will feature Jeff Battisti, Director Global Cloud BI&A Medical IT at Cardinal Health, and Greg Cantwell, Senior Consultant, Business Metrics / Analytics, who will provide lessons learned and best practices, from creating a new data warehouse to supporting Global Sales & Financial reporting in over 60 countries with Amazon Redshift.
This talk will provide a brief update on Microsoft’s recent history in Open Source with specific emphasis on Azure Databricks, a fast, easy and collaborative Apache Spark-based analytics service. Attendees will learn how to integrate MongoDB Atlas with Azure Databricks using the MongoDB Connector for Spark. This integration allows users to process data in MongoDB with the massive parallelism of Spark, its machine learning libraries, and streaming API.
Slidedeck presented at http://devternity.com/ around MongoDB internals. We review the usage patterns of MongoDB, the different storage engines and persistency models as well has the definition of documents and general data structures.
Introduction to our Datawarehouse solutions called BigQuery.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
Getting Started With Multi-Cloud Architecture by PM IntegratedOrganization
How you can overcome the challenges in developing a strategy for a multi-cloud architecture? Learn more about the different types of solutions and applications of Multi-Cloud Architecture by PM Integrated.
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
At Google Cloud Platform, we're combining the Apache Spark and Hadoop ecosystem with our software and hardware innovations. We want to make these awesome tools easier, faster, and more cost-effective, from 3 to 30,000 cores. This presentation will showcase how Google Cloud Platform is innovating with the goal of bringing the Hadoop ecosystem to everyone.
Bio: "I love data because it surrounds us - everything is data. I also love open source software, because it shows what is possible when people come together to solve common problems with technology. While they are awesome on their own, I am passionate about combining the power of open source software with the potential unlimited uses of data. That's why I joined Google. I am a product manager for Google Cloud Platform and manage Cloud Dataproc and Apache Beam (incubating). I've previously spent time hanging out at Disney and Amazon. Beyond Google, love data, amateur radio, Disneyland, photography, running and Legos."
Google Cloud Platform is a cloud computing platform by Google that offers hosting on the same supporting infrastructure that Google uses internally for end-user products like Google Search and YouTube. Cloud Platform provides developer products to build a range of programs from simple websites to complex applications.
Google Cloud Platform is a part of a suite of enterprise solutions from Google for Work and provides a set of modular cloud-based services with a host of development tools. For example, hosting and computing, cloud storage, data storage, translations APIs and prediction APIs.
Topic Covered
Why Google Cloud Platform ?
Google Cloud Platform Services: First Insight !!!
Google Cloud Data Platform - Why Google for Data Analysis?Andreas Raible
Introduction to our Data Platform from capture, processing, analysis and exploration.
The Google Cloud Platform products are based on our internal systems which are powering Google AdWords, Search, YouTube and our leading research in the field of real-time data analysis.
You can get access ($300 for 60 days) to our free trial through google.com/cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
Twitter's Data Platform is built using multiple complex open source and in house projects to support Data Analytics on hundreds of petabytes of data. Our platform support storage, compute, data ingestion, discovery and management and various tools and libraries to help users for both batch and realtime analytics. Our DataPlatform operates on multiple clusters across different data centers to help thousands of users discover valuable insights. As we were scaling our Data Platform to multiple clusters, we also evaluated various cloud vendors to support use cases outside of our data centers. In this talk we share our architecture and how we extend our data platform to use cloud as another datacenter. We walk through our evaluation process, challenges we faced supporting data analytics at Twitter scale on cloud and present our current solution. Extending Twitter's Data platform to cloud was complex task which we deep dive in this presentation.
Google Cloud Functions & Firebase Crash CourseDaniel Zivkovic
#Serverless #Toronto community members Matt Welke (https://www.linkedin.com/in/matt-welke/) and Kudz Murefu (https://www.linkedin.com/in/kudzanai-murefu-7b128886/) introduced Google Cloud Functions and #Firebase to the community at our August meetup. It was the true "by the people, for the people" event!
More info https://www.meetup.com/Serverless-Toronto/events/259718715/
Recording https://youtu.be/CorFCkcuPOI
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. Quick Definitions
Firebase: The name of the suite of tools Google uses to provide BaaS (Backend as a Service)
Real Time Database: Document based NoSQL used for smaller projects that require low latency
Cloud FireStore: The new version of Real Time Database that is faster and more scalable
Document: Holds data that contains a key that can be indexed and value associated with that key (Think
table of contents that has a name of a chapter (Key) and a page number (Value))
Collection: List of documents
3. Pricing Models
Google charges users a fixed fee for every read, write and delete operation
Google also charges for the amount of GB stored on their network
Google offers three plans:
- Spark: Free tier with limited daily usage
- Flame: $25/month plan that stops charging if users go over a specific limit
- Blaze: Pay-as-you-go plan that charges based on usage (See next slide)
See https://firebase.google.com/pricing for more details
5. Managing Reads and Writes
Google sets the Blaze plan as default but it can be switched to any plan based on the users needs
Since Google charges based on Read, Write and Delete operations there are strategies that can be used
to minimize reads and writes and subsequently optimize your backend
The goal is to give Google as little money as possible and avoid spending “$30,356.56 USD in just 72
hours” [8]
[8]N. Contreras, "How we spent 30k USD in Firebase in less than 72 hours - By", Hackernoon.com, 2019. [Online]. Available:
https://hackernoon.com/how-we-spent-30k-usd-in-firebase-in-less-than-72-hours-307490bd24d. [Accessed: 22- Jul- 2019].
6. How Reads and Writes works
Reads
- When data is received from a document using get() or exist()
- If data in a document is changed and client reads the update
- If user logs out and logs back in after 30 minutes and reads the same data
Writes
- set() and update() are called
- Everytime the data is manually changed in Cloud Firestore
Deletes
- Anytime a document is deleted or document field is deleted
[1]"Understand Cloud Firestore billing | Firebase", Firebase, 2019. [Online]. Available:
https://firebase.google.com/docs/firestore/pricing. [Accessed: 15- Jul- 2019].
7. Strategies for Optimizing Reads and Writes
Strategy 1:
- Minimize hotspotting on Firestore
Strategy 2:
- Use Transactions and Batch Writes along with other Google recommended practices
Strategy 3:
- Follow Document Based NoSQL design patterns when modeling data
8. Hotspotting
Hotspotting: When one part of a system is being overloaded instead of being distributed across the
whole system
This occurs when:
- Many documents are being created at once with incrementing/decrementing ids
- Generating lots of documents in small collections
- Adding data that frequently changes (i.e timestamps)
- Deleting multiple documents in a collection
- Writing to a document too frequently without gradually increasing traffic
[2]"Best practices | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/best-practices#hotspots. [Accessed: 15- Jul- 2019].
9. Minimizing hotspotting
Document Ids
- Avoid using the characters . .. and /
- Do not use incrementing ids (i.e. Customer1, Customer2, Customer3 …)
- Best to use a unique identifier such as a Username or email
Field names
- Avoid using periods, brackets, asterisk and backticks (Requires extra processing)
Indexing
- Avoid indexing as it increases storage costs
- Only use indexing to partition or retreive expensive data (i.e large text file, large arrays)
[2]"Best practices | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/best-practices#hotspots. [Accessed: 15- Jul- 2019].
10. Following Google’s Best Practices
Avoid writing more that one document per second
- This can lead to high latency, timeouts or worse
Use Asynchronous calls over synchronous calls
Use cursors instead of offsets
Use transactions and batch writes for reads and writes
[2]"Best practices | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/best-practices#hotspots. [Accessed: 15- Jul- 2019].
11. Transactions and Batch Writes
Transactions and batch writes are used to perform atomic operations meaning it “guaranteed to be
isolated from other operations that may be happening at the same time.” [3]
Transaction is a set of reads and writes operations on one or more documents. [4]
Batch write is a set of write operations on one or more documents. [4]
[3]J. Fisher, "What the Heck Is an "Atomic Object"?", Atomic Spin, 2019. [Online]. Available:
https://spin.atomicobject.com/2016/01/06/defining-atomic-object/. [Accessed: 15- Jul- 2019].
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes. [Accessed: 15- Jul- 2019].
12. Transactions
A transaction is any get() operation followed by any set(), update() or delete() operation
By using transactions data is guaranteed to be up to date and consistent
Things to note:
- Read operations must come before write operations
- Transaction may be executed more than once if there are concurrent edits
- Transaction should not directly modify the application state
- Transactions will fail if the client is offline
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes. [Accessed: 15- Jul- 2019].
13. Transaction in Python
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes. [Accessed: 15- Jul- 2019].
14. Transaction Failure
A transaction will fail if:
- Transaction contains read operations after a write operation
- A document was modified during a transaction. In this case the transaction will retry for a set
number of times
- Transaction size is greater than 10 MB
Failed transactions does not write to firestore
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes. [Accessed: 15- Jul- 2019].
15. Batch Write
Batch writes allow you to write a combination set(), update() or delete() operations as a single atomic
action.
Batch write can hold up to 500 operations
Other operations include serverTimestamp() , arrayUnion() and increment()
Batch writes are less likely to fail and will not retry like transactions will
Batch writes will execute even if the client is offline
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes. [Accessed: 15- Jul- 2019].
16. Batch Write in Python
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes. [Accessed: 15- Jul- 2019].
17. Designing Document Based NoSQL
In traditional database tables have schema, a set on constraints the data must follow
In Firestore, data is schema-less meaning it does not have to follow constraints
[5]Microsoft, LocalDB used in Microsoft Visual Studio. 2019.
[6]Medium, Document used in Firebase Firestore. 2019.
18. Polymorphic Schema
Because there are no constraints to follow we can put any type of data into a collection which makes the
schema polymorphic or can take “many forms” [7]
An example could be an online store that sells Appliances, CDs and Books
Each item has similar attributes like price, name and quantity but also unique ones like:
- Books have a Page Number
- CDs have a Song Count
- Appliances have a type such as Kitchen
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
19. Polymorphic Schema
Since our online store will always be displaying the price, name and quantity to our users, the three
products will be retrieved the same way
Instead of a storing each product into separate collections for Books, CDs and Appliances it is better to
have a products collection because the data is retrieved the same
By simplifying our collections using a process known as denormalization, we reduce the number of reads
and writes to our database
Warning: Don’t over-simplify collections as it may reduce performance
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
20. One To Many Relationships
One to Many: When an instance of an entity has one or more related instances of another entity [7]
Examples include:
- A Garage contains one or many cars
- A shelf contains one or many books
Suggested Practice: To put the multiple instances as a map or array inside the single instance [7]
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
21. One To Many Example
Location Instance 1
Location Instance 2
Single
Instance
Customer
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
22. Many to Many Relationships
Many to Many: When multiple instances of one entity are related to multiple instances of another entity
[7]
Examples Include:
- Many students take many classes
- Many doctors have many patients
Suggested Practice: Use separate collections to represent the class of entities. Documents in the
collection contain references to the data they are related to. [7]
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
23. Many to Many Example
Courses Collection
Students Collection
Reference To Student Document
Course Document
Student Document Reference To Courses Document
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
24. Hierarchy Relationships
Hierarchy: Instances of entities in some kind of parent-child or part-subpart relationship [7]
Examples:
- Creating a recliner, table and desk as parts of a furniture collection
- Creating a lion, tiger and bobcat as children of a cat collection
Suggested Practice: Give child entities a reference to the parent entities [7]
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
26. Conclusion
In my experience, following these guidelines will help:
- Organize your data
- Make faster queries
- Create repeatable quality
- Reduce costs
Overall not just improving your user’s experience but your wallet’s experience as well
27. References
[1]"Understand Cloud Firestore billing | Firebase", Firebase, 2019. [Online]. Available:
https://firebase.google.com/docs/firestore/pricing. [Accessed: 15- Jul- 2019].
[2]"Best practices | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online]. Available:
https://cloud.google.com/firestore/docs/best-practices#hotspots. [Accessed: 15- Jul- 2019].
[3]J. Fisher, "What the Heck Is an "Atomic Object"?", Atomic Spin, 2019. [Online]. Available:
https://spin.atomicobject.com/2016/01/06/defining-atomic-object/. [Accessed: 15- Jul- 2019].
[4]"Transactions and batched writes | Cloud Firestore | Google Cloud", Google Cloud, 2019. [Online].
Available: https://cloud.google.com/firestore/docs/manage-data/transactions#batched-writes.
[Accessed: 15- Jul- 2019].
28. References
[5]Microsoft, LocalDB used in Microsoft Visual Studio. 2019.
[6]Medium, Document used in Firebase Firestore. 2019.
[7]D. Sullivan, NoSQL for mere mortals®. Hoboken [etc.]: Addison-Wesley, 2015, pp. 152 - 217.
[8]N. Contreras, "How we spent 30k USD in Firebase in less than 72 hours - By", Hackernoon.com, 2019.
[Online]. Available: https://hackernoon.com/how-we-spent-30k-usd-in-firebase-in-less-than-72-
hours-307490bd24d. [Accessed: 22- Jul- 2019].