Atlas Search combines the power of Apache Lucene - the technology behind the world’s most popular search engines - with the developer productivity, scale, and resilience of MongoDB Atlas to make it easier than ever to integrate fast, relevance-based search capabilities into all of your MongoDB applications.
Watch the Getting Started with MongoDB Atlas Search webinar where, with a few clicks and keystrokes, we unravel the mystery behind the search bar. The session searches through different data types, including text, numbers, dates, and geoJSON while exploring a variety of search capabilities.
From usability to performance, analytics to architecture; as report developers, the user experience design (UX) of your data model is quickly becoming more important than the pretty pictures that sit on top of it. This session will concentrate on the design decisions needed to increase the usage of your reports.
Speaker: Charlie Swanson
Learn how MongoDB answers your queries from a query system engineer. If you've ever had a performance problem with a query but didn't know how to find the cause, or if you've ever needed to confirm that your shiny new index is being put to work, the explain command is an excellent place to start. MongoDB's explain system is a powerful tool for solving this type of problem, but can be intimidating and unwieldy to use. In this talk, we will discuss how the explain command works and break down its output into digestible pieces.
There are many data modeling and database design terms and jargon that uses the word "key." Do you know the difference between a surrogate key and a primary key? A super key and a candidate key? Could you explain them to a technical audience? A business user or an auditor?
In this presentation, Karen Lopez covers the concepts of primary keys, foreign keys, candidate key, surrogate keys, and more.
CCM AlchemyAPI and Real-time AggregationVictor Anjos
An exploratory look into KairosDB (OpenTSDB) connected to Cassandra (CCM) and using AlchemyAPI for entity, topic and sentiment extraction.
Sprinkled in is a bit of Data Modeling, Truth Tables, Primary Keys, Partition Keys and Cluster Keys.
All written in Python!
Ten to fifteen years ago, we picked between a few major SQL databases. Now our apps have a variety of needs, and an overwhelming selection of database platforms. There are 5 main database families. In this talk we’ll survey all 5: Relational (SQL), Key/Value (NoSQL), Columnar (NoSQL), Document (NoSQL), and Graph (NoSQL). We’ll cover what scenarios each family handles well. In addition, we’ll discuss the most popular members of each family. So, the next time you need to pick a database, you’ll know which one - or ones - are the best fit.
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
Atlas Search combines the power of Apache Lucene - the technology behind the world’s most popular search engines - with the developer productivity, scale, and resilience of MongoDB Atlas to make it easier than ever to integrate fast, relevance-based search capabilities into all of your MongoDB applications.
Watch the Getting Started with MongoDB Atlas Search webinar where, with a few clicks and keystrokes, we unravel the mystery behind the search bar. The session searches through different data types, including text, numbers, dates, and geoJSON while exploring a variety of search capabilities.
From usability to performance, analytics to architecture; as report developers, the user experience design (UX) of your data model is quickly becoming more important than the pretty pictures that sit on top of it. This session will concentrate on the design decisions needed to increase the usage of your reports.
Speaker: Charlie Swanson
Learn how MongoDB answers your queries from a query system engineer. If you've ever had a performance problem with a query but didn't know how to find the cause, or if you've ever needed to confirm that your shiny new index is being put to work, the explain command is an excellent place to start. MongoDB's explain system is a powerful tool for solving this type of problem, but can be intimidating and unwieldy to use. In this talk, we will discuss how the explain command works and break down its output into digestible pieces.
There are many data modeling and database design terms and jargon that uses the word "key." Do you know the difference between a surrogate key and a primary key? A super key and a candidate key? Could you explain them to a technical audience? A business user or an auditor?
In this presentation, Karen Lopez covers the concepts of primary keys, foreign keys, candidate key, surrogate keys, and more.
CCM AlchemyAPI and Real-time AggregationVictor Anjos
An exploratory look into KairosDB (OpenTSDB) connected to Cassandra (CCM) and using AlchemyAPI for entity, topic and sentiment extraction.
Sprinkled in is a bit of Data Modeling, Truth Tables, Primary Keys, Partition Keys and Cluster Keys.
All written in Python!
Ten to fifteen years ago, we picked between a few major SQL databases. Now our apps have a variety of needs, and an overwhelming selection of database platforms. There are 5 main database families. In this talk we’ll survey all 5: Relational (SQL), Key/Value (NoSQL), Columnar (NoSQL), Document (NoSQL), and Graph (NoSQL). We’ll cover what scenarios each family handles well. In addition, we’ll discuss the most popular members of each family. So, the next time you need to pick a database, you’ll know which one - or ones - are the best fit.
The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
The Hidden Empires of Malware with TLS Certified Hypotheses and Machine LearningRyan Kovar
The “threat hunting” landscape has drastically changed due to the increase in encrypted transport layer security (TLS) Internet traffic. The days of adversaries registering domains with their given names are gone, and malicious actors increasingly use malware that takes advantage of TLS encryption to hide their tracks. Yet, even in this brave new world of altered tactics, techniques, and procedures, adversaries leave clues that can expose their infrastructure. To find these clues, however, blue teams need to learn some new tricks. This talk focuses on expanding on techniques that have been researched and presented at various conferences by Mark Parsons, and specifically on his methods for using TLS certificates to find malicious malware infrastructure. We will build on Parsons’ body of work and show how his approach to malware certificate hunting can be expanded to detect instances of PowerShell Empire servers that have self-generated SSL certifications on port 443 and 8080. These certificates have a unique fingerprint that can be detected by leveraging tools like zmap/zgrep, python, and statistics/machine learning. The results of this research will show how network defenders can find previously unknown instances of malicious infrastructure communicating with their network and prevent them in the future. Finally, we will discuss our creation of hypotheses, codes and techniques, and methods of validation for verification. We’ll then release our tools and methodology for use by the community to explore other potential “hidden empires” of malware
Let’s face it: Best Practices are too many to really know them all and choose which ones should be applied first. Does your telephone ring all the time? Do your users ask for that “quick report” that instead takes ages and keeps changing every time you think it’s done? Have you ever thought that in dire times avoiding Worst Practices could be a good starting point and you can leave fine tuning for a better future? If the answer is “yes”, then this session is for you: we will discover together how not to torture a SQL Server instance and we will see how to avoid making choices that in the long run could turn out to be not as smart as they looked initially.
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation enginelucenerevolution
This session will present a detailed tear-down and walk-through of a working soup-to-nuts recommendation engine that uses observations of multiple kinds of behavior to do combined recommendation and cross recommendation. The system is built using Mahout to do off-line analysis and Solr to provide real-time recommendations. The presentation will also include enough theory to provide useful working intuitions for those desiring to adapt this design.
The entire system including a data generator, off-line analysis scripts, Solr configurations and sample web pages will be made available on github for attendees to modify as they like.
Similar to Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSaturday Gothenburg) (20)
The Battle of the Data Transformation Tools (PASS Data Community Summit 2023)Cathrine Wilhelmsen
The Battle of the Data Transformation Tools (Presented as part of the "Batte of the Data Transformation Tools" Learning Path at PASS Data Community Summit on November 16th, 2023)
Visually Transform Data in Azure Data Factory or Azure Synapse Analytics (PAS...Cathrine Wilhelmsen
Visually Transform Data in Azure Data Factory or Azure Synapse Analytics (Presented as part of the "Batte of the Data Transformation Tools" Learning Path at PASS Data Community Summit on November 15th, 2023)
Building an End-to-End Solution in Microsoft Fabric: From Dataverse to Power ...Cathrine Wilhelmsen
Building an End-to-End Solution in Microsoft Fabric: From Dataverse to Power BI (Presented at SQLSaturday Oregon & SW Washington on November 11th, 2023)
Stressed, Depressed, or Burned Out? The Warning Signs You Shouldn't Ignore (S...Cathrine Wilhelmsen
Stressed, Depressed, or Burned Out? The Warning Signs You Shouldn't Ignore (Presented at SQLBits on March 18th, 2023)
We all experience stress in our lives. When the stress is time-limited and manageable, it can be positive and productive. This kind of stress can help you get things done and lead to personal growth. However, when the stress stretches out over longer periods of time and we are unable to manage it, it can be negative and debilitating. This kind of stress can affect your mental health as well as your physical health, and increase the risk of depression and burnout.
The tricky part is that both depression and burnout can hit you hard without the warning signs you might recognize from stress. Where stress barges through your door and yells "hey, it's me!", depression and burnout can silently sneak in and gradually make adjustments until one day you turn around and see them smiling while realizing that you no longer recognize your house. I know, because I've dealt with both. And when I thought I had kicked them out, they both came back for new visits.
I don't have the Answers™️ or Solutions™️ to how to keep them away forever. But in hindsight, there were plenty of warning signs I missed, ignored, or was oblivious to at the time. In this deeply personal session, I will share my story of dealing with both depression and burnout. What were the warning signs? Why did I miss them? Could I have done something differently? And most importantly, what can I - and you - do to help ourselves or our loved ones if we notice that something is not quite right?
"I can't keep up!" - Turning Discomfort into Personal Growth in a Fast-Paced ...Cathrine Wilhelmsen
"I can't keep up!" - Turning Discomfort into Personal Growth in a Fast-Paced World (Presented at SQLBits on March 17th, 2023)
Do you sometimes think the world is moving so fast that you're struggling to keep up?
Does it make you feel a little uncomfortable?
Awesome!
That means that you have ambitions. You want to learn new things, take that next step in your career, achieve your goals. You can do anything if you set your mind to it.
It just might not be easy.
All growth requires some discomfort. You need to manage and balance that discomfort, find a way to push yourself a little bit every day without feeling overwhelmed. In a fast-paced world, you need to know how to break down your goals into smaller chunks, how to prioritize, and how to optimize your learning.
Are you ready to turn your "I can't keep up" into "I can't believe I did all of that in just one year"?
Lessons Learned: Implementing Azure Synapse Analytics in a Rapidly-Changing S...Cathrine Wilhelmsen
Lessons Learned: Implementing Azure Synapse Analytics in a Rapidly-Changing Startup (Presented at SQLBits on March 11th, 2022)
What happens when you mix one rapidly-changing startup, one data analyst, one data engineer, and one hypothesis that Azure Synapse Analytics could be the right tool of choice for gaining business insights?
We had no idea, but we gave it a go!
Our ambition was to think big, start small, and act fast – to deliver business value early and often.
Did we succeed?
Join us for an honest conversation about why we decided to implement Azure Synapse Analytics alongside Power BI, how we got started, which areas we completely messed up at first, what our current solution looks like, the lessons learned along the way, and the things we would have done differently if we could start all over again.
6 Tips for Building Confidence as a Public Speaker (SQLBits 2022)Cathrine Wilhelmsen
6 Tips for Building Confidence as a Public Speaker (Presented at SQLBits on March 10th, 2022)
Do you feel nervous about getting on stage to deliver a presentation?
That was me a few years ago. Palms sweating. Hands shaking. Voice trembling. I could barely breathe and talked at what felt like a thousand words per second. Now, public speaking is one of my favorite hobbies. Sometimes, I even plan my vacations around events! What changed?
There are no shortcuts to building confidence as a public speaker. However, there are many things you can do to make the journey a little easier for yourself. In this session, I share the top tips I have learned over the years. All it takes is a little preparation and practice.
You can do this!
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
14. PASS and the SQL Server Community
PASS Summit
SQLSaturdays
24 Hours of PASS
Local Chapters
Virtual Chapters
passsummit.com
sqlsaturday.com
24hoursofpass.com
sqlpass.org
sqlug.se
25. Keyboard Shortcuts
Assign shortcuts
you frequently use
Remove shortcuts you
accidentally click
(no more "ooops")
msdn.microsoft.com/en-us/library/ms174205.aspx
50. SARGable Queries
"The query can efficiently seek using an index to find
the rows searched for in WHERE or JOIN clauses"
Compare it to finding a person in a phone book
(…let's just pretend we still use phone books…)
51. SARGable Queries
Adama, Lee
Adama, William
Agathon, Karl
Baltar, Gaius
Dualla, Anastasia
Gaeta, Felix
Henderson, Cally
Roslin, Laura
Thrace, Kara
Tigh, Saul
Tyrol, Galen
Valerii, Sharon
Find all rows where Name starts with 'T'
52. SARGable Queries
Adama, Lee
Adama, William
Agathon, Karl
Baltar, Gaius
Dualla, Anastasia
Gaeta, Felix
Henderson, Cally
Roslin, Laura
Thrace, Kara
Tigh, Saul
Tyrol, Galen
Valerii, Sharon
Find all rows where Name starts with 'T'
53. Non-SARGable Queries
"The query has to scan each row in the table to find
the rows searched for in WHERE or JOIN clauses"
Compare it to finding a person in a phone book
(…let's just keep pretending we still use phone books…)
54. Non-SARGable Queries
Adama, Lee
Adama, William
Agathon, Karl
Baltar, Gaius
Dualla, Anastasia
Gaeta, Felix
Henderson, Cally
Roslin, Laura
Thrace, Kara
Tigh, Saul
Tyrol, Galen
Valerii, Sharon
Find all rows where Name contains 'al'
55. Non-SARGable Queries
Adama, Lee
Adama, William
Agathon, Karl
Baltar, Gaius
Dualla, Anastasia
Gaeta, Felix
Henderson, Cally
Roslin, Laura
Thrace, Kara
Tigh, Saul
Tyrol, Galen
Valerii, Sharon
Find all rows where Name contains 'al'
56. WHERE LEFT(Name,1,1) = 'T'
WHERE YEAR(EpisodeDate) = 2005
WHERE EpisodeDate >= '20050101'
AND EpisodeDate < '20060101'
SARGable or Non-SARGable?
57. WHERE LEFT(Name,1,1) = 'T'
WHERE YEAR(EpisodeDate) = 2005
WHERE EpisodeDate >= '20050101'
AND EpisodeDate < '20060101'
SARGable or Non-SARGable?
58. WHERE Name LIKE 'T%'
WHERE Name LIKE '%al%'
WHERE LEFT(Name,1,1) = 'T'
SARGable or Non-SARGable?
59. WHERE Name LIKE 'T%'
WHERE Name LIKE '%al%'
WHERE LEFT(Name,1,1) = 'T'
SARGable or Non-SARGable?
60. WHERE CAST(EpisodeDate AS DATE) = '20050114'
WHERE CONVERT(CHAR(6),EpisodeDate,112) = '200501'
WHERE YEAR(EpisodeDate) = 2005
WHERE EpisodeDate >= '20050101'
AND EpisodeDate < '20060101'
SARGable or Non-SARGable?
61. WHERE CAST(EpisodeDate AS DATE) = '20050114'
WHERE CONVERT(CHAR(6),EpisodeDate,112) = '200501'
WHERE YEAR(EpisodeDate) = 2005
WHERE EpisodeDate >= '20050101'
AND EpisodeDate < '20060101'
SARGable or Non-SARGable?
62. WHERE Survivors < 40000
WHERE @Survivors BETWEEN Survivors-1000
AND Survivors+1000
WHERE Survivors BETWEEN @Survivors-1000
AND @Survivors+1000
SARGable or Non-SARGable?
63. WHERE Survivors < 40000
WHERE @Survivors BETWEEN Survivors-1000
AND Survivors+1000
WHERE Survivors BETWEEN @Survivors-1000
AND @Survivors+1000
SARGable or Non-SARGable?