This document appears to be a record of a training or seminar on negotiation skills. It lists a reference number, the name of the presenter "Larry Price", and the topic "Introducing the Art of Negotiation" along with the date it took place on December 21, 2013.
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...The Hive
Big data is disrupting many industries by generating and analyzing large amounts of data from diverse sources, enabling new products and services. Uber and Airbnb have disrupted transportation and hospitality by leveraging big data, while Waze uses traffic data. Sensors are now everywhere and producing huge amounts of scalable data that can create value by addressing real problems. Companies should capture data from all activities, use diverse sources, solve core issues, and retain data for unanticipated future uses.
#5 DataBeersBCN -"How to do Data Journalism… and not die trying"DataBeersBCN
1. The document discusses the history and evolution of data journalism, from early examples in the 1800s to modern practices using new digital tools.
2. It outlines key aspects of modern data journalism, such as multidisciplinary teams and making sources and methods transparent.
3. The author argues that data journalism is increasingly important for accountability by enabling investigative reporting using transparency laws and open data.
The document discusses visualization techniques for mobility data mining. It describes work done by the Knowledge Discovery and Data Mining Laboratory on several EU projects involving GPS data to analyze individual and collective mobility patterns. Visualizations are shown of individual daily and weekly movements, borders of human mobility, and traffic flows into and out of the city of Pisa. The goal is to develop techniques to create an atlas of urban mobility and a permanent mobility observatory.
The document discusses the use of 3D data management and visualization for places. It questions whether a national 3D data set is needed or if it is looking for problems to solve. It also discusses using 3D modeling at different levels of detail to improve decision making in areas like information management, conservation, urban planning, and public consultation. Level of detail can range from block models to detailed interior models. CityGML is presented as an open standard for 3D city and landscape models.
The document discusses the history and rapid growth of artificial intelligence, highlighting major AI breakthroughs and the datasets and algorithms that enabled them, with an average of 3 years between breakthroughs and datasets being created versus 18 years for the enabling algorithms. It then explores potential future directions for AI including its growing utility, the rise of machine-readable content on the semantic web, the potential for artificial general intelligence, and applications across various industries.
The document discusses how data visualization and geospatial data can be used for urban planning purposes. It provides examples of how game designers use geospatial death maps to improve game levels. Additionally, it discusses how John Snow's cholera map was an early example of using geospatial data and how individuals and organizations now emit large amounts of geospatial data daily through connected devices. The document advocates that urban planners could use this abundant geospatial data from citizens to inform community-focused planning and design processes. It provides examples of how geospatial data has been visualized regarding topics like Netflix queues, political donations, tourist vs local photos, and the costs of incarceration to provide new insights.
This document discusses various cool technologies including systems engineering, crowdsourcing, open data, open linked data, autonomy, GIS, image processing, pattern analysis, intelligent agents, assisted creativity, and paper-based user interfaces. It explores how these technologies can be combined with people and processes to drive innovation through ideas, crisis response teams, data applications, spatial data mapping, image analysis, summarization, swarm intelligence, creative collaboration, and interactive documents.
This document appears to be a record of a training or seminar on negotiation skills. It lists a reference number, the name of the presenter "Larry Price", and the topic "Introducing the Art of Negotiation" along with the date it took place on December 21, 2013.
Opportunites in Big Data by Sumant Mandal, Founder of The Hive for The Hive I...The Hive
Big data is disrupting many industries by generating and analyzing large amounts of data from diverse sources, enabling new products and services. Uber and Airbnb have disrupted transportation and hospitality by leveraging big data, while Waze uses traffic data. Sensors are now everywhere and producing huge amounts of scalable data that can create value by addressing real problems. Companies should capture data from all activities, use diverse sources, solve core issues, and retain data for unanticipated future uses.
#5 DataBeersBCN -"How to do Data Journalism… and not die trying"DataBeersBCN
1. The document discusses the history and evolution of data journalism, from early examples in the 1800s to modern practices using new digital tools.
2. It outlines key aspects of modern data journalism, such as multidisciplinary teams and making sources and methods transparent.
3. The author argues that data journalism is increasingly important for accountability by enabling investigative reporting using transparency laws and open data.
The document discusses visualization techniques for mobility data mining. It describes work done by the Knowledge Discovery and Data Mining Laboratory on several EU projects involving GPS data to analyze individual and collective mobility patterns. Visualizations are shown of individual daily and weekly movements, borders of human mobility, and traffic flows into and out of the city of Pisa. The goal is to develop techniques to create an atlas of urban mobility and a permanent mobility observatory.
The document discusses the use of 3D data management and visualization for places. It questions whether a national 3D data set is needed or if it is looking for problems to solve. It also discusses using 3D modeling at different levels of detail to improve decision making in areas like information management, conservation, urban planning, and public consultation. Level of detail can range from block models to detailed interior models. CityGML is presented as an open standard for 3D city and landscape models.
The document discusses the history and rapid growth of artificial intelligence, highlighting major AI breakthroughs and the datasets and algorithms that enabled them, with an average of 3 years between breakthroughs and datasets being created versus 18 years for the enabling algorithms. It then explores potential future directions for AI including its growing utility, the rise of machine-readable content on the semantic web, the potential for artificial general intelligence, and applications across various industries.
The document discusses how data visualization and geospatial data can be used for urban planning purposes. It provides examples of how game designers use geospatial death maps to improve game levels. Additionally, it discusses how John Snow's cholera map was an early example of using geospatial data and how individuals and organizations now emit large amounts of geospatial data daily through connected devices. The document advocates that urban planners could use this abundant geospatial data from citizens to inform community-focused planning and design processes. It provides examples of how geospatial data has been visualized regarding topics like Netflix queues, political donations, tourist vs local photos, and the costs of incarceration to provide new insights.
This document discusses various cool technologies including systems engineering, crowdsourcing, open data, open linked data, autonomy, GIS, image processing, pattern analysis, intelligent agents, assisted creativity, and paper-based user interfaces. It explores how these technologies can be combined with people and processes to drive innovation through ideas, crisis response teams, data applications, spatial data mapping, image analysis, summarization, swarm intelligence, creative collaboration, and interactive documents.
The document discusses the evolution of technology from the 1980s to the present and future, focusing on the convergence of different technologies. It describes how in the 1980s, the personal computer started to converge with other technologies like the television and newspaper. In the 1990s and 2000s, further convergence occurred as the personal computer combined with radio, CD players, games, social networks, and telephones. The document predicts that future technologies will involve reality and virtual worlds on devices with ultra-high connectivity. It also discusses how research and business models are becoming more integrated and converged across different fields.
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbersAlan Dix
Talk at EPIC CDT Away Day, St Davids Hotel, Cardiff, 11th April 2024.
https://alandix.com/academic/talks/CDT-away-day-April-2024-QQ/
As academics we need to deal with numbers including project management spreadsheets and student marks. In addition, they are part of day-to-day life whether household budgeting or working out how many socks to pack for a journey. Perhaps most crucially, many national and global issues require an understanding of numeric information from climate change to tax rates, and of course the Covid-19 pandemic. If citizens are not able to make sense of this, democracy fails. Of course, many are not only uncertain when dealing with numbers, but suffer more or less extreme maths anxiety. Indeed a recent UK survey found that, “over a third of adults (35%) say that doing maths makes them feel anxious, while one in five are so fearful it even makes them feel physically sick”. Sometimes detailed calculations are necessary, but often the critical skill is qualitative–quantitative reasoning, that is a qualitative understanding of quantitative phenomena. This can after be aided by the ability to use back-of-the-envelope calculations and dealing with lightweight numeric information. This talk discusses these issues and presents some prototype tools to explore the design space for personal numeric information.
This talk is largely the same as the one of the same name given at Ulster University in February. However, the slides have been updated to correct web material misattributed to BBC which was actually Guardian. An eagle-eyed member of the audience spotted that the font in the screenshot was one found in the Guardian online web and not the BBC.
This document provides an overview of geographic information systems (GIS) and mapping tools for non-profits. It discusses how maps can be used for storytelling, advocacy, program delivery, research, fundraising and community mapping. It also covers topics like data sources, tools, stakeholder participation and challenges around data acquisition. Overall, the document serves as an introduction to using maps and GIS for social causes.
There's a wealth of data readily available, but few people know what to do with it. Based on our 7 years of practical experience running the leading Canadian data-visualization studio and working with high-profile clients, we share practical ways to use data in design & communications, while giving an overview of the challenges & opportunities ahead.
Creatives will be interested in learning how to use data in their works, marketers will discover new ways of communicating information.
Five things you will learn:
1- How data can be used as an input in the create process
2- How data can be used in communication & public relation
3- Discover "the spectrum of visualization"
4- Learn about the challenges of working with data
5- Discover the new disciplines emerging around the usage of data
Functional Leap of Faith (Keynote at JDay Lviv 2014)Tomer Gabel
Keynote talk given at JDay Lviv 2014 in Ukraine (http://www.jday.com.ua/). Video coming soon.
Abstract:
Some say that there's nothing new under the sun. However, looking back on five to six decades of computing, it's easy to see that things progress at their own leisurly pace. Structured programming, originating in the '60s, did not gain mainstream adoption until the '80s; object-oriented programming was hotly debated in the '70s and '80s but only gained widespread acceptance in the '90s. Every couple of decades sees an engineering leap that radically improves the software engineering discipline across the board. I believe we are now at such an inflection point, with functional programming concepts slowly sifting into the mainstream. After this talk, I hope you will too.
The Digital Divides or the third industrial revolution: concepts and figuresIsmael Peña-López
It is usual to think about the digital divide as a very concrete aspect of the impact of ICTs, mainly concerning whether there is an existence of infrastructures (sometimes computers, sometimes computers connected to the Internet).
It is usual to think about digital literacy as the ability of someone to switch on a computer and playing some cards game, sending an e-mail and, optimistically, run some word processor and type in a love letter.
It is usual to think about ICTs as something that won’t make disappear the hunger in the world or heal the thousands of people suffering from countless diseases, specially in places where citizens live with less than one dollar a day.
It is usual to think about the digital divide as something that does not affect me, as I live on the sunny side of the world, in a developed country that will last this way for centuries.
With the aim to dismantle all these (almost) false assumptions, the seminar will try and give "correct" definitions for concepts such as Digital Divide, Digital Literacy, eReadiness or eAwareness and show examples on how ICTs can help underdeveloped and developing countries to reach higher quotas of welfare… and how so-called developed countries can exchange places with the lesser developed ones in case they do not pay attention to what is happening in a global world.
More info, citation and download, here: http://ictlogy.net/bibciter/reports/projects.php?idp=287
This document discusses the future of AI and presents a timeline for progress and cost reductions. It predicts that by 2035, AI systems capable of human-level perception will exist, and by 2055, systems may develop human-level cognition. The cost of AI is expected to decrease dramatically over time, with supercomputers potentially costing $1,000 by 2040 and $1 by 2060. Experts may be surprised if progress is faster or slower than the predicted timeline. The document encourages students to help build the future of AI through open source contributions.
Community Technology Centers (CTCs) have struggled with changing names and priorities but have also achieved victories in expanding access to technology. CTCs originated in the 1980s to provide equal computer access and now over 1,000 are united through the Community Technology Centers' Network. Major accomplishments include federal grants in the 1990s-2000s totaling over $150 million. However, CTCs now face challenges such as broadband deployment without training, social media risks, and changing technologies. The top priorities for CTCs are expanding broadband access combined with training, supporting legislation to fund community technology programs, and ensuring CTCs remain relevant in a changing digital landscape.
The document discusses the value of data and the rise of big data. It notes that Matthew Fontaine Maury in the 1800s recognized the value of analyzing ship log data collectively. Today, new sources of data like sensors have exploded the volume of data. Characteristics of big data include volume, variety, and velocity. Technological challenges include scalability, heterogeneity, and low latency. The document provides examples of non-relational databases and MapReduce as approaches to handle big data.
HS DAM Chicago 2019 - Reframing the ConversationChristina Gibbs
Reframing the Conversation - Innovations in DAM, Collections Information, and Data at the Detroit Institute of Arts
September 24, 9:55AM
Presenters: Jessica Herczeg-Konecny, Digital Asset Manager, and Christina Gibbs, Collections Database Manager
Museums need to publish and widely share their data sets and images to remain relevant in today’s digital age. What does it take to provide the widest possible access to digital collections? This case study will reveal insights into the need for interoperability between Collections Information Systems, DAM, and the greater semantic web. Christina and Jessica will address challenges, risk analysis, and outcomes as they facilitate building a bigger and stronger foundation through implementing a new DAM system as well as an API.
The document provides an introduction to data mining. It discusses the growth of data from terabytes to petabytes and how data mining can help extract knowledge from large datasets. The document outlines the evolution of sciences from empirical to theoretical to computational and now data-driven. It also describes the evolution of database technology and defines data mining as the process of discovering interesting patterns from large amounts of data. The key steps of the knowledge discovery process are discussed.
The document defines data science as incorporating machine learning, data mining, capturing and cleaning unstructured data from sources like social media, using big data technologies to store and process large datasets, and considering ethics and regulation. It lists the key skills required of a data scientist as including communication, statistics, computer science, machine learning, data wrangling, visualization, and domain expertise. Common data science techniques are described as clustering, classification, association rule mining, and outlier detection.
This document provides an introduction to data mining concepts and techniques. It discusses why data mining is needed due to the massive growth of data, defines data mining as the extraction of patterns from large data sets, and outlines the data mining process. A variety of data types that can be mined are described, including relational, transactional, time-series, text and web data. The document also covers major data mining functionalities like classification, clustering, association rule mining and trend analysis. Top 10 popular data mining algorithms are listed.
A Training & Simulation Perspective on Maritime Information & AutomationAndy Fawkes
Presented at the 2nd SMi Maritime Information Warfare Conference - London on 27 November 2018. It proposed that Information Warfare & Automation and Training & Simulation have a number of parallels. It looked at the the Modern Sailor; the latest Training & Simulation Developments; Data & Digital Twins/Siblings; latest Gaming technology; and Automation.
BigData & Supply Chain: A "Small" IntroductionIvan Gruer
In the frame of the master in logistic LOG2020, a brief presentation about BigData and its impacts on Supply Chains at IUAV.
Topics and contents have been developed along the research for the MBA final dissertation at MIB School of Management.
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...Matthias Stürmer
The global datasphere is growing from 60 Zettabytes today to 175 Zettabytes in 2025. Much of this data and software is privately controlled by American and Chinese corporations with enormous market power. Only the seven largest big tech companies such as Microsoft, Facebook, Alibaba or Tencent already have a market capitalization of over USD 8700 billion, which is almost three times India's GDP. This trend is called data colonialism of the cyber space. What problems arise from this and how can they be solved? The concept of digitale sustainability addresses this challenge by presenting a new pathway towards greater data sovereignty.
The document discusses the evolution of technology from the 1980s to the present and future, focusing on the convergence of different technologies. It describes how in the 1980s, the personal computer started to converge with other technologies like the television and newspaper. In the 1990s and 2000s, further convergence occurred as the personal computer combined with radio, CD players, games, social networks, and telephones. The document predicts that future technologies will involve reality and virtual worlds on devices with ultra-high connectivity. It also discusses how research and business models are becoming more integrated and converged across different fields.
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbersAlan Dix
Talk at EPIC CDT Away Day, St Davids Hotel, Cardiff, 11th April 2024.
https://alandix.com/academic/talks/CDT-away-day-April-2024-QQ/
As academics we need to deal with numbers including project management spreadsheets and student marks. In addition, they are part of day-to-day life whether household budgeting or working out how many socks to pack for a journey. Perhaps most crucially, many national and global issues require an understanding of numeric information from climate change to tax rates, and of course the Covid-19 pandemic. If citizens are not able to make sense of this, democracy fails. Of course, many are not only uncertain when dealing with numbers, but suffer more or less extreme maths anxiety. Indeed a recent UK survey found that, “over a third of adults (35%) say that doing maths makes them feel anxious, while one in five are so fearful it even makes them feel physically sick”. Sometimes detailed calculations are necessary, but often the critical skill is qualitative–quantitative reasoning, that is a qualitative understanding of quantitative phenomena. This can after be aided by the ability to use back-of-the-envelope calculations and dealing with lightweight numeric information. This talk discusses these issues and presents some prototype tools to explore the design space for personal numeric information.
This talk is largely the same as the one of the same name given at Ulster University in February. However, the slides have been updated to correct web material misattributed to BBC which was actually Guardian. An eagle-eyed member of the audience spotted that the font in the screenshot was one found in the Guardian online web and not the BBC.
This document provides an overview of geographic information systems (GIS) and mapping tools for non-profits. It discusses how maps can be used for storytelling, advocacy, program delivery, research, fundraising and community mapping. It also covers topics like data sources, tools, stakeholder participation and challenges around data acquisition. Overall, the document serves as an introduction to using maps and GIS for social causes.
There's a wealth of data readily available, but few people know what to do with it. Based on our 7 years of practical experience running the leading Canadian data-visualization studio and working with high-profile clients, we share practical ways to use data in design & communications, while giving an overview of the challenges & opportunities ahead.
Creatives will be interested in learning how to use data in their works, marketers will discover new ways of communicating information.
Five things you will learn:
1- How data can be used as an input in the create process
2- How data can be used in communication & public relation
3- Discover "the spectrum of visualization"
4- Learn about the challenges of working with data
5- Discover the new disciplines emerging around the usage of data
Functional Leap of Faith (Keynote at JDay Lviv 2014)Tomer Gabel
Keynote talk given at JDay Lviv 2014 in Ukraine (http://www.jday.com.ua/). Video coming soon.
Abstract:
Some say that there's nothing new under the sun. However, looking back on five to six decades of computing, it's easy to see that things progress at their own leisurly pace. Structured programming, originating in the '60s, did not gain mainstream adoption until the '80s; object-oriented programming was hotly debated in the '70s and '80s but only gained widespread acceptance in the '90s. Every couple of decades sees an engineering leap that radically improves the software engineering discipline across the board. I believe we are now at such an inflection point, with functional programming concepts slowly sifting into the mainstream. After this talk, I hope you will too.
The Digital Divides or the third industrial revolution: concepts and figuresIsmael Peña-López
It is usual to think about the digital divide as a very concrete aspect of the impact of ICTs, mainly concerning whether there is an existence of infrastructures (sometimes computers, sometimes computers connected to the Internet).
It is usual to think about digital literacy as the ability of someone to switch on a computer and playing some cards game, sending an e-mail and, optimistically, run some word processor and type in a love letter.
It is usual to think about ICTs as something that won’t make disappear the hunger in the world or heal the thousands of people suffering from countless diseases, specially in places where citizens live with less than one dollar a day.
It is usual to think about the digital divide as something that does not affect me, as I live on the sunny side of the world, in a developed country that will last this way for centuries.
With the aim to dismantle all these (almost) false assumptions, the seminar will try and give "correct" definitions for concepts such as Digital Divide, Digital Literacy, eReadiness or eAwareness and show examples on how ICTs can help underdeveloped and developing countries to reach higher quotas of welfare… and how so-called developed countries can exchange places with the lesser developed ones in case they do not pay attention to what is happening in a global world.
More info, citation and download, here: http://ictlogy.net/bibciter/reports/projects.php?idp=287
This document discusses the future of AI and presents a timeline for progress and cost reductions. It predicts that by 2035, AI systems capable of human-level perception will exist, and by 2055, systems may develop human-level cognition. The cost of AI is expected to decrease dramatically over time, with supercomputers potentially costing $1,000 by 2040 and $1 by 2060. Experts may be surprised if progress is faster or slower than the predicted timeline. The document encourages students to help build the future of AI through open source contributions.
Community Technology Centers (CTCs) have struggled with changing names and priorities but have also achieved victories in expanding access to technology. CTCs originated in the 1980s to provide equal computer access and now over 1,000 are united through the Community Technology Centers' Network. Major accomplishments include federal grants in the 1990s-2000s totaling over $150 million. However, CTCs now face challenges such as broadband deployment without training, social media risks, and changing technologies. The top priorities for CTCs are expanding broadband access combined with training, supporting legislation to fund community technology programs, and ensuring CTCs remain relevant in a changing digital landscape.
The document discusses the value of data and the rise of big data. It notes that Matthew Fontaine Maury in the 1800s recognized the value of analyzing ship log data collectively. Today, new sources of data like sensors have exploded the volume of data. Characteristics of big data include volume, variety, and velocity. Technological challenges include scalability, heterogeneity, and low latency. The document provides examples of non-relational databases and MapReduce as approaches to handle big data.
HS DAM Chicago 2019 - Reframing the ConversationChristina Gibbs
Reframing the Conversation - Innovations in DAM, Collections Information, and Data at the Detroit Institute of Arts
September 24, 9:55AM
Presenters: Jessica Herczeg-Konecny, Digital Asset Manager, and Christina Gibbs, Collections Database Manager
Museums need to publish and widely share their data sets and images to remain relevant in today’s digital age. What does it take to provide the widest possible access to digital collections? This case study will reveal insights into the need for interoperability between Collections Information Systems, DAM, and the greater semantic web. Christina and Jessica will address challenges, risk analysis, and outcomes as they facilitate building a bigger and stronger foundation through implementing a new DAM system as well as an API.
The document provides an introduction to data mining. It discusses the growth of data from terabytes to petabytes and how data mining can help extract knowledge from large datasets. The document outlines the evolution of sciences from empirical to theoretical to computational and now data-driven. It also describes the evolution of database technology and defines data mining as the process of discovering interesting patterns from large amounts of data. The key steps of the knowledge discovery process are discussed.
The document defines data science as incorporating machine learning, data mining, capturing and cleaning unstructured data from sources like social media, using big data technologies to store and process large datasets, and considering ethics and regulation. It lists the key skills required of a data scientist as including communication, statistics, computer science, machine learning, data wrangling, visualization, and domain expertise. Common data science techniques are described as clustering, classification, association rule mining, and outlier detection.
This document provides an introduction to data mining concepts and techniques. It discusses why data mining is needed due to the massive growth of data, defines data mining as the extraction of patterns from large data sets, and outlines the data mining process. A variety of data types that can be mined are described, including relational, transactional, time-series, text and web data. The document also covers major data mining functionalities like classification, clustering, association rule mining and trend analysis. Top 10 popular data mining algorithms are listed.
A Training & Simulation Perspective on Maritime Information & AutomationAndy Fawkes
Presented at the 2nd SMi Maritime Information Warfare Conference - London on 27 November 2018. It proposed that Information Warfare & Automation and Training & Simulation have a number of parallels. It looked at the the Modern Sailor; the latest Training & Simulation Developments; Data & Digital Twins/Siblings; latest Gaming technology; and Automation.
BigData & Supply Chain: A "Small" IntroductionIvan Gruer
In the frame of the master in logistic LOG2020, a brief presentation about BigData and its impacts on Supply Chains at IUAV.
Topics and contents have been developed along the research for the MBA final dissertation at MIB School of Management.
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...Matthias Stürmer
The global datasphere is growing from 60 Zettabytes today to 175 Zettabytes in 2025. Much of this data and software is privately controlled by American and Chinese corporations with enormous market power. Only the seven largest big tech companies such as Microsoft, Facebook, Alibaba or Tencent already have a market capitalization of over USD 8700 billion, which is almost three times India's GDP. This trend is called data colonialism of the cyber space. What problems arise from this and how can they be solved? The concept of digitale sustainability addresses this challenge by presenting a new pathway towards greater data sovereignty.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
13. Platform Society
Foursquare 2008
Big Data
Widespread Adoption
Democratization of data
Statistics
Bayes Theorem (1763)
Regression (1805)
Computer Age
Turing (1936)
Neural Networks (1943)
Evolutionary Computation (1965)
Databases (1970s)
Genetic Algorithms (1975)
Data Mining
KDD or Knowledge Discovery from
Databases (1989)
Supervised machine learning (1992)
Data Science (2001)
35. Machine Learning Pattern Recognition
Algorithms
High-Performance
Computing
Statistics
Database Systems
Data Warehouse
Information Retrieval Applications
Data Mining
Visualization
class overview
(ʘᗩʘ')
72. DATA MINING THE CITY
Weds 7p-9p 200 Buell
Violet Whitney, vw2205@columbia.edu
attendance/reflection:
shoutkey.com/us
Editor's Notes
We’re going to do some exercises: this first one will be on getting data which will start the weekly assignment.
D<>D just means paired designers, were going to pair up with whoever has computers because its more fun together, and then we can meet each other
I just graduated with my MArch from GSAPP
Aleppo project at CSR
sidewalk
Where we fit into history
Kings College practiced statistics through engineeringThe world’s most powerful computer at Watson Lab 1954,
Paperless studio (CAD)
CBIP - Columbia Building Intelligence Project - data/metric-driven design of the built environment
Columbia also hosted Cities Lab and Network Cities
Center for Spatial Research - humanitarian mapping
This is the best place for technology and architecture
As Professor José van Dijck has described, the computerization of every aspect of life has created a Platform society.
Today most of our social and economic relations take place through platforms like Facebook and Venmo
Tinder’s matching algorithm leads to an increasing number of matches and marriages each year. Ultimately its algorithm will shape the genetic makeup of the human race, as swipes are made, humans are matched and babies are born.
The filters of StreetEasy and Apartment Finder --literally filter the makeup of --who lives in what neighborhoods-- reprogramming entire city zones.
Where the Nolli map once exposed accessible public space, Yelp is now telling individuals what spaces they should like, but everyone sees a different map. These recommendation systems algorithmically segregate cities, generating spatialized filter bubbles which choreograph pedestrian flows through siloed canals across the city.
From Yelp reviews directing people to preferred restaurants to Airbnb reprogramming homes into vacation rentals, the invisible code that powers a city’s use may have more drastic influence than any physical invention in the last century.
But cities have always operated as platforms, as Manuel Castells states - they are the ‘material interfaces’ that connect individual city dwellers.
Just like the networks on the internet, room adjacencies and hallways too act like networks.
not only have cities operated like platforms, the usage of data in cities isn’t new. -- In the 30s surveys and statistics about the makeup of a place were used to justify the redevelopment of “blighted areas” --and for racial redlining. So what is so different about data in the city now?
Today its the quantity and ubiquity of that data which is new. The democratization of data through public APIs allow various apps and lone coders to access giant pools of data dropped by tiny transactions throughout the city.
This interconnectedness and availability of this data gives immense power to designers to choreograph the use of cities and speculate creatively about the urban environment.
This course will focus on encoding spatial analytical processes. We will hypothesize about the relationships of tools and space, as well as develop models and simulations so designers can gain a foothold in the changing landscape of the digital city.
We will develop a technical training in relevant techniques: using Python, public APIs, batch image and video processes, and visualization techniques in Processing
As well as a critical understanding of the social, economic, and political dynamics caused by these technologies such as data bias, and privacy issues.
In Session A, we will learn about data types, preprocessing data, about location and accuracy
About mapping Data & Other Visualization techniques,
About defining Spatial Patterns
About recommendation systems
And about Pixels, Images, Video, and computer vision
Session B will be run as workshops tailored to your specific interests (such as sentiment analysis or natural language processing) and will give you the opportunity to deep dive into your own project which can orient around your studio.
Workshops will include expert guest critics from data, cloud computing and urban analytics.
Set of processes or methods for discovering patterns
We’ll do a quick reflection at the end of each class through a google form to give you the opportunity to submit regular feedback on the class as well as mark yourself as here
Every week there will be a tutorial or an assignment that will develop your Project which you will post on Medium.
Who knows what Medium is?
Every week there will be a tutorial or an assignment that will develop your Project which you will post on Medium.
We’ll get started on the first week’s assignment and you’ll continue it at the end of class.
The course project asks students to use at least 2 NYC datasets to generate a visual argument about change in the city. Projects will be individual, however students are encouraged to share their data sets and methods with a pair coding partner.
Super open on what people want to do for midterm and final review.
critics?
Who has computers?
groups
Google Street View is an amazing archive of the city but has yet to be easily sortable. If we want to see all locations that are marked as historic in New York City, we would need to look up each location from a database of addresses copy the address into Google Maps, drop the pegman into each location, screenshot each street scene, and then repeat the steps for each location before being able to compare them all.
Artists like Josh Begley have found smarter ways to sample Google Street View. He uses Google’s API and custom scripts to automate the downloading of street view from various locations. In “Officer Involved”, he uses databases of police brutality (collected by non-governmental and news organizations) to sample Street View scenes at the location of each incident, thus immersing us in “the environment of someone’s last moment”.
Where is data stored?-----Flat files, Databases and websites, APIs - whats an API?
Google Maps (church, CVS, bridge, bar, etc) ------> google sheets
manually scraping
Each dataset has the same summary statistics (mean, standard deviation, correlation),...
and the datasets are clearly different, and visually distinct).
Anscombe’s Quartet is the classic example showing how visualization can trump statistics alone.
In a paper by Benoit Mandelbrot on the coastline of Britain it was shown that it is inherently nonsensical to discuss certain spatial concepts(such as the length of the perimeter of the coastline) despite that there me an inherent presumption that discussing the length of a coastline seems valid. Lengths in ecology depend directly on the scale at which they are measured and experienced. So while surveyors commonly measure the length of a river, this length only has meaning in the context of the relevance of the measuring technique to the question under study.
He depicted this idea behind fractal geometry, that certain forms and branching patterns could be seen at multiple scales
binary is the way computers store data at their lowest level, as electric charge.
We don’t use ones and zeroes. When working with binary data, we often use hexadecimal instead.
But given the proper context, this hexadecimal string actually represents color (you’ve probably used these numbers in photoshop)
What you may not know is that internally, most data are held as long, one-dimensional sequences of values, either binary (as hexadecimal) or text (as characters).
In computers, encoding is the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage.
Decoding is the opposite process -- the conversion of an encoded format back into the original sequence of characters.
Now that we know a bit about what data are and how they’re stored… lets get into formatting data
We’re going to use location data to get streetview images from Google’s API (their open data)
We want to clean our data to turn our addresses into lat, and longitude
When we’re talking about our data, there are a couple terms to know...
When we’re talking about our data, there are a couple terms to know...