Calit2 has grown significantly in its first five years from 2001-2005. It started with a small space and 25 people, and has expanded to include two new buildings providing 340,000 square feet of space and over 1000 researchers. Calit2 has also established several new laboratories for nanotechnology and virtual reality. It has received over $350 million in federal grants and $72 million from industry partnerships. Calit2 works with over 300 faculty across dozens of departments at UCSD and UCI on various projects including undergraduate research.
VRA2013 Engaging New Technology - TrendlerJohn Trendler
Tech on the Horizon. Conference presentation by John Trendler at the Visual Resources Association's 2013 conference in Providence, RI, part of the Engaging New Technology Session.
Songification: Enhancing live music dataDeb Verhoeven
This document summarizes a presentation about using music gig data to create musical compositions. It discusses analyzing the location, date, and frequency of gigs played by several Australian bands from the 1960s-1970s. This data is converted into musical notes based on the frequency and distance of gigs from a central point. The resulting notes are played together to create songified versions of the bands' gig histories and movements around venues in that era. Examples of songs created from the gig data of bands like Max Merritt and the Meteors, Billy Thorpe and the Aztecs, and Doug Parkinson are presented.
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)Deb Verhoeven
Cinema data is characteristically complex, heterogeneous and interlinked. Rather than relying on simple information retrieval techniques, researchers are increasingly turning to the creative exploration and reapplication of data in order to more fully explore the meaning of newly available and diverse data sets. In this context, the cinema historian becomes the creator of visual texts which can be assessed for both their interpretive insight and their aesthetic qualities. This paper presents four research projects that develop different spatio-temporal visualisation techniques to understand the industrial dynamics of post-war film exhibition and distribution in Australia. The research integrates groundbreaking work by a group of inter-disciplinary investigators into the effectiveness of techniques such as dendritic mapping, Circos, time-series graphs, animation, cartogram mapping, and multivariate visualisation for the study of cinema circuits and operations at a number of scales.
Kinomatics: Presentation at HOMER (Prague 2013) Deb Verhoeven
The document discusses the Kinomatics project which uses large datasets of cinema data to analyze trends in the global film industry. The project is led by researchers from Deakin and RMIT Universities and collects data on movie showtimes, venues and box office earnings from 48 countries. It aims to use this "big data" to better understand factors influencing the film industry and enable predictive analysis. The volume of data collected each week is demonstrated and challenges around data veracity are also discussed.
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemLarry Smarr
The Pacific Research Platform will create a regional "Big Data Freeway System" along the West Coast to support science. It will connect major research institutions with high-speed optical networks, allowing them to share vast amounts of data and computational resources. This will enable new forms of collaborative, data-intensive research for fields like particle physics, astronomy, biomedicine, and earth sciences. The first phase aims to establish a basic networked infrastructure, with later phases advancing capabilities to 100Gbps and beyond with security and distributed technologies.
Calit2 has grown significantly in its first five years from 2001-2005. It started with a small space and 25 people, and has expanded to include two new buildings providing 340,000 square feet of space and over 1000 researchers. Calit2 has also established several new laboratories for nanotechnology and virtual reality. It has received over $350 million in federal grants and $72 million from industry partnerships. Calit2 works with over 300 faculty across dozens of departments at UCSD and UCI on various projects including undergraduate research.
VRA2013 Engaging New Technology - TrendlerJohn Trendler
Tech on the Horizon. Conference presentation by John Trendler at the Visual Resources Association's 2013 conference in Providence, RI, part of the Engaging New Technology Session.
Songification: Enhancing live music dataDeb Verhoeven
This document summarizes a presentation about using music gig data to create musical compositions. It discusses analyzing the location, date, and frequency of gigs played by several Australian bands from the 1960s-1970s. This data is converted into musical notes based on the frequency and distance of gigs from a central point. The resulting notes are played together to create songified versions of the bands' gig histories and movements around venues in that era. Examples of songs created from the gig data of bands like Max Merritt and the Meteors, Billy Thorpe and the Aztecs, and Doug Parkinson are presented.
Visualizing Cinema Data: Presentation at HOMER (Prague 2013)Deb Verhoeven
Cinema data is characteristically complex, heterogeneous and interlinked. Rather than relying on simple information retrieval techniques, researchers are increasingly turning to the creative exploration and reapplication of data in order to more fully explore the meaning of newly available and diverse data sets. In this context, the cinema historian becomes the creator of visual texts which can be assessed for both their interpretive insight and their aesthetic qualities. This paper presents four research projects that develop different spatio-temporal visualisation techniques to understand the industrial dynamics of post-war film exhibition and distribution in Australia. The research integrates groundbreaking work by a group of inter-disciplinary investigators into the effectiveness of techniques such as dendritic mapping, Circos, time-series graphs, animation, cartogram mapping, and multivariate visualisation for the study of cinema circuits and operations at a number of scales.
Kinomatics: Presentation at HOMER (Prague 2013) Deb Verhoeven
The document discusses the Kinomatics project which uses large datasets of cinema data to analyze trends in the global film industry. The project is led by researchers from Deakin and RMIT Universities and collects data on movie showtimes, venues and box office earnings from 48 countries. It aims to use this "big data" to better understand factors influencing the film industry and enable predictive analysis. The volume of data collected each week is demonstrated and challenges around data veracity are also discussed.
The Pacific Research Platform:a Science-Driven Big-Data Freeway SystemLarry Smarr
The Pacific Research Platform will create a regional "Big Data Freeway System" along the West Coast to support science. It will connect major research institutions with high-speed optical networks, allowing them to share vast amounts of data and computational resources. This will enable new forms of collaborative, data-intensive research for fields like particle physics, astronomy, biomedicine, and earth sciences. The first phase aims to establish a basic networked infrastructure, with later phases advancing capabilities to 100Gbps and beyond with security and distributed technologies.
The document discusses the rapidly growing volumes of data being generated across many scientific domains such as biology, astronomy, climate science, and others. It notes that while "big science" projects have been able to develop robust cyberinfrastructure to manage and analyze large datasets, most individual researchers and smaller research groups lack adequate computing resources and software tools to effectively handle the data. The author argues that providing research cyberinfrastructure as a cloud-based service could help address this problem by reducing costs and barriers to entry for researchers. Specific services like Globus Online for data transfer and potential future services for storage, collaboration, and integration with other tools are presented as examples of this approach.
Advancing Science through Coordinated CyberinfrastructureDaniel S. Katz
How local, regional, and national cyberinfrastructure can be coordinated and linked to advance science and engineering, based on experiences and lessons from the Center for Computation & Technology at LSU (ideas, funding, implementation), plus some thoughts on what might be done differently if we were starting today. Presented at First Workshop - Center for Computational Engineering & Sciences, Unicamp, Campinas, Brazil 10 APR 2014
This document summarizes a webinar about using cloud technologies in teaching and learning. It discusses various cloud tools used like Twitter, HTML pages, CloudConcepts, Popplet, Echo livestreaming, ScoopIT, WordPress blog, and Top Hat Monocle. These tools enabled adding Twitter feeds, creating content pages, hosting videos, facilitating mind mapping for assessments, live streaming lectures, curating resources, blogging, and conducting polls/quizzes. Student feedback was positive about the live streaming and tools. Producing reusable resources and collaborating took time. Ongoing support is needed to manage the technologies. Contact information is provided for further assistance.
Biological Science Collections Tagging and Tracking presented at SPNHCRob Guralnick
This document describes the Biological Science Collections Tracker (BiSciCol) project. BiSciCol aims to track biodiversity objects across collections by implementing globally unique identifiers and a linked data approach. It creates relationships between objects, such as specimens, tissues, sequences, and taxonomic concepts, stored across various biodiversity data providers. BiSciCol addresses how to link together and query across these diverse data sources to form a growing constellation of biodiversity data and knowledge.
This document discusses building knowledge graphs using DIG (Distributed Information Graphs) to integrate heterogeneous data sources. It describes the steps involved, including data acquisition, feature extraction, mapping to an ontology, entity resolution, graph construction, and deployment. As a use case, DIG has been used to build a knowledge graph from over 100 million web pages related to human trafficking to help law enforcement identify victims and prosecute traffickers.
What's beyond Virtualization - The Future of Cloud PlatformsDerek Collison
My updated talk om the future of IT at QCon NY
What lies beyond virtualization? How do we start the journey to a secure, composeable, and trusted hybrid platform that truly delivers the business value and velocity we all want?
In the era of software-defined everything, one goal is to reach a fluid infrastructure that has the level of plasticity needed to self heal itself and provide higher level SLAs for applications and services. Adding value to existing applications and services in a transparent fashion requires a rethinking of core technologies in the platform space. In this talk we will take a look at some low level technologies and approaches to achieving this goal. Topics will range from Intelligent layer 7 SDN with semantic awareness, distributed scheduling algorithms, policy distribution and invalidation, health monitoring and management, self healing techniques, and the role of unsupervised deep machine learning and anomaly detection.
Water and Technology, some stuff we've learned by Robert Cheetham, President,...Kim Beidler
This document discusses lessons learned from using technology and water management. It outlines Azavea's work in areas like open source software, open data, user interface design, gamification, and mobile applications. The document also identifies opportunities in big data, cognitive surplus, data science, and cloud computing. It concludes by encouraging readers to believe in the impossible and get to work on these challenges.
This document discusses crowdfunding for university research projects. It provides information on different types of crowdfunding platforms, including domain-specific, university-specific, and education platforms. Data is presented on the success of the Pozible crowdfunding platform in launching and fully funding research projects. Charts show funding amounts over time for individual projects and overall. Factors that contribute to project success are discussed, including the Twitter network of the project principal and social media engagement with the project website.
The Humanities Networked Infrastructure (HuNI) combines data from 30 Australian cultural websites into the largest humanities and creative arts database in Australia. It covers all disciplines and brings together information on people, works, events, organizations and places that make up Australia's rich cultural landscape. Researchers can search HuNI, save search results as virtual collections, refine collections by adding links between records, and share collections with other researchers.
Big CInema Data: Analysing global cinema showtimesDeb Verhoeven
Looking at cinema exhibition and distribution at an international scale requires data beyond broad aggregates, it requires data that is specific to individual films and cinema venues in order to appreciate the intricate temporal and geographic aspects of flow and patterns. The Kinomatics Project has tracked the global flow of individual film screenings (down to date and time) for over 54,000 films for 30,000 venues throughout 48 countries internationally.
This presentation will highlight the importance of global scale analysis and data through three case studies. The first will track the spatial and temporal relationships of The Hobbit: an unexpected journey, highlighting the complexities of international cinema enterprises and the subtleties of contemporary releasing strategies. The second explores the relationship between remittance flows and the movement of film around the globe with a focus on Bollywood films. The thrid test dyadic relationships between countries. This presentation will introduce some methods for analysing and visualising data used in the three case studies.
The document discusses the Humanities Networked Infrastructure (HuNI) project. HuNI aims to (1) integrate cultural data from 28 sources at a national level, (2) make this aggregated data accessible through a new national data service, and (3) connect the data to the Linked Data Cloud to allow for complex queries. The HuNI lab app will allow users to discover, explore, connect, curate and share data as well as save and import their own data. HuNI intends to change the nature of humanities research by enabling work with larger datasets and breaking down disciplinary data boundaries to promote sharing and collaboration.
Research My World: Pilot Project EvaluationDeb Verhoeven
This document provides an evaluation of a pilot crowdfunding project called "Research My World" conducted by Deakin University in Australia. The key findings were:
- 6 out of 8 research projects were successfully funded, raising over $50,000 with additional funds raised after.
- The projects generated over 200 media stories reaching over 1.4 million people and over 3,600 tweets.
- The researchers improved their digital and social media skills, and saw increased profiles and networks.
- Factors like the reach of Twitter networks and driving traffic to project websites correlated with funding success.
The document contains checklists for researchers and universities to prepare for crowdfunding campaigns. The institutional checklist covers having a point person, payment methods, receipting procedures, PR and marketing support, and being prepared to support projects beyond the campaign. The researcher checklist covers having an existing project, promotion and networking strategies, production of a promotional video, dedicating time to promotion, and support from their school and faculty. The overall document provides guidance to help ensure researchers and their universities are ready to successfully conduct a crowdfunding campaign.
The document describes the Humanities Networked Infrastructure (HuNI) project. HuNI aims to create a virtual laboratory that integrates 28 Australian cultural datasets and enables new forms of humanities research. It will harvest data from partner organizations, transform it into a searchable format and linked open data, and develop tools for researchers to discover, analyze, annotate, and share collections across the integrated datasets. The project is led by Deakin University with funding from NeCTAR and contributions from partner organizations.
Presentation at the Australasian Consortium of Humanities Research Centres (ACHRC), July 2013. Panel description:
The Digital Humanities offers not only new tools to support what we do in the Humanities, but also new ways of thinking about what it is that we do. This panel will build upon Alan Liu’s keynote discussion of ideas for digital tools for humanities advocacy and speak to the way non-digital centres can benefit from digital humanities initiatives.
Mapping the Australian Screen Content ProducerDeb Verhoeven
The document discusses a survey of Australian screen content producers. It aimed to map the culture, motivations, and aspirations of over 4,000 producers in defined populations. The survey included open and closed questions about classifications, projects, education, employment, industry sentiment, attitudes, and perceptions.
The document also discusses the role and responsibilities of a producer. Producers are involved in all aspects of a production from development through post-production and marketing. They oversee the creative process and make important decisions. While no single producer handles every task, they must perform a majority of producing functions.
Producers tend to work in industries like health, education, media, and finance. They are motivated more by intrinsic rewards than
The document discusses the rapidly growing volumes of data being generated across many scientific domains such as biology, astronomy, climate science, and others. It notes that while "big science" projects have been able to develop robust cyberinfrastructure to manage and analyze large datasets, most individual researchers and smaller research groups lack adequate computing resources and software tools to effectively handle the data. The author argues that providing research cyberinfrastructure as a cloud-based service could help address this problem by reducing costs and barriers to entry for researchers. Specific services like Globus Online for data transfer and potential future services for storage, collaboration, and integration with other tools are presented as examples of this approach.
Advancing Science through Coordinated CyberinfrastructureDaniel S. Katz
How local, regional, and national cyberinfrastructure can be coordinated and linked to advance science and engineering, based on experiences and lessons from the Center for Computation & Technology at LSU (ideas, funding, implementation), plus some thoughts on what might be done differently if we were starting today. Presented at First Workshop - Center for Computational Engineering & Sciences, Unicamp, Campinas, Brazil 10 APR 2014
This document summarizes a webinar about using cloud technologies in teaching and learning. It discusses various cloud tools used like Twitter, HTML pages, CloudConcepts, Popplet, Echo livestreaming, ScoopIT, WordPress blog, and Top Hat Monocle. These tools enabled adding Twitter feeds, creating content pages, hosting videos, facilitating mind mapping for assessments, live streaming lectures, curating resources, blogging, and conducting polls/quizzes. Student feedback was positive about the live streaming and tools. Producing reusable resources and collaborating took time. Ongoing support is needed to manage the technologies. Contact information is provided for further assistance.
Biological Science Collections Tagging and Tracking presented at SPNHCRob Guralnick
This document describes the Biological Science Collections Tracker (BiSciCol) project. BiSciCol aims to track biodiversity objects across collections by implementing globally unique identifiers and a linked data approach. It creates relationships between objects, such as specimens, tissues, sequences, and taxonomic concepts, stored across various biodiversity data providers. BiSciCol addresses how to link together and query across these diverse data sources to form a growing constellation of biodiversity data and knowledge.
This document discusses building knowledge graphs using DIG (Distributed Information Graphs) to integrate heterogeneous data sources. It describes the steps involved, including data acquisition, feature extraction, mapping to an ontology, entity resolution, graph construction, and deployment. As a use case, DIG has been used to build a knowledge graph from over 100 million web pages related to human trafficking to help law enforcement identify victims and prosecute traffickers.
What's beyond Virtualization - The Future of Cloud PlatformsDerek Collison
My updated talk om the future of IT at QCon NY
What lies beyond virtualization? How do we start the journey to a secure, composeable, and trusted hybrid platform that truly delivers the business value and velocity we all want?
In the era of software-defined everything, one goal is to reach a fluid infrastructure that has the level of plasticity needed to self heal itself and provide higher level SLAs for applications and services. Adding value to existing applications and services in a transparent fashion requires a rethinking of core technologies in the platform space. In this talk we will take a look at some low level technologies and approaches to achieving this goal. Topics will range from Intelligent layer 7 SDN with semantic awareness, distributed scheduling algorithms, policy distribution and invalidation, health monitoring and management, self healing techniques, and the role of unsupervised deep machine learning and anomaly detection.
Water and Technology, some stuff we've learned by Robert Cheetham, President,...Kim Beidler
This document discusses lessons learned from using technology and water management. It outlines Azavea's work in areas like open source software, open data, user interface design, gamification, and mobile applications. The document also identifies opportunities in big data, cognitive surplus, data science, and cloud computing. It concludes by encouraging readers to believe in the impossible and get to work on these challenges.
Similar to Big Data goes to the movies: Kinomatics Part 2 (8)
This document discusses crowdfunding for university research projects. It provides information on different types of crowdfunding platforms, including domain-specific, university-specific, and education platforms. Data is presented on the success of the Pozible crowdfunding platform in launching and fully funding research projects. Charts show funding amounts over time for individual projects and overall. Factors that contribute to project success are discussed, including the Twitter network of the project principal and social media engagement with the project website.
The Humanities Networked Infrastructure (HuNI) combines data from 30 Australian cultural websites into the largest humanities and creative arts database in Australia. It covers all disciplines and brings together information on people, works, events, organizations and places that make up Australia's rich cultural landscape. Researchers can search HuNI, save search results as virtual collections, refine collections by adding links between records, and share collections with other researchers.
Big CInema Data: Analysing global cinema showtimesDeb Verhoeven
Looking at cinema exhibition and distribution at an international scale requires data beyond broad aggregates, it requires data that is specific to individual films and cinema venues in order to appreciate the intricate temporal and geographic aspects of flow and patterns. The Kinomatics Project has tracked the global flow of individual film screenings (down to date and time) for over 54,000 films for 30,000 venues throughout 48 countries internationally.
This presentation will highlight the importance of global scale analysis and data through three case studies. The first will track the spatial and temporal relationships of The Hobbit: an unexpected journey, highlighting the complexities of international cinema enterprises and the subtleties of contemporary releasing strategies. The second explores the relationship between remittance flows and the movement of film around the globe with a focus on Bollywood films. The thrid test dyadic relationships between countries. This presentation will introduce some methods for analysing and visualising data used in the three case studies.
The document discusses the Humanities Networked Infrastructure (HuNI) project. HuNI aims to (1) integrate cultural data from 28 sources at a national level, (2) make this aggregated data accessible through a new national data service, and (3) connect the data to the Linked Data Cloud to allow for complex queries. The HuNI lab app will allow users to discover, explore, connect, curate and share data as well as save and import their own data. HuNI intends to change the nature of humanities research by enabling work with larger datasets and breaking down disciplinary data boundaries to promote sharing and collaboration.
Research My World: Pilot Project EvaluationDeb Verhoeven
This document provides an evaluation of a pilot crowdfunding project called "Research My World" conducted by Deakin University in Australia. The key findings were:
- 6 out of 8 research projects were successfully funded, raising over $50,000 with additional funds raised after.
- The projects generated over 200 media stories reaching over 1.4 million people and over 3,600 tweets.
- The researchers improved their digital and social media skills, and saw increased profiles and networks.
- Factors like the reach of Twitter networks and driving traffic to project websites correlated with funding success.
The document contains checklists for researchers and universities to prepare for crowdfunding campaigns. The institutional checklist covers having a point person, payment methods, receipting procedures, PR and marketing support, and being prepared to support projects beyond the campaign. The researcher checklist covers having an existing project, promotion and networking strategies, production of a promotional video, dedicating time to promotion, and support from their school and faculty. The overall document provides guidance to help ensure researchers and their universities are ready to successfully conduct a crowdfunding campaign.
The document describes the Humanities Networked Infrastructure (HuNI) project. HuNI aims to create a virtual laboratory that integrates 28 Australian cultural datasets and enables new forms of humanities research. It will harvest data from partner organizations, transform it into a searchable format and linked open data, and develop tools for researchers to discover, analyze, annotate, and share collections across the integrated datasets. The project is led by Deakin University with funding from NeCTAR and contributions from partner organizations.
Presentation at the Australasian Consortium of Humanities Research Centres (ACHRC), July 2013. Panel description:
The Digital Humanities offers not only new tools to support what we do in the Humanities, but also new ways of thinking about what it is that we do. This panel will build upon Alan Liu’s keynote discussion of ideas for digital tools for humanities advocacy and speak to the way non-digital centres can benefit from digital humanities initiatives.
Mapping the Australian Screen Content ProducerDeb Verhoeven
The document discusses a survey of Australian screen content producers. It aimed to map the culture, motivations, and aspirations of over 4,000 producers in defined populations. The survey included open and closed questions about classifications, projects, education, employment, industry sentiment, attitudes, and perceptions.
The document also discusses the role and responsibilities of a producer. Producers are involved in all aspects of a production from development through post-production and marketing. They oversee the creative process and make important decisions. While no single producer handles every task, they must perform a majority of producing functions.
Producers tend to work in industries like health, education, media, and finance. They are motivated more by intrinsic rewards than
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
1. Deakin University CRICOS Provider Code: 00113B
KINOMATICS:
BIG DATA GOES TO
THE MOVIES
DEB VERHOEVEN (DEAKIN UNIVERSITY)
@BESTQUALITYCRAB
2. Deakin University CRICOS Provider Code: 00113B
KINOMATICS:
BIG DATA WILL GO TO
THE MOVIES
DEB VERHOEVEN (DEAKIN UNIVERSITY)
@BESTQUALITYCRAB
3. Deakin University CRICOS Provider Code: 00113B
KINOMATICS:
BIG DATA WENT TO
THE MOVIES
DEB VERHOEVEN (DEAKIN UNIVERSITY)
@BESTQUALITYCRAB
4. Deakin University CRICOS Provider Code: 00113B
RECAP: What is
Kinomatics?
•KINEMATICS: THE STUDY OF THE
GEOMETRY OF MOTION
•KINOMATICS: THE STUDY OF THE
INDUSTRIAL GEOMETRY OF MOTION
PICTURES
6. Deakin University CRICOS Provider Code: 00113B
RECAP: Who is
Kinomatics?
Deb Verhoeven (Cinema Studies; Digital Humanities)
Colin Arrowsmith (Geospatial Science)
Alwyn Davidson (Geospatial Science)
Bronwyn Coate (Economist)
Ben Eltham (Cultural Policy)
Stuart Palmer (Network Analyst)
9. Deakin University CRICOS Provider Code: 00113B
RECAP: What is
Kinomatics?The major research aim of Kinomatics is to foster and
investigate:
• the global flow of culture – in particular in the film
industry;
• the computational turn in cinema (and music)
research;
• innovative approaches to digital research
methodologies;
• models for ‘big’ cultural data research.
10. Deakin University CRICOS Provider Code: 00113B
“BIG” data
It’s not how big your data
is, it’s what you do with it
that counts
11. Deakin University CRICOS Provider Code: 00113B
“BIG” data and anxiety
•Size anxiety - dimensia - who has
the biggest?
•Primacy anxiety - who came first?
Or who did it first?
12. Deakin University CRICOS Provider Code: 00113B
ONE YEAR LATER…
Big Data doesn’t
immediately equate to Big
Breakthroughs.
Moving beyond The
Obvious takes work.
17. Deakin University CRICOS Provider Code: 00113B
TIME?: PLAYWEEKSW
ednesday-Tuesday
Thursday-W
ednesday
Friday-Thursday
Saturday-Friday
No
Data
18. Deakin University CRICOS Provider Code: 00113B
MULTIPLE TEMPORALITIES
• Data describes showtimes at the level of individual
sessions that are projected to occur in a forthcoming
week
• Data arrives at kinomatics weekly
• Data is obsolete after one month (and is disposed of
by the data provider)
• Showtime data in the kinomatics database is
replaced on a weekly basis if records are incomplete
• The kinomatics database itself is built in ‘legacy’
data formats (MySQL)
19. Deakin University CRICOS Provider Code: 00113B
BEYOND BINARIES
Above Below
Theoretical Empirical
Big Small
Fast Slow
Uniform Heterogeneous
Distant Close
Global Local
Generated Recuperated
Now Then
Abstract Concrete
20. Deakin University CRICOS Provider Code: 00113B
BIG DATA AND NCH
Big Data changes the past by changing our approach to it.
Working with Big Data invites researchers to reflect on the
nature of history itself: how do we deal with passing media,
passing technologies and also passing ideas about ‘newness’
itself?
If we understand information systems as inherently theoretical
and temporary formations/formulations then what theoretical
and historical questions do they themselves recommend?
21. Deakin University CRICOS Provider Code: 00113B
BIG NEW CINEMA HISTORY
I am large, I contain multitudes
Walt Whitman, Song of Myself
22. Deakin University CRICOS Provider Code: 00113B
THE ORDERING OF THINGS
Reflect on changes in our
own temporality as
researchers: knowledge is
always in process—not a
priori nor a posteriori
To date digital cinema research has been undertaken through of a series of initiatives produced “from below” (Maltby et al). This preference for working with local datasets is consistent with recent methodological breakthroughs in cinema historiography sometimes known as the New Cinema History.
Without exception the existing datasets that form the empirical basis for digital cinema research have occurred at the national or sub-national level. Cinema datasets have been generated for scholarly research projects focussed on (and not limited to): London (Christie), the Netherlands (Dibbets), Ghent (Biltereyst et al), Antwerp (Meers et al), Australia (Maltby et al; Verhoeven et al), Scotland (Hopwood et al) and North Carolina (Robert C. Allen). Each of these datasets were developed independently to solve specific research problems and they are not technically or semantically compatible. The prospect of interoperating these data collections remains a tantalising but near impossible challenge with few options for resourcing an undertaking of this magnitude.
Whilst the proliferation of these digital case studies has produced a great deal of methodological innovation in Cinema Studies, this disjointed approach has also resulted in a significant deficit in our understanding of the global nature of the cinema. These distributed computational platforms are not yet capable of addressing the global, elastic, and networked nature of the contemporary international film industry currently producing and exploiting huge quantities and varieties of data.
- See more at: http://kinomatics.com/about/what-is-kinomatics/#sthash.75Zssawz.dpuf
Through its use and development of digital research techniques the project will also open broader questions; How might the opportunities presented by an unprecedented proliferation of networked data for example, also challenge the unspoken assumptions and ordinary practices of conventional film studies research? And how might the ‘computational turn’ present opportunities (and challenges) for a New Cinema History at the intersection of qualitative historiographies (focused on the social experience of the cinema) and quantitative research approaches such as data mining, empirical analysis and digital visualisations?
How many cars stacked to the moon. (220,000 km almost all the way to the moon)
How many soccer balls in a swimming pool (we would fill 452 Olympic sized swimming pools)
All our research described here relies on our showtimes data
How many cars to the moon. (around 785 round trips to the moon)
How many soccer balls in a swimming pool (we would fill 452 Olympic sized swimming pools)
How many cars to the moon. (around 785 round trips to the moon)
How many soccer balls in a swimming pool (we would fill 452 Olympic sized swimming pools)
[Add presentation title, presenter name here]
Imperial anxieties – field setting, territorial, anxieties of occupation and ownership (control).
Volume. Spatial anxiety.
Velocity…Temporal anxiety.
Other anxieties – hydrophobia, technophobia and so on.
Decoherence – an interaction with wider environment that wipes out quantum behaviour. Loss of information from a system into the environment.
Entanglements are generated between system and environment. Decoherence provides an explanation for the transition of the system to a mixture of states that seem to correspond to those states observers perceive.
HOBBIT viz - Does the word flow have any meaning in the world of select all.
Scale – country and city level for the moment. Complex levels of analysis
combining different data sets a huge problem at global scale. (e.g. no global database of cities). Country information when city information is not available. (use cinema cities).
Working with country boundaries - where they are administrative but not necessarily practically lived.
Accounting for border effects for example.
How can we use the detailed data to full advantage (i.e. not just to generalize).
Cannot combine information at a suburb level. So different scales will always. Our analysis will always be multi-scaled.
Data describes a legacy (yet persistent) media industry.
Show me the history? How does DICIS relate to HOMER. Richard Maltby’s remarks distinguishing historical effort from contemporary studies.
The 5th V of big data – volume, velocity, variety, veracity and also vantage
Every new field also has a theory about the past and the present, of change, of what was and what is, a notion of time: a theory of history.
New Cinema History – already contains in its title an historiographic inclination. The use of ‘new’ already implies a notion of a past.
Franco Moretti got it partly right. Changes are occurring at scale but …
Big new cinema history is multi-scaled. Not just about long versus short of it.
Through its focus on multiple scales, velocities and locations of cultural diffusion, Kinomatics addresses the global, elastic and networked nature of the contemporary international film industry that is itself producing and exploiting huge quantities and varieties of data.
Research itself is always in beta mode—what’s important is not the end result of a purely scholarly exercise but the iterative, multimodal, recursive, and co-created aspects of engagement.