(1) The amount of data in the world is growing exponentially, with unstructured data making up over 80% of collected data by 2020. (2) Apache Drill provides data agility for Hadoop by enabling self-service data exploration through a flexible data model and schema discovery. (3) Drill allows business users to rapidly query diverse data sources like files, HBase tables, and Hive without requiring IT, through a simple SQL interface.
Generic presentation about Big Data Architecture/Components. This presentation was delivered by David Pilato and Tugdual Grall during JUG Summer Camp 2015 in La Rochelle, France
MapR-DB is an enterprise-grade, high performance, in-Hadoop NoSQL (“Not Only SQL”) database management system. It is used to add real-time, operational analytics capabilities to Hadoop and now natively support JSON.
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
Provides an overview of M7, which is the first unified data platform for tables and files. Does a deep dive into the MapR architecture, especially containers, and how M7 tables integrates with the rest of MapR architecture, including volumes, management and Hadoop.
Describes some of the problems with Apache HBase, and how M7 from MapR solves many of these issues.
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
Details the first ever Exabyte-scale system that can hold a Trillion large files. Describes MapR's Distributed NameNode (tm) architecture, and how it scales very easily and seamlessly. Shows map-reduce performance across a variety of benchmarks like dfsio, pig-mix, nnbench, terasort and YCSB.
Generic presentation about Big Data Architecture/Components. This presentation was delivered by David Pilato and Tugdual Grall during JUG Summer Camp 2015 in La Rochelle, France
MapR-DB is an enterprise-grade, high performance, in-Hadoop NoSQL (“Not Only SQL”) database management system. It is used to add real-time, operational analytics capabilities to Hadoop and now natively support JSON.
MapR M7: Providing an enterprise quality Apache HBase APImcsrivas
Provides an overview of M7, which is the first unified data platform for tables and files. Does a deep dive into the MapR architecture, especially containers, and how M7 tables integrates with the rest of MapR architecture, including volumes, management and Hadoop.
Describes some of the problems with Apache HBase, and how M7 from MapR solves many of these issues.
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
Details the first ever Exabyte-scale system that can hold a Trillion large files. Describes MapR's Distributed NameNode (tm) architecture, and how it scales very easily and seamlessly. Shows map-reduce performance across a variety of benchmarks like dfsio, pig-mix, nnbench, terasort and YCSB.
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
Describes the thinking behind MapR's architecture. MapR"s Hadoop achieves better reliability on commodity hardware compared to anything on the planet, including custom, proprietary hardware from other vendors. Apache HDFS and Cassandra replication is also discussed, as are SAN and NAS storage systems like Netapp and EMC.
http://bit.ly/1BTaXZP – Hadoop has been a huge success in the data world. It’s disrupted decades of data management practices and technologies by introducing a massively parallel processing framework. The community and the development of all the Open Source components pushed Hadoop to where it is now.
That's why the Hadoop community is excited about Apache Spark. The Spark software stack includes a core data-processing engine, an interface for interactive querying, Sparkstreaming for streaming data analysis, and growing libraries for machine-learning and graph analysis. Spark is quickly establishing itself as a leading environment for doing fast, iterative in-memory and streaming analysis.
This talk will give an introduction the Spark stack, explain how Spark has lighting fast results, and how it complements Apache Hadoop.
Keys Botzum - Senior Principal Technologist with MapR Technologies
Keys is Senior Principal Technologist with MapR Technologies, where he wears many hats. His primary responsibility is interacting with customers in the field, but he also teaches classes, contributes to documentation, and works with engineering teams. He has over 15 years of experience in large scale distributed system design. Previously, he was a Senior Technical Staff Member with IBM, and a respected author of many articles on the WebSphere Application Server as well as a book.
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with YarnDavid Kaiser
Hadoop is about so much more than batch processing. With the recent release of Hadoop 2, there have been significant changes to how a Hadoop cluster uses resources. YARN, the new resource management component, allows for a more efficient mix of workloads across hardware resources, and enables new applications and new processing paradigms such as stream-processing. This talk will discuss the new design and components of Hadoop 2, and examples of Modern Data Architectures that leverage Hadoop for maximum business efficiency.
John Sing's Edge 2013 presentation, detailing when/where/how external storage products and/or system software (i.e. GPFS) can be effectively used in a Hadoop storage environment. Many Hadoop situations absolutely required direct attached storage. However, there are many intelligent situations where shared external storage may make sense in a Hadoop environment. This presentation details how/why/where, and promotes taking an intelligent, Hadoop-aware approach to deciding between internal storage and external shared storage. Having full awareness of Hadoop considerations is essential to selecting either internal or external shared storage in Hadoop environment.
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
Describes the thinking behind MapR's architecture. MapR"s Hadoop achieves better reliability on commodity hardware compared to anything on the planet, including custom, proprietary hardware from other vendors. Apache HDFS and Cassandra replication is also discussed, as are SAN and NAS storage systems like Netapp and EMC.
http://bit.ly/1BTaXZP – Hadoop has been a huge success in the data world. It’s disrupted decades of data management practices and technologies by introducing a massively parallel processing framework. The community and the development of all the Open Source components pushed Hadoop to where it is now.
That's why the Hadoop community is excited about Apache Spark. The Spark software stack includes a core data-processing engine, an interface for interactive querying, Sparkstreaming for streaming data analysis, and growing libraries for machine-learning and graph analysis. Spark is quickly establishing itself as a leading environment for doing fast, iterative in-memory and streaming analysis.
This talk will give an introduction the Spark stack, explain how Spark has lighting fast results, and how it complements Apache Hadoop.
Keys Botzum - Senior Principal Technologist with MapR Technologies
Keys is Senior Principal Technologist with MapR Technologies, where he wears many hats. His primary responsibility is interacting with customers in the field, but he also teaches classes, contributes to documentation, and works with engineering teams. He has over 15 years of experience in large scale distributed system design. Previously, he was a Senior Technical Staff Member with IBM, and a respected author of many articles on the WebSphere Application Server as well as a book.
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with YarnDavid Kaiser
Hadoop is about so much more than batch processing. With the recent release of Hadoop 2, there have been significant changes to how a Hadoop cluster uses resources. YARN, the new resource management component, allows for a more efficient mix of workloads across hardware resources, and enables new applications and new processing paradigms such as stream-processing. This talk will discuss the new design and components of Hadoop 2, and examples of Modern Data Architectures that leverage Hadoop for maximum business efficiency.
John Sing's Edge 2013 presentation, detailing when/where/how external storage products and/or system software (i.e. GPFS) can be effectively used in a Hadoop storage environment. Many Hadoop situations absolutely required direct attached storage. However, there are many intelligent situations where shared external storage may make sense in a Hadoop environment. This presentation details how/why/where, and promotes taking an intelligent, Hadoop-aware approach to deciding between internal storage and external shared storage. Having full awareness of Hadoop considerations is essential to selecting either internal or external shared storage in Hadoop environment.
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Dataconomy Media
Modern big data applications such as social, mobile, web and IoT deal with a larger number of users and larger amount of data than the traditional transactional applications. The datasets associated with these applications evolve rapidly, are often self-describing and can include complex types such as JSON and Parquet. In this demo we will show how Apache Drill can be used to provide low latency queries natively on rapidly evolving multi-structured datasets at scale.
Transform Unstructured Data Into Relevant Data with IBM StoredIQPerficient, Inc.
Recent studies indicate more than 90% of the world's data was created in the last 2 years, and organizational data is rising 20-50% year-over-year. As the amount of structured and unstructured data dramatically expands, the expense of maintaining data stores is outpacing the reduction in storage costs.
To help your organization address the issues associated with growing data capacity requirements, IBM offers StoredIQ, a leading unstructured data management and intelligent eDiscovery solution.
Learn about:
Data governance challenges
Information lifecycle governance implementation options
The benefits of early action when uncovering redundant, obsolete and trivial (ROT) content
Typical industry use cases
You'll also learn how to most effectively deploy IBM StoredIQ to be able to:
Analyze data sources in-place
Identify ROT content
Uncover personally identifiable information and sensitive data
Limit your compliance risks and reduce storage costs
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015Codemotion
Codemotion Roma 2015 - Unity 5 is here! The latest version of the industry-standard, cross-platform game engine brings a whole variety of new features and tools: Physically-Based Rendering, Reflection Probes, Global Illumination, Audio Mixing, Analytics, Game Recording and Social Media Sharing and many more! This talk will be a hands-on, in-editor demonstration of these new features and how they can easily be used to create beautiful and performant 3D and 2D games!
Webinar: Selecting the Right SQL-on-Hadoop SolutionMapR Technologies
In the crowded SQL-on-Hadoop market, choosing the right solution for your business can be difficult. In this webinar, learn firsthand from Rick van der Lans, independent analyst and managing director of R20/Consultancy, how to sort through this market complexity and what tough questions to ask when evaluating perspective SQL-on-Hadoop solutions.
Hadoop and the Future of SQL: Using BI Tools with Big DataSenturus
Hadoop is changing how businesses operate, learn about this emerging technology stack. View the webinar video recording and download this deck: http://www.senturus.com/resource-video/hadoop-future-sql/?rId=3410.
Learn the role SQL queries play for big data, and how SQL-on-Hadoop technologies enable organizations to leverage their existing SQL skills and investments in business intelligence (BI) tools to dramatically improve: 1) Recommendation engines for online retail, 2) Transactional fraud prevention for financial services, 3) Customized advertising and 4) Predictive failure analytics for manufacturing.
Senturus, a business analytics consulting firm, has a resource library with hundreds of free recorded webinars, trainings, demos and unbiased product reviews. Take a look and share them with your colleagues and friends: http://www.senturus.com/resources/.
http://bit.ly/1EUxliI - Drill provides the agility, flexibility and the familiarity required for users to derive timely insights from big data and to build the next generation big data applications.
From the Hadoop Summit 2015 Session with Tomer Shiran.
To deliver real-time impact from big data, organizations must evolve beyond traditional analytic approaches to support a new class of agile, distributed applications. Real-time Hadoop overcomes batch programs reliant on data transformations and schema management. This session highlights how leading organizations are leveraging Hadoop and NoSQL to merge analytics and production data to make adjustments while business is happening to optimize revenue, mitigate risk and reduce operational costs. Details include how companies have achieved real-time impact on their business, collapsed data silos, and automated in-line analytics with operational data for immediate impact.
It’s no longer a world of just relational databases. Companies are increasingly adopting specialized datastores such as Hadoop, HBase, MongoDB, Elasticsearch, Solr and S3. Apache Drill, an open source, in-memory, columnar SQL execution engine, enables interactive SQL queries against more datastores.
Hadoop 2.0 - Solving the Data Quality ChallengeInside Analysis
The Briefing Room with Dr. Claudia Imhoff and RedPoint Global
Live Webcast on July 22, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=7bb4cbc33402c3b5f649343052cb9a6d
Whether data is big or small, quality remains the critical characteristic. While traditional approaches to cleansing data have made strides, nonetheless, data quality remains a serious hurdle for all organizations. This is especially true for identity resolution in customer data, but also for a range of other data sets, including social, supply chain, financial and other domains. One of the most promising approaches for solving this decades-old challenge incorporates the power of massive parallel processing, a la Hadoop.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Claudia Imhoff, who will explain how Hadoop 2.0 and its YARN architecture can make a serious impact on the previously intractable problem of data quality. She’ll be briefed by George Corugedo of RedPoint Global, who will show how his company’s platform can serve as a super-charged marshaling area for accessing, cleansing and delivering high-quality data. He’ll explain how RedPoint was one of the first applications to be certified for running on YARN, which is the latest rendition of the now-ubiquitous Hadoop.
Visit InsideAnlaysis.com for more information.
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
More and more organizations are turning to Hadoop and NoSQL to manage big data. In fact, many IT professionals consider each of those terms to be synonymous with big data. At the same time, these two technologies are seen as different beasts that handle different challenges. That means they are often deployed in a rather disjointed way, even when intended to solve the same overarching business problem. The emerging trend of “in-Hadoop databases” promises to narrow the deployment gap between them and enable new enterprise applications. In this talk, Dale will describe that integrated architecture and how customers have deployed it to benefit both the technical and the business teams.
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
This webinar discusses why Apache Hadoop most typically the technology underpinning "Big Data". How it fits in a modern data architecture and the current landscape of databases and data warehouses that are already in use.
August Pittsburgh Hadoop User group meetup ( http://www.meetup.com/HUG-Pittsburgh/events/195143712/ ) where we discuss Apache Drill and how it can provide agility, flexibility, and speed to both structured and unstructured data analytics, on hadoop and otherwise.
Join our experts Neeraja Rentachintala, Sr. Director of Product Management and Aman Sinha, Lead Software Engineer and host Sameer Nori in a discussion about putting Apache Drill into production.
Self-Service BI for big data applications using Apache Drill (Big Data Amster...Mats Uddenfeldt
Modern big data applications such as social, mobile, web and IoT deal with a larger number of users and larger amount of data than the traditional transactional applications. The datasets associated with these applications evolve rapidly, are often self-describing and can include complex types such as JSON and Parquet. In this demo we will show how Apache Drill can be used to provide low latency queries natively on rapidly evolving multi-structured datasets at scale.
Similar to The Future of Hadoop: MapR VP of Product Management, Tomer Shiran (20)
How Data-Driven Approaches are Changing Your Data Management Strategies
Introducing data-driven strategies into your business model alters the way your organization manages and provides information to your customers, partners and employees. Gone are the days of “waterfall” implementation strategies from relational data to applications within a data center. Now, data-driven business models require agile implementation of applications based on information from all across an organization–on-premises, cloud, and mobile–and includes information from outside corporate walls from partners, third-party vendors, and customers. Data management strategies need to be ready to meet these challenges or your new and disruptive business models will fail at the most critical time: when your customers want to access it.
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
How Rendezvous Architecture Improves Evaluation in the Real World
In this addition of our machine learning logistics webinar series we build on the ideas of the key requirements for effective management of machine learning logistics presented in the Overview webinar and in Part I Workshop. Here we focus on model-to-model comparison & evaluation, use of decoy models and more. Listen here: http://info.mapr.com/machine-learning-workshop2.html?_ga=2.35695522.324200644.1511891424-416597139.1465233415
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
MapR has launched the MapR Data Science Refinery which leverages a scalable data science notebook with native platform access, superior out-of-the-box security, and access to global event streaming and a multi-model NoSQL database.
Enabling Real-Time Business with Change Data CaptureMapR Technologies
Machine learning (ML) and artificial intelligence (AI) enable intelligent processes that can autonomously make decisions in real-time. The real challenge for effective ML and AI is getting all relevant data to a converged data platform in real-time, where it can be processed using modern technologies and integrated into any downstream systems.
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
Big data technologies are being applied to a wide variety of use cases. We will review tangible examples of machine learning, discuss an autonomous driving project and illustrate the role of MapR in next generation initiatives. More: http://info.mapr.com/WB_Machine-Learning-for-Chickens_Global_DG_17.11.02_RegistrationPage.html
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
Having heard the high-level rationale for the rendezvous architecture in the introduction to this series, we will now dig in deeper to talk about how and why the pieces fit together. In terms of components, we will cover why streams work, why they need to be persistent, performant and pervasive in a microservices design and how they provide isolation between components. From there, we will talk about some of the details of the implementation of a rendezvous architecture including discussion of when the architecture is applicable, key components of message content and how failures and upgrades are handled. We will touch on the monitoring requirements for a rendezvous system but will save the analysis of the recorded data for later. Listen to the webinar on demand: https://mapr.com/resources/webinars/machine-learning-workshop-1/
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
Join Ellen Friedman, co-author (with Ted Dunning) of a new short O’Reilly book Machine Learning Logistics: Model Management in the Real World, to look at what you can do to have effective model management, including the role of stream-first architecture, containers, a microservices approach and a DataOps style of work. Ellen will provide a basic explanation of a new architecture that not only leverages stream transport but also makes use of canary models and decoy models for accurate model evaluation and for efficient and rapid deployment of new models in production.
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
Data warehouses have been the standard tool for analyzing data created by business operations. In recent years, increasing data volumes, new types of data formats, and emerging analytics technologies such as machine learning have given rise to modern data lakes. Connecting application databases, data warehouses, and data lakes using real-time data pipelines can significantly improve the time to action for business decisions. More: http://info.mapr.com/WB_MapR-StreamSets-Data-Warehouse-Modernization_Global_DG_17.08.16_RegistrationPage.html
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
For this talk we will explore the power of streaming real time events in the context of the IoT and smart cities.
http://info.mapr.com/WB_Streaming-Real-Time-Events_Global_DG_17.08.02_RegistrationPage.html
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
Deploying storage with a forklift is so 1990s, right? Today’s applications and infrastructure demand systems and services that scale. Customers require performance and capacity to fit the use case and workloads, not the other way around. Architects need multi-temperature, multi-location, highly available, and compliance friendly platforms that grow with the generational shift in data growth and utility.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
An Introduction to the MapR Converged Data PlatformMapR Technologies
Listen to the webinar on-demand: http://info.mapr.com/WB_Partner_CDP_Intro_EMEA_DG_17.05.31_RegistrationPage.html
In this 90-minute webinar, we discuss:
- The MapR Converged Data Platform and its components
- Use cases for the Converged Data Platform
- MapR Converged Partner Program
- How to get started with MapR
- Becoming a partner
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
IT budgets are shrinking, and the move to next-generation technologies is upon us. The cloud is an option for nearly every company, but just because it is an option doesn’t mean it is always the right solution for every problem.
Most cloud providers would prefer that every customer be tightly coupled with their proprietary services and APIs to create lock-in with that cloud provider. The savvy customer will leverage the cloud as infrastructure and stay loosely bound to a cloud provider. This creates an opportunity for the customer to execute a multicloud strategy or even a hybrid on-premises and cloud solution.
Jim Scott explores different use cases that may be best run in the cloud versus on-premises, points out opportunities to optimize cost and operational benefits, and explains how to get the data moved between locations. Along the way, Jim discusses security, backups, event streaming, databases, replication, and snapshots across a variety of use cases that run most businesses today.
Is your organization at the analytics crossroads? Have you made strides collecting and sharing massive amounts of data from electronic health records, insurance claims, and health information exchanges but found these efforts made little impact on efficiency, patient outcomes, or costs?
Changes in how business is done combined with multiple technology drivers make geo-distributed data increasingly important for enterprises. These changes are causing serious disruption across a wide range of industries, including healthcare, manufacturing, automotive, telecommunications, and entertainment. Technical challenges arise with these disruptions, but the good news is there are now innovative solutions to address these problems. http://info.mapr.com/WB_Geo-distributed-Big-Data-and-Analytics_Global_DG_17.05.16_RegistrationPage.html
MapR announced a few new releases in 2017, and we want to go over those exciting new products and features that are available now. We’d like to invite our customers and partners to this webinar in which members of the MapR product team will share details about the latest updates.
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
SAP® HANA and SAP® IQ are popular platforms for various analytical and transactional use cases. If you’re an SAP customer, you’ve experienced the benefits of deploying these solutions. However, as data volumes grow, you’re likely asking yourself: How do I scale storage to support these applications? How can I have one platform for various applications and use cases?
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
SAP HANA is an increasingly popular platform for various analytical and transactional use cases with its in-memory architecture. If you’re an SAP customer you’ve experienced the benefits.
However, the underlying storage for SAP HANA is painfully expensive. This slows down your ability to grow your SAP HANA footprint and serve up more applications.
You’re not the only one still loading your data into data warehouses and building marts or cubes out of it. But today’s data requires a much more accessible environment that delivers real-time results. Prepare for this transformation because your data platform and storage choices are about to undergo a re-platforming that happens once in 30 years.
With the MapR Converged Data Platform (CDP) and Cisco Unified Compute System (UCS), you can optimize today’s infrastructure and grow to take advantage of what’s next. Uncover the range of possibilities from re-platforming by intimately understanding your options for density, performance, functionality and more.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Have someone introduce me.
Thank audience (tie to morning activities), sponsors, HP, etc.
We’re here because this is the biggest thing that has happened to Hadoop…
Here at the conference we’re talking about data science. But before we can appreciate the changes happening in data science, we must first talk about Data. Data is doubling every two years. The fast growing volume, variety and velocity of data is overwhelming traditional systems and approaches. A revolutionary approach is required to leverage this data. And with this new technology, Data science as we know, is undergoing tremendous change.
To give you a sense of the data volumes that we’re talking about, I’ve included this chart that shows why a revolutionary approach is needed. You can see the amount of data growth moving from 1.8 Zettabytes to 44 Zettabytes in just over 5 years. To put this into perspective a large datawarehouse contains terabytes of data. A zettabye is 1 billion terabytes.
Numbers in chart are from two IDC reports (sponsored by emc).
http://www.emc.com/collateral/about/news/idc-emc-digital-universe-2011-infographic.pdf
http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm
What is the source of this data growth? While structured data growth has been relatively modest, the growth in unstructured data has been exponential.
Source of statistic: http://link.springer.com/chapter/10.1007/978-3-642-39146-0_2
sensor data, social media, clickstream, genomic data, location information, video files, etc.
The system that is enabling this growth in data capture is Hadoop.
We are proud/fortunate that Forrester has named MapR as the best Hadoop distribution in the market.
8
9
Many organizations now want to unlock the data in Hadoop and make it accessible to a broader audience within their organizations. That’s easier said than done. While we’ve largely solved the infrastructure scalability challenge, the massive volume, variety and velocity of this data introduces serious challenges on the human side, such as how to prepare all that data and make it available to users, how to make operational data available in real-time for analytics, etc. We need better technology to empower users to take advantage of these massive volumes of data.
Past: Enable organizations to capture the data.
Future: Enable organizations to more easily extract value from all this captured data.
What does the future of Hadoop look like?
The problem
I’m sure many of you have experienced this (just like the quotes)
Why we want to solve it
Here’s what we’re doing about it
One of the challenges with Hadoop as well as traditional data management tools is the business user’s “distance from the data”.
The dependency on IT (or additional development) increases time to value and reduces agility. It also creates a burden on IT at a time when IT is already overworked. The red arrows in this illustration can represent significant backlogs and delays (often many months).
Many of you are likely having to spend a lot of time on plumbing development and data preparation. How many have had to do this? (show hand)
“Data modeling and transformations” may seem easy, but when you look at a real-world environment, you could have thousands of data sets.
Opportunity
This is the opportunity.
The audience should feel like this is their chance to become heroes by bringing this to their companies.
They have to feel (be emotional) about the problem at this point.
IT-driven = months of delay, unnecessary work (data is no longer relevant, etc.)
The so-what needs to be conveyed. Why does it matter that it’s not needed.
6 months -> 3 months -> 3 months -> day zero
So imagine now what you can get…
Data Agility is needed for Business Agility
>>> Stand still during slide, move in at the punchline (why does this matter to YOU)
Need an example or analogy to explain self-describing data.
All SQL engines (traditional or SQL-on-Hadoop) view tables as spreadsheet-like data structures with rows and columns. All records have the same structure, and there is no support for nested data or repeating fields. Drill views tables conceptually as collections of JSON (with additional types) documents. Each record can have a different structure (hence, schema-less). This is revolutionary and has never been done before.
If you consider the four data models shown in the 2x2, all models can be represented by the complex, no schema model (JSON) because it is the most flexible. However, no other data model can be represented by the flat, fixed schema model. Therefore, when using any SQL engine except Drill, the data has to be transformed before it can be available to queries.
TODO: Add Impala and Splunk logos
What I want you to see now is how easy is it to ….
Is there something from Israel?
With other technologies you have to do this, then this, then this, …
Key takeaways
Core message – We are revolutionizing Hadoop
Call to action – get involved, and enjoy the conference as we have great speakers
If doing Q&A, set boundaries (time - how much time we have, topic – what questions can I answer about this revolution), back pocket question (someone asked me this morning)
-