The Data Interoperability extension is an optional extension to ArcGIS, based on Safe Software's FME technology. The extension supports reading over 100 GIS, CAD and database formats
Many of us are coming from knowing SQL before they have been introduced with DAX or many of us know how to use tabular model through reporting tool but do not know they can use DAX as query language against a tabular model.
This presentation focuses on three areas:
1. How to use DAX as a query language to select columns (columns from multiple tables and from value); Group Data; Filter Data; Join Tables; Build customized calculations/measures (like sum, window function, means)
2. How to use tracing tools to monitor the performance difference between SQL solution and DAX solution;
3. Use a real life example to demonstrate how to DAX and tabular modeling (Row Contexts vs. Filter Contexts + Bridge Table + Inactive relationship + extended table) to replace high resource consuming ETL processes
This document discusses taking source filters in Oracle's Financial Data Management Enterprise Edition (FDMEE) to the next level. It presents two case studies of customizing source filters: 1) For a Universal Data Adapter extracting from SQL, dynamically setting a filter parameter value to include all entities in a division. 2) For an HFM extract, dynamically setting dimension filters based on a user attribute value. The document explains how to build custom filter values in a BefImport script and update the parameter value at runtime to make it dynamic rather than static. This allows more flexible filtering than the out-of-the-box capabilities in FDMEE.
The document discusses StreamHorizon's "adaptiveETL" platform for big data analytics. It highlights limitations of legacy ETL platforms and StreamHorizon's advantages, including massively parallel processing, in-memory processing, quick time to market, low total cost of ownership, and support for big data architectures like Hadoop. StreamHorizon is presented as an effective and cost-efficient solution for data integration and processing projects.
Columnar Table Performance Enhancements Of Greenplum Database with Block Meta...Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 7 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/2923.html
Alibaba built up a data warehouse service named HybridDB in its public cloud service, based on the open sourced Greenplum Database. And it keeps on enhancing HybridDB's preformance. This presentation will talk about how Alibaba improves HybridDB's performance for columnar tables with data block's meta data (MIN/MAX values of block data) and sort keys (pre-defined keys that data will be sorted and stored with). Testing result shows that, block metadata can be generated on-the-fly without much overhead, but can achive better performance even than index scan. With sort keys, a constant response time can be archived for GROUP-BY and ORDER-BY queries.
Sun Trainings is one of the best coaching center in Hyderabad. Join our online training sessions with our real time faculty of Informatica. Practical sessions will also be provided for hands on experience. We provide training courses ideal for software and data management professionals. Our training sessions covers all information from basic to advanced level. Don’t wait anymore and mail your queries on contact@suntrainings.com / (M) 9642434362 .
ODTUG KSCOPE 2017 - Black Belt Techniques for FDMEE and Cloud Data ManagementFrancisco Amores
This document provides techniques for advanced data integration using Oracle's Hyperion Financial Management (HFM) and Financial Data Management Enterprise Edition (FDMEE). It discusses 25 techniques across areas like data extraction, mappings, scripting, integration with EPM applications, and automation. Examples include using member lists and functions to extract additional data, mapping based on target dimension values, running MaxL scripts via the Essbase JAPI, and enhancing the standard scheduler to allow more flexible scheduling options.
Many of us are coming from knowing SQL before they have been introduced with DAX or many of us know how to use tabular model through reporting tool but do not know they can use DAX as query language against a tabular model.
This presentation focuses on three areas:
1. How to use DAX as a query language to select columns (columns from multiple tables and from value); Group Data; Filter Data; Join Tables; Build customized calculations/measures (like sum, window function, means)
2. How to use tracing tools to monitor the performance difference between SQL solution and DAX solution;
3. Use a real life example to demonstrate how to DAX and tabular modeling (Row Contexts vs. Filter Contexts + Bridge Table + Inactive relationship + extended table) to replace high resource consuming ETL processes
This document discusses taking source filters in Oracle's Financial Data Management Enterprise Edition (FDMEE) to the next level. It presents two case studies of customizing source filters: 1) For a Universal Data Adapter extracting from SQL, dynamically setting a filter parameter value to include all entities in a division. 2) For an HFM extract, dynamically setting dimension filters based on a user attribute value. The document explains how to build custom filter values in a BefImport script and update the parameter value at runtime to make it dynamic rather than static. This allows more flexible filtering than the out-of-the-box capabilities in FDMEE.
The document discusses StreamHorizon's "adaptiveETL" platform for big data analytics. It highlights limitations of legacy ETL platforms and StreamHorizon's advantages, including massively parallel processing, in-memory processing, quick time to market, low total cost of ownership, and support for big data architectures like Hadoop. StreamHorizon is presented as an effective and cost-efficient solution for data integration and processing projects.
Columnar Table Performance Enhancements Of Greenplum Database with Block Meta...Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 7 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/2923.html
Alibaba built up a data warehouse service named HybridDB in its public cloud service, based on the open sourced Greenplum Database. And it keeps on enhancing HybridDB's preformance. This presentation will talk about how Alibaba improves HybridDB's performance for columnar tables with data block's meta data (MIN/MAX values of block data) and sort keys (pre-defined keys that data will be sorted and stored with). Testing result shows that, block metadata can be generated on-the-fly without much overhead, but can achive better performance even than index scan. With sort keys, a constant response time can be archived for GROUP-BY and ORDER-BY queries.
Sun Trainings is one of the best coaching center in Hyderabad. Join our online training sessions with our real time faculty of Informatica. Practical sessions will also be provided for hands on experience. We provide training courses ideal for software and data management professionals. Our training sessions covers all information from basic to advanced level. Don’t wait anymore and mail your queries on contact@suntrainings.com / (M) 9642434362 .
ODTUG KSCOPE 2017 - Black Belt Techniques for FDMEE and Cloud Data ManagementFrancisco Amores
This document provides techniques for advanced data integration using Oracle's Hyperion Financial Management (HFM) and Financial Data Management Enterprise Edition (FDMEE). It discusses 25 techniques across areas like data extraction, mappings, scripting, integration with EPM applications, and automation. Examples include using member lists and functions to extract additional data, mapping based on target dimension values, running MaxL scripts via the Essbase JAPI, and enhancing the standard scheduler to allow more flexible scheduling options.
This document provides steps for extracting data from a relational database and transforming it into RDF triples to load into an AllegroGraph RDF graph database using Talend ETL. The steps include connecting to the relational data source, collecting metadata on the tables and columns, creating a Talend job with a data source and mapper component, mapping the relational columns to RDF properties and classes, configuring the output, and testing the job to generate N-Triples, N-Quads, or N-Quads Extended triples for loading into the graph database.
(ATS6-PLAT07) Managing AEP in an enterprise environmentBIOVIA
Deployments can range from personal laptop usage to large enterprise environments. The installer allows both interactive and unattended installations. Key folders include Users for individual data, Jobs for temporary execution data, Shared Public for shared resources, and XMLDB for the database. Logs record job executions, authentication events, and errors. Tools like DbUtil allow backup/restore of data, pkgutil creates packages for application delivery, and regress enables test automation. Planning folder locations and maintenance is important for managing resources in an enterprise environment.
This document discusses using Apache Falcon's Pipeline Designer for big data ETL. It provides an overview of the key concepts in Pipeline Designer including feeds, processes, actions, transforms, and deployment. Pipeline Designer aims to simplify authoring of ETL workflows for big data by providing a visual interface and compiling transformations into Pig scripts to be executed by Falcon.
The EHRI Metadata Publishing Tool (MPT) facilitates publishing archival resources and sitemaps to conform with the ResourceSync framework. The MPT is a desktop application with tabs to configure settings, import archival finding aids, select resources for synchronization, execute the synchronization, export resources and sitemaps to a web server, and audit the published resources. It assists in managing the internal logistics of preparing and transporting archival descriptions to be harvested by the EHRI portal and kept in synchronization according to the ResourceSync specification.
The document describes Europeana's aggregation workflow and processes for ingesting data. It discusses Europeana's aggregation team, publication policy, submission deadlines, ingestion tools and processes, acceptance criteria, guidance resources, and future plans to open parts of the workflow to allow providers more self-service options. Future plans aim to give providers the ability to map, validate, and preview their data before publication to Europeana.
(ATS6-DEV06) Using Packages for Protocol, Component, and Application DeliveryBIOVIA
Delivering protocols, components, and applications to users and other developers on an AEP server can be very challenging. Accelrys delivers the majority of its AEP services in the form of packages. This talk will discuss the methods that anyone can use to deliver bundled applications in the form of packages and the benefits of doing so. The discussion will include how to create packages, modifying existing packages, deploying packages to servers, and tools that can be used for ensuring the quality of the packages.
Major advancements in Apache Hive towards full support of SQL compliance include:
1) Adding support for SQL2011 keywords and reserved keywords to reduce parser ambiguity issues.
2) Adding support for primary keys and foreign keys to improve query optimization, specifically cardinality estimation for joins.
3) Implementing set operations like INTERSECT and EXCEPT by rewriting them using techniques like grouping, aggregation, and user-defined table functions.
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Apache Apex
Presenter:
Priyanka Gugale, Committer for Apache Apex and Software Engineer at DataTorrent.
In this session we will cover introduction to Yarn, understanding yarn architecture as well as look into Yarn application lifecycle. We will also learn how Apache Apex is one of the Yarn applications in Hadoop.
Column qualifier encoding in Apache Phoenix provides benefits over using column names as column qualifiers. It assigns numbers as column qualifiers instead of names, allowing for more efficient column renaming and optimizations. Test results showed the encoded approach used less disk space, improved ORDER BY and GROUPED AGGREGATION performance by up to 2x, and had near constant growth in ORDER BY performance as columns increased versus non-encoded approaches. Further work is ongoing to fully implement and test column encoding.
This document provides an overview of the Whole Platform solution for the LWC11 submission. It highlights that the Whole Platform is a vertically integrated system for domain-level agile development. It then describes the LWC11 solution, including grammars for entities and instances, a metamodel for ER modeling, actions for behavior and tooling, a solution deployer, and a test suite.
The Query Service is the new platform solution for querying a variety of data sources. The goal of Query Service is that administrators can configure a metadata description of the data source that can then be used by end users without detailed knowledge of the underlying data source. This session explains how to configure Query Service data sources and use them with the RESTful API or component collection.
The document discusses the implementation of an ETL system with bi-directional data flow between Source A and Source B. It describes the ETL process, data sources and targets, and commercial and open-source ETL tools that can be used. It also discusses six key parts of ETL solutions for data integration in a data warehouse: data migration, consolidation, integration, master data management, the data warehouse itself, and data synchronization.
Apache Phoenix is a SQL query layer for Apache HBase that allows users to interact with HBase through JDBC. It transforms SQL queries into native HBase API calls to optimize execution across the HBase cluster in a parallel manner. The presentation covered Phoenix's current features like join support, new features like functional indexes and user defined functions, and the future integration with Apache Calcite to bring more SQL capabilities and a cost-based query optimizer to Phoenix. Overall, Phoenix provides a relational view of data stored in HBase to enable complex SQL queries to run efficiently on large datasets.
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
Phoenix has evolved to become a full-fledged relational database layer over HBase data. We'll discuss the fundamental principles of how Phoenix pushes the computation to the server and why this leads to performance enabling direct support of low-latency applications, along with some major new features. Next, we'll outline our approach for transaction support in Phoenix, a work in-progress, and discuss the pros and cons of the various approaches. Lastly, we'll examine the current means of integrating Phoenix with the rest of the Hadoop ecosystem.
What is Data Warehousing? ,
Who needs Data Warehousing? ,
Why Data Warehouse is required? ,
Types of Systems ,
OLTP
OLAP
Maintenance of Data Warehouse
Data Warehousing Life Cycle
Large Scale ETL for Hadoop and Cloudera Search using Morphlineswhoschek
Cloudera Morphlines is a new, embeddable, open source Java framework that reduces the time and skills necessary to integrate and build Hadoop applications that extract, transform, and load data into Apache Solr, Apache HBase, HDFS, enterprise data warehouses, analytic online dashboards, or other consumers. If you want to integrate, build, or facilitate streaming or batch transformation pipelines without programming and without MapReduce skills, and get the job done with a minimum amount of fuss and support costs, Morphlines is for you.
In this talk, you'll get an overview of Morphlines internals and explore sample use cases that can be widely applied.
Middle Tier Scalability - Present and Futuredfilppi
How the data explosion of recent years has spawned many new technologies, and role of in-memory techology currently and in light of advances in flash memory.
The document provides an agenda and summaries of presentations for an Apache Phoenix conference. The presentations will cover Phoenix use cases at various companies, new features in Phoenix like ACID transactions using Tephra and cost-based query optimization with Calcite, and interoperability between Phoenix and Drill. One presentation will discuss using Phoenix for time-series data at Salesforce, another will provide tips for optimizing performance on large Phoenix clusters at Sony, and a third will cover how Phoenix is used at eHarmony for batch processing and low-latency queries.
One Slide Overview: ORCL Big Data Integration and GovernanceJeffrey T. Pollock
This document discusses Oracle's approach to big data integration and governance. It describes Oracle tools like GoldenGate for real-time data capture and movement, Data Integrator for data transformation both on and off the Hadoop cluster, and governance tools for data preparation, profiling, cleansing, and metadata management. It positions Oracle as a leader in big data integration through capabilities like non-invasive data capture, low-latency data movement, and pushdown processing techniques pioneered by Oracle to optimize distributed queries.
This document provides steps for extracting data from a relational database and transforming it into RDF triples to load into an AllegroGraph RDF graph database using Talend ETL. The steps include connecting to the relational data source, collecting metadata on the tables and columns, creating a Talend job with a data source and mapper component, mapping the relational columns to RDF properties and classes, configuring the output, and testing the job to generate N-Triples, N-Quads, or N-Quads Extended triples for loading into the graph database.
(ATS6-PLAT07) Managing AEP in an enterprise environmentBIOVIA
Deployments can range from personal laptop usage to large enterprise environments. The installer allows both interactive and unattended installations. Key folders include Users for individual data, Jobs for temporary execution data, Shared Public for shared resources, and XMLDB for the database. Logs record job executions, authentication events, and errors. Tools like DbUtil allow backup/restore of data, pkgutil creates packages for application delivery, and regress enables test automation. Planning folder locations and maintenance is important for managing resources in an enterprise environment.
This document discusses using Apache Falcon's Pipeline Designer for big data ETL. It provides an overview of the key concepts in Pipeline Designer including feeds, processes, actions, transforms, and deployment. Pipeline Designer aims to simplify authoring of ETL workflows for big data by providing a visual interface and compiling transformations into Pig scripts to be executed by Falcon.
The EHRI Metadata Publishing Tool (MPT) facilitates publishing archival resources and sitemaps to conform with the ResourceSync framework. The MPT is a desktop application with tabs to configure settings, import archival finding aids, select resources for synchronization, execute the synchronization, export resources and sitemaps to a web server, and audit the published resources. It assists in managing the internal logistics of preparing and transporting archival descriptions to be harvested by the EHRI portal and kept in synchronization according to the ResourceSync specification.
The document describes Europeana's aggregation workflow and processes for ingesting data. It discusses Europeana's aggregation team, publication policy, submission deadlines, ingestion tools and processes, acceptance criteria, guidance resources, and future plans to open parts of the workflow to allow providers more self-service options. Future plans aim to give providers the ability to map, validate, and preview their data before publication to Europeana.
(ATS6-DEV06) Using Packages for Protocol, Component, and Application DeliveryBIOVIA
Delivering protocols, components, and applications to users and other developers on an AEP server can be very challenging. Accelrys delivers the majority of its AEP services in the form of packages. This talk will discuss the methods that anyone can use to deliver bundled applications in the form of packages and the benefits of doing so. The discussion will include how to create packages, modifying existing packages, deploying packages to servers, and tools that can be used for ensuring the quality of the packages.
Major advancements in Apache Hive towards full support of SQL compliance include:
1) Adding support for SQL2011 keywords and reserved keywords to reduce parser ambiguity issues.
2) Adding support for primary keys and foreign keys to improve query optimization, specifically cardinality estimation for joins.
3) Implementing set operations like INTERSECT and EXCEPT by rewriting them using techniques like grouping, aggregation, and user-defined table functions.
Intro to YARN (Hadoop 2.0) & Apex as YARN App (Next Gen Big Data)Apache Apex
Presenter:
Priyanka Gugale, Committer for Apache Apex and Software Engineer at DataTorrent.
In this session we will cover introduction to Yarn, understanding yarn architecture as well as look into Yarn application lifecycle. We will also learn how Apache Apex is one of the Yarn applications in Hadoop.
Column qualifier encoding in Apache Phoenix provides benefits over using column names as column qualifiers. It assigns numbers as column qualifiers instead of names, allowing for more efficient column renaming and optimizations. Test results showed the encoded approach used less disk space, improved ORDER BY and GROUPED AGGREGATION performance by up to 2x, and had near constant growth in ORDER BY performance as columns increased versus non-encoded approaches. Further work is ongoing to fully implement and test column encoding.
This document provides an overview of the Whole Platform solution for the LWC11 submission. It highlights that the Whole Platform is a vertically integrated system for domain-level agile development. It then describes the LWC11 solution, including grammars for entities and instances, a metamodel for ER modeling, actions for behavior and tooling, a solution deployer, and a test suite.
The Query Service is the new platform solution for querying a variety of data sources. The goal of Query Service is that administrators can configure a metadata description of the data source that can then be used by end users without detailed knowledge of the underlying data source. This session explains how to configure Query Service data sources and use them with the RESTful API or component collection.
The document discusses the implementation of an ETL system with bi-directional data flow between Source A and Source B. It describes the ETL process, data sources and targets, and commercial and open-source ETL tools that can be used. It also discusses six key parts of ETL solutions for data integration in a data warehouse: data migration, consolidation, integration, master data management, the data warehouse itself, and data synchronization.
Apache Phoenix is a SQL query layer for Apache HBase that allows users to interact with HBase through JDBC. It transforms SQL queries into native HBase API calls to optimize execution across the HBase cluster in a parallel manner. The presentation covered Phoenix's current features like join support, new features like functional indexes and user defined functions, and the future integration with Apache Calcite to bring more SQL capabilities and a cost-based query optimizer to Phoenix. Overall, Phoenix provides a relational view of data stored in HBase to enable complex SQL queries to run efficiently on large datasets.
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
Phoenix has evolved to become a full-fledged relational database layer over HBase data. We'll discuss the fundamental principles of how Phoenix pushes the computation to the server and why this leads to performance enabling direct support of low-latency applications, along with some major new features. Next, we'll outline our approach for transaction support in Phoenix, a work in-progress, and discuss the pros and cons of the various approaches. Lastly, we'll examine the current means of integrating Phoenix with the rest of the Hadoop ecosystem.
What is Data Warehousing? ,
Who needs Data Warehousing? ,
Why Data Warehouse is required? ,
Types of Systems ,
OLTP
OLAP
Maintenance of Data Warehouse
Data Warehousing Life Cycle
Large Scale ETL for Hadoop and Cloudera Search using Morphlineswhoschek
Cloudera Morphlines is a new, embeddable, open source Java framework that reduces the time and skills necessary to integrate and build Hadoop applications that extract, transform, and load data into Apache Solr, Apache HBase, HDFS, enterprise data warehouses, analytic online dashboards, or other consumers. If you want to integrate, build, or facilitate streaming or batch transformation pipelines without programming and without MapReduce skills, and get the job done with a minimum amount of fuss and support costs, Morphlines is for you.
In this talk, you'll get an overview of Morphlines internals and explore sample use cases that can be widely applied.
Middle Tier Scalability - Present and Futuredfilppi
How the data explosion of recent years has spawned many new technologies, and role of in-memory techology currently and in light of advances in flash memory.
The document provides an agenda and summaries of presentations for an Apache Phoenix conference. The presentations will cover Phoenix use cases at various companies, new features in Phoenix like ACID transactions using Tephra and cost-based query optimization with Calcite, and interoperability between Phoenix and Drill. One presentation will discuss using Phoenix for time-series data at Salesforce, another will provide tips for optimizing performance on large Phoenix clusters at Sony, and a third will cover how Phoenix is used at eHarmony for batch processing and low-latency queries.
One Slide Overview: ORCL Big Data Integration and GovernanceJeffrey T. Pollock
This document discusses Oracle's approach to big data integration and governance. It describes Oracle tools like GoldenGate for real-time data capture and movement, Data Integrator for data transformation both on and off the Hadoop cluster, and governance tools for data preparation, profiling, cleansing, and metadata management. It positions Oracle as a leader in big data integration through capabilities like non-invasive data capture, low-latency data movement, and pushdown processing techniques pioneered by Oracle to optimize distributed queries.
Zararfa summer camp 2012 interesting tips & tricks when migrating to zarafaZarafa
This document discusses migrating collaboration solutions and deploying Outlook profiles. It covers migrating mailbox data, permissions, rules and other information between solutions like Scalix and Zarafa. It describes Outlook profile components, storage locations, and generation methods including the Outlook profile app and mechanics tools. The document provides tips on migration strategies and demonstrates profile migration between Scalix and Zarafa.
AMIS organiseerde op maandagavond 15 juli het seminar ‘Oracle database 12c revealed’. Deze avond bood AMIS Oracle professionals de eerste mogelijkheid om de vernieuwingen in Oracle database 12c in actie te zien! De AMIS specialisten die meer dan een jaar bèta testen hebben uitgevoerd lieten zien wat er nieuw is en hoe we dat de komende jaren gaan inzetten!
Deze presentatie is deze avond gegeven als een plenaire sessie!
An AMIS Overview of Oracle database 12c (12.1)Marco Gralike
Presentation used by Lucas Jellema and Marco Gralike during the AMIS Oracle Database 12c Launch event on Monday the 15th of July 2013 (much thanks to Tom Kyte, Oracle, for being allowed to use some of his material)
M.
Solr + Hadoop: Interactive Search for Hadoopgregchanan
This document discusses Cloudera Search, which integrates Apache Solr with Cloudera's distribution of Apache Hadoop (CDH) to provide interactive search capabilities. It describes the architecture of Cloudera Search, including components like Solr, SolrCloud, and Morphlines for extraction and transformation. Methods for indexing data in real-time using Flume or batch using MapReduce are presented. The document also covers querying, security features like Kerberos authentication and collection-level authorization using Sentry, and concludes by describing how to obtain Cloudera Search.
Sf big analytics_2018_04_18: Evolution of the GoPro's data platformChester Chen
Talk 1 : Evolution of the GoPro's data platform
In this talk, we will share GoPro’s experiences in building Data Analytics Cluster in Cloud. We will discuss: evolution of data platform from fixed-size Hadoop clusters to Cloud-based Spark Cluster with Centralized Hive Metastore +S3: Cost Benefits and DevOp Impact; Configurable, spark-based batch Ingestion/ETL framework;
Migration Streaming framework to Cloud + S3;
Analytics metrics delivery with Slack integration;
BedRock: Data Platform Management, Visualization & Self-Service Portal
Visualizing Machine learning Features via Google Facets + Spark
Speakers: Chester Chen
Chester Chen is the Head of Data Science & Engineering, GoPro. Previously, he was the Director of Engineering at Alpine Data Lab.
David Winters
David is an Architect in the Data Science and Engineering team at GoPro and the creator of their Spark-Kafka data ingestion pipeline. Previously He worked at Apple & Splice Machines.
Hao Zou
Hao is a Senior big data engineer at Data Science and Engineering team. Previously He worked as Alpine Data Labs and Pivotal
This document provides an agenda and summaries of presentations for an Apache Phoenix conference. The agenda includes presentations on using Phoenix for time-series data at Salesforce, optimization techniques for a large Phoenix/HBase cluster at Sony, and how Phoenix was used at eHarmony. New features to be discussed are ACID transactions powered by Tephra and cost-based query optimization using Calcite. The document also provides a brief summary of each presentation topic.
Apache Phoenix: Use Cases and New FeaturesHBaseCon
James Taylor (Salesforce) and Maryann Xue (Intel)
This talk with be broken into two parts: Phoenix use cases and new Phoenix features. Three use cases will be presented as lightning talks by individuals from 1) Sony about its social media NewsSuite app, 2) eHarmony on its matching service, and 3) Salesforce.com on its time-series metrics engine. Two new features will be discussed in detail by the engineers who developed them: ACID transactions in Phoenix through Apache Tephra. and cost-based query optimization through Apache Calcite. The focus will be on helping end users more easily develop scalable applications on top of Phoenix.
The document summarizes Jabil's cloud solutions and services for deploying large-scale data centers globally. Specifically:
1) Jabil provides rack-level integration of optimized hardware and software stacks for applications like Hadoop and OpenStack, with testing, validation and support services.
2) Their solutions include pre-validated configurations, global deployment capabilities, and an full lifecycle of services from deployment to support.
3) Jabil has facilities and expertise for hardware/software integration, solution design, testing and customization to accelerate customer deployments.
This document summarizes Hortonworks' Data Cloud, which allows users to launch and manage Hadoop clusters on cloud platforms like AWS for different workloads. It discusses the architecture, which uses services like Cloudbreak to deploy HDP clusters and stores data in scalable storage like S3 and metadata in databases. It also covers improving enterprise capabilities around storage, governance, reliability, and fault tolerance when running Hadoop on cloud infrastructure.
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...Christian Tzolov
Slides from ApacheCon BigData 2015 HAWQ/GEODE talk: http://sched.co/3zut
In the space of Big Data, two powerful data processing tools compliment each other. Namely HAWQ and Geode. HAWQ is a scalable OLAP SQL-on-Hadoop system, while Geode is OLTP like, in-memory data grid and event processing system. This presentation will show different integration approaches that allow integration and data exchange between HAWQ and Geode. Presentation will walking you through the implementation of the different Integration strategies demonstrating the power of combining various OSS technologies for processing bit and fast data. Presentation will touch upon OSS technologies like HAWQ, Geode, SpringXD, Hadoop and Spring Boot.
How to load data for applications/programs into an Oracle database: SQL Developer, APEX Data Upload, APEX Data Upload Wizard, REST web services. Review and demonstration of techniques for loading data for applications.
The document announces a user conference on web GIS and new trends in GIS technology to take place in Cape Town, South Africa from May 6-8, 2014. It highlights new capabilities in web GIS, big data analysis, 3D visualization, and real-time event processing using ArcGIS software products. The document also summarizes enhancements in ArcGIS Desktop, Server, and new applications that extend GIS capabilities across organizations.
The document announces the 2014 Esri Africa User Conference to take place from May 6-8, 2014 in Cape Town, South Africa. The conference, hosted by Esri International and African distributors, will bring together geospatial leaders from around the world and provide attendees with new solutions, advanced training, insights into the latest features, and networking opportunities through special interest group meetings, user presentations, hands-on training, technical workshops, and a GIS solutions expo.
The document provides an overview of demographic, economic, and quality of life trends in the Gauteng City-Region (GCR) based on census and survey data. Some key points:
1. The population of Gauteng province has grown rapidly, increasing by over 2.9 million between 2001-2011 to a total of 12.3 million people, which represents 23.7% of South Africa's total population.
2. Gauteng contributes disproportionately to South Africa's economy, representing 36% of national GDP while only comprising 2% of the country's total land area. However, unemployment in Gauteng remains the highest among OECD metro-regions.
3. Access to basic
This document outlines the agenda for a two-day field training on mobile GIS solutions. Day one covers introductions and presentations on mobile basics, hands-on training with L1 devices, and a presentation on Leica hardware. Day two focuses on continued hands-on training, a presentation on ArcGIS Collector, and discussions led by Francois. Presenters will demonstrate software like ArcPad and ArcGIS for Windows Mobile as well as various hardware options for professional and consumer-grade mobile mapping.
GIS is transforming into a pervasive platform that connects existing GIS investments and provides mapping and analysis capabilities to entire organizations. ArcGIS provides the core of this platform by offering data management, analysis, visualization, and integration with other systems. It supports a variety of devices and enables new user experiences through crowdsourcing and location awareness. As a platform, ArcGIS breaks down barriers between workflows and disciplines, and organizations are rapidly adopting it to organize and share information.
This document provides a summary of a presentation on Python for Everyone. The presentation outline includes an introduction, overview of what Python is, why use Python, where it fits in, and how to automate workflows using Python for both desktop and server applications in ArcGIS. It also discusses ArcGIS integration with Python using ArcPy and resources for learning more about Python. The presentation includes demonstrations of automating tasks using Python for desktop and server applications. It promotes official Esri training courses on Python and provides resources for learning more about Python for GIS tasks.
The document provides an overview of new features in ArcGIS 10.1, including improvements to data management, mapping and visualization, geoprocessing and analysis, and GIS on the web. Key updates include enhanced feature tracking, lidar and imagery analysis tools, database management interfaces, dynamic legends, 85 new geoprocessing tools, easier workflows, and a focus on web applications using technologies like Flex, Silverlight, and JavaScript that are easier to configure and deploy. The new version also improves server performance with native 64-bit capabilities and aims to make GIS capabilities more accessible to all users.
Over the last two years the licencing model of the ArcGIS product suite has changed significantly. Licencing issues are still by far outstripping any other queries directed to our support division. These seminar notes are dedicated to discussing the challenges that our users experience with the licencing of the ArcGIS products, disusing the different ArcGIS Licencing concepts and options.
The main focus of this presentation is on coordinate systems. We describe common problems that people have, key terms , how to apply coordinate systems in 10.1 and best practices.
Geodatabase: The ArcGIS Mechanism for Data ManagementEsri South Africa
This presentation is about understanding the content that goes into a geodatabase, advantages of using geodatabases, data management and maintaining data integrity.
This document summarizes several free GIS applications from Esri, including ArcReader, ArcGIS Explorer Desktop, ArcGIS Online, and Maps for Office. It outlines the key capabilities of each application such as loading and styling data, accessing online maps and content, creating presentations, and integrating maps with PowerPoint and Excel. The seminar includes demonstrations of publishing maps with ArcReader, exploring online maps and content with ArcGIS Explorer Desktop and Online, and using the Maps for Office add-in. Future releases may include expanded functionality for 3D mapping and applications on the web.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
4. Format Support
• Over 100 supported formats
− Read only
• Directly accessible in ArcCatalog
• Allows direct analysis, mapping and
visualisation
5. Data Translation
• Data Interoperability toolbox
− Quick Import
• To personal or file geodatabase
− Quick Export
• To over 75 formats
• Geoprocessing integration
The extension is an optional extension to ArcGIS and adds additional functionality for reading over 100 GIS, CAD, and database formatsExtensionbased on Safe Software's FME technology and includes the FME Workbench application.The extension allows for data to be read directly from source and also includes quick import and export tools.Within the workbench application you can build complex spatial etltoolss.The extension is fully integrated with standard geoprocessing tools and model builder.
Work directly with more than 100 data formats without the need to convert between formats. GML, XML, WFS, Autodesk, DWG/DXF, MicroStation Design, MapInfo, MID/MIF and TAB, Oracle and Oracle Spatial, and Intergraph GeoMedia WarehouseOnce the Extension is activated, you can see supported data in ArcCatalog And you can use the data the same way you would use esri datasets, for viewing, analysis and mapping
The extension installs a data interoperability toolbox in ArcToolbox and includes Quick Import and quick export tools.The quick import tool loads directly from source into a personal or file geodatabase, this is very handy if you don’t require any data manipulation to be doneThe quick export tool can export to over 75 industry formats. Both tools can be run as stand alone or as part of a geoprocessing model or script to script tool.Data interoperability tool is fully integrated with geoprocessing tools, meaning you can keep your source data in their industry format and simply use them as input in any geoprocessing tools or even model builder.This model uses a MapInfo MIF file as input to a buffer operationAn ESRI shapefile is generated from the buffer operation, which is then output to three different formats – GML, file geodatabase and GeoMedia
DemoLook at ESRI data in ArcCatalogEnable Data interop extension – refresh – can now see new data formats, looks like any feature class. Can be added to map, look at attribute.Look at interoperability extensions – Not all data immediately visible – double click to add. Select format, now you can see data and once again all standard map functions supportedCan use within arcmap, Symbolise, select, buffer??
Extract the data from a source system. Transform the data into the format and data model required by the target system. Load the data into the target systemSpatial ETL simply means you can EXTRACT, TRANSFORM and LOAD Spatial data – often referred to as For Example:Enterprise organizations with multiple database GIS technologies
Safe Software's FME Workbench application is included with the ArcGIS Data Interoperability extension. It provides a visual diagramming environment with more than 240 transformers [PDF] that enable you to transform both geographic and attribute information.Attribute transformers would be something like joins, Renaming or even creating attributesGeometric transformers may include building geometry (lines from points, polygons from lines), filter by geometry, coordinate setter etc.
Give an overview of WorkbenchOpen existing ETL toolShow Navigator Pane – Source, Transformers, bookmarksShow Transformer Gallery – Categories, Search, Help, FavouritesWorkspace Canvas – reads from left to right Readers, transformers, writers – can add readers and writers from File Menu
Some commonly used Transformers:To select a subset of you data you can use the attributefilter (select all records where value is more than 25000) geometry filter (popular with CAD) – split data into poin, line, polygon, annotationTranslation – valuemapper (attributevaluemapper) you want to change attributes from 1, 2, 3 to a, b, cConstruct Geometries – areabuilderManipulate attributes – join, neighborfinder (can find annotation closest to feature and assign it as attribute)
Look at data in ArcCatalog – familirise with data, show properties are drawn as blocks and fence lines. WE want properties to be individual polygons and the property number should be assigned as an attribute.(if time) we also want to join a dbf file of the owners to the propertyCreate ETL tool – wizard – go through steps, level names, etcTransformers used
Q: When do I use workbench over Quick import and Export tool?A: Quick import/export tool performs a straight translation without any manipulation. Workbench gives you the ability to manipulate your data Q: What is the difference between workbench and model builder?A: They are both graphical authoring environments, but workbench allows you to manipulate data at feature level, where model builder manipulates data at feature class (dataset) level. You can also have a spatial ETL tool within model builder.