Information Consolidation and Concentration (WP4 ForgetIT 1st year review)ForgetIT Project
Techniques for the analysis of similarity and redundancy in textual and multimedia data, semantic multimedia analysis for condensation and information condensation and consolidation.
The Preserve-or-Forget Reference Model and Framework (WP8 ForgetIT 1st year r...ForgetIT Project
Design of the Preserve or-Forget framework architecture, definition of the integration approach for all the components developed in the other technical work packages and definition a preliminary reference model.
Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...Rittman Analytics
As big data and data warehousing scale-up and move into the cloud, they’re increasingly likely to be delivered as services using distributed cloud query engines such as Google BigQuery, loaded using streaming data pipelines and queried using BI tools such as Looker. In this session the presenter will walk through how data modelling and query processing works when storing petabytes of customer event-level activity in a distributed data store and query engine like BigQuery, how data ingestion and processing works in an always-on streaming data pipeline, how additional services such as Google Natural Language API can be used to classify for sentiment and extract entity nouns from incoming unstructured data, and how BI tools such as Looker and Google Data Studio bring data discovery and business metadata layers to cloud big data analytics
Information Consolidation and Concentration (WP4 ForgetIT 1st year review)ForgetIT Project
Techniques for the analysis of similarity and redundancy in textual and multimedia data, semantic multimedia analysis for condensation and information condensation and consolidation.
The Preserve-or-Forget Reference Model and Framework (WP8 ForgetIT 1st year r...ForgetIT Project
Design of the Preserve or-Forget framework architecture, definition of the integration approach for all the components developed in the other technical work packages and definition a preliminary reference model.
Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...Rittman Analytics
As big data and data warehousing scale-up and move into the cloud, they’re increasingly likely to be delivered as services using distributed cloud query engines such as Google BigQuery, loaded using streaming data pipelines and queried using BI tools such as Looker. In this session the presenter will walk through how data modelling and query processing works when storing petabytes of customer event-level activity in a distributed data store and query engine like BigQuery, how data ingestion and processing works in an always-on streaming data pipeline, how additional services such as Google Natural Language API can be used to classify for sentiment and extract entity nouns from incoming unstructured data, and how BI tools such as Looker and Google Data Studio bring data discovery and business metadata layers to cloud big data analytics
Reducing large S3 API costs using Alluxio at Datasapiens Alluxio, Inc.
Alluxio Global Online Meetup
August 4, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Koen Michiels, Datasapiens
Juraj Pohanka, Datasapiens
Bin Fan, Alluxio
Datasapiens is an international data-analytics startup based in Prague. We help our clients to uncover the value of their data and open up new revenue streams for them. We provide an end-to-end service that manages the data pipeline and automates the process of generating data insights.
In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency levels by implementing Alluxio as a data orchestration layer between S3 and Presto. Also, we will show the results of an experiment with estimating the per-query S3 API costs using the TPC-DS dataset.
This talk will focus on:
- The Hadoop ecosystem at Datasapiens
- Drastic increase of S3 API costs during performance tests with Presto
- S3 API costs tests with TPC-DS
- Implications to the cloud data lake architecture
Alluxio Community Office Hour
July 14, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Calvin Jia, Alluxio
Bin Fan, Alluxio
Alluxio 2.3 was just released at the end of June 2020. Calvin and Bin will go over the new features and integrations available and share learnings from the community. Any questions about the release and on-going community feature development are welcome.
In this Office Hour, we will go over:
- Glue Under Database integration
- Under Filesystem mount wizard
- Tiered Storage Enhancements
- Concurrent Metadata Sync
- Delegated Journal Backups
Alluxio Use Cases and Future DirectionsAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Data Orchestration for Analytics and AI in the Cloud Era
Calvin Jia, Founding Engineer (Alluxio)
Bin Fan, Founding Engineer, VP of Open Source (Alluxio)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
Alluxio Webinar
April 6, 2021
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Alex Ma, Alluxio
Peter Behrakis, Alluxio
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data lakes as well. As a result, running analytics such as Hive, Spark, Presto and machine learning are experiencing sluggish response times with data and compute in multiple locations. We also know there is an immense and growing data management burden to support these workflows.
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
In this tech talk, we'll go over:
- What is Alluxio Data Orchestration?
- How does it work?
- Alluxio customer results
Discussion on cloud-based data storage and databases. Presentation done by Zia Babar at the July event of the Waterloo Data Science and Data Engineering meetup
Data Orchestration for the Hybrid Cloud EraAlluxio, Inc.
Alluxio Community Office Hour
October 20, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speaker(s):
Alex Ma, Alluxio
Peter Behrakis, Alluxio
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data lakes as well. As a result, running analytics such as Hive, Spark, Presto and machine learning are experiencing sluggish response times with data and compute in multiple locations. We also know there is an immense and growing data management burden to support these workflows.
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
In this tech talk, we'll go over:
- What is Alluxio Data Orchestration?
- How does it work?
- Alluxio customer results
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...PGDay.Amsterdam
Rijkswaterstaat is the Service of the Ministry of Infrastructure and Water Management in the Netherlands. During this presentation, I will share our journey to develop and apply PostgreSQL at Rijkswaterstaat. Our work is ICT-driven and access to our data, both historical and actual is key for executing our task now and in the future.
Analyse de sécurité de bout en bout avec la Suite ElasticElasticsearch
Vous voulez prendre une longueur d’avance sur la concurrence, dans l’univers en constante évolution des solutions de sécurité ? Découvrez comment créer une plateforme centralisée d’analyse de la sécurité qui réponde à vos exigences de volumétrie et de rapidité d’investigation.
Decoupling Compute and Storage for Data WorkloadsAlluxio, Inc.
This was presented by Carlos Quieroz, Head of Data Platform at Development Bank of Singapore, at the Data Transformation in Financial Services meetup in Singapore jointly hosted by Accenture, Talend, BigDataSG Hadoop, and Alluxio.
Build an Open Source Data Lake For Data ScientistsShawn Zhu
This is a talk I presented in 2019 ICSA (International Chinese Statistics Association) Applied Statistics Symposium in session "How Data Science Drives Success in Enterprises"
Digital dark age - Are we doing enough to preserve our website heritage?Olivier Dobberkau
While creating web sites we often see their lifespan only for up to 3 to 5 years. With every relaunch
and overhaul we are confronted with content migration and short term motives to delete maybe
valuable content. On the other hand what is the value of our content? Can we assess it
meaningfully? Do we really know in which context it is used?
Scientist stated that where as we are producing more and more digital artifacts we fail to see that
we are not keeping an eye on preserving it in a manner that will enable us to find and use it in more
that a few years in the future.
This talk will introduce you the aspects of digital preservation with a special look on how TYPO3 is
preparing to help it users to create a digital heritage.
This Talk is part of the "Concise Preservation by combining Managed Forgetting and
Contextualized Remembering" Project ForgetIT. The ForgetIT project is funded by the EC within the
7th Framework Programme under the objective "Digital Preservation" (GA 600826).
Reducing large S3 API costs using Alluxio at Datasapiens Alluxio, Inc.
Alluxio Global Online Meetup
August 4, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Koen Michiels, Datasapiens
Juraj Pohanka, Datasapiens
Bin Fan, Alluxio
Datasapiens is an international data-analytics startup based in Prague. We help our clients to uncover the value of their data and open up new revenue streams for them. We provide an end-to-end service that manages the data pipeline and automates the process of generating data insights.
In this talk, we will describe how we have solved an issue with large S3 API costs incurred by Presto under several usage concurrency levels by implementing Alluxio as a data orchestration layer between S3 and Presto. Also, we will show the results of an experiment with estimating the per-query S3 API costs using the TPC-DS dataset.
This talk will focus on:
- The Hadoop ecosystem at Datasapiens
- Drastic increase of S3 API costs during performance tests with Presto
- S3 API costs tests with TPC-DS
- Implications to the cloud data lake architecture
Alluxio Community Office Hour
July 14, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Calvin Jia, Alluxio
Bin Fan, Alluxio
Alluxio 2.3 was just released at the end of June 2020. Calvin and Bin will go over the new features and integrations available and share learnings from the community. Any questions about the release and on-going community feature development are welcome.
In this Office Hour, we will go over:
- Glue Under Database integration
- Under Filesystem mount wizard
- Tiered Storage Enhancements
- Concurrent Metadata Sync
- Delegated Journal Backups
Alluxio Use Cases and Future DirectionsAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Data Orchestration for Analytics and AI in the Cloud Era
Calvin Jia, Founding Engineer (Alluxio)
Bin Fan, Founding Engineer, VP of Open Source (Alluxio)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
Alluxio Webinar
April 6, 2021
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Alex Ma, Alluxio
Peter Behrakis, Alluxio
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data lakes as well. As a result, running analytics such as Hive, Spark, Presto and machine learning are experiencing sluggish response times with data and compute in multiple locations. We also know there is an immense and growing data management burden to support these workflows.
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
In this tech talk, we'll go over:
- What is Alluxio Data Orchestration?
- How does it work?
- Alluxio customer results
Discussion on cloud-based data storage and databases. Presentation done by Zia Babar at the July event of the Waterloo Data Science and Data Engineering meetup
Data Orchestration for the Hybrid Cloud EraAlluxio, Inc.
Alluxio Community Office Hour
October 20, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speaker(s):
Alex Ma, Alluxio
Peter Behrakis, Alluxio
Many companies we talk to have on premises data lakes and use the cloud(s) to burst compute. Many are now establishing new object data lakes as well. As a result, running analytics such as Hive, Spark, Presto and machine learning are experiencing sluggish response times with data and compute in multiple locations. We also know there is an immense and growing data management burden to support these workflows.
In this talk, we will walk through what Alluxio’s Data Orchestration for the hybrid cloud era is and how it solves the performance and data management challenges we see.
In this tech talk, we'll go over:
- What is Alluxio Data Orchestration?
- How does it work?
- Alluxio customer results
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...PGDay.Amsterdam
Rijkswaterstaat is the Service of the Ministry of Infrastructure and Water Management in the Netherlands. During this presentation, I will share our journey to develop and apply PostgreSQL at Rijkswaterstaat. Our work is ICT-driven and access to our data, both historical and actual is key for executing our task now and in the future.
Analyse de sécurité de bout en bout avec la Suite ElasticElasticsearch
Vous voulez prendre une longueur d’avance sur la concurrence, dans l’univers en constante évolution des solutions de sécurité ? Découvrez comment créer une plateforme centralisée d’analyse de la sécurité qui réponde à vos exigences de volumétrie et de rapidité d’investigation.
Decoupling Compute and Storage for Data WorkloadsAlluxio, Inc.
This was presented by Carlos Quieroz, Head of Data Platform at Development Bank of Singapore, at the Data Transformation in Financial Services meetup in Singapore jointly hosted by Accenture, Talend, BigDataSG Hadoop, and Alluxio.
Build an Open Source Data Lake For Data ScientistsShawn Zhu
This is a talk I presented in 2019 ICSA (International Chinese Statistics Association) Applied Statistics Symposium in session "How Data Science Drives Success in Enterprises"
Digital dark age - Are we doing enough to preserve our website heritage?Olivier Dobberkau
While creating web sites we often see their lifespan only for up to 3 to 5 years. With every relaunch
and overhaul we are confronted with content migration and short term motives to delete maybe
valuable content. On the other hand what is the value of our content? Can we assess it
meaningfully? Do we really know in which context it is used?
Scientist stated that where as we are producing more and more digital artifacts we fail to see that
we are not keeping an eye on preserving it in a manner that will enable us to find and use it in more
that a few years in the future.
This talk will introduce you the aspects of digital preservation with a special look on how TYPO3 is
preparing to help it users to create a digital heritage.
This Talk is part of the "Concise Preservation by combining Managed Forgetting and
Contextualized Remembering" Project ForgetIT. The ForgetIT project is funded by the EC within the
7th Framework Programme under the objective "Digital Preservation" (GA 600826).
Foundations of Forgetting and Remembering (WP2 - ForgetIT 1st year review)ForgetIT Project
Conceptual foundations of human and organizational remembering and forgetting in order to identify aspects of human memory and forgetting that might be helpful in the design of a digital preservation and managed forgetting system.
ForgetIT – Some store to remember, some store to forgetSøren Schaffstein
With growing storage capacities and sinking storage prices, the paradigm of keeping everything is prevailing. However, keeping information accessible, useable and useful goes far beyond purely keeping things, especially in the long run, and entails expenses much larger than just the storage costs. This issue especially applies to content in Content Management Systems where we increasingly face the situation of creating, managing and storing (preserving) multimedia content, which we might never access again due to the pure volume of content.
To overcome these issues, we envision the concept of flexible managed forgetting for information that progressively ceases in importance and finally becomes obsolete as well as for redundant information. We will extend TYPO3 with preservation and forgetting. The forgetting will also reduce the user’s cognitive burden for past activities and information in TYPO3 but still allows access if needed. The same as our brain will retrieve details of our past when remembering and getting associations, the approach will provide such means.
Within the Seventh Framework Programme for Research (FP7) of the European Union the "ForgetIT" project strives to build a solution for the mentioned problems. The project has a scope of 3 years and TYPO3 has been selected as CMS to build upon as it is Open Source Software and has an open and active community.
This talk will give an introduction into digital preservation and why companies can greatly profit from it. The current status of the research project will be demonstrated.
An overview of the project can be found on the projects website (of course made with TYPO3): http://www.forgetit-project.eu/
Managed Forgetting (WP3 - ForgetIT 1st year review)ForgetIT Project
Data model and a computation method based on Semantic Web technologies, Integration to PIMO semantic desktop and Preserve-or-Forget middleware Exploratory studies,
Collective memory analysis of public events in Wikipedia, High-impact feature analysis for content retention in the Social Web, Feature selection for efficiency and scalability
The CMIS standard provides an answer to most issues met by typical content-centric applications by offering a common model and a set of services for ECM interoperability. In this presentation we'll first provide an introduction to the CMIS services and bindings, then we'll offer a view of the landscape of the different ECM providers and clients implementing CMIS, and we'll finish with practical examples of the uses of OpenCMIS, the Apache Chemistry (Java) library, designed to help you easily write CMIS applications.
Presented at ApacheCon 2010 (http://na.apachecon.com/c/acna2010/sessions/591)
Windows Azure - Uma Plataforma para o Desenvolvimento de AplicaçõesComunidade NetPonto
A plataforma Windows Azure abre espaço a desenvimento de aplicações utilizando o novo paradigma: "A Nuvem". Aplicações escaláveis, redundantes, e mais próximas do utilizador final. Isto tudo utilizando como base os conhecimentos que já tem e o novo Visual Studio 2010.
An overview of the CMIS standard and of the AIIM Demo put together by the iECM committee. This presentation was given by Karin Ondricek of EMC and Laurence Hart of AIIM and Washington Consulting, Inc.
Alfresco Community Edition 3.2
Designed to Improve Compliance, Productivity & Integration
In addition to enabling mobile content management, streamlining email management and supporting open specifications and standards - including CMIS and IMAP - Alfresco Community 3.2 also lays the groundwork for records management support for U.S. Department of Defense (DoD) 5015.2 certification in September 2009.
Reducing costs, improving integration, increasing innovation and supporting regulatory requirements top the list of priorities for IT executives in 2009, according to a recent report issued by Forrester Research*.
Alfresco has responded to each of these industry demands in developing enhancements to Alfresco Community Edition 3.2.
New features in Alfresco Community Edition 3.2 include:
* Enhanced Records Management – Laying the groundwork for supporting Records Management (RM) in readiness for DoD 5015.2 certification in September 2009, Alfresco RM will enable companies to support the strict legal requirements needed to manage vital company information.
Functionality will include:
- metadata management,
- YUI-based forms,
- lifecycle management,
- CMIS-based query access,
- email capture, import/export facility and
- auditing.
* Mobile Access – The Smartphone client provides support for mobile devices (including Apple’s iPhone), providing the first ECM designed to enable mobile collaboration for business processes on the go.
Users can now search, view and edit content, activities and tasks from anywhere. Rather than delivery by desktop application, the client interface is designed for the new Smartphone form factors.
* Email Client Access & Archiving via IMAP Support – Simple ‘drag and drop’ allows users to share key messages with colleagues and team members without contributing to the ever-growing volume of forwarded email.
Alfresco’s unique transparent IMAP standard protocol support provides full access to repository services without a client install and can be accessed from mobile devices.
The virtualized repository support enables users to manage and archive emails within the corporate content repository.
* CMIS/Interoperability Support – Alfresco Community 3.2 offers full support (SOAP Web Services, REST and Query) for version 0.61 of the Content Management Interoperability Services (CMIS) specification, providing the most complete implementation of CMIS to date and allowing CMIS compliant clients and repositories to interoperate and share content across information silos.
* Extranet Collaboration – Alfresco Community 3.2 is scalable to tens of thousands of users, is cloud-ready for EC2 and other cloud service providers and supports content collaboration outside the enterprise or in the cloud.
* WCM Authoring and Deployment –
Huge performance increases to improve deployment of web site content to external web sites through highly-parallel web site deployment and publishing.
Alfresco Community Edition 3.2 is immediately available for download from http://wiki.alfresco.com/wiki/Download_Community_Edition
The Dispatch Printing Company is a leading regional media company in the USA, anchored by its flagship newspaper The Columbus Dispatch. Its Dispatch Broadcast Group owns and operates two TV stations, the WBNS radio station, the Ohio News Network radio service, and a 24-hour cable news channel.
This session is a case study in migrating OpenCms sites, generating millions of daily page views, from a traditional data center to the Amazon Web Services platform. Through this migration there were many lessons learned about how to successfully use Amazon's cloud service offerings to improve OpenCms scalability and lower total costs to the business. An overview of select Amazon services and how they have been leveraged in a production OpenCms environment will be presented.
We will talk about possible uses for a variety of Amazon services including:
EC2 - Implementation strategy for running OpenCms on Amazon's Elastic Compute Cloud virtual hardware
CloudWatch - Provide detailed visibility into the health of an OpenCms environment
Simple Storage System - Work with OpenCms's export functionality to push exported files directly to Amazon's web accessible storage space
CloudFront - Leverage the power of a content delivery network for your OpenCms environment
We will discuss the effort prior to launch to convince the business that Amazon would be reliable, allow for a disaster recovery plan, be secure, and save the business money. We will provide tips on how we setup our infrastructure to alleviate the various concerns the business had.
The first service leveraged was Amazon CloudWatch. This service can provide a detailed look at the health of the entire OpenCms infrastructure with little to no custom development effort. This includes the ability to quickly create alerts and notifications for when anything goes wrong in your environment.
We also decided to leverage Amazon Relational Data Services. We will present the trade-offs in the decision to use a managed data layer and how we justified taking the managed database approach.
Finally, we will briefly cover the other Amazon services that have been used as a part of our OpenCms deployment including ElastiCache, CloudFront, Simple Queue Service, Simple Email Service, SimpleDB, and Amazon S3.
With the Topology and Orchestration Specification for Cloud Applications (TOSCA) framework, one expects to achieve a strong level of interoperability when packaging an application or service for deployment to a Cloud Platform. T-Systems tested the OASIS TOSCA specification together with its Labs and University partners. This session will share the results and some of the important considerations that arose from the PoC.
Eine pfeilschnelle Suchmaschine, die mehr kann: Apache Solr für TYPO3 verarbeitet Suchanfragen in Millisekunden und bietet obendrein intelligente Features wie Filter, Synonymsuche oder Autovervollständigung. Olivier Dobberkau zeigt, wie umfangreiche Produktkataloge, Publikationen oder Personenverzeichnisse im Handumdrehen “suchbar” werden.
In this presentation, that we held at MeetTYPO3 Rotterdam, we show how we solved the problem our customer presented us, they had with their product catalog, using TYPO3 and Apache Solr.
This Presentation was given at the TYPO3 Launch Event in Milano, Italy.
I will show you how TYPO3 has evolved into being cloud-ready. Additionally, this will show how your organization can profit from easier and faster innovation cycles. This will include a Demo of a TYPO3 v8 being deployed on Platform.sh.
Disclaimer: Beware of the quotes given in this presentation! :-)
In this presentation we will speak about how Universities can cooperate with TYPO3.
How a common view on the actual status of TYPO3 usage can happen.
And how a joint future within technical requirements can look like.
Literally: How can cooperation happen under the umbrella of the TYPO3 Association.
In diesem Vortrag geht es darum, wie Universitäten und Fachhochschulen mit TYPO3 zusammenarbeiten können. Wie ein gemeinsamer Blick auf dem aktuellen Zustand der Nutzung von TYPO3 passieren kann. Und wie eine Zukunft innerhalb der technischen Vorgaben aussehen kann.
Konkret: Wie kann eine Zusammenarbeit miteinander unter dem Dach der TYPO3 Association passieren?
This presentation elaborates on how your relationship with TYPO3 can be improved and can evolve. Being part of the TYPO3 community means to receive a lot of hidden perks that are unlocked with trust, contribution and attrition.
Finden die Besucher Ihrer Website wirklich die Information, die diese suchen? Eine gute Suche auf ihrer Website führt zu längeren Verbleib und mehr Transaktionen. Apache Solr für TYPO3 bietet hierfür die Grundlagen und dieser Vortrag informiert über fortgeschrittene Integration in TYPO3 CMS.
This is a presentation of Hosted Solr as a Search as a Service component for your CMS or Web Application. We also showcase some of the TYPO3 Solr implementations made by us and other TYPO Community members.
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...Olivier Dobberkau
What is the value of the content on your website? Which one is creating value for your business? Who created it and how does the network of your editors perform?
In this presentation we want to introduce you to the ideas of our work that is done in the ForgetIt Project. We will also give an insight into how we think CMIS will be implemented in TYPO3 CMS so that content can be exchanged thru a content repository.
Last but not least we will give you a brief view into our semantic and concept detection services that we will introduce to TYPO3 CMS.
ForgetIT: Beyond the page: Giving content a meaning and valueOlivier Dobberkau
Following the concept of human memory Forget IT aims to create a framework which will bring “managed forgetting” to TYPO3 CMS. It will provide semantic annotation, intelligent preservation and managed archiving of content objects. Learn what dkd plans for 2014 and how you can contribute.
While preservation of digital content is now well established in memory institutions such as national libraries and archives, it is still in its infancy in most other organizations, and even more so for personal content. ForgetIT combines three new concepts to ease the adoption of preservation in the personal and organizational context.
Managed Forgetting:
Managed Forgetting models resource selection as a function of attention and significance dynamics. It is inspired by the important role of forgetting in human memory and focuses on characteristic signals of reduction in salience.
Synergetic Preservation:
Synergetic Preservation crosses the chasm that exists between active information use and preservation management by making intelligent preservation processes an integral part of the content lifecycle in information management.
Contextualized Remembering:
Contextualized Remembering targets keeping preserved content meaningful and useful. It will be based on a process of dynamic evolution-aware contextualization.
Impact on TYPO3 CMS:
Together with the TYPO3 community and selected pilot customers, dkd will work on establishing the respective extensions to provide these concepts to TYPO3 CMS and its user base.
Olivier will introduce you the project, its concepts and the framework architecture. The past year has been used to define these and a solid foundation was laid.
We elaborated the design and functional requirements by using two use cases (I. Press release, II. DAM integration into the backend).
The current year in the project will be used to create a first and working implementation.
What does this mean for you?
After a short break, a joint brainstorming about how you can be involved and what potential benefits would be, shall take place.
Things to look at will be:
* the value of content objects
* semantic annotation and contextualization
* memory buoyancy, allowing mechanics to forget content over time
* utilization of open standards like CMIS, ODATA, Stanbol
ForgetIT – Some store to remember, some store to forget
With growing storage capacities and sinking storage prices, the paradigm of keeping everything is prevailing. However, keeping information accessible, useable and useful goes far beyond purely keeping things, especially in the long run, and entails expenses much larger than just the storage costs. This issue especially applies to content in Content Management Systems where we increasingly face the situation of creating, managing and storing (preserving) multimedia content, which we might never access again due to the pure volume of content.
To overcome these issues, we envision the concept of flexible managed forgetting for information that progressively ceases in importance and finally becomes obsolete as well as for redundant information. We will extend TYPO3 with preservation and forgetting. The forgetting will also reduce the user’s cognitive burden for past activities and information in TYPO3 but still allows access if needed. The same as our brain will retrieve details of our past when remembering and getting associations, the approach will provide such means.
Within the Seventh Framework Programme for Research (FP7) of the European Union the "ForgetIT" project strives to build a solution for the mentioned problems. The project has a scope of 3 years and TYPO3 has been selected as CMS to build upon as it is Open Source Software and has an open and active community.
An overview of the project can be found on the projects website (of course made with TYPO3): http://www.forgetit-project.eu/
This is an updated short presentation on the TYPO3 Association. It shoud give you an intro on the facts and figures and on our goals and activities. Feel free to contact me with questions and corrections.
The TYPO3 Extension EXT:solr adds a fast, precise and extendable modern search the TYPO3 CMS. In this Presentation you will be informed about the current Status of development of the Extension and its Add-Ons. We will give you an overview on common indexing strategies and offer you insights into the best practices for your implementation
The Future of CMS
This are very rough slides and a very loose collection of my thoughts, observations and advice around the future of content management systems.
Certainly they are not imperative or analysing all details in depth.
This slides have been presented at TYPO3 Université 2013 in Annecy France.
Olivier Dobberkau June 2013
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
3. Meeting, Date, Location
Content Management Interoperability Services
(CMIS)
An open standard, ensuring CMS
interoperability
Abstraction layer
Defined protocols and domain
model
Common data model with generic
properties
4. T3CON14 Berlin
CMIS Benefits
Easy to learn and adopt
Supported by the widest range of vendors and user organizations
(e.g. Alfresco, Sharepoint, Magnolia, Adobe, Nuxeo)
End users can use one application to access / exchange
documents between various systems supported CMIS
Libraries for Java, Python, .NET, Objective-C and PHP
Standard service API
6. Meeting, Date, Location
CMIS Use Cases
Repository to Repository: CMIS talk directly to each other
Application to Repository: E.g. End user gets information from a
mobile website via CMIS
!
10. Meeting, Date, Location
Services
Services Description
Repository Services Used to get information and capabilities of a repository
Navigation Services Used to traverse the repository‘s folder hierarchy
Object Services Used to perform CRUD operations on objects
Multi-filing Services
If the repository supports storing an object in more than one folder,
this service handles it
Discovery Services Used to handle queries
Versioning Services Used to checkout documents and work with document versions
Relationship Services Used to query an object for its relationships
Policy Services Used to apply, remove, and query for policies
ACL Services Used to manage the ACL of an object
12. T3CON14 Berlin
Web Services Binding
Maps CMIS operations directly to SOAP calls (Simple Object
Access Protocol)
Covers entire CMIS specification
Authentication:
WS-Security 1.1 or Username Token Profile 1.1
other authentication mechanisms
CMIS repository needs MTOM (Message Transmission
Optimization Mechanism) for content transfer
13. T3CON14 Berlin
AtomPub Binding
Built on the AtomPub specification (mainly designed for publishing
and simple editing of resources)
Extends AtomPub to support features like hierarchies, versioning,
renditions, permissions, and so on.
Follows REST paradigm by using HTTP methods GET, POST,
PUT, DELETE
Recommended Authentication:
HTTP Basic Authentication in conjunction with SSL
14. T3CON14 Berlin
AtomPub Binding
Disadvantages:
Covers not the entire specification (e.g. does not support
createDocumentFromSource())
Mostly needs two HTTP calls to access the content of a
document: get document’s Atom entry which contains the
AtomPub Link to the content
15. T3CON14 Berlin
Browser Binding
Based on JSON (JavaScript Object Notation)
HTTP methods:
GET (read), POST (create, update, delete)
Covers entire specification
Recommended authentication:
HTTP Basic Authentication for non-browser Clients
Authentication with Tokens for browser Clients
16. T3CON14 Berlin
Browser Binding Benefits
More compact and performant than AtomPub and Web Services
Binding
Suitable for use in mobile and browser apps
Additional client libraries are not necessary
17. T3CON14 Berlin
Why we need CMIS 1.1 (1/2)
Main new features:
Type Mutability: CMIS clients can create, modify and delete
type definitions (see Data Model)
Secondary object types: set of properties that can be
dynamically added and removed from CMIS objects
Browser Binding
Supports bulk property updates with a single service call
18. T3CON14 Berlin
Why we need CMIS 1.1 (2/2)
Main new features:
New Item object type: exposes any other object types via CMIS
that do not fit the model's definition for document, folder,
relationship or policy
Append to a content stream: CMIS 1.1 allows to move large
files in chunks into the repository
20. T3CON14 Berlin
Current status
Several CMIS Extensions for TYPO3 CMS already available, BUT
they are not be maintained/updated since years
do support only CMIS 1.0
Disadvantages of existing CMIS library for PHP (Apache
Chemistry)
Not CMIS 1.1. compatible
Not object-oriented
22. T3CON14 Berlin
Table configuration array (TCA)
Database field definition beyond SQL possibilities
type of a field (text, date, select field, checkbox, etc.)
what field should be displayed in the Backend and in which
layout
how to validate the content of the field (required, integer, etc.)
define relation between records / TCA tables
highly extensible to implement own validators or special field
types
23. T3CON14 Berlin
File abstraction layer (FAL)
Abstract API to store files
Support for multiple „storages“
Each storage has a driver that communicates with the target
system
Available Drivers and ideas: Local, WebDAV, Dropbox, FTP,
Amazon S3, Flickr, Database, CMIS
27. T3CON14 Berlin
Neos Nodes vs. CMIS
Neos
Storage inspired on the PHP based JCR implementation
PHPCR
TYPO3 CR offers some funky / future features (Dimensions)
JCR / PHPCR Nodes can be translated to a CMIS compatible
format
Expectations are high that PHPCR <-> CMIS is way more easy to
realize than TCA+FAL <-> CMIS
29. T3CON14 Berlin
Benefits for Web CMS
Content gets decoupled from presentation layer
Specialized Applications doing one job right
„Future proof“
Reduction of redundancies, many content objects are used in
production systems and those can be linked to CMIS repository
Expanding possibilities
31. T3CON14 Berlin
A real php-CMIS-lib
dkd is currently building an open source version of a CMIS library
in php as part of the ForgetIT project
following the java implementation to keep interfaces consistent
threaded, object oriented, scalable
supporting CMIS V 1.1
> Want to contribute? Talk to us!
32. T3CON14 Berlin
Roadmap
First version Q1/2015
browserbindings
CRUD for standard TCA objects
More features as CMIS evolves and gets accepted by TYPO3
community as additional content repository technology
Native Support in TYPO3 CMS 7 onwards?
33. T3CON14 Berlin
Where to get more information …
Book: CMIS and Apache Chemistry
CMIS - OASIS Specs/Site
http://docs.oasis-open.org/cmis/CMIS/v1.1/cs01/CMIS-v1.1-cs01.html
Apache Chemistry
http://chemistry.apache.org
Alternative TER Plugins
http://typo3.org/extensions/repository/?id=23&L=0&q=CMIS&tx_solr%5Bfilter%5D%5Boutdated%5D=outdated%3AshowOutdateddf