Chris Rockwell, University of Michigan
Based on lessons learned, a presentation of some nifty techniques for expediting and automating content migration leveraging Ruby, Cucumber, Selenium, Capybara, CURB, and the SlingPostServlet
Do you need an external search platform for Adobe Experience Manager?therealgaston
Experience Manager provides some basic search capabilities out of the box. In this talk, we'll explore an external search platform for implementing an Experience Manager powered, search-driven site. As an example, we will use Apache Solr as a reference implementation and describe best practices for indexing content, exposing non-Experience Manager content via search, delivering search-driven experiences, and deploying the solution in a production setting.
Consuming External Content and Enriching Content with Apache Cameltherealgaston
While AEM Solr Search provides a framework for indexing and searching content within AEM, it does not address other real-world use cases such as indexing and searching content external to AEM (i.e. products). Secondly, it assumes that the final indexable AEM document will be produced entirely by AEM. This is often not the case, as advanced search applications typically need to enrich the document prior to indexing using external data sources.
In this talk we will extend the AEM Solr Search reference architecture to include document processing capabilities using Apache Camel. As an example, two real-world use cases will be provided: 1) ingesting an external product data set via Apache Camel into a shared Solr instance and delivering the results via AEM, and 2) enriching AEM content with analytics and ratings data for the purpose of applying popularity boosting.
<p>You've built a great site and spent a countless hours creating compelling content, but important questions remain. Can your visitors quickly find what they need on your site? Is your current search strategy helping visitors find information, or is it slowing them down?</p>
<p>Join Robert Douglass, Senior Drupal Advisor at Acquia and maintainer of the Apache Solr Search integration module, and Bryan House, senior director of marketing, for a one-hour webinar presentation. Acquia Search is a cloud-based service within the Acquia Network that delivers powerful Apache Solr search capabilities to Drupal 6 websites as a plug-and-play option. Using Acquia Search, your visitors will find information faster and spend more time on your site, resulting in higher conversions on your site.</p>
<p>Key takeaways will include:</p>
<ul>
<li>Overview of the latest Acquia Search features - including multisite search, attachment search, and more</li>
<li>Learn how easy it is to deploy and configure Acquia Search on any Drupal 6 site</li>
<li>An introduction to the pricing options available for Acquia Search, starting at under $30 / month</li>
</ul>
Pragmatic REST: recent trends in API designMarsh Gardiner
As presented by @mpnally and @earth2marsh at I Love APIs 2015. Slides covered API design trends, with particular attention paid to hypermedia and versioning. Note the distinction between service-oriented and data-oriented approaches on slide #5.
Do you need an external search platform for Adobe Experience Manager?therealgaston
Experience Manager provides some basic search capabilities out of the box. In this talk, we'll explore an external search platform for implementing an Experience Manager powered, search-driven site. As an example, we will use Apache Solr as a reference implementation and describe best practices for indexing content, exposing non-Experience Manager content via search, delivering search-driven experiences, and deploying the solution in a production setting.
Consuming External Content and Enriching Content with Apache Cameltherealgaston
While AEM Solr Search provides a framework for indexing and searching content within AEM, it does not address other real-world use cases such as indexing and searching content external to AEM (i.e. products). Secondly, it assumes that the final indexable AEM document will be produced entirely by AEM. This is often not the case, as advanced search applications typically need to enrich the document prior to indexing using external data sources.
In this talk we will extend the AEM Solr Search reference architecture to include document processing capabilities using Apache Camel. As an example, two real-world use cases will be provided: 1) ingesting an external product data set via Apache Camel into a shared Solr instance and delivering the results via AEM, and 2) enriching AEM content with analytics and ratings data for the purpose of applying popularity boosting.
<p>You've built a great site and spent a countless hours creating compelling content, but important questions remain. Can your visitors quickly find what they need on your site? Is your current search strategy helping visitors find information, or is it slowing them down?</p>
<p>Join Robert Douglass, Senior Drupal Advisor at Acquia and maintainer of the Apache Solr Search integration module, and Bryan House, senior director of marketing, for a one-hour webinar presentation. Acquia Search is a cloud-based service within the Acquia Network that delivers powerful Apache Solr search capabilities to Drupal 6 websites as a plug-and-play option. Using Acquia Search, your visitors will find information faster and spend more time on your site, resulting in higher conversions on your site.</p>
<p>Key takeaways will include:</p>
<ul>
<li>Overview of the latest Acquia Search features - including multisite search, attachment search, and more</li>
<li>Learn how easy it is to deploy and configure Acquia Search on any Drupal 6 site</li>
<li>An introduction to the pricing options available for Acquia Search, starting at under $30 / month</li>
</ul>
Pragmatic REST: recent trends in API designMarsh Gardiner
As presented by @mpnally and @earth2marsh at I Love APIs 2015. Slides covered API design trends, with particular attention paid to hypermedia and versioning. Note the distinction between service-oriented and data-oriented approaches on slide #5.
Introduces "Slug" a web crawler (or "Scutter") designed for harvesting semantic web content. Implemented in Java using the Jena API, Slug provides a configurable, modular framework that allows a great degree of flexibility in configuring the retrieval, processing and storage of harvested content. The framework provides an RDF vocabulary for describing crawler configurations and collects metadata concerning crawling activity. Crawler metadata allows for reporting and analysis of crawling progress, as well as more efficient retrieval through the storage of HTTP caching data.
This session is about building client-side web parts, list-based and page-based applications on SharePoint. I'll show the workbench, the web part and a list based application, React and how to apply simple CSS styles for typography, color, icons, animations, and responsive grid layouts with Office UI Fabric.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward. Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Come learn how you can get your content into Solr and integrate it into your applications!
How to build your own Delve: combining machine learning, big data and SharePointJoris Poelmans
You are experiencing the benefits of machine learning everyday through product recommendations on Amazon & Bol.com, credit card fraud prevention, etc… So how can we leverage machine learning together with SharePoint and Yammer. We will first look into the fundamentals of machine learning and big data solutions and next we will explore how we can combine tools such as Windows Azure HDInsight, R, Azure Machine Learning to extend and support collaboration and content management scenarios within your organization.
With the commercialization of the web, web development has become one of the blooming industries. Learning web development enables you to create attractive websites using HTML, CSS, JQuery and JavaScript. Web development includes developing simple and complex web-based applications, electronic businesses and social networking sites. Being a web developer you can deliver applications as web services which is only available in desktop applications.
This session describes the architecture and implementation of an embeddable, extensible enterprise content management core for Java EE and simpler platforms. The presentation starts by describing the general architectural concepts used as building blocks:
• A schema and document model, reusing XML schemas and making good use of XML namespaces, where document types are built with several facets
• A repository model, using hierarchy and versioning, with the Content Repository API for Java (JSR 170) being one of the possible back ends
• A query model, based on the Java Persistence query language (JSR 220) and reusing the path-based concepts from Java Content Repositories (JCR)
• A fine-grained security model, compatible with WebDAV concepts and designed to provide flexible security policies
• An event model using synchronous and asynchronous events, allowing bridging through Java Message Service (JMS) or other systems to other event-enabled frameworks
• A directory model, representing access to external data sources using the same concepts as for documents but taking advantage of the specificities of the data back ends
Suitable abstraction layers are put in place to provide the required level of flexibility. One of the main architectural tasks is to find commonalities in all the systems used (or whose use is planned in the future) so framework users need to learn and use a minimal number of concepts. The result is a set of concepts that are fundamental to enterprise document management and are usable through direct Java technology-based APIs, Java EE APIs, or SOA. The presentation shows, for each of the main components, which challenges have been met and overcome when building a framework in which all components are designed to be improved and replaced by different implementations without sacrificing backward compatibility with existing ones.
The described implementation, Nuxeo Core, can be embedded in a basic Java technology-based framework based on OSGi (such as Eclipse) or in one based on Java EE, according to the needs of the application using it. This means that the core has to function without relying on Java EE services but also has to take advantage of them when they are available (providing clustering, messaging, caching, remoting, and advanced deployment).
Introduces "Slug" a web crawler (or "Scutter") designed for harvesting semantic web content. Implemented in Java using the Jena API, Slug provides a configurable, modular framework that allows a great degree of flexibility in configuring the retrieval, processing and storage of harvested content. The framework provides an RDF vocabulary for describing crawler configurations and collects metadata concerning crawling activity. Crawler metadata allows for reporting and analysis of crawling progress, as well as more efficient retrieval through the storage of HTTP caching data.
This session is about building client-side web parts, list-based and page-based applications on SharePoint. I'll show the workbench, the web part and a list based application, React and how to apply simple CSS styles for typography, color, icons, animations, and responsive grid layouts with Office UI Fabric.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward. Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Come learn how you can get your content into Solr and integrate it into your applications!
How to build your own Delve: combining machine learning, big data and SharePointJoris Poelmans
You are experiencing the benefits of machine learning everyday through product recommendations on Amazon & Bol.com, credit card fraud prevention, etc… So how can we leverage machine learning together with SharePoint and Yammer. We will first look into the fundamentals of machine learning and big data solutions and next we will explore how we can combine tools such as Windows Azure HDInsight, R, Azure Machine Learning to extend and support collaboration and content management scenarios within your organization.
With the commercialization of the web, web development has become one of the blooming industries. Learning web development enables you to create attractive websites using HTML, CSS, JQuery and JavaScript. Web development includes developing simple and complex web-based applications, electronic businesses and social networking sites. Being a web developer you can deliver applications as web services which is only available in desktop applications.
This session describes the architecture and implementation of an embeddable, extensible enterprise content management core for Java EE and simpler platforms. The presentation starts by describing the general architectural concepts used as building blocks:
• A schema and document model, reusing XML schemas and making good use of XML namespaces, where document types are built with several facets
• A repository model, using hierarchy and versioning, with the Content Repository API for Java (JSR 170) being one of the possible back ends
• A query model, based on the Java Persistence query language (JSR 220) and reusing the path-based concepts from Java Content Repositories (JCR)
• A fine-grained security model, compatible with WebDAV concepts and designed to provide flexible security policies
• An event model using synchronous and asynchronous events, allowing bridging through Java Message Service (JMS) or other systems to other event-enabled frameworks
• A directory model, representing access to external data sources using the same concepts as for documents but taking advantage of the specificities of the data back ends
Suitable abstraction layers are put in place to provide the required level of flexibility. One of the main architectural tasks is to find commonalities in all the systems used (or whose use is planned in the future) so framework users need to learn and use a minimal number of concepts. The result is a set of concepts that are fundamental to enterprise document management and are usable through direct Java technology-based APIs, Java EE APIs, or SOA. The presentation shows, for each of the main components, which challenges have been met and overcome when building a framework in which all components are designed to be improved and replaced by different implementations without sacrificing backward compatibility with existing ones.
The described implementation, Nuxeo Core, can be embedded in a basic Java technology-based framework based on OSGi (such as Eclipse) or in one based on Java EE, according to the needs of the application using it. This means that the core has to function without relying on Java EE services but also has to take advantage of them when they are available (providing clustering, messaging, caching, remoting, and advanced deployment).
This session introduces the Spring Web Scripts and the Spring Surf framework describing how they are used to underpin the Alfresco Share user interface. As well as covering the basic concepts, this session will cover the history and future roadmap for the frameworks.
(BDT402) Performance Profiling in Production: Analyzing Web Requests at Scale...Amazon Web Services
Code profiling gives a rich, detailed view of runtime performance. However, it's difficult to achieve in production: for even a small fraction of web requests, huge challenges in scalability, access, and ease of use appear. Despite this, Yelp profiles a nontrivial fraction of its traffic by combining Amazon EC2, Amazon EMR, and Amazon S3. Developers can search, sort, filter, and combine interesting profiles; during a site slowdown or page failure, this allows a fast diagnosis and speedy recovery. Some of our analyses run nightly, while others run in real-time via Storm topologies. This session includes our use cases for code profiling, its benefits, and the implementation of its handlers and analysis flows. We include both performance results and implementation challenges of our MapReduce and Storm jobs, including code overviews. We also touch on issues such as concurrent logging, cross-data center replication, job scheduling, and API definitions.
Semantic technologies in practice - KULeuven 2016Aad Versteden
Slides of the course given at the KULeuven lecture of Knowledge and the Web on 2016/10/26. Examples of semantic technologies and a way of developing web apps on top of it.
Front end vs Backend
Front-End intersections ( designers - developers)
Design system
UI developer VS Front end developer
Front End Skills
Front-End roles and responsibilities
What should Front End developer know ?
Slides for plenary talk on "Content Management - Buy or Build?" given by Ricky Ranking and Gareth McLeese at the IWMW 2003 event held at the University of Kent on 11-13 June 2003.
See http://www.ukoln.ac.uk/web-focus/events/workshops/webmaster-2003/sessions/#talk-6
Similar to How to migrate from any CMS (thru the front-door) (20)
CIRCUIT 2015 - Hybrid App Development with AEM AppsICF CIRCUIT
Pat McLoughlin & Paul Michelotti - ICF Interactive
A technical deep dive into the waters of hybrid app development on the AEM apps platform and an introduction to the open source Ionic development framework for AEM Apps.
CIRCUIT 2015 - AEM Infrastructure Automation with Chef CookbooksICF CIRCUIT
Drew Glass - Hero Digital
Push button deployments can automate AEM infrastructure to reduce costs and defects. Chef is a platform that enables this by transforming infrastructure into code using DevOps practices. AEM Author, Publish and Dispatcher instances can be fully configured and deployed as code with Chef. In this talk we will discuss how the open source AEM Chef Cookbook can be used to automate the deployment of AEM instances with Chef features like recipes, attributes, providers and resources. Out of the box, the AEM Chef Cookbook supports:
- Unattended installation of AEM Author, Publish, and Dispatcher nodes.
- Automatic search for and configuration of AEM cluster members using Chef searches.
- Configuration for replication agents using the replicator provider.
- Configurations for Dispatcher farms with the farm provider.
- Deploying and removing AEM packages with the package provider.
We will also discuss how AEM can be automated to supported different SSO and deployment scenarios like cold standby. Finally, we will cover how to extend the Cookbook to support your project specific needs.
CIRCUIT 2015 - Akamai: Caching and BeyondICF CIRCUIT
Puru Hemnani - ICF Interactive
The session will go over the advantages of CDN in general and Akamai caching in particular. Akamai is one of the most commonly used caching option with AEM and several clients use it. There are several features and akamai tuning options such as Error caching, GeoRouting, ESI, Siteshield, WAF that can help developers and system engineers make the sites faster and secure. Configuring it correctly can also reduce the licensing requirements for AEM as well as infrastructure costs as you can serve much higher amount of traffic with less number of origin servers.
Brian Bayer - ICF Interactive
An action-packed primer on developer-based testing of AEM including when to test, what to test, how to test, how testing can determine the structure of your code, and an answer to the age old question “why test?”. Beer not included.
CIRCUIT 2015 - UI Customization in AEM 6.1ICF CIRCUIT
Andreea Corbeanu & Christian Meyer - Adobe
How to extend a dialog by purely providing the missing pieces via the Sling Resource Merger
* Customizable search facets
How to create custom search facets
* Custom page properties bulk editing
How to add a custom field to the bulk editing
CIRCUIT 2015 - Content API's For AEM SitesICF CIRCUIT
Bryan Williams - ICF Interactive
Many sites need to expose their AEM repository content through a flexible remote API whether it be for consumption by mobile apps, third parties, etc. This presentation will walk through setting up a custom, extensible, secure and testable API utilizing various open source tools that are at your disposal.
Damien Antipa & Gabriel Walt - Adobe
In this session we will demonstrate how to configure a website project with the new capabilities of AEM 6.1. We show the benefit with the new integrated device simulator. How to leverage breakpoints and the new AEM grid system to create a new author experience with an elastic and responsive layout. We will discuss new tooling for web designers and component developers as well as new opportunities with the grid system.
CIRCUIT 2015 - Glimpse of perceptual diffICF CIRCUIT
Rachel Ingles - ICFI
It is a presentation on how to use before and after page screenshots for testing and how the contrasts highlight the status of the build.
CIRCUIT 2015 - Orchestrate your story with interactive video and web contentICF CIRCUIT
Robb Winkle - ICF Interactive
Using PopcornJS and the Butter editor to place video and components onto the same timeline. No longer are you bound to show/hide user actions. Open up opportunities for video interactivity.
David Bosschaert & Carsten Ziegelar - Adobe
"The OSGi platform powering AEM provides a dynamic module system and enables component oriented development. Besides serving the as foundation for AEM, there are benefits for application developers.
This talk outlines the ease of use of OSGi in application code and shows how to master development tasks by using the right APIs and tools. Learn about the latest in component development, asynchronous processing, configuration management and deploying your application code in larger modules, so-called subsystems. A subsystem allows to package a set of bundles and configurations. The subsystem can run isolated from other bundles or other applications.
Learn how to leverage the latest OSGi tech for your own projects. All of the functionality discussed works directly with in AEM 6.1, GA now.
Make the most of the power of OSGi.
CIRCUIT 2015 - 10 Things Apache Sling Can DoICF CIRCUIT
Presented by Carsten Ziegeler & David Bosschaert from Adobe
Apache Sling is the underlying web framework for Adobe AEM. While the main concept of resource handling is well known, the project contains some hidden gems. Learn some fun facts about the open source project together with very valuable insight into important bits and pieces making the life of an application developer easier. This is a developer focused journey into the "secrets" of Apache Sling.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
How to migrate from any CMS (thru the front-door)
1. CIRCUIT – An Adobe Developer Event
Presented by ICF Interactive
How to migrate
any CMS
through the front door
2. Agenda
• About @rockwell105
– Recent Experiences in Content Migration
• A process for any CMS
• Frontend Tools
• Example Code
– Cucumber / Step-Definition using Capybara
– LSA Department Profile Pages
• Demonstration
• Summary / Questions
– Resources & References
3. About Me
Chris Rockwell
– University of Michigan, LSA
• College of Literature, Science and the Arts, Web Services
– Technical Lead on AEM project
– Neither software consultant nor database expert
• API User; Java, Ruby and Frontend
– Recent Experience
• Database Migration from OpenText
• R2Integrated did a great job in migrating our SQL database
to AEM
– Java classes calling SQL Stored Procedures and creating the
content in the JCR
– We also used frontend techniques, which I want to talk about
today
4. Querying the Database
Why
database
migra/ons
can
be
difficult
-‐ Requires
skills
in
both
systems
-‐ Source
CMS
and
target
AEM
-‐ Source
DB
table
names
are
like
…
-‐ vgnAsAtmyContentChannel?
-‐ vgnAsAtmyContentObjRef?
-‐ Rela/onships
between
the
tables
were
unclear
-‐ In
our
case,
no
foreign
keys
-‐ Legacy
system
customiza/ons
may
not
be
well
understood
or
documented
What
about
the
Legacy
system
API?
Or
Screen
Scraping?
5. Migrate ANY CMS
HTML
CSS
JS
WordPress
AEM
OpenText
Joomla
Drupal
MediaWiki
Magnolia
AssumpOon:
Every
Web
CMS
that
places
content
in
HTML
templates,
which
provide
a
consistent
HTML
Document
structure.
Template
Mapping
Old
system
to
New
system
Group
URL’s
by
template
group
For
each
group
idenOfy
extra
informa/on
needed
to
migrate
properly
6. Data / Screen Scaping
hGps://en.wikipedia.org/wiki/Data_scraping
“Data
scraping
is
generally
considered
an
ad
hoc,
inelegant
technique,
o2en
used
only
as
a
"last
resort"
when
no
other
mechanism
for
data
interchange
is
available.
Aside
from
the
higher
programming
and
processing
overhead,
output
displays
intended
for
human
consump>on
o2en
change
structure
frequently.”
For
us,
some
content
was
much
easier
(and
more
fun)
to
automate
a
browser
and
get
the
content
from
the
frontend.
Why
it
this
easier?
-‐ Content
is
consolidated
on
the
page
-‐ No
reverse
engineering
of
messy
legacy
systems
-‐ Knowledge
of
the
DOM
can
be
used
to
get
content
using
CSS
selectors
-‐ Consistent
HTML
template
structure
provided
by
the
legacy
system
-‐ UAT
fails
if
the
migraOng
URL
does
not
meet
assumpOons
7. Data / Screen Scaping
Other
reasons
to
do
this
-‐ Going
aYer
business
with
no
access
to
the
database
(POC)
-‐ Can
be
done
quickly
without
knowledge
about
the
legacy
system
-‐ Can
be
done
in
phases
(migrates
based
URL’s
listed)
-‐ Works
against
live
websites
(not
stale
database
snapshots)
8. Frontend Tools
Makes
it
easy
to
-‐ Provide
tables
of
input
for
migraOon
-‐ Script
Selenium
-‐ Visit
every
page
-‐ Get
the
content
-‐ Format
the
content
-‐ Post
using
curl
(curb)
Takes
Ome
usually
5s
per
page,
or
more
User
Acceptance
Tools
(UAT):
Cucumber,
Capybara,
Selenium
Webdriver
source :rubygems!
!
gem 'cucumber', '~> 2.0.0'!
gem 'capybara', '~> 2.4.4'!
gem 'rspec', '~> 2.8.0'!
gem 'selenium-webdriver', '2.45.0’!
gem 'curb', '~> 0.8.8'!
gem 'capybara-webkit' , '~>1.5.1'!
!
9. Example Code - Cucumber
Feature: Given a list of URL's go to each and create or update the AEM profile!
!
Scenario Outline: Visit live profile, get profile data, update the AEM page !
!Given the profile page, visit the <URL> !
!Then profiles should migrate into these dept categories:!
!| uniqname | dept | categories | !!
!| smaarons@umich.edu | earth | graduate-students | !
!| alabbey@umich.edu | earth | graduate-students | !
!| carliana@umich.edu | earth | graduate-students | !
!| mjbegin@umich.edu | earth | graduate-students |!
!Examples:!
!| URL | !
!| http://www.lsa.umich.edu/earth/people/ci.aaronssarah_ci.detail |!
!| http://www.lsa.umich.edu/earth/people/ci.abbeyalyssa_ci.detail |!
!| http://www.lsa.umich.edu/earth/people/ci.aciegosarah_ci.detail |!
!| http://www.lsa.umich.edu/earth/people/ci.altjeffrey_ci.detail |!
!| http://www.lsa.umich.edu/earth/people/ci.ansongjoseph_ci.detail |!
!| http://www.lsa.umich.edu/earth/people/ci.apsitisbrenda_ci.detail | !
• Use
Scenario
Outlines,
and
list
each
URL
to
migrate
under
Examples:
• All
Steps
will
run
for
each
page
(URL
example)
• The
steps
are
defined
under
the
step_defini/ons
folder
• These
are
UAT
tools,
so
we
can
take
advantage
include
steps
to
test
the
success
of
the
page
migraOon
Create
one
(or
more)
*.feature
file
for
each
Template
Group
(or
URL
group)
10. Example Code - Step Definition
Given /^the profile page (.*)$/ do | url |!
visit url !
end!
!
Given /^profiles should migrate into these dept categories:$/ do |table|!
@peopleDeptCat = table.raw!
@peopleHash = Hash[@peopleDeptCat.map {|key, value, v2| [key, [value, v2]]}]!
!
@phone = find("#phone", :visible => false).value!
@imageURI = find(".peopleImg")[:src]!
@education = find("#education").all('li').collect(&:text) !
!
curlAuthenticate(@profilePath)!
buildJsonContent!
postContent(@peoplePath, @categoryHash) # create category page!
postContent(@categoryPath, @profileHash) # create profile!
….!
@c.close!
end!
!
The
Capybara
gem
provides
convenient
ways
to…
• Drive
Selenium,
visit
url
• Get
content
from
the
page,
find(".peopleImg")[:src]
A
Data
Table
is
passed
in
from
Cucumber
lisOng
email,
department
and
category.
This
extra
informaOon
is
used
to
create
the
new
paths
for
the
migrated
pages.
11. Example Code- Sling Post Servlet
def buildJsonContent!
@profileHash = {!
"jcr:primaryType"=> "cq:Page",!
@uniqueName =>{!
"jcr:primaryType"=> "cq:Page", !
"jcr:content"=> {!
"jcr:primaryType"=> "cq:PageContent",!
! "officeLocation"=> "#{@officeLocation}",!
"jcr:title"=> "#{@firstName} #{@lastName}",!
"website1"=> "#{@url}",!
"website2"=> "#{@url2}",!
"lastName"=> "#{@lastName}",!
"cq:template"=> "/apps/sweet-aem-project/templates/department_person_profile",!
"officeHours"=> "#{@officeHours}",!
"fileName"=> "#{@cvFileName.match(/w*.w{3,4}$/) if !@cvFileName.nil?}", #!
"education"=> @education || "",!
"about"=> "#{@about}",!
"phone"=> "#{@phone.gsub(/<br>/,', ') if !@phone.nil?}",!
"title"=> "#{@title.gsub(/<br>/,'; ') if !@title.nil?}", !
"firstName"=> "#{@firstName}",!
"uniqueName"=> "#{@uniqueName}",!
"hideInNav"=> "true",!
"sling:resourceType"=> "sweet-aem-project/components/pages/department_person_profile",!
"cq:designPath"=> "/etc/designs/sweet-aem-project",!
"profileImage"=> {!
"jcr:primaryType"=> "nt:unstructured",!
"sling:resourceType"=> "foundation/components/image",!
"imageRotate"=> "0",!
},!
}!
}!
} !
Step
Defini/on
Overview
Visit
the
page,
Get
the
content.
Build
nested
hash(es),
which
convert
nicely
to
JSON
Use
*.infinity.json
on
example
content.
Use
this
as
a
starOng
point
for
the
nested
hash.
def postContent(jcrPath, contentHash)!
@c.url = jcrPath!
@c.on_success {|easy| puts "ON SUCCESS #{easy.response_code}"}!
@c.on_failure {|easy| fail easy.to_s}!
@c.http_post("#{jcrPath}", !
Curl::PostField.content(':operation', 'import'),!
Curl::PostField.content(':contentType', 'json'),!
Curl::PostField.content(':replaceProperties', 'true'),!
Curl::PostField.content(':content', contentHash.to_json))!
puts "FINISHED: HTTP #{@c.response_code}"!
end !
Step
Defini/on
Overview
(cont.)
Post
JSON
to
the
desired
path
using
:opera/on
import
The
JSON
contains
a
structure
needed
for
the
page
in
AEM
containing
properOes
needed;
jcr:primaryType,
cq:template,
sling:resourceType
content
hash
example
12. Legacy
System
AEM
System
OperaOon
Import
SlingPostServlet
Wrap-up
Demo
Questions
13. • Several options for Content Migration
– Scraping webpages is one option to consider
– :operation import is great
• Ways to speed up frontend migration
– Scale migration across machines using
Selenium Grid to launch parallel operations
– Use a headless browser
Questions
Wrap-up
Demo