This document discusses migrating a large 8 TB SharePoint site collection to a new farm within a 96 hour maintenance window. Key points:
- The site collection is too large to migrate as-is, so it will be split by promoting some subsites to new site collections.
- Metalogix Content Matrix will be used to script the migration in parallel batches to complete it on time.
- Challenges include maintaining performance over the large data set and validating a 99% accurate migration within the narrow window. Careful scripting and testing is required to successfully migrate such a large amount of content.
3.
Leader in secure cloud-based solutions that improve
collaboration, information sharing, and supply chain management for
over 100,000 companies worldwide, including some of the largest players
in aerospace and defense, life sciences, pharmaceuticals, and financial
services.
Exostar’s solutions are accessed with our award winning Identity &
Access Management services that enable speedy adoption of your trusted
external partners.
Exostar’s ForumPass is a cloud-based, enterprise-class, collaboration
platform that addresses the needs of users sharing sensitive data with
external partners. The Cloud/SaaS “ready infrastructure” delivery
provides real time scalability for 20-30 people collaborative teams or
larger teams with 1,000s of users. The service is delivered on a per seat
month basis which helps our customers avoid costly startup costs for a
robust, secure service.
ForumPass is built on the SharePoint platform, currently utilizing the
SharePoint 2010 Enterprise version.
4.
Per Microsoft, there are about 9,000 mediumsize businesses that have SharePoint data
storages ranging from 2 TB to 100 TB
A lot of these storages eventually need to be
migrated
To a new version of SharePoint
To a new location
Etc.
By the time of migration, some of the site
collections can grow quite large
5.
One of the ForumPass site collections is 8 TB
This is twice as large as the recommended maximum for
any content database, and 81.92 times larger than the
recommended maximum for a site collection
More than 30,000 users
Content is added/modified very frequently
The farm is migrated to a new version of
SharePoint
The maintenance window is only 4 days (96 hours)
At least 99% of data must be preserved during the migration
The big question: How to migrate the 8TB site collection?
6.
Not feasible to migrate without splitting the
site collection
Copying would take much more time than the
maintenance window given
Even if somehow it would be feasible to finish the
migration within the maintenance window, it would
probably be wiser to find a long-term solution that
would conform to the Microsoft best practices
None of the out-of-the-box methods offered by
Microsoft is able to split a site collection
7.
We chose Metalogix Content Matrix as our
migration software
Allows read-only direct connection to the source database
- important for performance reasons
Metalogix allows scripting migration activities
Provides PowerShell cmdlets
Allows running several migration activities
simultaneously, thus speeding up the process
Allows full and incremental copies
Important because incremental copies take less time than full
copies – thus a final incremental copy is more likely than a
full copy to fit into the maintenance window
Each script can take parameters
8. • SharePoint farm installation
• Web application configuration
• Service application configuration
The new
• Firewalls configured
environment has to
be fully functional • Etc.
Code has to be
migrated
PowerShell has to be
prepared
• Feature IDs need to be preserved
• If migrating from MOSS 2007, code has to be compatible with SharePoint 2010
• In particular, code that refers to user profiles or search
• All the solutions need to be deployed
• Use Content Matrix PowerShell Console
• Make sure your powershell.exe.config file contains the settings necessary to initialize features
9. Each first-level subsite is promoted to a site
collection
Some but not all second-level subsites are promoted
to site collections
No other subsites are promoted to site collections
(for complexity reasons)
The content of the top-level site of the site collection
(libraries, lists, images, etc.) is NOT migrated
10. For
each
firstlevel
subsite
• Create a new content database
• In this content database, create a new site
collection based on the standard template
• Then, two options:
• 1) copy the content of the subsite to the toplevel site of this new site collection
• Since some second-level subsites are
promoted to their own site collection, a site
filter is required
• or
• 2) copy the subsite to this new site collection
11. Copy-MLAllSharePointSiteContent or
Copy-MLSharePointSite
The specific parameters depend on the choice of the
cmdlet, as well as your migration requirements
E.g., you don’t want to migrate themes if you are
migrating from MOSS 2007 to SharePoint 2010
Make sure that the SiteFilterExpression is present if
you plan to promote certain subsites to their own
site collections
Certain parameters might affect performance
Sometimes it is worth to prototype the migration
operation in the GUI
12.
Use Copy-MLAllSharePointSiteContent when
The URL of the new site collection has to stay exactly
the same as in the first-level subsite, or
You want the first-level subsite content on the root
level of the newly-created site collection, and the site
template of that subsite does not interfere with the
site template of the root subsite
In all other cases, use Copy-MLSharePointSite
14.
At the very least, it should include:
Server-relative source url
E.g., /sites/mycompany/SomeCoolSite
Managed path
E.g., /customers/ or /sites/mycompany
Site Name
E.g., SomeCoolSite
Site Description
E.g., Some Cool Site
Whether migration is full or incremental
15.
At the very least, it should contain the sitecollection-relative URLs of excluded subsites
16. Should
contain:
• Input CSV file path
• Exclusion CSV file path
• Source information
• DB Server, content DB, root
URL, template path, etc.
• Target information
• DB Server, farm
administrator, root url
• Metalogix job history path
17.
18.
Some second-level subsites are promoted to site
collections
These site collections’ URLs are new
A separate script is needed
Script configuration similar to what we’ve seen
Input CSV should include the URL of the new site
collection, as well as the web template of the site
copied
The Copy-MLSharePointSite cmdlet is used in the
script
New site collections are created in new content
databases
19.
20.
Be careful with Team Sites
-MergeSiteFeatures parameter
If it is true and you migrate from MOSS 2007 to
SharePoint 2010, then the web parts from
default.aspx will move to SitePages/Home.aspx and
default.aspx will be empty - causes great confusion
for users
If it is false and you used the CopyMLAllSharePointContent cmdlet, you need to make
sure that all necessary site collection features are
activated
21.
Full copy: Workflow associations are copied, workflow
instances are NOT
Incremental copy: Workflow associations are NOT
copied
Possible to copy Nintex or SharePoint Designer workflow
associations
Thus, the users should NOT create new workflow
associations after the full copy ran
LegacyWorkflows feature needs to be activated on
newly-created site collections
22.
Make sure you add site collection admins to
the newly-created site collections
Involve users (CFT)
Their feedback will identify the problem areas
Run incremental migrations as needed
We needed several incremental migrations, plus a final
one to be run during the maintenance window
23.
Metalogix allows comparison reports to verify
completeness of the migration job
Also, Metalogix provides logs for each job
When your testers identify a migration
issue, the reports and logs will help you
troubleshoot
Sometimes, an additional incremental copy might be
needed
24.
The hardest thing to troubleshoot
Migrating a 8 TB site collection may well take more than 1024
times than migrating a 8 GB site collection
Migration rate can go down with time
C:UsersSomeUserAppDataRoamingMetalogix
Content Matrix Console – SharePoint
EditionApplicationSettings.xml
PerActionResourceUse - Controls how many migration
activities are run in parallel
Trade-off - Higher value means more parallelism but less
predictability
Since parallelism is available where possible, the variance of
load within a job is less predictable.
SQLQueryTimeoutTime – You can also lose data if the timeout
time is too low
Disable verbose logging
25.
Accomplished a 96-hour go-live event
Migrated more than three million documents
Achieved 99.6% quality metric with customer
concurrence
Achieved major improvements in Metalogix
Content Matrix migration tool, in direct
collaboration with Metalogix
26.
Migrating a very large site collection:
Typically involves splits, which means that a thirdparty product such as Metalogix Content Matrix will
be needed
Can be scripted, with scripts running in parallel
Requires comparison reports to ensure completeness
Presents performance challenges as the migration
rate tends to go down