The document discusses Helix4Git, a product from Perforce Software that allows their versioning engine to directly manage Git data. It provides automated mirroring of Git repositories to remote locations for improved scalability and performance in continuous integration and delivery pipelines. A new version, Helix TeamHub, adds additional features to solve challenges for developers working with multiple Git repositories at scale, such as improved collaboration, administration, and integration capabilities.
40. Follow us for news and insights!
Visit www.perforce.com
Editor's Notes
Title perhaps a bit misleading … but try to keep it within that frame if possible.
Keep this short, maybe just give people a glance, it’s just to provide a quick idea of who’s talking.
Project Management & Issue Tracking
Hansoft (complex project with lots of dependencies and tight deadlines, particularly popular in the media/gaming world)
HTH Issues (lightweight agile, tightly integrated with workflow)
Helix ALM (Requirements focus, particularly useful for regulated efforts)
Integrations (JIRA, Bugzilla, Redmine, etc)
Developer Collaboration
Swarm (for Classic, and Streams)
Helix TeamHub (for Git, Git @ Scale)
Version Control & Repository Management
Helix Core (Classic, Streams, Graph Depot can be used without TeamHub)
Helix TeamHub (Graph Depot)
20 years is a long time to work on scale and performance issues, we have used the time well
It’s an honor to serve our extensive list of high profile customers include Samsung, Qualcom, Nvidia, Pixar, New York Stock Exchange, EA (and most other gami developers), etc.
However, so far we haven’t had a great story for small teams.
Also, our first wave of Git support was a great engineering effort and used by many clients to manage some of their challenges. But it was a bit cumbersome and didn’t scale as far as we’d like.
GitFusion: A translation layer allowing users to work both in Git and Helix against the same repository.
GitSwarm: A modified GitLab version with support for GitFusion.
It worked fairly well, even amazingly well in some respects, added more complexity then we like. Translation is not free, obviously, and teams generally work only in one VCS format (at least per component/service/product).
Thus, we are moving away from both and I’m just about to tell you what we have done and are doing instead.
Now why would we want to do that?
Let’s compare the two and get into what this can bring to the table!
Git is very popular and we use it for some projects too to make sure we have expertise in the area.
It isn’t really a version control system though, it’s more of a snapshot tracker.
Helilx Classic Data Model provides a lot more detail about what’s going on then Git’s repository snapshots.
Helix core tenant is to 1: never loose or destroy data and 2: operate quickly
Some distinctive features:
Work on only partial and/or multiple repositories
File locking
Know who’s working on what
Almost crazy granular permissions
Amazing auditability
Massive scale (size, number of files, change rate, replication)
Change identifiers are sequential, combined with above it makes it easier to manage complex products with hundreds of code lines and massive merges at the end.
Perforce organizes data in Depots and Git, in this context, is a new type of Depot that can hold Git repositories.
Many p4 commands can operate on both, even at the same time.
For builds you can, for example, grab whatever you want from anywhere on the server.
However, we are not going back to protocol/date translations so users will submit/push to one model at a time.
We'll touch on some unique features later ...
Normal Git clients and management solutions still work.
Helix makes sure nothing is stale by checking and updating caches as needed.
Helix4Git also leverages our “normal” replication and scalability features.
We can cover more of this if you have interest and there's time.
Generally, when scalability is an issue customers make sure they have the data accessible in their LAN so 80% is realistic in most real-world scenarios.
But, if that isn't the case, 40% isn't bad either.
18% less data might seem unimportant to some but it really isn't when you're talking about high-end petabyte volumes.
Perhaps skip this one?
Perhaps skip this one?
I've already touched on this but perhaps explain more what this means and why it is important? ...
As an interesting data point, one of our customers who evaluated Helix4Git expects to reduce their build farm from 400 servers to under 150 when they have it fully in production.
Anybody wants to venture a guess?
We obviously need to do something about that, right?
Enter Helix TeamHub ...
We started this effort in 2016 as a new project and decided to accelerate it buy acquiring Deveo and integrate it with Helix Core.
By doing so, our customers can leverage the full functionality of both Helix Core, Streams, and Graph Depots.
These are some of the unique attributes of Helix TeamHub.
It has features that work for the individual developer, as well as small and large teams.
Acceleration of builds provides 40-80% faster builds
One of our customers reported they could go from 400 build servers to 150
Integrations are easy to set up and based on normal web hooks.
Let’s start by listing the pain points that we are trying to address.
Perforce is all about scale and for us that means:
@ Scale = Number of developers, number of locations, number of repos, number of assets, number of commits, number of technologies involved, and rate of change.
As these numbers grow, so does the problems:
- Role of IT, how does IT fit to the picture and how can they help
- Tracking changes on features or releases that span across multiple repositories
- Making the development workflow as smooth and automated as possible
- Continuous integration and other automated feedback loops
And finally, Serving the developer needs as well as the organization needs
All the way since the PCs, followed by the Web, Clouds, SaaS, PaaS, BYOD, and in general “Software Eating The World”, IT departments are struggling to respond quickly enough.
New Projects are created more often and teams, particularly distributed ones, can not wait for IT to get around to it. Self service is a requirement, not a nice-to-have feature
We have customers with team/department servers all-over-the-place and it’s causing massive issues with security (hacked cloud instances, data leaks, etc), availability, DR, and performance.
Self service in code hosting and HTH in particular means anyone with proper credentials can create a project and become the owner and thus the administrator of that project
Earlier there were tickets created to IT to accomplish this and all of the related tasks
Now that person who created the project can invite others to the project
Set permissions based on roles
Create necessary repositories and the project structures
Set up the rest of the tooling around it
Everything happens with a couple of clicks
It sounds very basic, but platforms that can deliver superior performance generally tends to be hard to configure and maintain or lack in flexibility.
The role of the IT organization is becoming more and more
To enable self-service within the development organization and
Making sure the compliance and security requirements are fulfilled
Self-service actually helps in this regard as the number of “self managed” resources diminishes and the self-service platform acts as a single source of truth
Multi-repo in single team:
Early days we were building Monoliths, software consisted of a component that was stored under a single repository
-> divide development into components, good example being dividing Backend and frontend where we already typically divide the code to two different repositories
- starting to use libraries and frameworks
-> coming to today, where micro-services architectures are more than common
- biggest percentage of code comes from 3rd party components
-> in the future, we see trends such as serverless emerging
- even smaller components
The number of code lines per component is getting smaller
The number of components is getting larger
- Various build tools used, various artifacts being produced by those build tools, need a way to store and version those artifacts
We rely overgrowing number of external dependencies, 3rd party libraries and tools
Up to today, there’s no ability to manage multiple components seamlessly
Managing both code and the dependencies under one platform is something that brides the gap between the development tools
If we then think about these problems in a larger context, we face the same problems across the organization:
We have various projects, some of them legacy, the projects have been done in various technologies and use various version control systems
Companies do more and more acquisitions and obtain the IPR, which means the source code and the dependencies
What we really need is a Single source of truth for managing the sprawl that happens. Having that single source of truth allows us to
Protecting IP
Protecting against claims
Enable discovery and reuse of software, components or even source code
Code Review benefits:
- Fewer defects in code, better design
- Improved communication and sharing of best practices
- Education of junior programmers and peer learning
Done right, code reviews can result in
Shorter feedback loops,
Shorter lifetime for bugs and
More maintainable code
And all this boils down to, is of course, more customer satisfaction and less support calls
- For code reviews, we need proper tools to both conduct and store the review information
- What we need from the code review tools is:
- Easy workflow for developers to conduct the reviews
- And here I see that a ”pull requests” in which I mean a contribution outside of the project, happens less often outside of open source community,
code reviews are typically done within the same repo, across branches, not across repositories
Typical workflow is doing a feature in isolation inside a branch, there can be numerous feature branches in parallel
When feature is done, create a review which will also conduct the merge between the branches after the is done
In order to make this code review workflow efficient, we want to have multiple layers of control, before we actually merge
In HTH we can set number of approvals, the number of “other” team members who need to explicitly approve the changes
We can also require that a continuous integration tool, such as Jenkins gives a successful Build status for the feature branch
We can ensure that the feedback given during the code review gets addressed using Task comments that need to be resolved
And finally, we can set Default reviewers, that are automatically assigned as reviewers to new code reviews, we can think default revierwers as owners or guardians to a specific component
Code Hosting solutions makes setting up CI with a team much simpler
Earlier it took custom scripts or a lot of manual work to test that features work well together and not just “on my machine”
CI is the feedback loop that we build on top of the VCS.
It ties to the development workflow giving early feedback often and constantly.
It also acts as a gate keeper or quality gate during the code review process
Now, as the projects get bigger and bigger, the CI can become the bottle neck.
Especially when we are talking about projects that span across geographical locations
The problem we usually see that the clone and pull operations take longer and longer.
Usually this is tackled with multi-site replication and this is a good solution on some cases.
However, when the projects span both across multiple geographical locations, and across tens, hundreds or even thousands of repositories,
we start to experience the problems with clone and pull performance even if we use replication.
Helix TeamHub Enterprise we have solved this by creating Helix4Git
Helix4Git is a basically a reimplementation of the Git backend utilizing the proprietary capabilities in Helix version engine,
but The difference is, you can both manage and interact with multiple Git repositories atomically.
Currently this allows checkout changes 80% faster and in some cases, reduce the number of build servers required by 75%
In the future, it will allow us to do even more
We divide the reasoning to the developers as well as to the organization around them
For the developer Developer:
We have always wanted to build tools that Developers love to use, after all, we are developers ourselves
One of our key values has been Simplicity, which means getting things done efficiently
Another is extensibility, the ability to build your own tooling around the product, we accomplish this with the 100% API coverage, which means you can do everything via APIs that you can do via the web UI
Additionally, we have a lot of hooks that can trigger actions in other tools and services according to actions taken in Helix TeamHub
If we think about the organization as a whole:
We strive to deliver our promise of a Single source of truth, where all the assets, both code, build artifacts and other digital assets can be found from one place
This ensures that the Compliance and security requirements are met and that the IPR are safe.
Three cloud versions:.
Free is for up to 5 users, and includes 1gb of storage per user.
Standard is for 6+ users for $19/users/year and adds email support.
Premium is $29/user/year and adds SSO, Repo and Branch-level authorizations, Code Search and Collaborator accounts.
Enterprise is $179/user/year and adds:
HA/DR and Higher Performance Build with H4G.
Unique features backed by GD (multi repo CR, replication, …)
YOU CAN SEE THE FEATURE BREAKOUT and pricing on Perforce.com