Parameter Store provides an easy and secure way to manage secrets in AWS. It encrypts secrets at rest with AWS KMS and allows fine-grained access control through IAM roles. Chamber is an open source tool that makes it simple for developers to access secrets from Parameter Store through environment variables or directly in commands. It handles authentication with AWS using AWS Vault so humans don't have to secure credentials. Together, Parameter Store and Chamber provide encryption, access control, support, auditability, and scalability for secrets management in AWS.
3. $(Who Am I)
- I am Evan. I spoke here last year too.
- I have an engineering and security
background.
- Worked at Cloudflare and LastPass
- Recently joined Segment!
4. Last Year Recap: “JWTs in a flash”
- Last year my talk was a precautionary tale
about JWTs.
- It ended up holding really true.
- JWTs became widely adopted but are very
disliked.
5. Kevin Burke: Things to Use Instead of JWT.
Scott Arciszewski: No Way, JOSE!
JWT Pushback
7. - Joined Segment. They didn’t have a great
solution there, but wanted something great.
- Tasked me with finding a solution
How I started with secrets management
8. - Ian Haken (Netflix) Automated Bootstrapping
of secrets & identity in the cloud
Secrets management prior talks
9. What are the properties of a good secrets
management tool?
11. - Secrets encrypted at rest & centrally managed
- Access Control to secrets.
- Supported
- Auditable.
- Scalable.
Secrets Management Properties
12. - Lots of ways to skin this cat.
- Lots of cryptography questions here.
- Probably the least important of the bullet
points though.
Encrypted Secrets
13. - Access control implies authentication and
authorization.
- Authentication and authorization implies
identity
- Without identity you will fail at making secure
secrets management.
Access Control
14. - If you want people to use it and not make
mistakes, it needs to be simple.
Easy to work with
24. 1.Vault receives pkcs7 identity document.
2.Checks signature.
3.Checks identity document for instance id.
4.Uses AWS EC2 API to fetch the IAM Role of
the instance.
Vault: Authenticating on AWS EC2
25. - Using metadata doesn’t work for ECS.
- Uses the brand new AWS IAM Authentication.
- Used to not work at all, so it’s nice that it
works now.
Vault: Authenticating on AWS ECS
26. - Scoping a secret to a specific IAM Role takes
work.
- Doesn’t inherently understand a role name.
- Not well documented how to attach IAM roles
to Vault Roles and policies. Two completely
separate concepts.
Vault: Roles & Policies
27. - Have to provision roles and policies manually
- Have to “unseal” the password vault
- Used to not be possible to use ECS.
- How to auth vault with secret store.
- How do humans authenticate with vault?
Vault: Problems
28. - I built a service that does EC2 auth for AWS,
and doesn’t have the rest of the problems:
- https://github.com/ejcx/dssss
- It works, but don’t use it. Just shows how vault
could be improved.
dssss
30. - AWS Parameter Store ended up being the
perfect solution for us.
- I don’t hear anyone talking about parameter
store.
- Why?
AWS Parameter Store
31. - Parameter store can store “secure strings”
- Secure strings are encrypted by an AWS KMS
key.
- Access to each parameter store secrets can
be scoped with IAM.
AWS Parameter Store: Intro
32. ✅ Secrets encrypted at rest & centrally managed
✅ Access Control to secrets.
✅ Supported
✅ Auditable.
✅ Scalable.
✅ Easy to work with.
✅ Easy to automate
Security properties of parameter store
35. - Audit logs automagically are part of cloudtrail.
- Cloudtrail is delivered free of charge to S3.
- Because we describe our roles in code, all
changes go through pull-request and review
process.
AWS Parameter Store: Audit logs
36. AWS Parameter Store: Easy to use
- https://github.com/segmentio/chamber
- Chamber is the interface that engineers and services use
to access, manage, and fetch their secrets.
- Chamber fetches secrets and sets an environment
variable
40. Parameter Store: What does chamber do?
- Exec. run the command after -- after setting env from
names fetched in parameter store.
- List. show you what secrets exist for a given project.
- Write. Put a secret in to parameter store.
41. Parameter Store: How do humans use it.
- We use a tool called AWS Vault all the time.
- AWS-Vault allows developers to securely authenticate
with aws in the command line.
- We have an aws config set up with dev, staging, and prod.
42. Parameter Store: Chamber example usages
aws-vault exec prod -- chamber list web
aws-vault exec dev -- chamber write web Name hunter2
aws-vault exec dev -- chamber exec web -- ./run.sh
aws-vault exec stage -- chamber write web Name hunter2
43. Thoughts
- My thoughts are that we hit the secrets management lotto.
- We get identity for free. USE IT!
- We have a plan for moving this same setup to kubernetes,
still with IAM roles.
This talk is called secrets management in the cloud
But another name for it could probably be “Don’t use hashicorp vault”
Here is my contact information. You can email me questions at my work email. Evan at segment.com
My twitter is ejcx_ since this is the only thing that matters in infosec these days.
All the code I talk about is available at ejcx and segmentio on github
So who ami ?
I’m Evan Johnson.
My talks are all over the place normally. I’ve done talks about web vulnerabilities, breaking stuff, building secure solutions, software engineering. Most people would call what what I’m doing Security engineering or Application Security.
I might be uniquely well-qualified to talk about it, having worked at LastPass, a password manager for people. “Secrets management” is just a password manager for your code!
I’m working at segment. We just publicly announced our Series C about two weeks ago. Based in San Francisco. Lots of fun. Great folks. If you watch this and you feel like we are on the same wavelength then reach out to me
I spoke a the crypto village last year. When I sat down and began working on my slides I started to think about my talk from last year. I talked about JWTs and the javascript object signing and encryption standards.
The talk was admittedly really atrocious. I wasn’t prepared and I remember looking down at the time and only 14 minutes had gone by and I remember thinking “only 10 more minutes of this”.
The general message of that talk ended up being spot-on. The message was that people do not fully understand JOSE and JWT and are making a lot of mistakes. They also don’t understand the crypto details and make really basic mistakes like mistaking a Signed JSON Web Token for an encrypted JSON Web Token.
I said the spec was a mess. Over the last year the people have spoken!
I went on hacker news and these were the two most popular blog posts of the year about JWTs
I think the two blog posts are really neat. Kevin is a great engineer I’ve worked with and is definitely not a cryptographer. Scott is a great engineer too, but has a different skillset from Kevin. He’s a cryptographer first and focused on security engineering.
Written by Kevin Burke and Scott Arciszewski, cryptographers and developers kind of see eye to eye that JWTs are dangerous and no good.
I think anti-JWT fever really caught on this last year.
So. Last year I feel like I predicted the future with my talk about JOSE and JWTs. People are choosing to avoid them when possible.
This year I really want to predict how people will start doing secrets management.
I want to simplify secrets management for people because right now they don’t really know what they are looking for in a solution.
I joined Segment and they did not have a great solution to secrets management.
About segment’s infrastructure. It’s almost 100% dockerized services running on ECS. At any given time we have over 1000 running services or tasks.
How secrets management worked before me. They had some special config repos where secrets were in plaintext and passed as environment variables to running containers but wanted something really top notch they could be proud of.
Cloudflare had a solution to secrets management that worked really well for their CDN nodes where employees could encrypt secrets with public keys that were then decrypted on the CDN nodes, but this was the most I was familiar with secrets management.
My experience was really really limited in this department and so I really had to dive in and figure this space out.
I really think the main talk to watch on this subject is this talk by Ian Haken. Had a lot of really great ideas that I used to get started in this space.
So I set out to design us a secrets management solution. I think when you start out building any engineering solution, it is important to ask yourself at the beginning what security properties the desired solution will have. It’s a simple way to design security directly into the product
These properties might be things like
Revocation if you’re talking about web sessions.
Forward secrecy if you’re building a communication protocol.
Your desired properties might be different, but I started thinking of what the desired properties were for ANY secret store.
I came up with the following properties
Just kidding this would be way too easy!
These are the main desired properties that I wanted in the secrets management solution.
Just to quickly go over them.
I think it goes without saying a secrets store should encrypt the data it is storing. We should be encrypting the world
Access control to the secrets is the whole reason for centrally managing your secrets. Denying access and granting access based on identity.
An unsupported security product is usually a very bad thing. All software has bugs.
Scalable. Like I said we had thousands of services. Really wanted the secrets service to not be a bottleneck. To not go down at 4am and suddenly new services fail to deploy because it can’t fetch something stupid like a datadog api key.
Developer experience I think is really important to making a secrets management solution. Segment has a lot of front end engineers and full stack node developers. And is changing and growing to become an enterprise. If we want buy in from all engineers then the “right way” that we come up with for secrets to be present in a container needs to be the easiest way possible.
The details of the cryptography matters the least out of these.
Do you encrypt each secret with a separate key?
Do you use public key crypto? Symmetric key?
I don’t really care about these details in this talk.
I’m fine with one key unlocking the entire secrets vault.
If you’re breaking new ground to encrypt your secret store then you’re doing it wrong.
This might be one of the most important slides in the deck.
The vocabulary word access control implies more than may meet the eye.
A prerequisite of access control with secrets management is identity.
If you don’t have a way for your code to identify itself then you will not be able to set up secrets management in a secure way.
You can punt on identity and just have everything be able to access everything, or break your secrets management solution into different trust zones, where everything in each trust zone has access to each secret in that trust zone, but generally, you won’t have a great time.
This should go without saying.
Usable security is a crazy important initiative that I really wish people would talk about again
If you have a secrets management solution and hundreds of engineers, you need it to be simple enough for anyone to use, understand, and not need to know the details of.
Engineers should not need to be trained to use your secrets management solution or whatever you build. Security should be inherent and you should invest making the tools you use really high quality, because it pays off in time spent incident-responding and time spent training people.
The rest of the features are a lot less security impactful.
It’s obvious that a supported solution is necessary here. Nobody wants vaporware. All software has bugs. In security software, you need bugs patched.
Being able to audit who accesses secrets. When they are modified.
Scalable, it should be easy to run thousands of services without having a bottleneck.
At segment we had the additional requirement
Segment runs pretty much all of their code within docker containers on ECS.
It was important for us to be able to handle services that restart often, that do not live on snowflake boxes that are hand configured or spun up with some play book.
We wanted each container to be able to handle getting the secrets on their own.
So it was time to build an MVP and I had heard about a lot of people using vault.
I decided to try to learn what I could and spin up a vault instance for segment’s secrets.
This is their logo.
I think the logo is a V shaped safe with a keypad on it. Not a bad logo.
It’s a hashicorp product!
How does it work?
So this is my final report and thoughts about Vault.
One important takeaway from this talk that I hope I convey to you. If you are building cloud native software. If you are running all on AWS or the cloud, look for solutions that are cloud native. Built specifically for the cloud. Don’t even look at Vault. If you’re on AWS and have a lot of stuff automated in a modern dev shop, vault is no good.
I’ll tell you why.
I think it’s pretty obvious that vault is supported, auditable, and scalable.
They offer a “high availability mode” to run multiple vault servers at once in case one dies.
There are people who respond when you complain about it on twitter. Their CEO and I have had some banter back and forth about it.
It offers audit logs and you can go in and see what policies are set up.
I think the two problems are the ease of use, the access control is a little questionable, and the biggest problem was it is extremely difficult to automate
I feel like to actually convince you not to use vault I really have to educate about looks great when you first hear about it, but then it goes down hill with the amount of things you need to do. Im going to explain a few things about how vault works.
Vault is a server.
It is a services that runs in your infrastructure.
What it is actually doing is living in your infrastructure, reading data from an encrypted data store, decrypting it, and giving it to your running services.
The encrypted data store can be aws s3, this thing called consul which is another hashicorp product, it can be just a linux directory instead of being remote.
This brings us to the first problem. The vault server needs to authenticate with whatever data store is chosen otherwise, anyone can delete or modify the secret store.
But how can vault securely store credentials it needs to start? Is the answer more vaults? No. It isn’t more vaults.
So we are just spinning up vault and we hit a roadblock. Let me come back to how we solve this later because we will his the problem again. The solution isn’t pretty.
So we have a server. But the server talks to a password store. This password store needs to be initialized.
When you send an init command to vault it returns back two things. A root api token that is all powerful and a set of unseal keys or a single unseal key, depending on how it is configured.
This arrives at our first problem again.
We now need to manage these three secrets ourselves.
We can revoke the root token.
Then we are left with unseal keys to decrypt the vault.
An unseal key. A root token, we can deactivate the root token though and not manage it, and credentials to the vault server. This is all to start vault.
When the vault server starts up it is in a “sealed” state.
Meaning it doesn’t have the crypto keys it needs to decrypt the secrets.
In order to make vault usable, it needs to be unsealed.
The set of Unseal keys is by default 5 keys and you need 3 of them to unlock the chain.
So they imagine that to unlock your vault you will have people type these secrets in and decrypt.
This is incompatible with an automated world, so I set it to be only 1 key with a threshold of 1 to unseal
Under the hood they accomplish this with shamir’s secret sharing algorithm to derive the master key. The threshold is configurable.
This sounds cool but trust me it doesn’t sound cool when 5 keys were issued, one person is on a plane, one person is on vacation, one person isn’t picking up their phone, and none of your services can start.
Not automated
This part is critical for access control of secrets.
There are many many many ways to authenticate with vault.
An exhaustive list is:
App ID, AppRole, AWS, GitHub, LDAP, MFA, Okta, RADIUS, TLS Certificates, Tokens, Username & Password
Something interesting is many of these require secrets that end up being baked in to source code or baked in to images. It ends up not solving the problem but pushing it around if you choose some of these methods, unless you have ways to render a tls cert in an ephemeral way.
I think something else is pretty interesting here. This set of auth mechanisms are not focused on services. It’s focused on services, people, and stuff like that
For us, we cared about authentication on aws.
Metadata is a Special IP that runs an HTTP server that gives you basic metadata about your instance.
Metadata service provides a signed pkcs7 that says the instance id.
It is available magicly. It’s a lighttpd server that runs on the hypervisor that serves you your identity document. All local.
How Vault authenticates:
Metadata service provides a signed pkcs7.
Vault checks signature and makes sure it was signed by AWS
Gets the instance ID out of the metadata.
Inspects instance ID using AWS API, fetches iam role.
Issues token if the role is expected
I think this is really cool so I went down the rabbit hole and was asking people at AWS “how does the metadata service work” and one of the people I talked to was convinced there is a security hole in it somewhere he just couldn’t get it to work.
This is just a side note.
AWS ECS auth used to not work at all. Just shipped in May
Running containers. There was a very awkward moment at work where I built all this vault stuff and then I ran it all and expected it to work, but my container couldn’t get it’s secrets from vault.
Cluster bigcluster-mywebsite. Authed as bigcluster. Instead of website
Turned out that my container was trying to do ec2 auth. I was new to AWS and didn’t realize there was a difference and didn’t know about this gotcha.
Ended up completely scrapping vault when I hit this gotcha.
I can’t remember which is which right now, but to actually scope a secret to a specific service or piece of code that runs, you must create the roles and policies in vault.
This is honestly a mess. There is no automation.
IAM role is created for a new service in AWS. How does the task role get propigated to vault?
Vault doens’t automatically fetch them.
IAM Roles and policies don’t just automatically exist in vault. Vault doesn’t know anything about your hundreds of IAM roles. Someone needs to tell vault about each one, and provide a scope to scope secrets to each IAM role.
To fix this I ended up needing to build a service that is running all the time and inspecting AWS IAM, to look for new roles.
Every time it finds a new role that didn’t have a matching role in vault, it provisions a new role and policy that creates a new “directory” in vault for those secrets
This means that out of the box, vault isn’t completely automated.
There were numerous issues with vault that caused me to pass on it.
I think it really became clear that vault was not “cloud native”.
It has a ton of features and ways to authenticate, backends to store data in, fancy key sharing, but it doesn’t have the basic stuff that people need.
If you roll out vault in a non-automated way, someone at the company always needs to be a vault person.
We didn’t even get to have the opportunity to run in to problems about how would employees add secrets and manage the secrets in vault in a usable way, but I believe I would have needed to build a web UI that talks to the vault api. Vault sells this but why pay for something you don’t want to use.
Everyone has uber for cats. Or Instacart for people who love bikes.
I built “vault but cloud native”.
Vault is a lot of code and DSSSS is not. Its worth reading just to learn
It’s called dssss and on my github. You dont’ really want to use this but it’s kind of neat to read the code.
I will show you how to actually store your secrets in a moment.
This does pkcs7 auth of each IAM Role, and knows about each role automatically.
IT doesn’t need special secrets. Instead it encrypts all the bootstrapping secrets with KMS and stores them.
It just runs without configuration.
BUT! This isn’t the right way to do secrets management in the cloud. It isn’t battle tested. IT’s a cool project but not cut out for prime time.
So, what was the right way to do secrets management for us?
I kept thinking. Why do we need these tokens? We have roles for identity and can auth by role. Why do we need anything more than KMS and IAM
We discovered AWS Parameter Store.
We actually settled on AWS parameter store.
AWS Parameter store is a way to store strings in AWS.
It ended up being the perfect fit for us.
If you are running all on AWS and need secrets management. Don’t do any more looking. Just use parameter store.
It’s fairly new and not well documented. But it has everything you need using the API.
Parameter store stores strings and stuff, but it can also store “secret strings”. But it also stores other stuff too. Non secrets.
Secure strings though are special. Encrypted with an AWS KMS Key that you specify.
Access to each parameter store secret can be granted or denied using IAM.
This is something I love.
Parameter store not needing to mess with these awful tokens. Instead it is just pure AWS IAM Role.
I want to say that I only have good things to say about AWS parameter store.
It’s amazing and I personally think if you are on AWS and not using it, you are 100% doing it wrong.
Some of these I will talk more about but some of them I won’t
Specifically I think it goes without saying that this is supported and scalable. AWS offers this product and will support anything you write them a check for.
Terraform is an “infrastructure as code” project. Also built by hashicorp.
Made by the same people who makes vault. Provisions infrastructure in aws.
All it does is configure things in AWS as code. Very useful. Pretty buggy but better than clicking around by far.
This is from our terraform configuration that all services use.
We use terraform to define all services in our infrastructure.
If you’re sleeping. This is the slide with all the security in it.
This is the actual IAM configuration we give to all AWS Services we run.
This is where access control to secrets is enforced.
If a service spins up named web, it has access to secrets that begin with “web.”
All services that get created get their own IAM Task Role.
Each IAM Task Role has access to read secrets that begin with their name.
Each Task role gives each service the ability to decrypt secrets.
If a service tries to access secrets of a different services name, they have the wrong IAM permissions and end up not finding any of the secrets they wanted.
So. Parameter store is “key” value. Every secret has a name, and a value.
The name of the service prefixes the name of the secret, with a dot.
What ends up happening is the website’s IAM Role is provisioned to allow it to access all secrets that begin with the word website.
All audit logs end up in cloudtrail
Cloudtrail flows to S3.
We store all audit logs forever in S3.
Additionally, because all of our changes flow through terracode we are protected from things like a service changing it’s name to get new permissions and access to new secrets.
These changes would be caught during code review and the pull-request process.
This is all fine and dandy but we still need a way to talk to parameter store.
This is kind of the missing link that is left.
How do humans and services interact with parameter store?
Segment built and open sourced a very small utility written in golang.
It’s called chamber. It is meant to be usable, secure, and one size fits all for containers and services that use parameter store.
It uses the AWS Golang SDKs to communicate with parameter store.
So a way to conceptualize what is happening and how chamber fetches secrets for our services.
When you run chamber and tell it to fetch secrets for website.
It will fetch all secrets that begin with the word website.
Take the portion after the . and set an environment variable with the same value.
So in parameter store we have a website.stripe_secret_key with the value of seven. Chamber will set that to the env.
Next, chamber does an exec and runs run.sh
This is before instrumenting a project with chamber.
They really just say chamber exec and chamber does the rest.
Chamber handles signals like sigterm and stuff as well, so that the service can gracefully shut down.
This is after instrumenting a project with chamber.
Services it is obvious how they use chamber.
They have an IAM role and the AWS SDKs that chamber uses automatically tries to get AWS credentials using the AWS Security Token Service.
Humans though. Humans need a program called AWS-Vault, which securely stores their accounts AWS credentials in the apple keychain.
When they run a chamber command, they run aws-vault first.
We have several AWS accounts for dev, staging, and production so here is how it ends up looking when our engineers add things to the parameter store.
`aws-vault exec development -- chamber list website`
These are four examples of real commands we would run with aws-vault
This first example is listing the secrets in production that web will have access to.
The next one is adding a new secret to web named “Name” with the secret being the word “Hunter2”
The last example is how programs are run locally. You can authenticate as yourself with our dev account, fetch all of dev’s secrets and stuff them in your environment, and then run the run.sh script.
There’s a lot of things that engineers can do to solve their own problems with this set up.
I have not heard of anyone with a more usable and more secure scheme than this.
We rely on Amazon so some of the hairy parts are done for us.
Especially identity. Every service has a task-role and without this it’s not possible.
This was pretty amazing to me because I came to Segment from Cloudflare and LastPass. Both companies ran 100% of their own infrastructure and things like service identity and KMS were pretty mind blowing to me.
It’s not all cookies and cream where we turned this thing on in one afternoon. Some services didn’t have a task role because they were the ugly stepchildren at segment. Services that just had been sitting around and running well, not requiring maintenance.
We moved what we needed to to the newer cluster so they all had a task-role, and then automatically had scopes to the parameter store paths they needed.
People are afraid of “vendor lockin”. It’s a real thing.
We are planning to move some of our infrastructure to kubernetes. We have a plan for this to work!
Kube2IAM is a project that should mean our secrets storage can stay on AWS, while our services run on kubernetes with IAM roles.
So we can be half on kubernetes, half off of kubernetes, but still using chamber and our existing AWS toolset.
My next year I hope to see people on AWS ditching vault. You don’t need it and it’s a lot of work.