Dealing with large datasets and tight security can make it difficult for a data science team to finish their work. A shared computational environment that scales for big data, and a single place to establish security protocols, make it much easier. Enter JupyterHub.
JupyterHub is as a computational environment that provides shared resources to a team of data scientists. Each team member can work on their own tasks while accessing common data sources and scaled computational resources with minimal DevOps experience.
Don't want to set this up yourself?
JupyterHub installations can be complex to set up and even more complex to manage. If you want a quicker solution for your team, consider Saturn Cloud Hosted Organizations or Saturn Cloud Enterprise at www.saturncloud.io.
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Setting up JupyterHub on AWS
1. Setting up JupyterHub on AWS
A step-by-step walk guide to installing JupyterHub for your
organization on your internal systems.
2. Don't want to set this up yourself?
JupyterHub installations can be complex to set up and even more complex to
manage. If you want a quicker solution for your team, consider Saturn Cloud
Hosted Organizations or Saturn Cloud Enterprise.
Saturn Cloud
3. Dealing with large datasets and tight security can make it difficult for a data science team to get their
work done. Each team member can work on their own tasks while accessing common data sources
and scaled computational resources with minimal DevOps experience.
The official JupyterHub documentation covers much of what follows, but often times it points users to
other resources, which can be a bit disorienting. In the following tutorial, we’ve outlined the necessary
steps to get started with JupyterHub on AWS EKS.
Saturn Cloud
4. Compare and Contrast JupyterHub Versions
The Littlest JupyterHub is easy to set up, but probably not suitable as a long-term production data science
environment. That said, TLJH is a great way to get an understanding of how JupyterHub works and to get a
feel for its user interface. Zero-to-JupyterHub is a containerized solution, so that all users are working in
isolation of each other, and any particular set of work can be run on arbitrary hardware. This allows users to
get the right amount of CPU, RAM, and GPUs for their particular work, and avoids the scenario where one
user’s work could impact another.
A containerized solution might be right for you if you need to scale to potentially thousands of users or if you
prefer to have more control on how your JupyterHub is administered, then ZTJH is the better option. The
Littlest JupyterHub is an environment that runs on a single server, and as the name implies, is the smallest and
simplest way to get started.
Saturn Cloud
5. Compare and Contrast JupyterHub Versions
When choosing an Amazon Machine Image, they reference Ubuntu version 18.04 LTS. However, feel free to
choose the more recent Ubuntu 20.04 LTS version instead. When choosing an instance type, keep in mind how
many users plan to work on the platform simultaneously. Once TLJH is up and running, navigate to the
JupyterHub endpoint.
From there, you can add users and install conda/pip packages.
Saturn Cloud
6. Compare and Contrast JupyterHub Versions
Saturn Cloud
TLJH ZTJH
Can run on remote servers (AWS)
Run in a containerized environment (Kubernetes)
Ability to auto-scale
Ability to add preferred authentication providers
Ability to enable HTTPS
Production ready
Max number of users up to ~100 up to 1000+
7. The Littlest JupyterHub on AWS EC2
When choosing an instance type, keep in mind how many users plan to work on the platform
simultaneously. JupyterHub offers a guide on how to properly size your instance. When configuring
the instance details, remember to add an admin user to the JupyterHub install script.
Zero-to-JupyterHub on AWS EKS
Whereas the TLJH runs on a single EC2 instance, ZTJH can be configured to run on a Kubernetes
cluster so that your team can scale calculations and handle very large datasets. The setup is more
involved, but this allows for broader customization.
Saturn Cloud
8. Create your Amazon EKS cluster
1. Log into the AWS Management Console and make sure you’re in the AWS Region you would like your
AWS resources created in. If possible, they should be close to your users.
2. Create an EC2 VPC with both public and private subnets by navigating to the AWS CloudFormation
tool.
NOTE: Creating a VPC with both public and private subnets is recommended for any production deployment.
Specify template
Copy this CloudFormation template URL and paste it in the “Amazon S3 URL” field. This template creates both
public and private subnets, don’t let the name confuse you.
https://amazon-eks.s3.us-west-2.amazonaws.com/cloudformation/2020-10-29/amazon-eks-vpc-private-su
bnets.yaml
Click “Next”.
Saturn Cloud
10. Create your Amazon EKS cluster
Configure stack options
Use the default options and click “Next”.
Review
This last page allows you to review the configurations you have made. From here, click “Create” and let
CloudFormation create the VPC for you. This shouldn’t take more than a few minutes.
Create an EKS cluster IAM Role by navigating to the IAM > Roles page.
Select trusted entity
Select “AWS Service” as the “Trusted entity type”.
Then search for “EKS” under the “Use case”.
Select the radio button next to “EKS cluster”.
12. Create your Amazon EKS cluster
Add permissions
Confirm the AmazonEKSClusterPolicy permissions are attached. No other permissions are needed.
Click “Next”.
13. Create your Amazon EKS cluster
Name, review, and create
Give your EKS role a name, something that clearly indicates what kind of resource it is (i.e. an EKS role) so
that it is easily identifiable in later steps. Then. click “Create” and your role should be instantly available. In
our case, we gave it the name ztjh-eks-role.
14. Create your Amazon EKS cluster
4. Navigate to the EKS console and
create an EKS cluster.
Configure cluster
Give your EKS cluster a unique and
identifiable name. In our case, we gave
it the name ztjh-eks.
Attach the newly created EKS role that
we created in the previous action. In our
case, we would use ztjh-eks-role.
Click “Next”.
15. Create your Amazon EKS cluster
Specify networking
Now, we configure our EKS cluster to use the VPC that we created. In
the “VPC” section, select your VPC. In our case, we would use
ztjh-vpc. The name on the screenshot might seem a little confusing
but this is because the first part is the unique VPC ID and the second
part (ztjh-vpc-VPC) is the name we chose with -VPC appended.
The subnets should be automatically added in the “Subnet” section
once the VPC is selected.
Leave the remaining items as is, and click “Next”.
Configure logging
Keep the default logging configuration settings, and click “Next”.
Review and create
Finally, review your EKS configurations, then click “Create”. It might take
10-15 minutes for the EKS cluster to be created.
16. Configure your computer to communicate with your cluster
1. Create or update your local kubeconfig file.
This step assumes you have installed the AWS CLI, one of the prerequisites listed above. It also assumes you have
your AWS credentials stored in the ~/.aws/credentials file.
NOTE: AWS allows you to store multiple credentials (i.e. labeled “profiles”) in ~/.aws/credentials file so if you need to specify which account
these credentials are tied to, run export AWS_PROFILE=<your-profile> before the command below. If you only have one AWS profile, and it’s set
to “default”, you can ignore this note.
aws eks update-kubeconfig --region <region-code> --name <my-cluster>
2. Verify you can communicate with the Kubernetes cluster.
NOTE: Kubernetes allows you to connect with multiple contexts (or clusters). If you need to switch contexts, run kubectl config set-context
--name <my-cluster>, where <my-cluster> would be ztjh-eks in our example.
kubectl get scv
The output should look like this:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 1m
If you get an error message like could not get token: AccessDenied: Access denied, inspect your ~/.kube/config
and have a look at this troubleshooting guide.