Cloud Vedas: aws cheat sheet

Showing posts with label aws cheat sheet. Show all posts

AWS DynamoDB Cheat Sheet

DynamoDB is fast and flexible noSQL DB service for all application that need consistent single digit millisecond latency at any scale. It is a fully managed DB and support both document and key value data models.It is great for IoT, mobile/web gaming, and many other apps.

Quick facts of dynamodb

Stored on SSD storage
Spread across 3 geo distinct Ds.
Eventual consistent reads:- Consistency across all copies is usually reached within a sec. Repeating a read after short time should return the updated data.(Best Read perf)
Strongly consistent reads:- It returns a result that reflects all writes that received successful response prior to the read.

Table
Items(Like row of data in a table)
Attributes(Like column of data in a table)

Here everything between brackets {} is Item and 1587, Alan etc. are attributes.

{
"ID" : 1587,
"Name" : "Alan"
"Phone": "555-5555"
}

Two types of primary keys available:-
Single Attribute(Think unique ID)
Partition Key (Hash Key) composed of one attribute.

Composite(Think unique ID and date Range)
Partition key and Sort key (hash & Range) composed of 2 attributes

Partition key

Dynamodb uses the partition key 's value as input to an internal hash function. The output from the hash function determines the partition(this is simply the physical location in which the data is stored)
No two items in a table can have the same partition key value.

Partition Key and Sort Key

Dynamodb uses the partition key 's value as input to an internal hash function. The output from the hash function determines the partition(this is simply the physical location in which the data is stored)
Two items in a table can have the same partition key , but they must have a different sort key.
All items with the same partition key are sorted together , in sorted order by sorted key value

Local secondary index

It has the same partition key but different sort key
Can only be created when creating a table. they cannot be removed or modified later.

Global secondary index:

It has different partition key and different sort key.
Can be created at table creation or added later.

DynamoDB streams

If a new item is added to the table, the stream captures an image of the entire item, including all of its attributes
If an item is updated, the stream captures the before and after image of any attributes that were modified in the item.
If an item is deleted from the table, the stream captures an image of an entire item before it was deleted.

Query:-
A query operations find items in a table using only primary key attribute values. You must provide a partition attribute name and a distinct value to search for. You can optionally provide a sort key attribute name and value, and use a comparison operator to refine search results.
By default, a query returns all of the data attributes for the items with specified primary key(s) however you can use the ProjectionExpression parameter so that the query only returns some of the attributes, rather than all of them.

Query results are always sorted by the sort key. If the data type of the sort key is a number the results are returned in numeric order. Otherwise, the results are returned in order of ascii character code values. By default the sort order is ascending. To reverse the order set the ScanIndexForward parameter to false.

By default is eventually consistent but can be changed to strongly consistent.

SCAN:-
A Scan operation examines every item in the table. By default, a scan returns all of the data attributes of every item however you can use the ProjectionExpression parameter so that the scan only returns some of the attributes, rather than all of them.

Hope you find this quick glance of DynamoDB useful. Do let us know in comments if you have any query or suggestion.

Today we also want to share with you a good news that our blog is now included by Feedspot in the list of AWS Top 10 blogs . We would like to thank you all for your help and support in achieving this.

AWS certification exam cheat sheets

AWS certification exams grill you on vast topics and lot of services. In this post we have consolidated major services and topics of different exams so that you can access them from a single location.

Below links will give you better info on which topics and services are important for each exam and how to best prepare for them.

Which AWS certification is suitable for me?

How to prepare for AWS Certified Developer - Associate

How to prepare for AWS Certified SysOps Administrator – Associate

How to prepare for AWS Certified Solutions Architect - Associate

How to prepare for AWS Certified Solutions Architect - Professional

Once decided on which certification you want to pursue you can take a quick look of services in these cheat sheets for each service.

Services

AWS EB CLI Cheat Sheet - Elastic Beanstalk

In this post we will discuss about the Elastic Beanstalk CLI called EB CLI.
If you are new to Elastic Beanstalk, it’s recommended that you go through this free AWS Elastic Beanstalk crash course.
If you want to manage Elastic Beanstalk using traditional AWS CLI follow this post .
Installation
Follow these guides to install eb cli on Windows, Linux and MacOS .
Get help

eb -h

Initialize eb cli

eb init

It will ask questions:-

Default region
Access key details
Select existing application or create new.
Application name
Platform e.g. PHP, Python etc.
Setup ssh
Select keypair or create one.

Create environment

eb create

Check status

eb status

Check health information

eb health

Check events

eb events

Pull logs

eb logs

Open environment website in browser

eb open

Deploy Update

eb deploy

Check configuration options

eb config

Terminate environment

eb terminate

List environments

eb list

Change current environment

eb use cldvds-env

Below are some other useful commands

eb abort	Cancel deployment
eb appversion	Manage EB application versions
eb clone	Create clone of environment
eb console	Open environment in AWS console
eb labs	Extra commands for experiment
eb local	Run commands on local machine
eb platform	Manage platform
eb printenv	Show environment variables
eb restore	rebuild a terminated environment
eb scale	Scaling the number of instances.
eb setenv	Set environment variables
eb ssh	Connect to instance via ssh
eb swap	Swap CNAME of two environments
eb tags	Modify environment tags
eb upgrade	Update the platform to most recent version

Above list is created by referring the AWS doc for elastic beanstalk cli . If you have any query or concern please feel free to contact us.

AWS EC2 CLI - Cheat sheet

Below is the cheat sheet of AWS CLI commands for EC2.
If you are new to EC2, it’s recommended that you go through this free AWS EC2 crash course.
If you want to know how to install AWS CLI please follow steps on this post
Get help

aws ec2 help

Create instance EC2 Classic

aws ec2 run-instances --image-id ami-xxxxxxxx --count 1 --instance-type t1.micro --key-name MyKeyPair --security-groups my-sg

Create instance in VPC

aws ec2 run-instances --image-id ami-xxxxxxxx --count 1 --instance-type t2.micro --key-name MyKeyPair --security-group-ids \

sg-xxxxxxxx --subnet-id subnet-xxxxxxxx

Start instance

aws ec2 start-instances --instance-ids <instance-id>

Stop instance

aws ec2 stop-instances --instance-ids <instance-id>

Reboot instance

aws ec2 reboot-instances --instance-ids <instance-id>

Terminate instance

aws ec2 terminate-instances --instance-ids <instance-id>

View console output

aws ec2 get-console-output --instance-id <instance-id>

Describe Instance

aws ec2 describe-instances --instance-ids <instance-id>

Create an AMI

aws ec2 create-image \ --instance-id <instance-id> \ --name myAMI \ --description 'CloudVedas Test AMI'

List images(AMIs)

aws ec2 describe-images --image-ids <ami-id>

List security groups

aws ec2 describe-security-groups

Create security group

aws ec2 create-security-group --vpc-id vpc-1234abcd --group-name db-access --description "cloudvedas db access"

Get details of security group

aws ec2 describe-security-groups --group-names <group-name>

Delete Security group

aws ec2 delete-security-group --group-id sg-1234abcd

List key pairs

aws ec2 describe-key-pairs

Create keypair

aws ec2 create-key-pair --key-name <value>

Import keypair

aws ec2 import-key-pair --key-name keyname_test --public-key-material file:///cldvds/sagu/id_rsa.pub

Delete keypair

aws ec2 delete-key-pair --key-name <value>

Check the networking attribute

aws ec2 describe-instance-attribute --instance-id <instance-id> --attribute sriovNetSupport

Add tags to instance

aws ec2 create-tags --resources i-xxxxxxxx --tags Key=Name,Value=MyInstance

Add EBS volume

aws ec2 --block-device-mappings "[{\"DeviceName\":\"/dev/sdf\",\"Ebs\":{\"VolumeSize\":20,\"DeleteOnTermination\":false}}]"

List EBS volumes

aws ec2 describe-volumes

Check snapshot associated with EBS volume

aws ec2 describe-volumes --volume-ids vol-01c6l3de3v21bd46s

Note:- All the above commands are taken from different AWS EC2 CLI reference guides and put in one place over here. Please run the commands after due diligence as we won’t be responsible for any mistakes in executing the commands and it’s consequences. If you have any concern or query feel free to contact us.

AWS S3 CLI - Cheat sheet

Below is the cheat sheet of AWS CLI commands for S3.
If you are new to S3 it’s recommended that you go through this free AWS S3 crash course.
If you want to know how to install AWS CLI, follow steps on this post.
Get help

aws s3 help

aws s3api help

Create bucket

aws s3 mb s3://bucket-name

Removing bucket

aws s3 rb s3://bucket-name

To remove a non-empty bucket (Extremely careful while running this). This will remove all contents in the bucket including subfolders and data in them.

aws s3 rb s3://bucket-name --force

Copy object

aws s3 cp mypic.png s3://mybucket/

Copy buckets

aws s3 cp myfolder s3://mybucket/myfolder --recursive

(Note: –recursive will copy recursively everything including the subfolders)
Sync buckets

 aws s3 sync <source> <target> [--options]

List buckets

aws s3 ls

List specific bucket

aws s3 ls s3://mybucket

Bucket location

aws s3api get-bucket-location --bucket <bucket-name>

Logging status

aws s3api get-bucket-logging --bucket <bucket-name>

ACL (Access Control List)
The following example copies an object into a bucket. It grants read permissions on the object to everyone and full permissions (read, readacl, and writeacl) to the account associated with user@example.com.

aws s3 cp file.txt s3://my-bucket/ --grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers full=emailaddress=user@example.com

AWS Crash Course - EMR

What is EMR?

AWS EMR(Elastic MapReduce) is a managed hadoop framework.
It provides you an easy, cost-effective and highly scalable way to process large amount of data.
It can be used for multiple things like indexing, log analysis, financial analysis, scientific simulation, machine learning etc.

Cluster and Nodes

The centerpiece of EMR is Cluster.
Cluster is a collection of EC2 instances also called as nodes.
All nodes of an EMR cluster are launched in same availability zone.
Each node has a role in cluster.

Type of EMR Cluster Nodes
Master Node:- It’s the main boss which manages the cluster by running software components and distributing the tasks to other nodes. Master node will monitor task status and health of cluster.
Core Node:- It’s a slave node which “run tasks” and “store data” in HDFS (Hadoop Distributed Filesystem).
Task Node:- This is also a slave node but it only “run tasks”. It doesn’t store any data. It’s an optional node.
Cluster Types
EMR has two type of clusters
1) Transient :- These are clusters which are shutdown once the jobs is done. These are useful when you don’t need cluster to be running all day long and can save money by shutting them down.
2) Persistent :- Persistent clusters are those which need to be always available to process the continuous stream of jobs or you want the data to be always available on HDFS.
Different Cluster States
An EMR cluster goes through multiple stages as described below:-
STARTING – The cluster provisions, starts, and configures EC2 instances.
BOOTSTRAPPING – Bootstrap actions are being executed on the cluster.
RUNNING – A step for the cluster is currently being run.
WAITING – The cluster is currently active, but has no steps to run.
TERMINATING – The cluster is in the process of shutting down.
TERMINATED – The cluster was shut down without error.
TERMINATED_WITH_ERRORS – The cluster was shut down with errors.

Types of filesystem in EMR
Hadoop Distributed File System (HDFS)
Hadoop Distributed File System (HDFS) is a distributed, scalable file system for Hadoop. HDFS distributes the data it stores across instances in the cluster, storing multiple copies of data on different instances to ensure that no data is lost if an individual instance fails. HDFS is ephemeral storage that is reclaimed when you terminate a cluster.
EMR File System (EMRFS)
Using the EMR File System (EMRFS), Amazon EMR extends Hadoop to add the ability to directly access data stored in Amazon S3 as if it were a file system like HDFS. You can use either HDFS or Amazon S3 as the file system in your cluster. Most often, Amazon S3 is used to store input and output data and intermediate results are stored in HDFS.
Local File System
The local file system refers to a locally connected disk. When you create a Hadoop cluster, each node is created from an Amazon EC2 instance that comes with a preconfigured block of preattached disk storage called an instance store. Data on instance store volumes persists only during the lifecycle of its Amazon EC2 instance.
Programming languages supported by EMR

Perl
Python
Ruby
C++
PHP
R

EMR Security

EMR integrates with IAM to manage permissions.
EMR has Master and Slave security groups for nodes to control the traffic access.
EMR supports S3 server-side and client-side encryption with EMRFS.
You can launch EMR clusters in your VPC to make it more secure.
EMR integrates with CloudTrail so you will have log of all activites done on cluster.
You can login via ssh to EMR cluster nodes using EC2 Key Pairs.

EMR Management Interfaces

Console :- You can manage your EMR clusters from AWS EMR Console .
AWS CLI :- Command line provides you a rich way of controlling the EMR. Refer here the EMR CLI .
Software Development Kits (SDKs) :- SDKs provide functions that call Amazon EMR to create and manage clusters. It’s currently available only for the supported languages mentioned above. You can check here some sample code and libraries.
Web Service API :- You can use this interface to call the Web Service directly using JSON. You can get more information from API reference Guide .

EMR Billing

You pay for EC2 instances used in cluster and EMR.
You are charged for per instance hours.
EMR supports On-Demand, Spot, and Reserved Instances
As a cost saving measure it is recommenced that task nodes should be Spot instances
It’s not a good idea to use spot instances for Master or Core Node as they store data on them. And you will lose data once the node is terminated.

If you want to try some EMR hands on refer this tutorial.

This AWS Crash Course series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

AWS Crash Course - Redshift

Redshift is a data warehouse from Amazon. It’s like a virtual place where you store a huge amount of data.

Redshift is fully managed petabyte-scale system
Amazon Redshift is based on PostgreSQL 8.0.2
It is optimized for data warehousing
Supports integrations and connections with various applications, including, Business Intelligence tools
Redshift provides custom JDBC and ODBC drivers.
Redshift can be integrated with CloudTrail for auditing purpose.
You can monitor Redshift performance from CloudWatch.

Features of Amazon Redshift
Supports VPC − The users can launch Redshift within VPC and control access to the cluster through the virtual networking environment.
Encryption − Data stored in Redshift can be encrypted and configured while creating tables in Redshift.
SSL − SSL encryption is used to encrypt connections between clients and Redshift.
Scalable − With a few simple clicks, you can choose vertical scaling(increasing instance size) or horizontal scaling(increasing compute nodes).
Cost-effective − Amazon Redshift is a cost-effective alternative to traditional data warehousing practices. There are no up-front costs, no long-term commitments and on-demand pricing structure.
MPP(Massive Parallel Processing) – Redshift Leverages parallel processing which improves query performance. Massively parallel refers to the use of a large number of processors (or separate computers) to perform coordinated computations in parallel. This reduces computing time and improves query performance.
Columnar Storage – Redshift uses columnar storage. So it stores data tables by column rather than by row. The goal of a columnar database is to efficiently write and read data to and from hard disk storage in order to speed up the time it takes to return a query.
Advanced Compression – Compression is a column-level operation that reduces the size of data when it is stored thus help in saving space.
Type of nodes in Redshift
Leader Node
Compute Node
What does these nodes do?
Leader Node:- A leader node receives queries from client applications, parse the queries and develops execution plans, which are an ordered set of steps to process these queries. The leader node then coordinates the parallel execution of these plans with the compute nodes. Good part is that you will not be charged for leader node hours; only compute nodes will incur charges. If you run single node Redshift cluster you don’t need leader node.
Compute Node:- It execute the steps specified in the execution plans and transmit data among other nodes to serve queries. The intermediate results are sent back to the leader node for aggregation before being sent back to the client applications. You can have 1 to 128 Compute Nodes.
From which sources you can load data in Redshift?
You can do it from multiple sources like :-

Amazon S3
Amazon DynamoDB
Amazon EMR
AWS Data Pipeline
Any SSH-enabled host on Amazon EC2 or on-premises

Redshift Backup and Restores

Redshift can take automatic snapshots of cluster.
You can also take manual snapshot of cluster.
Redshift continuously backs up its data to S3
Redshift attempt to keep at least 3 copies of the data.

Hope the above snapshot give you a decent understanding of Redshift. If you want to try some handson check this tutorial .
This series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

AWS Crash Course - SQS

Today we will discuss about an AWS messaging service called SQS.

SQS is Simple Queue Service.
It’s a messaging queue service which acts like a buffer between message producing and message receiving components.

Using SQS you can decouple the components of an application.
Messages can contain upto 256 KB of text in any format.
Any component can later retrieve the messages programmatically using the SQS API.
SQS queues are dynamically created and scale automatically so you can build and grow applications quickly – and efficiently.
You can combine SQS with auto scaling of EC2 instances as per warm up and cool down.
Used by companies like Vodafone, BMW, RedBus, Netflix etc.
You can use Amazon SQS to exchange sensitive data between applications using server-side encryption (SSE) to encrypt each message body.
SQS is pull(or poll) based system. So messages are pulled from SQS queues.
Multiple copies of every message is stored redundantly across multiple availability zones.
Amazon SQS is deeply integrated with other AWS services such as EC2, ECS, RDS, Lambda etc.

Two types of SQS queues:-

Standard Queue
FIFO Queue

Standard Queue :-

Standard Queue is the default type offered by SQS
Allows nearly unlimited transactions per second.
Guarantees that a message will be delivered at least once.
But it can deliver the message more than once also.
It provides best effort ordering.
Messages can be kept from 1 minute to 14 days. Default is 4 days.
It has a visibility time out window. And if order is not processed till that time, it will become visible again and processed by another reader.

FIFO Queue :-

FIFO queue complements the standard queue.
It has First in First Out delivery mechanism.
Messages are processed only once.
Order of the message is strictly preserved.
Duplicates are not introduced in the queue.
Supports message groups.
Limited to 300 transactions per second.

Hope the above snapshot give you a decent understanding of SQS. If you want to try some handson check this tutorial .
This series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

AWS Crash Course - RDS

Welcome to AWS Crash Course.
What is RDS?

RDS is Relational Database Service of Amazon.
It is part of its PaaS offering.
A new DB instance can easily be launched from AWS management console.
Complex administration process like patching, backup etc. are manged automatically by RDS.
Amazon has its own relational database called Amazon Aurora.
RDS also supports other popular database engines like MySQL, Oracle, SQL Server, PostgreSQL and MariaDB .

RDS Supports Multi AZ(Availability Zone) failovers.
What does that mean?
It means if your primary DB is down. Services will automatically failover to secondary DB in other AZ.

Multi-AZ deployments for MySQL,Oracle and PostgreSQL engines utilizes synchronous physical replication to keep data on the standby up-to-date with Primary.
Multi-AZ deployments for the SQL server engine use synchronous logical replication to achieve the same result, employing SQL server native mirroring tech.
Both approaches safeguard your data in event of a DB instance failure or loss of AZ.
Backups are taken from secondary DB which avoids I/O suspension to the primary.
Restore’s are taken from secondary DB which avoids I/O suspension to the primary.
You can force a failover from one AZ to another by rebooting your DB instance.

But RDS Multi AZ failover is not a scaling Solution.
Read Replicas are for Scaling.
What are Read Replicas?
As we discussed above Multi AZ is synchronous replication of DB. While read replicas are asynchronous replication of DB.

You can have 5 read replicas for both MySQL and PostgreSQL.
You can have read replicas in different regions but for MySQL only.
Read replica’s can be built off Multi-AZ’s databases.
You can have read replica’s of read replica’s , however only for MySQL and this will further increase latency.
You can use read replicas for generating reports. By this you won’t put load on the primary DB.

RDS supports automated backups.
But keep these things in mind.

There is a performance hit if Multi-AZ is not enabled.
If you delete an instance then all automated backups are deleted, however manual db snapshots will not be deleted.
All snapshots are stored on S3.
When you do a restore , you can change the engine type(e.g. SQL standard to SQL enterprise) provided you have enough storage space.

Hope the above snapshot give you a decent understanding of RDS. If you want to try some handson check this tutorial .
This series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

AWS Crash Course - S3

Welcome to AWS Crash Course.
What is S3?
S3 is Simple Storage Service. It’s an object storage. That means, it’s used for storing objects like photos, videos etc.

S3 provides 11 9’s durability 99.999999999%. Means Losing 1 out of 100 Billion objects.
S3 provides 99.99% availability.
Files can be 1 byte to 5 TB in size
Read after Write consistency for PUTS of new objects – means you can immediately read what you have written.
Eventual consistency for overwrite PUTS and DELETES – if you overwrite or delete an existing object it takes time to propagate in S3 globally.
Secure your data using ACL and bucket policies.
S3 is designed to sustain the loss of 2 facilities concurrently i.e. 2 Availability Zone failures.
S3 has multiple classes. One is S3 Standard which we discussed above. Others classes are:-

S3-IA (Infrequently Accessed)

S3-IA is for data which is not frequently accessed but still needed an immediate access.
You get same durability and availability as S3 but at reduced price.
Can manage upto 2 Concurrent facility fault tolerance.

S3-RRS (S3- Reduced Redundancy Storage)

99.99% durability and availability.
Use RRS if you are storing non-critical data that can be easily reproduced. Like thumbnails of images.
No Concurrent facility fault tolerance

S3 Glacier

Data is stored in Amazon Glacier in “archives.“
Archive can be any data such as a photo, video, or document.
A single archive can be as large as 40 terabytes.
You can store an unlimited number of archives in Glacier
Amazon Glacier uses “vaults” as containers to store archives.
Under a single AWS account, you can have up to 1000 vaults.

S3 Supports versioning.
What does that mean?
It means that if you change a file it can keep versions of both old and new files.

If you enable versioning in S3 it will keep all the versions even if you delete or update the old version.
Great backup tool
Once enabled versioning cannot be disabled ,only suspended
Integrates with lifecycle rules
Versioning’s MFA delete capability which uses multi factor authentication, can be used to provide additional layer of security.
Cross region replication , requires versioning enabled on the source bucket

S3 Supports Lifecycle Management

What does that mean?
It means you can move objects from one storage class to another storage class after few days as per your schedule. This is used to reduce cost by moving less critical data to cheaper storage class.

Lifecycle configuration enables you to specify the lifecycle management of objects in a bucket
Can be used with versioning
Can be applied to current versions and previous versions
Transition from standard to infrequent access storage class can be done only after the data is in standard class storage for 30 days.
You can directly put data from standard to glacier
Lifecycle policy will not transition objects that are less than 128KB

If you want to try some handson try this exercise .
This series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

AWS Crash Course - Elastic Beanstalk

Welcome back to AWS Crash Course.
In the last section we discussed about EBS.
In this section we will discuss about AWS Elastic Beanstalk.
AWS Elastic Beanstalk makes it even easier for developers to quickly deploy and manage applications in the AWS cloud. Developers simply upload their application, and Elastic Beanstalk automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling, and application health monitoring.

You can push updates from GIT and only the modified files are transmitted to AWS elastic beanstalk.
Elastic beanstalk supports IAM, EC2, VPC and RDS instances.
You have full access to the resources under elastic beanstalk
Code is stored in S3
Multiple environments are allowed to support version control. You can roll back changes.
Amazon Linux AMI and Windows 2008 R2 supported.
What are the supported Languages and Development Stacks?
Apache Tomcat for Java applications
Apache HTTP Server for PHP applications
Apache HTTP Server for Python applications
Nginx or Apache HTTP Server for Node.js applications
Passenger or Puma for Ruby applications
Microsoft IIS 7.5, 8.0, and 8.5 for .NET applications
Java SE
Docker
Go

How can you update Elastic Beanstalk?

You can upload the code for updating on AWS elastic beanstalk
It support multiple running environments like test, pre-prod and prod etc
Each environment is independently configured and runs on its own separate AWS resources
Elastic beanstalk also stores and tracks application versions over time so an existing environment can easily rolled back to a prior version.
New environment can be launched using an older version to try and reproduce a customer problem.

Fault Tolerance

Always design, implement, and deploy for automated recovery from failure
Use multiple Availability Zones for your Amazon EC2 instances and for Amazon RDS
Use ELB for balancing the load.
Configure your Auto Scaling settings to maintain your fleet of Amazon EC2 instances.
If you are using Amazon RDS, then set the retention period for backups, so that Amazon RDS can perform automated backups.

What about Security?

Security on AWS is a shared responsibility
You are responsible for the security of data coming in and out of your Elastic Beanstalk environment.
Configure SSL to protect information from your clients.
Configure security groups and NACL with least privilege.

This short course was to give you an understanding of elastic beanstalk. If you want to try some hands on follow this AWS tutorial.

AWS Crash Course - EBS

In the last section we discussed about VPC. In this section we will discuss about EBS.
What is EBS?

EBS is Elastic Block Storage.
EBS volume is a durable, block-level storage. It’s similar to the hard disk that you have in your laptop or desktop.
EBS volumes can be used as primary storage for data that requires frequent updates.
EBS volume in an Availability Zone is automatically replicated within that zone to prevent data loss due to failure.
You can create encrypted EBS volumes with the Amazon EBS encryption feature or use 3rd party software for encryption.
To improve performance use RAID Groups e.g. RAID 0, RAID 1, RAID 10

What are the different types of EBS volumes?

General Purpose SSD (gp2) – It provides you upto 10,000 IOPS(Input/output operations per second) and it can be of size from 1GB to 16TB . This is used for for normal loads. And should be enough for your you Dev or UAT setups.
Provisioned IOPS SSD (io1) – It provides you upto 20000 IOPS and it can be of size from 4GB to 16TB . These are generally used for Large SQL/NoSQL Databases.
Throughput Optimized HDD (st1) – These provide you upto 500 IOPS and can range in size from 500GB to 16TB. These are mostly useful for Big Data/ Data warehouses.
Cold HDD (sc1) – These are the cheapest kind of disks. They provide upto 250 IOPS -and can range in size from 500GB to 16TB. These are commonly used fro data archiving as they provide low IOPS but are cheap for storing data which is not used frequently.

You can take snapshots of EBS volumes.

So what is a snapshot?

You can back up the data on your EBS volumes to Amazon S3 by taking point-in-time snapshots
Snapshots are incremental backups – Saves time and storage costs
Snapshots support encryption
Snapshots exclude data that has been cached by any applications or the OS
You can share your unencrypted snapshots with others
You can use a copy of a snapshot for Migrations, DR, Data retention etc.

You can try handson with EBS by using this exercise .

AWS Crash Course – Route 53

Route 53 is a DNS service that route user requests.

Amazon Route 53 (Route 53) is a scalable and highly available Domain Name System (DNS).
The name route 53 is reference to UDP port 53 which is generally used for DNS.
Route 53 with its DNS service that allows administrators to direct traffic by simply updating DNS records in the hosted zone.
TTL(Time to Live) can be adjusted for resource records to be shorter which allow record changes to propagate faster to clients.
One of the key features of Route 53 is programmatic access to the service that allows customers to modify DNS records via web service calls.

Three Main functions of Route 53 are:-
Domain registration:- It allows you to register domain names from your AWS accounts.
DNS service:- This service is used for mapping your website IP to a name. e.g.54.168.4.10 to example.com. It also supports many other formats which we will discuss below.
Health Monitoring:- It can monitor the health of your servers/VMs/instances and can route traffic as per the routing policy. It can also work as a load balancer for region level traffic management.
Route 53 supports different routing policies and you can use the one which is most suitable for your applications.
Routing Policies :-

Simple:- In this Route 53 will respond to DNS queries that are only in the record set.
Weighted:- This policy let you split the traffic based on different weights assigned. for e.g. 10% traffic goes to us-east-1 and 90% goes to eu-west-1
Latency:- Allows to route your traffic based on lowest network latency for your end user.(ie which region will give end user the fastest response time)
Failover:- This policy is used when you create an active/passive setup. Route 53 will monitor the health of your primary site using a health check.
Geolocation:- This routing lets you choose where your traffic will go based on geographic location of end users. So the user requesting from France will be served from server which is nearest to France.

Route 53 supports many DNS record formats:-

A Format :- Returns a 32-bit IPv4 address, most commonly used to map hostnames to an IP address of the host.
AAAA Format:- Returns a 128-bit IPv6 address, most commonly used to map hostnames to an IP address of the host.
CNAME Format:- Alias of one name to another. So with CNAME you can set example.com and www.example.com as alias of each other.
MX Format :- Maps a domain name to a list of message transfer agents for that domain
NS Format:- Delegates a DNS zone to use the given authoritative name servers.
PTR Format :- Pointer to a canonical name. Unlike a CNAME, DNS processing stops and just the name is returned. The most common use is for implementing reverse DNS lookups, but other uses include such things as DNS-SD.
SOA Format:- Specifies authoritative information about a DNS zone, including the primary name server, the email of the domain administrator, the domain serial number, and several timers relating to refreshing the zone.
SRV Format:- Generalized service location record, used for newer protocols instead of creating protocol-specific records such as MX.
TXT Format :- Originally for arbitrary human-readable text in a DNS record.

Tip:- For the exam understanding A format and CNAME should be enough.
If you want to try some handson try this exercise .
This series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

Solved: How to download a complete S3 bucket or a S3 folder?

If you ever want to download an entire S3 folder you can do it with CLI.
You can download the AWS CLI from this page. AWS CLI Download
Download the AWS CLI as per your system Window, Linux or Mac.
In our case we use Windows 64 bit. Once you donwload the .exe simply double click on it to install the AWS CLI.
Once the AWS CLI is installed go to windows command prompt(CMD) and enter command

aws configure

It will ask for the AWS user details with which you want to login and region name. Check this post to know How to create an IAM user.
You can get the AWS IAM user access details from IAM console .
Get Region name here .
Fill in the user details as below:

AWS Access Key ID: <Key ID of the user>

AWS Secret Access Key: <Secret key of the user>D

region Name : <us-east-1>

Default output format: None

Once you have downloaded and configured AWS CLI on your machine you have to exceute “sync” command as shown below.

aws s3 sync s3://mybucket/dir  /local/folder

You can also do the same with “cp” command. It will need –recursive option to recursively copy the contents of subdirectories also.

aws s3 cp s3://myBucket/dir /local/folder --recursive

Refer to this S3 cheat sheet to learn more tricks.

AWS Crash Course – VPC

In the last section we discussed about EC2. In case you missed it you can check it here AWS Crash Course – EC2 .
In this section we will discuss about VPC.
What is VPC?

VPC is Virtual Private Cloud.
VPC is like your own private cloud inside the AWS public cloud.
You can decide the network range.
Your VPC is not shared with others.
You can launch instances in VPC and restrict inbound/outbound access to them.
You can leverage multiple layers of security, including security groups and network access control lists.
You can create a Virtual Private Network (VPN) connection between your corporate datacenter and your VPC.

Components of Amazon VPC:-

Subnet: A segment of a VPC’s IP address range this is basically the network range of IPs which you assign to your resource e.g. EC2.
Internet Gateway: If you want your instance in VPC to be able to access Public Internet, you create an internet gateway.
NAT Gateway: You can use a network address translation (NAT) gateway to enable instances in a private subnet to connect to the Internet or other AWS services, but prevent the Internet from initiating a connection with those instances.
Hardware VPN Connection: A hardware-based VPN connection between your Amazon VPC and your datacenter, home network, or co-location facility.
Virtual Private Gateway: A virtual private gateway is the VPN concentrator on the Amazon side of the VPN connection..
Customer Gateway: A customer gateway is a physical device or software application on your side of the VPN connection.
Router: Routers acts like a mediator for your sunets in VPC. It interconnect subnets and direct traffic between Internet gateways, virtual private gateways, NAT gateways, and subnets.
Peering Connection: A peering connection enables you to route traffic via private IP addresses between two peered VPCs. Peering connection is used to do VPC Peering by which you can establish connections/tunnel between two different VPCs.

VPC has few more components but to avoid confusion we will discuss about them in later sections.
This series is created to give you a quick snapshot of AWS technologies. You can check about other AWS services in this series over here .

AWS Crash Course - EC2

We are starting this series on AWS to give you a decent understanding of different AWS services. These will be short articles which you can go through in 15-20 mins everyday.
You can check the complete series here of AWS Crash Course .
Introduction:-

AWS compute is part of it’s IaaS offerings.
With compute, you can deploy virtual servers to run your applications.
Don’t have to wait for days or weeks to get your desired server capacity.
You can manage the OS or let AWS manage it for you.
It can be used to build mobile apps or running massive clusters.
You can even deploy application serverless.
It provides high fault tolerance.
Easy scalability and load balancing.
You are billed as per your usage.

What is EC2?

EC2 is Elastic Compute Cloud
It’s VM (virtual machine) in cloud.
You can commission one or thousands of instances simultaneously, and pay only for what you use, making web-scale cloud computing easy.
Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change.
Amazon EC2 provides developers the tools to build failure resilient applications and isolate them from common failure scenarios.

What are EC2 pricing models?

On Demand – Pay by hour no long term commitment.
Reserved – Yearly reservations up to 75% cheaper compared to On Demand.
Dedicated – A dedicated Physical server is provided to you. Up to 70% cheaper compared to On Demand.
Spot – Bid on spare Amazon computing capacity. Up to 90% cheaper compared to On Demand.

EC2 Instance Types:-

General Purpose (T2, M4 and M3) – Small and mid-size databases
Compute Optimized (C4 and C3) – High performance front-end fleets, web-servers, batch processing etc.
Memory Optimized (X1, R4 and R3) – High performance databases, data mining & analysis, in-memory databases
Accelerated Computing Instances(P2, G2 and F1) – Used for graphic workloads
Storage Optimized I3 – High I/O Instances – NoSQL databases like Cassandra, MongoDB
D2 – Dense-storage Instances – Massively Parallel Processing (MPP) data warehousing, MapReduce and Hadoop distributed computing

Check out more details in next section . AWS Crash Course – VPC
If you want to try some hands on, you can follow this guide to launch Amazon Linux Instance or this for Windows instance.