Thursday, 6 July 2017

Spotrestart

AWS EC2 spot instances can achieve cost savings of up to 90% compared to on-demand instance pricing. Some spot instances can be stable, many instance types run for months without termination at much lower cost than on-demand. See here for details.

If app application can tolerate a short outage of 2-3 minutes every couple of months to save up to 90% costs then it could be a good candidate for spot instances.  This could be useful for many applications that can afford such a outage or useful for many application in Dev, SIT, UAT environments. The idea of spotrestart is to automatically manage the whole spot process.

Spotrestart allows an application that is using AWS EC2 instance to use spot instance and automatically restart EC2 spot instance if it receives a termination notice. When the spot instance starts it perform the following:
  • Mounts additional file systems (if required). This can be used to store any state in case the spot instance terminates and restarts. All state information is stored on external EBS volumes hence no data is lost if a spot instance terminates and subsequently restarts.
  • Registers hostname with Route 53, in case IP address changes the instance is still reachable with same hostname
  • Starts spotmonitor.sh to continually check for spot instance termination, if termination notice is received it launches another spot instance to take over.

Spotstart consists of the following: 

  • spotrestart.config, parameter file:
      • Region
      • Possible instance types
      • AMI to be used
      • Subnet to be used
      • Security group(s) to be used
      • Volumes to mount on startup
      • Hostname and Domain name to register in Route 53
  • spotlaunch.sh:
      • launches spot instance as per spotrestart.config, parameter file
  • userdata.template (used by spotlaunch.sh):
      • script to start on on spot instance on startup which invokes spotrestart
  • bestinstance.sh:
      • Reads spotrestart.config parameter to determine potential spot instance types
      • Calculates best spot instance type based on spot price history
  • spotmonitor.sh:
      • Monitors for spot instance termination notice
      • If termination notice is received launches another spot instance with the spotrestart.config parameter file to take over the existing instance based on best instance type

spotlaunch.sh
#!/bin/bash

myregion()
{
case "$REGION" in

Frankfurt) export AWS_DEFAULT_REGION=eu-central-1
    ;;
Ireland) export AWS_DEFAULT_REGION=eu-west-1
    ;;
N.-California) export AWS_DEFAULT_REGION=us-west-1
    ;;
N.-Virginia) export AWS_DEFAULT_REGION=us-east-1
   ;;
Oregon) export AWS_DEFAULT_REGION=us-west-2
   ;;
US) export AWS_DEFAULT_REGION=
   ;;
Sao-Paulo) export AWS_DEFAULT_REGION=sa-east-1
   ;;
Singapore) export AWS_DEFAULT_REGION=ap-southeast-1
   ;;
Tokyo) export AWS_DEFAULT_REGION=ap-northeast-1
   ;;
Seoul) export AWS_DEFAULT_REGION=ap-northeast-2
   ;;
Sydney) export AWS_DEFAULT_REGION=ap-southeast-2
   ;;
*) echo "Region $reg enot recognised"; exit 0
   ;;
esac
}

echo -e "\n\n`date` Starting spotlaunch"
AWS_DEFAULT_OUTPUT="text"
BASE=/usr/local/spotrestart
config=spotrestart.config
MAIL=`grep -v '^#' $BASE/$config | grep -i MAIL: | cut -d':' -f2-100`
cd $BASE

(
cat << '_END_'
#!/bin/bash
source /etc/profile
curl --user-agent abc123topsecret  https://s3-ap-southeast-2.amazonaws.com/mybucket/spotrestart.tgz > /tmp/spotrestart.tgz
mkdir -p /usr/local/spotrestart; cd /usr/local/spotrestart
tar xzf /tmp/spotrestart.tgz
_END_

echo 'cat > /usr/local/spotrestart/spotrestart.config <<_EOF_'
cat $BASE/$config
echo '_EOF_'


cat userdata.template
) > userdata




REGION=`grep -v '^#' $BASE/$config | grep -i REGION: | cut -d':' -f2-100`
myregion; REGION=$AWS_DEFAULT_REGION
AMI=`grep -v '^#' $BASE/$config | grep -i AMI: | cut -d':' -f2-100`
AZs=`grep -v '^#' $BASE/$config | grep -i AZs: | cut -d':' -f2-100`
PROFILE=`grep -v '^#' $BASE/$config | grep -i PROFILE: | cut -d':' -f2-100`
KEY=`grep -v '^#' $BASE/$config | grep -i KEY: | cut -d':' -f2-100`
SUBNETS=`grep -v '^#' $BASE/$config | grep -i SUBNETS: | cut -d':' -f2-100`
SECURITYGROUPIDS=`grep -v '^#' $BASE/$config | grep -i SECURITYGROUPIDS: | cut -d':' -f2-100`
BIDPRICE=`grep -v '^#' $BASE/$config | grep -i BIDPRICE: | cut -d':' -f2-100`
USERDATA=$(base64 -w 0 < userdata)
BESTINSTANCE=`cat $BASE/bestinstance/lowestcost | sort -t, -nk4 | head -1  | awk -F',' ' { print $0 "," $5*1.2 } '`
#ap-southeast-2b,2017-04-19T05:46:48.000Z,c3.xlarge,0.043200,.265,0.163019,0.318

AZ=`echo $BESTINSTANCE | cut -d, -f1`
INSTANCE=`echo $BESTINSTANCE | cut -d, -f3`
[ -z "$BIDPRICE" ] &&  BIDPRICE=`echo $BESTINSTANCE | cut -d, -f7`
i=`echo $AZs | tr ' ' '\n' | cat -n | grep $AZ | tr '\t' ':' | cut -d ':' -f1`; AZSUBNET=$(echo $SUBNETS | awk " { print \$$i } ")


(
echo '{'
echo '  "ImageId": "'$AMI'",'
echo '  "InstanceType": "'$INSTANCE'",'
echo '  "SubnetId": "'$AZSUBNET'",'
echo '  "KeyName": "'$KEY'",'
echo '  "IamInstanceProfile": { "Arn": "'$PROFILE'" },'
cat <<'_EOF_'
  "BlockDeviceMappings": [
    {
      "DeviceName": "/dev/xvda",
      "Ebs": {
        "DeleteOnTermination": true
      }
    },
        {
          "DeviceName": "/dev/xvdca",
          "VirtualName": "ephemeral0"
        },
        {
          "DeviceName": "/dev/xvdcb",
          "VirtualName": "ephemeral1"
        }
  ],
  "SecurityGroupIds": [
_EOF_

last=`echo $SECURITYGROUPIDS | tr ' ' '\n' | tail -1`
for i in $SECURITYGROUPIDS
do
    echo -n '    "'$i'"'; [ "$i" == "$last" ] && echo || echo ","
done
echo '  ],'
echo '  "UserData": "'$USERDATA'"'
echo '}'
) > specs.json


(
HOSTNAME=`grep -v '^#' $BASE/$config | grep -i HOSTNAME: | cut -d':' -f2-100`
echo aws ec2 request-spot-instances --region $REGION --spot-price $BIDPRICE --instance-count 1 --launch-specification file://specs.json
echo "Lauched from: `hostname` `date`"; aws ec2 request-spot-instances --region $REGION --spot-price $BIDPRICE --instance-count 1 --launch-specification file://specs.json
) | tee spotlaunch.out | mailx -v -s "$HOSTNAME Spot instance launch `date`" $MAIL

exit 0

userdata.template
myregion()
{
case "$REGION" in

Frankfurt) export AWS_DEFAULT_REGION=eu-central-1
    ;;
Ireland) export AWS_DEFAULT_REGION=eu-west-1
    ;;
N.-California) export AWS_DEFAULT_REGION=us-west-1
    ;;
N.-Virginia) export AWS_DEFAULT_REGION=us-east-1
   ;;
Oregon) export AWS_DEFAULT_REGION=us-west-2
   ;;
US) export AWS_DEFAULT_REGION=
   ;;
Sao-Paulo) export AWS_DEFAULT_REGION=sa-east-1
   ;;
Singapore) export AWS_DEFAULT_REGION=ap-southeast-1
   ;;
Tokyo) export AWS_DEFAULT_REGION=ap-northeast-1
   ;;
Seoul) export AWS_DEFAULT_REGION=ap-northeast-2
   ;;
Sydney) export AWS_DEFAULT_REGION=ap-southeast-2
   ;;
*) echo "Region $reg enot recognised"; exit 0
   ;;
esac
}

updaterclocal()
{
BASE=/usr/local/spotrestart
config=spotrestart.config

sleep 300
echo "Additing to rc.local `date`."
echo "$BASE/spotmonitor.sh > $BASE/spotmonitor.out 2>&1 &" >> /etc/rc.local
}

BASE=/usr/local/spotrestart
config=spotrestart.config
cd $BASE
cp bin/* /bin
yum install -y mailx
alias cp='cp'
cat /etc/mail/sendmail.cf | sed -e "1,\$s/^DS/DSsmtp.mycompany.local/g" > /tmp/sendmail.cf; cp -f /tmp/sendmail.cf /etc/mail/sendmail.cf
service sendmail restart

REGION=`grep -v '^#' $BASE/$config | grep -i REGION: | cut -d':' -f2-100`
myregion

(
config=spotrestart.config
TAGS=`grep TAGS: $BASE/$config | grep -i TAGS: | cut -d':' -f2-100`
DOMAIN=`grep -v '^#' $BASE/$config | grep -i DOMAIN: | cut -d':' -f2-100`
HOSTNAME=`grep -v '^#' $BASE/$config | grep -i HOSTNAME: | cut -d':' -f2-100`

echo "`hostname` $HOSTNAME `curl -s http://169.254.169.254/latest/meta-data/instance-type` `wget -q -O- http://169.254.169.254/latest/meta-data/instance-id` `date`"

echo -e "\nProcessing spotmonitor start & rc.local\n"
echo "Before spotmonitor start:"; ps -ef | grep spotmonitor
#echo "$BASE/spotmonitor.sh > $BASE/spotmonitor.out 2>&1 &" >> /etc/rc.local
$BASE/spotmonitor.sh > $BASE/spotmonitor.out 2>&1 &
#$BASE/rcmonitor.sh >> $BASE/spotmonitor.out 2>&1 &
updaterclocal &
echo "After spotmonitor start:"; ps -ef | grep spotmonitor

VOLUMES=`grep -v '^#' $BASE/$config | grep -i VOLUMES: | cut -d':' -f2-100`
REGION=`curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | awk ' { x=length($1)-1; print substr($1,0,x) } '`

for i in `seq 10`
do
  ebs=`echo $VOLUMES | cut -d';' -f$i | cut -d: -f1`
  device=`echo $VOLUMES | cut -d';' -f$i | cut -d: -f2`
  mount=`echo $VOLUMES | cut -d';' -f$i | cut -d: -f3`
  fs=`echo $VOLUMES | cut -d';' -f$i | cut -d: -f4`
  if [ -n "$ebs" ]
  then
       echo -e "\nProcessing: $ebs,$device,$mount,$fs\n"
       su - ec2-user -c "aws ec2 attach-volume --region $REGION --volume-id $ebs --instance-id `curl -s http://169.254.169.254/latest/meta-data/instance-id` --device $device"
       sleep 5
       mkdir -p $fs
       mount $mount $fs
       x=$?; [ $x -eq 0 ] && echo "$mount                      $fs      ext4      defaults,noatime    1 2" >> /etc/fstab
  fi
done
df -h


echo -e "\n\nProcessing Tags:\n"
for i in `seq 6`
do
  key=`echo $TAGS | cut -d';' -f$i | cut -d: -f1`
  value=`echo $TAGS | cut -d';' -f$i | cut -d: -f2`
  echo "$key,$value"
  su - ec2-user -c "aws ec2 create-tags --region $AWS_DEFAULT_REGION --resources `curl -s http://169.254.169.254/latest/meta-data/instance-id` --tags Key=\"$key\",Value=\"$value\""
done

echo -e "\nProccessing DNS:\n"
su - ec2-user -c "$BASE/bin/cli53 rrcreate $DOMAIN \"$HOSTNAME 60 A `curl -s http://169.254.169.254/latest/meta-data/local-ipv4`\" --replace"
#tail -1 /var/log/messages

#set also the hostname to the running instance
FQDN=$HOSTNAME.$DOMAIN
hostname $FQDN

echo -e "\nProcessing bestinstance start & cron:\n"
source /etc/profile && nice -10 bash /usr/local/spotrestart/bestinstance/bestinstance.sh > /usr/local/spotrestart/bestinstance/bestinstance.out 2>&1
su - ec2-user -c 'crontab -l' | grep bestinstance.out > /dev/null
x=$?; [ $x -eq 1 ] && su - ec2-user -c '(crontab -l 2>/dev/null; echo "0 * * * *  source /etc/profile && nice -10 bash /usr/local/spotrestart/bestinstance/bestinstance.sh >> /usr/local/spotrestart/bestinstance/bestinstance.out 2>&1" ) | crontab -'



service sendmail start
echo "`hostname` $HOSTNAME `curl -s http://169.254.169.254/latest/meta-data/instance-type` `wget -q -O- http://169.254.169.254/latest/meta-data/instance-id` `date`"
) > $BASE/userdata.out 2>&1

BASE=/usr/local/spotrestart
config=spotrestart.config
MAIL=`grep -v '^#' $BASE/$config | grep -i MAIL: | cut -d':' -f2-100`
cat $BASE/userdata.out | mailx -v -s "`hostname` Spot instance started `wget -q -O- http://169.254.169.254/latest/meta-data/instance-id` `date`" $MAIL

bestinstance.sh:
#!/bin/bash

#0 * * * * nice -10 bash /usr/local/spotrestart/bestinstance.sh > /usr/local/spotrestart/bestinstance.new 2>&1

myregion()
{
case "$REGION" in

Frankfurt) export AWS_DEFAULT_REGION=eu-central-1
    ;;
Ireland) export AWS_DEFAULT_REGION=eu-west-1
    ;;
N.-California) export AWS_DEFAULT_REGION=us-west-1
    ;;
N.-Virginia) export AWS_DEFAULT_REGION=us-east-1
   ;;
Oregon) export AWS_DEFAULT_REGION=us-west-2
   ;;
US) export AWS_DEFAULT_REGION=
   ;;
Sao-Paulo) export AWS_DEFAULT_REGION=sa-east-1
   ;;
Singapore) export AWS_DEFAULT_REGION=ap-southeast-1
   ;;
Tokyo) export AWS_DEFAULT_REGION=ap-northeast-1
   ;;
Seoul) export AWS_DEFAULT_REGION=ap-northeast-2
   ;;
Sydney) export AWS_DEFAULT_REGION=ap-southeast-2
   ;;
*) echo "Region $reg enot recognised"; exit 0
   ;;
esac
}


BASE=/usr/local/spotrestart
config=spotrestart.config
BASE=`grep -v '^#' $BASE/$config | grep -i BASE: | cut -d':' -f2-100`
INSTANCES=`grep -v '^#' $BASE/$config | grep -i INSTANCES: | cut -d':' -f2-100`
AZs=`grep -v '^#' $BASE/$config | grep -i AZs: | cut -d':' -f2-100`
REGION=`grep -v '^#' $BASE/$config | grep -i REGION: | cut -d':' -f2-100`

echo -e "\n\n`date` Starting bestinstance $1"
cd $BASE/bestinstance
myregion
if [ "$1" == "-quick1dayupdate" ]
then
    time aws ec2 describe-spot-price-history --region=$AWS_DEFAULT_REGION --product-description "Linux/UNIX (Amazon VPC)" --start-time `date -u --date="1 days ago" +'%Y-%m-%dT%H:%M:00'` | jq -r -c '.SpotPriceHistory[] | "\(.AvailabilityZone),\(.Timestamp),\(.InstanceType),\(.SpotPrice)"' >> spotprice
else
    time aws ec2 describe-spot-price-history --region=$AWS_DEFAULT_REGION --product-description "Linux/UNIX (Amazon VPC)" --start-time `date -u --date="7 days ago" +'%Y-%m-%dT%H:%M:00'` | jq -r -c '.SpotPriceHistory[] | "\(.AvailabilityZone),\(.Timestamp),\(.InstanceType),\(.SpotPrice)"' > spotprice
fi

> lowestcost
for i in $INSTANCES
do
 cost=`grep $i aws.costs.$REGION | grep ^ec2 | cut -d, -f5`
 cost=`echo "scale=3; $cost/730" | bc`
 echo "Instance-type=$i,$REGION,On-demand price=$cost"
 for a in $AZs
 do
     echo "AZ=$a"
     cat spotprice | grep $i | grep $a | sort -n | sed "1,\$s/$/,$cost/" | awk -F',' ' { print $0 "," $4/$5 } ' | sort -t, -nk 6 -n | tail -1 | tee -a lowestcost
 done
 echo "==========================================================================================="
done
echo -e "Lowest cost instance: "
cat lowestcost | sort -t, -nk4 | head -1  | awk -F',' ' { print $0 "," $5*1.2 } '

exit 0

spotmonitor.sh
#!/bin/bash
date; ps -ef | grep spotmonitor; echo

source /etc/profile
BASE=/usr/local/spotrestart
config=spotrestart.config
MAIL=`grep -v '^#' $BASE/$config | grep -i MAIL: | cut -d':' -f2-100`

cd $BASE

while true
  do
    if curl -s http://169.254.169.254/latest/meta-data/spot/termination-time | grep -q ".*T.*Z"
      then
        (
        echo "`hostname` Spot instance termination notice received `wget -q -O- http://169.254.169.254/latest/meta-data/instance-id` `date`."
        echo; cat spotmonitor.out
        curl -Is http://169.254.169.254/latest/meta-data/spot/termination-time; echo
        sync;sync
        $BASE/bestinstance/bestinstance.sh -quick1dayupdate | tee -a  $BASE/bestinstance/bestinstance.out
        $BASE/spotlaunch.sh; cat $BASE/spotlaunch.out
        ) 2>&1 | mailx -v -s "`hostname` Spot instance itermination notice received `wget -q -O- http://169.254.169.254/latest/meta-data/instance-id` `date`" $MAIL
        sleep 5
       break
      else
        # Spot instance not yet marked for termination.
        echo "`date` No termination notice:"
        sleep 5
    fi
  done

spotrestart.config
BASE:/usr/local/spotrestart

REGION:Sydney
INSTANCES:c3.large c4.large m3.large m4.large r3.large r4.large
AZs:ap-southeast-2a 


AMI:ami-5a081aaa
KEY:ec2-zorankey
PROFILE:arn:aws:iam::123456789023:instance-profile/spotrestart-ec2role
VOLUMES:vol-0ceb97324eaaa:/dev/xvdh:/dev/xvdh:/data;vol-079d0dc71d02ecbbb:/dev/xvdi:/dev/xvdi:/archive
SUBNETS:subnet-72d01000
SECURITYGROUPIDS:sg-86bbcaaa sg-80b35bbb sg-8ab35ccc
DOMAIN:mycompany.com.au
HOSTNAME:spotrestart1

TAGS:Bu:My Business Unit;CC:12345;Environment:prod;Name:Spot Restart;Owner:Zoran Gagic;Product:Spotrestart

MAIL:zorang@mycompany.com.au


No comments:

Post a Comment