Mateus Nava

Mateus Nava

January 20, 2022

Ruby AWS Lambda Functions to generate NDVI maps

By spaceX (unsplash.com)
By spaceX (unsplash.com)
In my company, we have an application that generates NDVI (normalized difference vegetation index) maps using Sentinel2 scenes, this is a very complex operation that requires a lot of RAM and CPU.

To understand the process, imagine that you have a polygon and you want to have the NDVI map for that polygon.

My example polygon
 
The simplified version of the whole process to have an NDVI map is:
  1. Find out which scene from Sentinel 2 you need - there are many scenes to cover the world, in this step you need to find which scene covers your polygon
  2. Download scene by Amazon S3 - this process is simple but expensive because S3 for Sentinel2 is Request Payer
  3. Calculate the NDVI using GDAL project https://gdal.org/ - GDAL has all features for it, we just orchestrate the CLI commands
  4. Color the NDVI with some color scheme - the NDVI map is grayscale and we want to deliver to our customers a color version


We are using Rails API and Sidekiq to generate maps. The first problem is that we have a lot of polygons to process and we need a lot of CPU and RAM to do that, in the beginning, we had EC2 machines to do all this work but we realized it was very expensive.

So we decided to put the sidekiq workers on our k8s cluster to run at night (low load moment), and we saved a few dollars on this operation.

It worked fine but created another problem: Sentinel 2's scenes are large and we paid a lot to transfer data from Amazon S3 to our k8s cluster, so we decided to create a lambda function to cut scenes in the amazon environment and only respond to a small file.

So far so good but we have one more problem, we need the GDAL binaries to cut the scene, i.e. the simple lambda function is not enough. At this point, we found that it is possible to create a Dockerfile to provision the environment that runs the AWS Lambda function :). We created a Docker image with all the GDALs and it worked great.

Our dockerfile
FROM ecr.aws/lambda/ruby:2.7

# GDAL Install
RUN yum update && \
 yum install libXcomposite libXcursor libXi libXtst libXrandr alsa-lib mesa-libEGL libXdamage mesa-libGL libXScrnSaver python3-minimal wget git -y &&\
 cd /tmp &&\
 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh &&\
 cd /tmp && sh Miniconda3-latest-Linux-x86_64.sh -b &&\
 /root/miniconda3/bin/conda install -c conda-forge gdal &&\
 chmod 777 -R /root &&\
 yum clean all &&\
 rm -rf /var/cache/yum &&\
 rm -rf /tmp

ENV PATH="/root/anaconda3/bin:${PATH}"
ENV PROJ_LIB="/root/anaconda3/share/proj"

COPY Gemfile ${LAMBDA_TASK_ROOT}
COPY Gemfile.lock ${LAMBDA_TASK_ROOT}

RUN bundle install

COPY app.rb ${LAMBDA_TASK_ROOT}
COPY lib ${LAMBDA_TASK_ROOT}/lib

CMD [ "app.LambdaFunctions::Handler.process" ]


The ruby code (app.rb)
require 'json'

module LambdaFunctions
  class Handler

    def self.process(event:, context:)
      # The Code
    end
  end
end


NDVI Map generates using our APP



You can find the entire guide to create this type of function https://docs.aws.amazon.com/lambda/latest/dg/ruby-image.html