January 20, 2022
Ruby AWS Lambda Functions to generate NDVI maps
In my company, we have an application that generates NDVI (normalized difference vegetation index) maps using Sentinel2 scenes, this is a very complex operation that requires a lot of RAM and CPU.
To understand the process, imagine that you have a polygon and you want to have the NDVI map for that polygon.
To understand the process, imagine that you have a polygon and you want to have the NDVI map for that polygon.
The simplified version of the whole process to have an NDVI map is:
- Find out which scene from Sentinel 2 you need - there are many scenes to cover the world, in this step you need to find which scene covers your polygon
- Download scene by Amazon S3 - this process is simple but expensive because S3 for Sentinel2 is Request Payer
- Calculate the NDVI using GDAL project https://gdal.org/ - GDAL has all features for it, we just orchestrate the CLI commands
- Color the NDVI with some color scheme - the NDVI map is grayscale and we want to deliver to our customers a color version
We are using Rails API and Sidekiq to generate maps. The first problem is that we have a lot of polygons to process and we need a lot of CPU and RAM to do that, in the beginning, we had EC2 machines to do all this work but we realized it was very expensive.
So we decided to put the sidekiq workers on our k8s cluster to run at night (low load moment), and we saved a few dollars on this operation.
It worked fine but created another problem: Sentinel 2's scenes are large and we paid a lot to transfer data from Amazon S3 to our k8s cluster, so we decided to create a lambda function to cut scenes in the amazon environment and only respond to a small file.
So far so good but we have one more problem, we need the GDAL binaries to cut the scene, i.e. the simple lambda function is not enough. At this point, we found that it is possible to create a Dockerfile to provision the environment that runs the AWS Lambda function :). We created a Docker image with all the GDALs and it worked great.
Our dockerfile
FROM ecr.aws/lambda/ruby:2.7 # GDAL Install RUN yum update && \ yum install libXcomposite libXcursor libXi libXtst libXrandr alsa-lib mesa-libEGL libXdamage mesa-libGL libXScrnSaver python3-minimal wget git -y &&\ cd /tmp &&\ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh &&\ cd /tmp && sh Miniconda3-latest-Linux-x86_64.sh -b &&\ /root/miniconda3/bin/conda install -c conda-forge gdal &&\ chmod 777 -R /root &&\ yum clean all &&\ rm -rf /var/cache/yum &&\ rm -rf /tmp ENV PATH="/root/anaconda3/bin:${PATH}" ENV PROJ_LIB="/root/anaconda3/share/proj" COPY Gemfile ${LAMBDA_TASK_ROOT} COPY Gemfile.lock ${LAMBDA_TASK_ROOT} RUN bundle install COPY app.rb ${LAMBDA_TASK_ROOT} COPY lib ${LAMBDA_TASK_ROOT}/lib CMD [ "app.LambdaFunctions::Handler.process" ]
The ruby code (app.rb)
require 'json' module LambdaFunctions class Handler def self.process(event:, context:) # The Code end end end
You can find the entire guide to create this type of function https://docs.aws.amazon.com/lambda/latest/dg/ruby-image.html