Automate AWS Lambda Layer Creation with Python and Shell Scripts
In AWS Lambda, a layer is used to provide additional code and dependencies required by lambda functions to run successfully. A layer is actually a .zip archive file that helps abstracts away the dependent libraries needed to run a function from the business logic implemented in the function. The idea of a layer in AWS Lambda promotes code sharing and a best practice of separating business logic code design from dependencies required to execute the logic.
For various Python runtimes and versions, AWS Lambda has provided a bunch of libraries to support lambda functions execution. These libraries include all the Python builtin modules as well as other third-party libraries. A list of predefined Python modules available in AWS Lambda can be found here.
For the needed libraries that are not available in the predefined module list, users are expected to package and upload such libraries for the use of their lambda functions. The process of creating and packaging a lambda layer is pretty manual and can be tedious where more than one modules and of different versions are required to be packaged as a layer for a given lambda function. This post aims to automate the process by designing a python and a shell scripts that can be easily run with desired modules as arguments in order to create a .zip archive layer to be used with lambda functions.
Although layers make life easier when deploying lambda functions, there are important quotas that users need to be aware of. AWS currently allows a total of 5 layers per lambda function and the maximum size for an uploadable .zip archive deployment package (including layers and custom runtimes) is 50MB. The unzipped deployment package maximum size must not go above 250MB. It's therefore a good practice to ensure that too many libraries are not added to the layer to avoid ballooning the size of the package above the limit. It is also advisable to first go through the predefined module list to ensure that only modules that are not already included in AWS Lambda are being packaged as layers for lambda functions.
Solution Steps
- Create a new conda environment with python version desired
- Activate the conda environment
- Create a project directory
- Get into the project directory
- Create a package path for libraries to be added to the layer e.g. build/python/lib/python3.7/site-packages
- Create a shell array to store the desired module names
- Add the required libraries to the array
- Install each library in the array to the defined path
- Consider removing unnecessary files and folders such as _pycache_, LICENSES, etc. to reduce the final package size.
- Navigate into the build directory
- Zip the installed libraries within the build folder and name the zip file
- Deactivate and remove the conda environment
- Move the zip layer package into the project directory
Code Design
import argparse
import subprocess
def getParser():
# create the parser
parser = argparse.ArgumentParser(
prog="createLambdaLayerPackage",
description="Create a Lambda layer package",
allow_abbrev=False,
)
# add arguments
parser.add_argument(
"--layer-package-name",
required=True,
type=str,
help="Name of the package zip file",
)
parser.add_argument(
"--runtime-version",
required=True,
type=str,
help="Python version to use for the Lambda layer e.g. 3.7",
)
parser.add_argument(
"--layer-library",
nargs=argparse.REMAINDER,
required=True,
type=str,
help="Python libraries to be included in the Lambda layer package. Separate multiple libraries with spaces.",
)
# parse the arguments
args = parser.parse_args()
return args
def createLambdaLayerPackage(myArgs: argparse.Namespace) -> None:
""" A function to create an AWS Lambda layer package
Args:
--runtime-version (str): Python version to use for the Lambda layer e.g. 3.7
--layer-library" (str): Python libraries to be included in the Lambda layer package.
--layer-package-name (str): Name of the package zip file.
Returns:
A zip layer package/file under projectLambdaLayer directory.
Usage: createLambdaLayerPackage.py --layer-package-name pandas-pyarrow-layer --runtime-version 3.9 --layer-library pandas pyarrow
"""
try:
# get a dict of arguments and values
mydict = vars(myArgs)
# extract values from arguments
modList = mydict["layer_library"]
runtimeVal = mydict["runtime_version"]
packageVal = mydict["layer_package_name"]
# combine layer-library arg values into a string
modStr = " ".join(modList)
shellCmd = f"""
pyLayerEnv="envLayer"
projectDir="projectLambdaLayer"
pythonVersion={runtimeVal}
packageName={packageVal}
# get your base conda environment name
condaBase=$(conda info | grep "base environment" | awk '{{ print "$4" }}' | awk -F"/" '{{ print "$NF" }}')
# create a new conda environment with python version desired
conda create --name $pyLayerEnv python=$pythonVersion --yes
# configure conda to avoid CommandNotFoundError
source ~/$condaBase/etc/profile.d/conda.sh
# activate the conda environment
conda activate $pyLayerEnv
# create a project directory
mkdir $projectDir
# get into the project directory
cd $projectDir
# create a path for libraries to be added to the layer
mkdir -p build/python/lib/python$pythonVersion/site-packages
# create an array to store libraries
declare -a libArr
# Add libraries to array
libArr+=({modStr})
# install libraries in the array to the defined path
for item in "${{libArr[@]}}"
do
pip install "$item" -t build/python/lib/python$pythonVersion/site-packages
done
# consider removing unnecessary files and folders such as __pycache__, LICENSES, etc. to reduce package size
find build/python/lib/python$pythonVersion/site-packages -type d -name "__pycache__" -exec rm -rf {{}} \;
find build/python/lib/python$pythonVersion/site-packages -type f -name "*LICENSE*" -exec rm -rf {{}} \;
# get into the build directory
cd build
# zip the installed libraries within the build folder and name the zip file
zip -r $packageName python
# deactivate the conda environment
conda deactivate
# remove the conda environment
conda remove --name $pyLayerEnv --all --yes
# move the zip layer package under the project directory
mv $packageName.zip ../
# step out of build folder
cd ..
# remove the build folder
rm -rf build
"""
# run the shell script
result = subprocess.run(
["/bin/bash"],
input=shellCmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
encoding="utf-8",
)
if result.returncode != 0:
print(result.stderr)
raise Exception("Error creating Lambda layer package")
else:
print("Lambda layer package created successfully")
except Exception as err:
print(err)
def main():
# get the required arguments
modArgs = getParser()
# create the lambda layer
createLambdaLayerPackage(modArgs)
if __name__ == "__main__":
main()
Example Script Usage
To use the script, first install a conda python environment. The example execution below creates a layer consisting of pandas, pyarrow and s3fs modules. When the script execution is completed, a .zip archive file pandas-pyarrow-s3fs.zip will be created under the project folder named projectLambdaLayer in your current working directory.
python createLambdaLayerPackage.py --layer-package-name pandas-pyarrow-s3fs --runtime-version 3.9 --layer-library pandas pyarrow s3fs
Additionally, you can run the command below to see the help tips on how to use the script and the required arguments.
python createLambdaLayerPackage.py -h
See the full code including the shell script in my Github repository. Thanks for reading.