Jul 5, 2022 5 min read Data

Automate AWS Lambda Layer Creation with Python and Shell Scripts

In AWS Lambda, a layer is used to provide additional code and dependencies required by lambda functions to run successfully. A layer is actually a .zip archive file that helps abstracts away the dependent libraries needed to run a function from the business logic implemented in the function. The idea of a layer in AWS Lambda promotes code sharing and a best practice of separating business logic code design from dependencies required to execute the logic.

For various Python runtimes and versions, AWS Lambda has provided a bunch of libraries to support lambda functions execution. These libraries include all the Python builtin modules as well as other third-party libraries. A list of predefined Python modules available in AWS Lambda can be found here.

For the needed libraries that are not available in the predefined module list, users are expected to package and upload such libraries for the use of their lambda functions. The process of creating and packaging a lambda layer is pretty manual and can be tedious where more than one modules and of different versions are required to be packaged as a layer for a given lambda function. This post aims to automate the process by designing a python and a shell scripts that can be easily run with desired modules as arguments in order to create a .zip archive layer to be used with lambda functions.

Although layers make life easier when deploying lambda functions, there are important quotas that users need to be aware of. AWS currently allows a total of 5 layers per lambda function and the maximum size for an uploadable .zip archive deployment package (including layers and custom runtimes) is 50MB. The unzipped deployment package maximum size must not go above 250MB. It's therefore a good practice to ensure that too many libraries are not added to the layer to avoid ballooning the size of the package above the limit. It is also advisable to first go through the predefined module list to ensure that only modules that are not already included in AWS Lambda are being packaged as layers for lambda functions.

Solution Design Steps

The solution steps highlighted below assume that you have Anaconda or its smaller footprint version, Miniconda, installed.

Create a new conda environment with a required python version


# Define some variables
pyLayerEnv="layerEnv"
projectDir="projectLambdaLayer"
pythonVersion="3.12"
packageName="polars_s3fs_layer"

# Get your base conda environment name
condaBase=$(conda info | grep "base environment" | awk '{{ print "$4" }}' | awk -F"/" '{{ print "$NF" }}')

# Create a new conda environment with a new python version
conda create --name $pyLayerEnv python=$pythonVersion --yes

# configure conda to avoid CommandNotFoundError 
source ~/$condaBase/etc/profile.d/conda.sh

Activate the new conda environment


conda activate $pyLayerEnv

Create a project directory and navigate to the folder


mkdir $projectDir && cd $projectDir

Create a package path for libraries to be included in the layer


mkdir -p build/python/lib/python$pythonVersion/site-packages

Create a shell array and add the required libraries to it


declare -a libArr && libArr+=("polars" "s3fs")

Install each library in the array to the defined path


for item in "${{libArr[@]}}"
    do 
        pip install --platform manylinux2014_x86_64 "$item" -t build/python/lib/python$pythonVersion/site-packages
    done

Consider removing unnecessary files and folders such as pycache, LICENSES, etc. to reduce the layer size.


find build/python/lib/python$pythonVersion/site-packages -type d -name "__pycache__" -exec rm -rf {{}} \;

find build/python/lib/python$pythonVersion/site-packages -type f -name "*LICENSE*" -exec rm -rf {{}} \;

Change to build directory and zip the installed libraries


cd build && zip -r $packageName python

Deactivate and remove the conda environment


conda deactivate && conda remove --name $pyLayerEnv --all --yes

Move the zip layer package into the project directory


mv $packageName.zip ../

If the zipped layer file is less than 10MB, you can upload it directly on the Lambda console when you're creating the layer. For a larger file size, you can upload it via Amazon S3.

Final Code Design

Putting the above shell commands together, you can run the code as either a standalone script in a shell or as a command within the subprocess python module as described below.


import argparse
import subprocess


def getParser():

    # create the parser
    parser = argparse.ArgumentParser(
        prog="createLambdaLayerPackage",
        description="Create a Lambda layer package",
        allow_abbrev=False,
    )

    # add arguments
    parser.add_argument(
        "--layer-package-name",
        required=True,
        type=str,
        help="Name of the package zip file",
    )

    parser.add_argument(
        "--runtime-version",
        required=True,
        type=str,
        help="Python version to use for the Lambda layer e.g. 3.12",
    )

    parser.add_argument(
        "--layer-library",
        nargs=argparse.REMAINDER,
        required=True,
        type=str,
        help="Python libraries to be included in the Lambda layer package. Separate multiple libraries with spaces.",
    )

    # parse the arguments
    args = parser.parse_args()
    return args



def createLambdaLayerPackage(myArgs: argparse.Namespace) -> None:
    """ A function to create an AWS Lambda layer package
    Args:
        --runtime-version (str): Python version to use for the Lambda layer e.g. 3.12
        --layer-library" (str): Python libraries to be included in the Lambda layer package.
        --layer-package-name (str): Name of the package zip file.

    Returns:
        A zip layer package/file under projectLambdaLayer directory.

    Usage: createLambdaLayerPackage.py --layer-package-name polars-pyarrow-layer --runtime-version 3.12 --layer-library polars pyarrow
    """

    try:
        # get a dict of arguments and values
        mydict = vars(myArgs)

        # extract values from arguments
        modList = mydict["layer_library"]
        runtimeVal = mydict["runtime_version"]
        packageVal = mydict["layer_package_name"]

        # combine layer-library arg values into a string
        modStr = " ".join(modList)

        shellCmd = f"""
            pyLayerEnv="envLayer"
            projectDir="projectLambdaLayer"
            pythonVersion={runtimeVal}
            packageName={packageVal}

            # get your base conda environment name
            condaBase=$(conda info | grep "base environment" | awk '{{ print "$4" }}' | awk -F"/" '{{ print "$NF" }}')

            # create a new conda environment with python version desired
            conda create --name $pyLayerEnv python=$pythonVersion --yes

            # configure conda to avoid CommandNotFoundError
            source ~/$condaBase/etc/profile.d/conda.sh

            # activate the conda environment
            conda activate $pyLayerEnv

            # create a project directory and navigate into the folder
            mkdir $projectDir && cd $projectDir

            # create a package path for libraries to be included in the layer
            mkdir -p build/python/lib/python$pythonVersion/site-packages

            # create an array to store libraries
            declare -a libArr

            # add libraries to array
            libArr+=({modStr})

            # install libraries in the array to the defined path
            for item in "${{libArr[@]}}"
                do
                    pip install --platform manylinux2014_x86_64 "$item" -t build/python/lib/python$pythonVersion/site-packages
                done

            # consider removing unnecessary files and folders such as __pycache__, LICENSES, etc. to reduce package size
            find build/python/lib/python$pythonVersion/site-packages -type d -name "__pycache__" -exec rm -rf {{}} \;
            find build/python/lib/python$pythonVersion/site-packages -type f -name "*LICENSE*" -exec rm -rf {{}} \;

            # change to build directory and zip the installed libraries
            cd build && zip -r $packageName python

            # deactivate and remove the conda environment
            conda deactivate && conda remove --name $pyLayerEnv --all --yes

            # move the zip layer package under the project directory
            mv $packageName.zip ../

            # step out of build folder
            cd ..

            # remove the build folder
            rm -rf build
            """

        # run the shell script
        result = subprocess.run(
            ["/bin/bash"],
            input=shellCmd,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            encoding="utf-8",
        )

        if result.returncode != 0:
            print(result.stderr)
            raise Exception("Error creating Lambda layer package")
        else:
            print("Lambda layer package created successfully")

    except Exception as err:
        print(err)


def main():
    # get the required arguments
    modArgs = getParser()

    # create the lambda layer
    createLambdaLayerPackage(modArgs)


if __name__ == "__main__":
    main()

Example Script Usage

The example below creates a layer consisting of polars and s3fs modules. When the script execution is completed, a .zip archive file polars-s3fs.zip will be created under the project folder named projectLambdaLayer in your current working directory.

python createLambdaLayerPackage.py --layer-package-name polars-s3fs --runtime-version 3.12 --layer-library polars s3fs

Additionally, you can run the command below to see help tips on how to use the script and the required arguments.

python createLambdaLayerPackage.py -h

See the full code including the shell script in my Github repository. Thanks for reading.

Solution Design Steps

Final Code Design

Example Script Usage

You might also like...

Python Decorators: When to use and when to avoid them

Read and Parse Configuration File in a PySpark Job

Open Password-Protected Excel Files in Python Without Manual Input of Password

Counting Null, Nan and Empty Values in PySpark and Spark Dataframes

How To Union Multiple Dataframes in PySpark and Spark Scala

Popular tags