AWS Athena – Update existing table using AWS Lambda + cloudwatch events-Schedule

Scenario

1. Athena external table created using S3 folder structure (/mybucket-s3-bucket/logs/2017/03/03) & here each day new folder will be created and logs will be pushed.

$ CREATE EXTERNAL TABLE tbl03 (col1 string) PARTITIONED BY (year string, month string, date string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LOCATION ‘s3://mybucket-s3-bucket/’
$ ALTER TABLE tbl03 ADD PARTITION (year=’2017′,month=’3′,date=’03’) location ‘s3://mybucket-s3-bucket/';

2. Now, the table has been created with the partition for today’s date (year=’2017′,month=’3′,date=’03’) & this needs to be updated on daily basis.

3. Attached JAVA code will connect to your Athena database & list the existing table (under any DB) & and create new partition every day using lambda function (cloudwatch events-Schedule).


Hello.Java

CloudwatchEventsRequest.java

CloudwatchEventsResponse.java

dependcy_for_build


4. Create the JAR fle (build the class files)

5. Create the new lambda function by selecting “blank function” and choose “CloudWatch Events – Schedule” as a configuration Trigger.

step:1
–> Rule Name “some_name”
–> Rule Description “something”
–> Scheduled Expression- cron(0 1 * * ? *) Example: This will Invoke a Lambda function at 01:00am (UTC) every day
–> Enable Trigger (check this)

Step:2 Configure Funciton
–> Name of function
–> Description
–> Run time ” Java8″

–> Upload your “jar file” (you have create using the attached files)

–> Handler* “Hello”
–> Choose existing AWS lambda role

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>