Lead Engineer at Multitudes
Emily loves to be involved in a range of things from blowing up circuits to riding horses to rockets. She's an Electrical Engineer by trade but has found herself working in the realms of software and data science.
With a passion for both hardware and software, she hopes to combine the two to make technology smarter and more sustainable. Currently, she is leading engineering at Multitudes, a startup providing engineering metrics that aren't creepy to unlock happier and higher-performing teams.
It can be easy to get trapped in expensive streaming products when looking at building a real-time data pipeline. As a startup, you don't have big pockets to foot a big bill, nor the time to learn a complicated new platform. Can we achieve a close to real-time pipeline using cheap serverless products? Emily thinks yes!
She'll dive into how she built a data pipeline to process GitHub pull requests with AWS Lambda and SQS that is fast enough for customers plus it is cheap! We will look into the technical details for how to set up a Lambda webhook, an SQS reader and a Lambda that then processes data from SQS and loads it into DynamoDB. This pipeline keeps up with events coming in every microsecond and data updates almost immediately PLUS does not cost an arm and a leg.
When building a data pipeline, you must think about how/when/why the data is used so you can build a platform that is fit for purpose and your budget. Sometimes the simplest solution is best.
Technologies we will cover: AWS Lambda, AWS SQS, AWS Dynamodb, Terraform, Typescript