Introducing Corral: A Serverless MapReduce Framework
This post gives a technical overview and architectural justification for my latest project, corral — a serverless MapReduce framework.
»This post gives a technical overview and architectural justification for my latest project, corral — a serverless MapReduce framework.
»For a recent project, I needed to read data from a specific chunk of a file. The data was a sequence of serialized records, so I used a bufio Scanner for splitting. Scanners are great, but they obscure the exact number of bytes read. In working through the problem, I found a solution that worked quite nicely.
I’ve been going through a period of programming language wanderlust over the past couple months. Recently, I’ve been quite interested in Rust. Coming from Python, I’ve found a lot of Rust’s language features to be quite powerful.
»If told to write a web crawler, the tools at the top of my mind would be Python based: BeautifulSoup or Scrapy. However, the ecosystem for writing web scrapers and crawlers in Go is quite robust. In particular, Colly and Goquery are extremely powerful tools that afford a similar amount of expressiveness and flexibility to their Python-based counterparts.
»Yelp’s MRJob is a fantastic way of interfacing with Hadoop MapReduce in Python. It has built-in support for many options of running Hadoop jobs — AWS’s EMR, GCP’s Dataproc, local execution, and normal Hadoop.
»During the month of December, I used the daily Advent of Code puzzles to teach myself Go.
»2017 was an interesting year. Personally and professionally, I found it to be a successful year of growth.
»This summer, I worked primarily in Java on a backend service for Zillow’s advertising platform. This service was built on top of Spring Boot, an “opinionated” Java web framework.
»This summer, I had the opportunity to do a Software Engineering internship at Zillow. I grew quite a bit as a developer during my time at Zillow, and gained a lot of experiential knowledge in practically applying the technologies that I’ve learned in school and in my side projects.
»I’ve been working on a site for the past month that generates 3D models from Github’s contribution graphs.
After a month of work, it’s complete! So, I’m happy to announce GitTrophy. It’s pretty slick: type in a Github user or repo name, and you’ll get a 3D preview of the contribution model. I 3D printed my chart from 2016, and it turned out pretty well.
»