Geek

Top Netflix Open Source Projects You Must Know


This is the 3rd post in the series covering top open source projects by most popular companies in the world. Today, we are covering Netflix which has an incredible open source culture. To get a sense of what they do in open source, take a look at their GitHub Repo. They’ve released over 50 open source projects, with several more in the pipeline. They also host regular public NetflixOSS meetups in the Bay Area.

Netflix both leverages and provides open source technology focused on providing the leading Internet television network. Their technology focuses on providing immersive experiences across all internet-connected screens. Netflix’s deployment technology allows for continuous build and integration into their worldwide deployments serving members in over 50 countries. Thier focus on reliability defined the bar for cloud-based elastic deployments with several layers of failover. Netflix also provides the technology to operate services responsibility with operational insight, peak performance, and security. This top Netflix open source projects’ list cover three following categories – Big Data, Build and Delivery Tools, Common Runtime Services and Libraries, Data Persistence, Reliability and Performance and Security.

Let’s know about the top Netflix open source projects one by one:

Top Netflix Open Source Projects – Big Data

Genie

Genie is a federated job execution engine which provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Presto, Sqoop, and more. It also provides APIs for managing many distributed processing cluster configurations and the commands and applications which run on them.

Inviso

Inviso is an interface to search and visualize Hadoop jobs, Job performance, and cluster utilization data.

Lipstick

Lipstick is a God-project for pig developers. It combines a graphical depiction of a Pig workflow with information about the job as it executes, giving developers insight that previously required a lot of sifting through logs (or, a Pig expert) to piece together.


Aegisthus

Aegisthus enables the bulk abstraction of data out of Cassandra for downstream analytic processing. It does so by implementing a reader for the SSTable format and provides a map/reduce program to create a compacted snapshot of the data contained in a column family.

This was all about the big data related top Netflix open source projects. Let’s discuss build and recovery tools:

Top Netflix Open Source Projects – Build and Delivery Tools

Nebula

Nebula is a collection of Gradle plugins that Netflix has open sourced to share its internal build infrastructure to the public. The nebula-plugins organization was set up to facilitate the generation, governance and releasing of Gradle plugins. It is done by providing a space to host plugins, in SCM, CI, and a Repository.

Asgard

Asgard is a web-based tool for managing cloud-based applications and infrastructure. Asgard  helps Netflix to build and deploy hundreds of applications and services to the Amazon cloud. Asgard is released under the Apache License, Version 2.0. Please feel free to fork the project and make improvements to it.


Top Netflix Open Source Projects – Common Runtime Services & Libraries

Hystrix

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services, and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable. In a distributed environment, inevitably some of the many service dependencies will fail. Hystrix is a library that helps you control the interactions between these distributed services by adding latency tolerance and fault tolerance logic. Hystrix does this by isolating points of access between the services, stopping cascading failures across them, and providing fallback options, all of which improve your system’s overall resiliency.

Karyon

Karyon is a framework and library that essentially contains the blueprint of what it means to implement a cloud-ready web service. All the other fine-grained web services and applications that form Netflix’s SOA graph can essentially be thought as being cloned from this basic blueprint.

Turbine

Turbine is a tool for aggregating streams of Server-Sent Event (SSE) JSON data into a single stream. The targeted use case is metrics streams from instances in an SOA being aggregated for dashboards. Netflix uses Hystrix which has a real-time dashboard that uses Turbine to aggregate data from 100s or 1000s of machines.


Top Netflix Open Source Projects – Data Persistence

EVCache

EVCache is a memcached and spymemcached based caching solution that is mainly used for AWS EC2 infrastructure for caching frequently used data.

Dynomite

Inspired by Amazon’s Dynamo whitepaper, Dynomite is a generic dynamo implementation for different k-v storage engines.

Astyanax

Astyanax is a high-level Java client for Apache Cassandra. Apache Cassandra is a highly available column oriented database. It borrows many concepts from Hector but diverges in the connection pool implementation as well as the client API. One of the main design considerations was to provide a clean abstraction between the connection pool and Cassandra API so that each may be customized and improved separately. Astyanax provides a fluent style API which guides the caller to narrow the query from key to column as well as providing queries for more complex use cases that Netflix have encountered. The operational benefits of Astyanax over Hector include lower latency, reduced latency variance, and better error handling.

This was all about the data persistance related top Netflix open source projects. Let’s discuss imsight, reliability, and performance tools:

Top Netflix Open Source Projects – Insight, Reliability, and Performance

Atlas

Atlas is used for managing dimensional time series data for near real-time operational insight. Atlas features in-memory data storage, allowing it to gather and report very large numbers of metrics, very quickly. It was primarily created to address issues with scale and query capability in the previous system.

Ice

Ice provides a birds-eye view of our large and complex cloud landscape from a usage and cost perspective. Cloud resources are dynamically provisioned by dozens of service teams within the organization and any static snapshot of resource allocation has limited value. The ability to trend usage patterns on a global scale, yet decompose them down to a region, availability zone, or service team provides incredible flexibility. Ice allows us to quantify our AWS footprint and to make educated decisions regarding reservation purchases and reallocation of resources.


Simian Army

The Simian Army is a suite of tools for keeping your cloud operating in top form. Simian Army consists of services (Monkeys) in the cloud for generating various kinds of failures, detecting abnormal conditions, and testing our ability to survive them. The goal is to keep cloud safe, secure, and highly available. Currently the simians include Chaos Monkey, Janitor Monkey, and Conformity Monkey.

After analysis and performance related projects, let’s talk about top Netflix open source projects related to security:

Top Netflix Open Source Projects – Security

Security Monkey

Security Monkey monitors policy changes and alerts on insecure configurations in an AWS account. It also proves to be a useful tool for tracking down potential problems as it is essentially a change tracking system.

Scumblr

Scumblr is a web application that allows performing periodic searches and storing / taking actions on the identified results. Scumblr searches utilize plugins called Search Providers. Each Search Provider knows how to perform a search via a certain site or API (Google, Bing, eBay, Pastebin, Twitter, etc.). Searches can be configured from within Scumblr based on the options available by the Search Provider.

Message Security Layer

Message Security Layer (MSL) is an extensible and flexible secure messaging framework that can be used to transport data between two or more communicating entities. Data may also be associated with specific users, and treated as confidential or non-replayable if so desired.

The above list of top Netflix open source projects was prepared with inputs from Netflix OSS.

Use our comment section below to share your views. Let us know if we missed out any popular open source project from Netflix in our list.

Check out our other articles on open source projects here.

To Top

Pin It on Pinterest

Share This