Availability


The Netflix OSS "stack"

posted Mar 30, 2015, 11:48 AM by Chris G


The Netflix Open Source Software suite of products. http://netflix.github.io/#repo.

Karyon: The nucleus of a Composable Web Service. More:

http://techblog.netflix.com/2013/03/karyon-nucleus-of-composable-web-service.html


Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.

https://github.com/Netflix/Hystrix

http://techblog.netflix.com/2012/02/fault-tolerance-in-high-volume.html



ZooKeeper: ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

http://www-01.ibm.com/software/data/infosphere/hadoop/zookeeper/

https://www.hakkalabs.co/articles/apache-zookeeper-introduction/

https://www.usenix.org/legacy/event/usenix10/tech/full_papers/Hunt.pdf



Curator - ZooKeeper client wrapper and rich ZooKeeper framework

http://netflix.github.com/curator


Archaius - Library for configuration management API

https://github.com/Netflix/archaius

http://techblog.netflix.com/2012/06/annoucing-archaius-dynamic-properties.html

http://jlordiales.me/2014/10/07/configuration-management-with-archaius-from-netflix/



Ribbon is a Inter Process Communication (remote procedure calls) library with built in software load balancers. The primary usage model involves REST calls with various serialization scheme support.

https://github.com/Netflix/ribbon


Eureka - Service registry for resilient mid-tier load balancing and failover.

https://github.com/Netflix/eureka


Zuul is an edge service that provides dynamic routing, monitoring, resiliency, security, and more:

https://github.com/Netflix/zuul


Asgard: Web-based Cloud Management and Deployment:

http://techblog.netflix.com/2012/06/asgard-web-based-cloud-management-and.html

https://github.com/Netflix/asgard


Cloud Services Powered by IBM SoftLayer and NetflixOSS:

http://www.slideshare.net/aspyker/cloud-servicespoweredbyib-mandnetflixoss


Cloud Services Fabric (and NetflixOSS) on Docker – Demo at IBM Impact 2014:

http://ispyker.blogspot.com/2014/05/cloud-services-fabric-and-netflixoss-on.html


A sample reference architecture application demonstrating the use of many Netflix Open Source projects:

https://github.com/cfregly/fluxcapacitor


http://nirmata.com/2014/08/getting-started-with-microservices-using-netflix-oss-docker/

Micro Services Architecture versus SOA

posted Mar 30, 2015, 11:33 AM by Chris G   [ updated Mar 30, 2015, 11:34 AM ]

 
  • MSA = micro-service architecture (MS = micro-service)
  • SOA = service-oriented architecture
  • EDA = event-driven architecture
  •  

    Micro-service architecture and the human body

    Let’s assume I am being chased by a lion. If I he scratches my leg, will that result in a sudden death, i.e. a collapse of my whole body?

    No it does not. It might hurt big-time, and I might need to get some help on the long term, but the first few minutes, seconds or maybe even hours, my brain will probably ignore the signals sent by the nerve receptors, and I should probably try to reduce the amount of blood loss I have.

    My adrenaline levels will rise, I will ignore the pain signals, and I might be able to run faster and longer then I ever imagined.

    The information highway

    Think about the way our body communicates with our brain: the brain has two main paths for communicating with the body: the nerves and the blood vessels. They both have a distinct function:

    • Nerves control muscles and report back status using nerve receptors.
    • Blood vessels report back information to your brain using hormones generated by your organs; for example the leptin hormone reports satiety to the brain. They are also used to transport essential nutrients to your cells.

    This is a gross simplification, but is in essence how your body works.

    Some remarks

    • The brain responds to both hormones and nerve receptors with a distinction; nerves are about state of an organ or muscle(i.e. lift that arm), while hormones are more about meta-information: how can I survive/multiply/…
    • An organ might fail, but that does not imply that you die. The body can adapt itself in numerous ways by regulating hormones first, and finding new ways after that.
    • The brain is a mystery; we know some things about it, but we don’t know everything. If we would have unlimited bodies, computing power and engineering skills, we could probably attach all nerves and hormone detectors to a machine, put them all in a room, and use survival of the fittest to figure out the best path to enlightenment. The problem is in defining what “survival” actually means, i.e. live the longest in the room, grow the oldest, have the fullest life, …
    • Some organs are essential (f.e. your heart).
    • The body is adaptive, f.e. nerve receptors in the brain have been retrained to allow blind people to “see” (= a perception of awareness).

    This brings us to the MSA

    Does your brain know how your nerves and blood vessels are organised? I don’t think so; what matters to the brain is cause and consequence. MSA’s work just like that:

    • Your nerves report the well-being of the system: do all my organs function, is one missing, it’s the state of your system:
      • does my service still run
      • when was the last time it emitted an event (i.e. is it actually used).
      • how much events were emitted in a time-frame
      • does the system reply to a heart-beat.
    • While your nerves are important, the value is in the well-being (i.e. the hormones flowing through your blood stream)
      • do we have visitors on my web site
      • are they ordering stuff
      • do we get payments
      • but also very short-term other goals
        • do people buy our new product X
        • does offering the 2nd item for free help

    The gist of it

    Here’s the difference between SOA and MSA: as long as the “hormones”(=used to generate business metrics) do not signal a problem, you don’t care about the organs (i.e. individual services), f.e. if you don’t know what a service is doing, just shut it down. If the business metrics don’t show it, it was useless. If people start making a fuss about it, add the proper business metric, and reboot the service.

    Have an extra feature or a customer-specific thing? Just let both services run, and let the one processing those responses decide which answer to take.

    Does a service fail? As long as your business metrics don’t show it, you don’t have a problem… In fact, because your business rules might change every once in a while, services will get redundant.

    Embrace failure and shut-down services at random during office hours. If your business metrics show failure, make sure that there are always at least 2 of them running, so the next time one of them fails, you will no longer have that issue… Or just write another service - after all, it’s just a couple of 100 lines - that does what it is supposed to do, and maybe even way better then before.

    If a system fails, you go back to the nerves to check what might be the cause, or just decide to fix it by launching another service. Experience will tell you.

    Where all of this is coming from

    As Nassim Taleb explains in Antifragile: Things That Gain from Disorder , the risk of a lot of small things all failing simultaneously is way smaller then the risk of one big thing failing at once. To get good at handling failure, it is better to fail all the time. This will allow you to have confidence in the system as a whole, while individual components might fail.

    The essence of MSA is monitoring the “what”, while EDA and SOA typically monitor the “how”. You don’t care about how the system works, just that it works… If it fails, you fix it by adding another service or adjusting the existing service. By making failure a non-event, you become essentially resistant to it.

    CentOS Encrypted Incremental Backup for AWS

    posted Nov 14, 2014, 5:51 PM by Chris G

    There are many documented solutions available on the internet for achieving full and incremental backups for CentOS servers to AWS S3 or even to AWS Glacier. Most of them rely on a tool called Duplicity, which relies on Python and BOTO.

    Backups can also be encrypted for added security to your data. Check it out!!!

    Tutorial - High Availability for AWS

    posted Sep 10, 2014, 12:26 PM by Chris G


    High Availability for Amazon VPC NAT Instances: An Example


    This article provides all the necessary resources, including an easy-to-use script,and instructions on how you can leverage bidirectional monitoring between two NAT instances to implement a high availability (HA) failover solution for network address translation (NAT).


    Create an OpenBSD 5.5 HAproxy HTTPS TCP proxy

    posted Sep 5, 2014, 12:32 PM by Chris G

    Assuming you already have a working OpenBSD 5.5 system, this task if very straightforward.

    First, install HAproxy from your favorite repository:
    pkg_add http://mirror.internode.on.net/pub/OpenBSD/5.5/packages/amd64/haproxy-1.4.24.tgz
    
    pkg_add http://mirror.internode.on.net/pub/OpenBSD/5.5/packages/amd64/nano-2.2.6.tgz
    You can then edit the HAproxy configuration file:

    nano /etc/haproxy/haproxy.cfg
    

    frontend  TEST-http-in

        bind *:80

        acl root_url url /

        redirect code 301 location https://correct.domain.com/context/ drop-query append-slash if root_url

    frontend https-c-in

       bind *:443

       mode tcp

       default_backend app_servers

       acl root_url url /

       redirect code 301 location https://correct.domain.com/context/ drop-query append-slash if root_url

    backend app_servers

       balance source

       mode tcp

       option ssl-hello-chk

       server  app 10.10.10.10:8443 check inter 2000 rise 2 fall 5


    Last step, make sure that the service starts when the server is rebooted:
    nano /etc/rc.conf
    

    # rc.d(8) packages scripts

    # started in the specified order and stopped in reverse order

    pkg_scripts="haproxy"

    Nutanix "Cluster in a box" solution and more!

    posted Feb 19, 2014, 5:28 PM by Chris G

    How Nutanix Works

    Watch this animation and learn how the Nutanix Virtual Computing Platform Works.


    The Nutanix Solution

    The Nutanix Virtual Computing Platform is a converged infrastructure solution that consolidates the compute (server) tier and the storage tier into a single, integrated appliance.

    Nutanix uses the same design principles and technologies that power IT innovators such as Google, Facebook, and Amazon. It tailors these for mainstream enterprises and government agencies.

    The Nutanix solution is radically simple compared to traditional datacenter infrastructures.

    • Rapid time to value: deployment in under 30 minutes
    • No disruption to ongoing operations
    • Easily scales
    • Powerful off-the-shelf, non-proprietary hardware
    • Reduces the cost and complexity of storage
    • Works with legacy components, protecting investments you’ve already made
    • Delivers advanced, enterprise-class storage capabilities

    http://www.nutanix.com/how-nutanix-works/


    vSphere Storage Appliance

    posted Sep 20, 2013, 8:18 PM by Chris G

    http://www.vmware.com/products/vsphere-storage-appliance/



    VMware vSphere® Storage Appliance™ is a software-based shared storage solution that enables high availability and automation in vSphere without shared storage hardware.


    Oracle High Availability

    posted Aug 19, 2013, 2:04 PM by Chris G

    This excellent document by Oracle defines some of the High Availability concepts and terminology.

    1-8 of 8