欢迎各位兄弟 发布技术文章

这里的技术是共享的

You are here

55)NginX and Riak

Problem of storage and delivering static content is quiet actual nowadays. Lots of people needs big and reliable storages for storing static images and many other static files and delivering it to end users. Most popular solution still is NFS mounted storage, which is accessible from all front-ends, but this solution has big bottlenecks.

  1. Hard to backup.
  2. Everything relies on NAS.
  3. Statically mounted external storage is needed.

Now lets dig deeper: 

Hard to Backup :Some of you will say that this is not so ! But lets imagine that you have 10TB of small images which your application regularly use and this images are very critical. Standard rsync and or tar could take lots of time and system resources, which is definitely not what we want.

Everything relies on NAS : So what ? We an buy a reliable NAS/SAN with cool RAID(1-10) storage and use it. But if we have a closer look we will see that for having 10TB space with for example 10x1TB 15k RPM SAS drives we will need at least 11x drives + RAID controller. Everything is good so far, but wait what is the price for that. After digging internet shops and price-lists you will see that this is quiet expensive, especial if your data is very critical and you need hot-backup aka second NAS/SAN. Another bottleneck is that in that in this solution you will have to do vertical only scalability. This is expensive and hard to achieve. And at least by order but not by meaning is that you will have to share same IO  device for all. This is truly a problem for large scale deployments.

Statically mounted external storage is needed: This means that all your system will rely on externally mounted device and regardless how reliable is that, it is some king of SPOF.

So combining this all will show that classical shared storage architecture is hard to implement, expensive and has slow performance for large deployments. This may not me a big deal if, you are IT of Bank, and you management has lots of money and very little “imagination”. In this case this article is not for you MapReduce

So for everyone else: 

lets summarize what we need:

  1. Reliable storage.
  2. Low latency to access file.
  3. Easy management and backup.
  4. Reliability and fault tolerance.
  5. Easy access and less programming overhead.

After spending lots of time for finding a solution for mentioned problems we found seems ideal solution:

  1. Riak (Will act as storage and deliver files )Four our needs free, community edition is much more than enough
  2. NginX (Will act as reverse proxy and URL filter)

Before starting let’s summarize what these two tools will give us:

Riak: Wonderful, fully clusterized NoSQL server written in Erlang. It works asynchronously, has great performance and easy access via REST, protobuf and lots of other interfaces. It also has built in realtime Search index and MapReduce Implementation. But for now we will use only small par of Riak, aka storage for static files. In this scenario we must look  on several benefits against shared storage solution.

  1. Low latency to access files. (Riak bitcask uses Single Seek to Retrieve any value )
  2. Horizontally scaleable. (Just add more and more cheap servers to the cluster)
  3. Much more throughput (for example 10 servers with 1xGbit por will have total 10 Gbit minus about 10% internal utilization)
  4. No need for expensive Raids, SAN etc

So lets start my favorite part: Installation and configuration of mentioned above. As I’m Debian fan, I will do this on current Stable release Debian 6.0 Squeeze

First you need t download and install Riak. At the moment of writing this article this was the latest version of Riak but before just copy-pasting check out for latest version here: http://basho.com/resources/downloads/.

Download and Install Riak:

# cd /usr/local/src
# wget http://s3.amazonaws.com/downloads.basho.com/riak/CURRENT/debian/6/riak_1.2.1-1_amd64.deb
# dpkg -i riak_1.2.1-1_amd64.deb

 Done! Riak is installed. Do not start it for now. Just in case:

# /etc/init.d/riak restart

 Now we need to clusterize it and make some configuration changes. By default Riak binds on 127.0.0.1 whic ix not a good idea fo clusters MapReduce so change it to internal ip address of server,do not bind Riak on servers public IP is that exist . 

edit /etc/riak/app.config and change:

{pb_ip,   "192.168.235.111" }, and {http, [ {"192.168.235.111", 8098 } ]},

Also make sure you have configured /etc/hosts file and system hostname. Correct /etc/hosts should look something like this:

Also make sure you have configured /etc/hosts file and system hostname. Correct /etc/hosts should look something like this:

127.0.0.1 localhost
192.168.235.111 riak1.your-domain.com riak1

Also if you do not have your own internal DNS, you will have to add other nodes to /etc/hosts as well, but better to have DNS.

192.168.235.112 riak2.your-domain.com riak2
192.168.235.113 riak3.your-domain.com riak3
192.168.235.11N riakN.your-domain.com riakN

Also make some changes for storage configuration:

format and mount your bid disk to /var/lib/riak:

# mkfs.xfs /dev/sdb1
# mount /dev/sdb1 /mnt
# mv /var/lib/riak/* /mnt/
# umount /mnt
# mount /dev/sdb1 /var/lib/riak
# chown -R riak.riak /var/lib/riak

Or better just create another mount-point and reconfigure Riak to use it

# mount /dev/sdb1 /opt
# mkdir /opt/riak
# chown riak.riak /opt/riak
# mv /var/lib/riak/* /opt/riak

Change paths in /etc/riak/app.config:

{riak_core, [
{ring_state_dir, "/opt/riak/riak/ring"},
...--------...
{bitcask, [{data_root, "/opt/riak/bitcask"} ]},
{eleveldb, [{data_root, "/opt/riak/leveldb"}]},

Also it would be nice to enable Riak console to have nice WUI

It is installed by default so all you need is to change userlist from {userlist, [{“user”, “pass”} to actual values. and make sure {admin, true} exist.

It is also highly recommended to enable HTTPS and user secure link to admin Riak. For that you will need to enable HTTPS at riak_core section and add Certificate and Private key files:

{https, [ {"192.168.235.111", 8069 } ]},
    {ssl, [
        {certfile, "/etc/riak/ssl/riak.crt"},
        {keyfile, "/etc/riak/ssl/riak.pem"}
]},

That’s all ! now restart Riak and login to Riak Admin via https://192.168..233.111:8069/admin.

also edit /etc.riak/vm.args and change

Now restart Riak:

-name riak@127.0.0.1
to
-name riak@riak1.your-domain.com
/etc/init.d/riak restart

Nor we need to more nodes to have redundancy: lets imagine that we have 3 nodes cluster for now.

So repeat all steps above on riak2.your-domain.com and riak3.your-domain.com:

Now you have 3 separate nodes, now need to join them all to single cluster: Very nice guide to do this is here

Shortly you need yo do following: After making apropriate configs

on node 2 and 3

riak-admin cluster join riak@riak1.your-domain.com

And only on riak1 node

riak-admin cluster plan
riak-admin cluster commit

Now you have fully clusterized and working Riak installation.

to test is do following:

curl -v -X PUT http://192.168.235.111:8098/riak/images/foo.jpg -H "Content-type: image/jpg" --data-binary @./foo.jpg

Full reference to Riak cUrl commands is here

Now open http://192.168.235.111:8098/riak/images/foo.jpg with your favorite browser: Just for Fun open all 3 nodes and see same picture:

Great we have completed with Riak installation, now  lets install and configure NginX front-end.

apt-get install nginx

Now edit /etc/nginx/sites-enabled/default and replace content with this:

upstream riak {
    server 192.168.235.111:8098 fail_timeout=30s;
    server 192.168.235.112:8098 fail_timeout=30s;
    server 192.168.235.113:8098 fail_timeout=30s;
}

server {
        listen 80;
        server_name  your.public.domain;

        if ( $uri !~ \. ) { return 403; }       # Require URI with file extension 
        if ( $uri !~ ^/.*/.* ) { return 403; }  # Disable access to Riak / 
        if ( $uri ~ ^/.*/.*/.* ) { return 403;} # Disable Link walk MR etc  

        location / {
        if ($request_method = GET){
        proxy_pass http://riak;
        rewrite ^/(.*) /riak/$1 break;          # Remove /riak from external URL (Hide Riak)
        }

        proxy_redirect          off;
        proxy_next_upstream     error timeout invalid_header http_500;
        proxy_connect_timeout   2;
        proxy_set_header        Host            $host;
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        Referer         ""; # Zero up referer or Riak will 403 all requests 
        proxy_hide_header       X-Riak-Vclock;      # Remove Riak specific headers
        proxy_hide_header       Link;               # Remove Riak specific headers 
        proxy_hide_header       ETag;               # Remove Another Riak header 
        proxy_hide_header       Server;
        }
        }

After all this done you can get you image via http://your.public.domain/images/foo.jpg without any Riak specific tags and links.

原文地址: http://netangels.net/knowledge-base/nginx-and-riak/#.Uo3xN9JkP-Z

来自 http://www.ttlsa.com/database/nginx-and-riak/

普通分类: