Skip to main content

Questions tagged [high-availability]

High availability is an architectural consideration often involving degrees of redundancy to insure availability in case of system or component failure.

Filter by
Sorted by
Tagged with
80 votes
11 answers
27k views

Multiple data centers and HTTP traffic: DNS Round Robin is the ONLY way to assure instant fail-over?

Multiple A records pointing to the same domain seem to be used almost exclusively to implement DNS Round Robin as a cheap load balancing technique. The usual warning against DNS RR is that it is not ...
Valentino Miazzo's user avatar
38 votes
6 answers
10k views

Windows 2008 ignores Gratuitous ARP requests

We recently saw an issue after a fail over of our router where our Windows 2008 Boxes didn't start talking to the primary router after fail-back. When we did some digging they still had the ARP ...
Zypher's user avatar
  • 37.6k
27 votes
9 answers
43k views

Alternatives to Heartbeat, Pacemaker and CoroSync?

Are there any major alternatives for automatic failover on Linux besides the typical Heartbeat/Pacemaker/CoroSync combinations? In particular, I'm setting up failover on EC2 instances, which only ...
organicveggie's user avatar
23 votes
3 answers
14k views

What is the difference between Anycast and GeoDNS / GeoIP wrt HA?

Based on the Wikipedia description of Anycast, it includes both the distribution of a domain-name-to-many-IP-mapping across many DNS servers as well as replying to clients with the most geographically ...
Riyad Kalla's user avatar
22 votes
6 answers
21k views

DNS Round Robin: Do browsers stick to one IP as long as it is online?

How do most browsers behave if they get multiple A-records from the DNS server? Do the stick to one IP as long as it is reachable (and only use another if the IP is down)? Or do they switch all the ...
HiPerFreak's user avatar
22 votes
8 answers
33k views

Avoiding DNS timeouts when a DNSserver fails

We have a small datacenter with about a hundred hosts pointing to 3 internal DNS servers (bind 9). Our problem comes when one of the internal DNS servers becomes unavailable. At that point all the ...
Neil Katin's user avatar
21 votes
2 answers
34k views

What is the difference between keepalive and heartbeat?

I want to structure a high available server cluster . Now I want to know detail about keepalive and heartbeat, what is the difference between both, and How to choice one.
aboutstudy's user avatar
17 votes
1 answer
1k views

Highly-available, Web-accessible and scalable deployment of statsd and graphite

I'd like to setup statsd/graphite so that I can log JS apps running on HTML devices (ie. not in a contained LAN environment, and possibly with a large volume of incoming data that I don't directly ...
David142's user avatar
  • 353
16 votes
8 answers
2k views

When is the right time to introduce high availability for web site?

When is the right time to introduce high availability for web site? There are many articles on High Availability options. It’s not that obvious however WHEN is the right time to switch from single ...
Dennis Gorelik's user avatar
16 votes
3 answers
8k views

Multi-site high availability

We have a SaaS application that we need to be highly available. We already have an expensive, well-maintained Hyper-V failover cluster, but today the datacenter where we host that cluster had a five-...
Mike's user avatar
  • 1,293
15 votes
2 answers
8k views

Options for Multisite High Availability with Puppet

I maintain two datacenters, and as more of our important infrastructure starts to get controlled via puppet, it is important the the puppet master work at the second site should our primary site fail. ...
Kyle Brandt's user avatar
  • 84.6k
15 votes
5 answers
3k views

When my A web server gets unplugged, how do I automatically redirect all the users to my B web server in another city, and vice versa?

When my A web server gets unplugged, how do I automatically redirect all the users to my B web server in another city, and vice versa? A load-balancing switch does what I want, except I can't figure ...
David Cary's user avatar
15 votes
1 answer
6k views

Replicating beanstalkd for High Availability

Title says it all. Does anyone know of a way to replicate beanstalkd such that if a beanstalk server went down, others slaves could take over? Here's one approach I thought of: I could make ...
Josh Nankin's user avatar
14 votes
5 answers
41k views

How to setup STONITH in a 2-node active/passive linux HA pacemaker cluster?

I am trying to setup an active/passive (2 nodes) Linux-HA cluster with corosync and pacemaker to hold a PostgreSQL-Database up and running. It works via DRBD and a service-ip. If node1 fails, node2 ...
MMore's user avatar
  • 543
13 votes
8 answers
4k views

Load balancing Apache on a budget?

I am trying to get my head around the concept of load balancing to ensure availability and redundancy to keep users happy when things go wrong, rather than load balancing for the sake of offering ...
Industrial's user avatar
  • 1,599
13 votes
2 answers
29k views

Keepalived for more than 20 virtual addresses

I have set up keepalived on two Debian machines for high availability, but I've run into the maximum number of virtual IP's I can assign to my vrrp_instance. How would I go about configuring and ...
cvaldemar's user avatar
  • 1,136
12 votes
7 answers
13k views

Mathematically, how to calculate an uptime percentage based on a number of nodes and their respective uptime percentage?

This question is more of a math question than a server question, but it is strongly server related. If I have a server that I would be able to guarantee 95% uptime and I would put that server in a ...
Jeroen Landheer's user avatar
12 votes
2 answers
24k views

What's the difference between keepalived and corosync, others?

I'm building a failover firewall for a server cluster and started looking at the various options. I'm more familiar with carp on freebsd, but need to use linux for this project. Searching google has ...
hookenz's user avatar
  • 14.7k
12 votes
3 answers
2k views

The downsides of using nginx as a primary web server?

I've seen millions of websites using nginx as a proxifying webserver working together with Apache. But I've seen very few servers running nginx only as their default webserver. What are the main ...
Vladislav Rastrusny's user avatar
12 votes
4 answers
14k views

What exactly does Gluster do?

I've been playing with gluster for the last 2 days and been asking questions here and on their questions system. I really don't understand some of the stuff. I see people saying stuff like Set up ...
cbaltatescu's user avatar
12 votes
2 answers
16k views

How to do client side NFS failover in Linux?

I have a CentOS 6.3 client that needs to access NFS storage. There are two NFS servers that serve up the same content stored on a SAN with a clustered filesystem. How do I set up CentOS to failover ...
Doug's user avatar
  • 361
12 votes
2 answers
9k views

Avoiding SPOFS with GlusterFS and Windows

We have a GlusterFS cluster we use for our processing function. We want to get Windows integrated into it, but are having some trouble figuring out how to avoid the single-point-of-failure that is a ...
sysadmin1138's user avatar
  • 134k
12 votes
3 answers
2k views

How can I load-balance a load balancer?

I'm about to convert a single-server single-database web application into a physically distributed high-available configuration with servers on two physical locations (for now). Now, obviously, I need ...
Nikolai Prokoschenko's user avatar
12 votes
3 answers
2k views

RabbitMQ - How I do configure servers for zero-downtime upgrades?

Having read through the docs and RabbitMQ in Action, creating a RabbitMQ cluster seems straightforward enough, but upgrading or patching an existing RabbitMQ cluster seems to require the whole cluster ...
Terence Johnson's user avatar
12 votes
5 answers
5k views

what cluster management software to use for linux?

I have found the following cluster management software tools: pacemaker (clusterlabs.org), - original a Heartbeat project, focus for high-availability, will be in the next debian version openqrm (...
yvess's user avatar
  • 413
11 votes
8 answers
8k views

High Availability DNS Hosting Strategy?

I'm trying to find a few options of ways to do high availability DNS hosting for a few existing websites. This morning, the company I work for was brought to its knees because the DNS hosting we have ...
Jack M.'s user avatar
  • 803
11 votes
1 answer
2k views

How to design/ensure high-availability of web servers?

I have been given a dedicated server by 1&1 internet, which has two hard drives in a RAID1 configuration. I expected this would be good enough as if one disk fails, the other can take over until ...
volume one's user avatar
11 votes
5 answers
679 views

High server availabilty for a small business

After having a bit of scare with a server that wouldn't come up one morning, the higher ups have decided that the business needs a high availability / fail over setup. We have 5 main servers (4x ...
Matthew's user avatar
  • 175
11 votes
3 answers
27k views

Can not switch drbd to secondary

I'm running drbd83 with ocfs2 in centos 5 and planning to use packemaker with them. Afer some time, I'm facing drbd split brain problem. version: 8.3.13 (api:88/proto:86-96) GIT-hash: ...
favadi's user avatar
  • 537
11 votes
7 answers
1k views

Looking for a recommendation on measuring a high availability app that is using a CDN

I work for a Fortune 500 company that struggles with accurately measuring performance and availability for high availability applications (i.e., apps that are up 99.5% with 5 seconds page to page ...
Tim Reddy's user avatar
  • 213
10 votes
3 answers
13k views

Is round-robin DNS a possible solution for high availability?

Let's say I have 2 IPs for a given domain (round-robin DNS). If one the IPs becomes unresponsive, will clients try to connect to the other IP? or they will just fail to establish comunication with the ...
GetFree's user avatar
  • 1,552
10 votes
5 answers
1k views

Setup for high availability virtualized environment

For a project I have the task of planning a high availability setup for a web shop and CMS system. However, of course the project is on a tight budget. So a high end solution might not be in the ...
spa's user avatar
  • 303
10 votes
1 answer
23k views

Keepalived send gratuitous ARP periodically

Is there a way for a keepalived to send gratuitous ARP periodically? We had following situation: switch failure (VLAN setup) keepalived failovered to backup instance backup instance sent gratuitous ...
user373333's user avatar
10 votes
3 answers
8k views

High Availability Cron Jobs

Information We are currently in the process of creating a high availability cluster for NGINX (on Centos 7) running PHP. Most of the configuration has been mapped and it should work nicely in a ...
ctwheels's user avatar
  • 211
9 votes
4 answers
8k views

Ganeti vs Proxmox [closed]

I'm system administrator in small software house. I'm going to virtualise our servers. The main reason for doing this is providing highest possible uptime, but probably it will also increase resources ...
Maciek Sawicki's user avatar
9 votes
11 answers
18k views

good failover / high availability solutions for linux? [closed]

I have several cases where I need applications to be migrated from one server to another in the event of a failure (server hang or crash). On solaris we do this with VCS (Veritas Cluster Server). ...
ericslaw's user avatar
  • 1,572
9 votes
9 answers
2k views

Questions about single point of failure for small operations

If you can't afford or don't need a cluster or spare server waiting to come online in the event of a failure, it seems like you might split the services provided by one beefy server onto two less ...
Boden's user avatar
  • 5,008
9 votes
1 answer
29k views

Keepalived's virtual_router_id - should it be unique per node?

I have two nodes running keepalived, and two VIP, e.g. Node 1 Node 2 VIP1 VIP2 So in each node, I have two definition of vrrp_instance, so I assume the two vrrp_instance in my keepalived....
Ryan's user avatar
  • 6,101
9 votes
3 answers
1k views

Global high availability setup question

I own and operate visualwebsiteoptimizer.com/. The app provides a code snippet which my customers insert in their websites to track certain metrics. Since the code snippet is external JavaScript (at ...
Paras Chopra's user avatar
9 votes
3 answers
5k views

SQL Server - Cluster vs Mirror for high availability?

I have been doing research into various high availability options for SQL Server 2005. With regard to high availability, what circumstances would favor clustering over mirroring as an option? From ...
Jeremy's user avatar
  • 661
9 votes
1 answer
941 views

Using ZFS head node as database server?

I'm using a dual-head ZFS-backed NAS for high availability cluster shared storage, based on Nexenta's recommended architecture as seen here: The disks in 1 JBOD will store the database files for a ...
elleciel's user avatar
  • 389
8 votes
3 answers
7k views

Shared storage options for ESXi HA cluster

I am seeking recommendations for shared storage options to support ESXi HA cluster (note I'm NOT asking for product/brand/model recommendation - I know this is against the rules here). I am asking ...
Arthur's user avatar
  • 223
8 votes
1 answer
10k views

How to lower Gluster FS down peer timeout / reduce down peer impact?

The setting: Two fresh CentOS 6.5 server with latest updates. Both have a fresh install of Gluster 3.5.2. What I did ( from the perspective of server 2, shared1 and shared2 are logical volumes ) : ...
Robert's user avatar
  • 105
8 votes
2 answers
4k views

DNS issue with Failover IP from Hetzner

Assume we have two servers A and B with 'real' and external IPs and we can switch the so called 'failover ip' (W.X.Y.Z) to point to a specific external IP of A or B. This works from the 'outside' and ...
Karussell's user avatar
  • 181
8 votes
1 answer
11k views

RDS snapshot: how long does I/O suspension occur?

As we're relying on RDS Postgresql manual backup for our backup strategy, we encountered the issue with the possible downtime of the RDS instance (single AZ) during snapshot creation. According to AWS:...
Arcobaleno's user avatar
8 votes
7 answers
11k views

How to perform cron jobs failover?

Using two Debian servers, I need to setup a strong failover environment for cron jobs that can be only called on one server at a time. Moving a file in /etc/cron.d should do the trick, but is there ...
Falken's user avatar
  • 1,742
8 votes
1 answer
17k views

OpenVPN timeout to reconnect on fail takes a long time

I'm trying to create a high-availability environment for my OpenVPN servers. I do this by having two identical VPN servers and in my client config specify multiple remote's: # The hostname/IP and ...
Luke's user avatar
  • 3,856
8 votes
3 answers
7k views

Is there a way to force heartbeat to add new ip addresses to the system without a full restart?

We utilize heartbeat for High Availability. I'd like to add an additional ip address to the heartbeat cluster, but I don't want to do a full restart of the cluster in the process. Is there a signal ...
Peter Grace's user avatar
  • 3,486
8 votes
3 answers
14k views

Any problems with having an active/active HAProxy setup with Keepalived

Apologies if this has been asked before, but I can't seem to find much on it. We're going to be using HAProxy to load balance our MariaDB Galera Cluster. All the articles/tutorials I have seen on ...
Luke Cousins's user avatar
8 votes
2 answers
11k views

How to setup Traefik for HA? Need a reverse-proxy in front of Traefik?

I am trying to setup Traefik on a production site, and I'm struggling with some high availability issues. I think we still need a reverse-proxy in front of the Traefik cluster. Here are the potential ...
Mark Grimes's user avatar

1
2 3 4 5
20