SQL DBA Internals: HA

Showing posts with label HA. Show all posts

Tuesday, June 5, 2018

WSFC Lost Quorum votes due to network problem

Problem:

Early in the morning I received emails requesting for support as the applications were failing to connect to one of the production servers that are configured with Always On, I asked for an specific error and the customer sent me this:

---------------------------

Error while updating the backup server list: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)

---------------------------

After checking with one of the IT managers he confirmed that one of the database servers was going through some networking issues, however that didn't explain why the SQL server didn't automatically failover to the other node that was up and running, as both servers are part of the same WSFC as required to configure Synchronous availability groups with automatic failover configured.

So after getting into the available replica the troubleshoot began.

The first thing I noticed was:

After also checking the availability group dashboard which also confirmed the error state I went to the Failover cluster manager and noticed the following errors:

Also noticed that the node that was under the issues was showing that the Failover cluster state was stopped. Obviously if the server was not able to be reached through the network the cluster wouldn't be able to show it as healthy or running. But because lf that and the fact that there was no third witness the entire cluster state was failed and due to that the availably group and Listener were down as well, so what was needed? getting the cluster up and running even if one node is missing, but how would you do that? Well you need to start the cluster without the quorum validation, with that you will have the WSFC running and in return the availability group and listener will be up as well.

But wait, I'm a DBA that means that my experience with managing clusters at the windows level is minimal. If that is the case you have 2 options call IT to get the assistance you need or do it yourself, In my case was the second, if you choose the same, in the following URL are the required steps
https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/windows/force-a-wsfc-cluster-to-start-without-a-quorum?view=sql-server-2017

In my particular case the steps with the windows interface didn't work but the PowerShell ones did.
This got my cluster running, however my Availability group still had as primary the failed server, so I still had to failover the group to the one working, I achieved that by forcing the failover with this command:

ALTER AVAILABILITY GROUP <DAG NAME> FORCE_FAILOVER_ALLOW_DATA_LOSS

I know, it is scary when you see "Allow data loss" however, I had Synchronous commit set, so worst case scenario, I would be loosing the transactions that weren't committed at the time the primary server went offline so in the end that wasn't even written in the Primary database and it was safe to roll them back if needed.

That brought my databases online and the applications could resume working.

After fixing all the network issues and noticing that you are able to connect to your secondary replica execute this command to restart the synchronization, make sure that you update it for each of your databases.
ALTER DATABASE database_name SET HADR RESUME

And bam, you are all set, your cluster and Database availability group is back to the state it was before the failure.

As a lesson learned, add a witness to a file server, so that you don't loose your Quorum and prevent this issue to happen.

Thanks for reading!

Monday, December 4, 2017

Configure Always on AG with SQL 2017

One of the new features that we have with SQL 2017 is that you no longer need a Windows cluster to enable the AlwaysOn feature with SQL server (remember we discuss the requirements to configure that here those are valid for versions 2012 to 2016) , if you want to review what's new in SQL 2017 you can check it here.

In this post I'll be showing you how to configure an Always On availability group with SQL Server 2017, most of the steps are the same for the older versions and I'll be telling the differences on each of the steps so lets get started.
First I'll amuse you already have SQL Server installed in two servers as stand alone instances, that is a requirement for this post.

1. Enabling AlwaysOn: In order to do that, you need to go to SQL Server Configuration Manager, and on the SQL Server Services, hit properties over the MSSQLSERVER

2. In the properties window, look for "AlwaysOn High Availability", if you are running with an older version, the checkbox "Enable AlwaysOn Availability Groups" won't be availabile until you make the machine where SQL Server part of a cluster but just part of the windows cluster, the SQL Server instance remains as an Stand Alone type. Once that you have the checkbox enabled, check it, and click apply and Ok. Enabling this feature requires a Service restart so take that in account. Also, you need to do this in all the SQL instances that will be part of the Availability Group.

3. Go to SQL Management studio, connect to any of the SQL instances that will be part of the group, look for the "Always On High Availability" folder, expand it and you will see a folder with the name "Availability Groups" do a right click on it and select "New Availability Group Wizard…"

4. In the availability group wizard, configure the name you want for your availability group.

a. Cluster Type (New with SQL 2017): you have 2 options here (the official documentation shows 3), External or None, if you choose External, it means that you will joining this availability group to an external cluster, E.g. if you have a windows cluster on your primary datacenter that is on a different network or you want to add it to a Linux cluster.
b. Database health level detection (New with SQL 2017), this will enable a constant validation of the databases that are part of the availability group that if anything goes wrong with any of the databases it will trigger an automatic failover
c. Per Database DTC support (New with 2016 SP1) Allows the Distributed Transaction Coordinator through the availability group, feature that was not available in older versions of SQL (2012 and 2014)

If you don't have a cluster created and just like this scenario, choose NONE

5. Select the databases that will be part of the availability group. Requisites for a database to be considered:
a. Be on Full recovery model
b. Have a full backup

6. In here you will be choosing the replicas that you want to add to your group. Important features here:

Initial Role: You will set the role that each replica will have once that the group becomes available
Failover Mode: Manual or Automatic
Availability Mode: Synchronous commit or Asynchronous commit
Readable Secondary: No, Yes - Read Intent Only and Yes. If you choose No or Yes - Read Intent Only, you won't be available to query your replicas, the Read Intent is when you enable that feature that routes the read operations to your available replicas. With the Yes option you will be able to query the databases in your replica.
The endpoints tab will show you the URL and ports each replica is set to.
Backup preferences tab: in here you can configure if you want to take the backups from the primary or the secondary servers.
Listener tab: In this window it lets you set the parameters to configure the availability group listener, however my advice is to do it later, configure your group first and once that is done, configure it later.
Read-Only Routing: tab This lets you configure your read only routing for the read-intent setup, this allows you to load balance the queries so you have only the queries that will Insert, Delete or Update on the primary and all the Select queries routed to your secondary's so they don't consume resources the Primary will use (I'll explain it in another post)

Automatic seeding (Starting SQL 2016): With this type, SQL Server will do everything for you, it will use the default folders configured so make sure you have enough space available on it.
Full database and log backup: Same as the Automatic seeding, however in this case it lets you choose where do you want to generate the backups
Join Only: With this one, you do prepare everything in your replica, important notice, you do a restore with norecovery in your replica and apply at least one log backup. This option is useful with really large databases.
Skip initial data synchronization: Same as before, however in this particular one, you need to restore everything just at the time before initializing the synchronization otherwise it will tell you that there are items pending to be restored.
Failover Mode: Manual or Automatic
Availability Mode: Synchronous commit or Asynchronous commit
Readable Secondary: No, Yes - Read Intent Only and Yes. If you choose No or Yes - Read Intent Only, you won't be available to query your replicas, the Read Intent is when you enable that feature that routes the read operations to your available replicas. With the Yes option you will be able to query the databases in your replica.
The endpoints tab will show you the URL and ports each replica is set to.
Backup preferences tab: in here you can configure if you want to take the backups from the primary or the secondary servers.
Listener tab: In this window it lets you set the parameters to configure the availability group listener, however my advice is to do it later, configure your group first and once that is done, configure it later.
Read-Only Routing: tab This lets you configure your read only routing for the read-intent setup, this allows you to load balance the queries so you have only the queries that will Insert, Delete or Update on the primary and all the Select queries routed to your secondary's so they don't consume resources the Primary will use (I'll explain it in another post)

8. Verify that the validation runs successfully, this warning is because I didn't setup the listener in step 6, but that is fine, every time that I've tried to configure it from the very first page it fails, so my advice is that you configure it later (part of this post).

9. After hitting finish and if none of the steps have failed you will see the screen just like this.

Setup of the listener

The listener will be your single point of entry for your availability group, no matter which server is primary and which secondary, you will only need to configure your applications to use the Listener fqn or the ip and that will give you that High Availability you are looking with Always On.
1. Go to the "Always On High Availability" folder, expand your availability group and look for the "Availability group listener" and right click on it.
1. In the Configuration screen, Configure the domain name you want your listener to respond onto, this will be like another computer in the domain, so be sure that you have permissions to create objects in the active directory, if you don't ask your domain admin to provision that name and grant you permissions over it so you can enable it.
You are all set, you can start configuring your applications to the listener and taking the advantages always On provides.

Wednesday, November 8, 2017

Contained Databases setup

Last week I wrote about what is a contained database, the benefits and challenges, today as promised I'll show you how to setup one, how to configure a user and how to connect to it so let's get started.

Step 1

After logging in to your SQL Instance, go to the instance properties.

Step 2

In the properties page, select advanced and in the first section choose "Enable Contained Databases = True" With this you will be enabling the feature at the instance level so that you can setup your databases this type.

Step 3

Go to the database that you want to convert to a contained database and click properties, then go to options and in the Containment Type choose Partial.

With the above steps you have now your first contained database, however at this point there is nothing different as if you try to connect to the database with the accounts or users that are already granted at the instance level you will see no difference, for you to see the difference you will need to configure one use or account at the database level as follow.

Step 4

Go to the security folder inside the contained database, go to the Users folder and choose "Create new User", it will prompt you the following window where you will create your user, remember to configure the privileges that you will allow it to have, and always remember the best practice is to follow the least privileged principle.

Ok so you have now your user created, let's try to connect

Step 5.1 connecting to SQL Server as usual

Wait, what happened? We created the user, now it is not able to connect?

Step 5.2 To take advantage of the containment, you will need to select directly the database that you want to connect, you can do this by clicking on the "Options Button" at the bottom right corner of the screen, the logon window will change, so then you will need to go to the Connection Properties tab and in the "Connect to Database" section, you will need to write the Database name, it shows the option to lookup, however if you are not logged in it will tell you that you need to login first and you will end up in a circle where you can't authenticate, so better write down the database name so you can enter it here.

Step 6 Welcome you are now connected directly to your database. Can you see the difference? With an account that is granted at the server level you can see all the databases in the SQL Instance, however when you are connected to your contained database, it shows you only the database that you have permissions, creating an isolation of the environment.

Tests:

Now lest do some tests, imaging that you are sharing the account with someone from the production control team and that person has a little bit of knowledge in SQL, and tries to discover it there are more databases in the instance, the query will only return 3, Master, tempdb and the contained database it has permissions, why this 3? Well, master has the metadata that it needs to work with, if you create temp objects you will need to rely on the tempdb, but that's it.

For the second test, I created another contained database and tried to run queries against, however it doesn't lets me do it, why? It was not mentioned in last week's post but other of the limitations is that you can't run queries across different databases, unless you have the guest account enabled, which is not this case and that strongly not recommend.

Summary: Contained databases is a feature that provides isolation between environments if you are running databases that requires this kind of separation, also by using this feature you don't have to worry about migrating users, roles, etc. when you move your database from one server to another, either because you are migrating, upgrading or recovering your server, also this portability eases the configuration and security administration when you have Always On Availability groups configured for you HA/DR environments.

Wednesday, November 1, 2017

Contained Databases

Today's post is in regards a feature that was released with SQL Server 2012, it hasn't gotten the attention or used that it deserves even when it helps solving a lot of problems you face when upgrading or migrating the database servers and is Contained databases.

What is a Contained database?

A contained database is a database that is isolated from other databases and from the instance of SQL Server that hosts the database. A contained database basically includes all database settings and the metadata within itself thereby resulting in no configuration dependencies on the instance of the SQL Server Database Engine where the database is actually installed. Users will be able to connect to a contained database without authenticating a login at the Database Engine level. This feature really helps to isolate the database from the Database Engine thereby making it possible to easily move the database from one instance of SQL Server to another, this is incredibly useful when you have HA environments. The Contained databases feature is available at the instance level and is not enabled by default.

What type are available?

Contained database feature provides two containment modes:

None – By default each database has its mode set as NONE. This means there is no contained database feature being used.

Partial – With partially contained databases, we can define boundaries between databases and the server, so the metadata will exist inside the databases. It makes SQL Server databases more portable and less dependent on underlying hosts.

Advantages of contained databases :-

1. User authentication can be done at database level, so you only need to be sure to grant the users permissions in your database.

2. Have less dependency on instance than conventional databases. Objects & features of each database can be managed by them self, reduce workload of system database & SQL instance

3. Easier & Faster to migrate databases from one server to another. Errors related to missing users and orphan users are no longer an issue.

4. Contained database users can be Windows and SQL Server authentication users.

5. Contained database user can access only contained database objects. They cannot access system databases and cannot access server objects.

6. This is the preferred mode to be used with HADR (Always On)

7. Maintaining database settings in the database, instead of in the master database increase security & flexibility. Each database owner have more control over their database, without giving the database owner sysadmin permissions.

8. To close collation issues in contained database. New feature catalog collation introduced with contained database. Now database collation works for user objects & catalog collation works for system objects in database. Catalog collation will be same for all contained databases on all SQL instance, also this collation cannot be changed.

Disadvantages and Limitations of contained databases :-

1. A database owner has more control on contained database, User can create contained database users without the permission of a DBA that can lead to security issues & data theft threat

2. Contained databases cannot use replication, change data capture, change tracking, numbered procedures, schema-bound objects that depend on built-in functions with collation changes

3. Before changing containment settings at database level from NONE to PARTIAL , contained databases feature needs to be enabled at instance level, so make sure to enable it before releasing your server and database to production.

4. To connect to a contained database, you need to specify the database name in the default database option tab.

5. Temporary stored procedures are currently permitted. But can be removed from future versions of contained database.

6. Contained database user can access other databases on the Database Engine, if the other databases have enabled the guest account, which can be a security issue.

Changes cross versions:

This feature hasn't gotten any change since it got released with SQL 2012, mostly as the new versions have been focused in improving the performance, security and integrating Artificial Intelligence and Machine learning features, however, personally if you have environments where you need High Availability and every time that you failover your groups you need to resynch your users and principals, this might work for you, however, you will need to remove you database from your availability group if you want to enable this in your database, but we will be working with it in next week's post.

Thanks for reading!

Resources:

http://www.sqlservercentral.com/blogs/mssqlfun/2014/06/23/sql-server-2012-contained-database/

https://www.sqlshack.com/contained-databases-in-sql-server/

https://docs.microsoft.com/en-us/sql/relational-databases/databases/contained-databases

Friday, October 6, 2017

Introduction to SQL Server Always On

Today I'll write with one of the features I like the most and is Always On. There will be two post of this subject, this first one is an overview of the changes it has gone through, benefits and the scenarios where you can get it implemented, the second will detail the steps on how you can get it implemented.

So I'll try to answer this questions What is Always On? When was it Released? And How can it be used?

What is Always On?

Always On or Always On availability groups (Microsoft) is an integrated high availability (HA) and disaster recovery (DR) solution that provides redundancy within a datacenter and across datacenters to help enable fast application failover during planned and unplanned downtime. Specifically, Always On delivers failover cluster instances, multiple secondary, active secondary, and connection director capabilities (multi-subnet and read-only intent).

In English, this means, Always On provides a highly available and Disaster recovery solution for you database environment. As mentioned before is the next evolution of database mirroring. Database mirroring is relatively easy to set up, and it provides an almost instant failover when it’s used in High Safety mode. However, database mirroring has several important limitations. First, database mirroring is limited to a single failover partner. In addition, database mirroring is capable of failing over only a single database at a time. Database mirroring must be implemented either synchronously or asynchronously you have to choose one or the other. Finally, the databases on the mirroring server are in a state of recovery, so they can’t be directly accessed. To use the data, you must take point-in-time snapshots of the

databases.

This function was introduced with SQL Server 2012 and has received a lot of transformations with the different releases of SQL Server

This are some of the enhancements:

Secondary replicas With 2012 you could only have 4, with 2014 it was increased to 8
You were not allowed to create a replica in Azure, with 2014 that option was introduced
In 2012 and 2014 it was limited to the Enterprise version, in 2016 SP1 you can use "Basic Availability groups since the standard version"
In the previous versions you were limited to servers that belong to the same domain, with 2016 you can combine groups form different domains.
In 2016 Databases with Transparent Data Encryption (TDE) are now supported
If you were using Grouped Managed Service accounts (gMSA) you were not able to configure the availability groups, it is now supported with 2016
In 2016 the Direct Seeding got introduced. Direct seeding takes all the initialization steps for you, it does a full backup and synchronizes the database in the background for you.
In 2014 the read intent was introduced, meaning that all of your read operations will be routed to your replicas so that you have the primary server to run I/U/D (Insert, Update, Delete), In 2016 the engine automatically load balance this across your replicas.

How can it be used?

Always On needs some components to run:

Two sets of IPs: You will need one IP for the Windows failover cluster and one for your Listener.
Windows Failover cluster: You need to configure (create) a windows failover cluster (WSFC) so that you can enable the Always on Functionality in the SQL Server service, why? Because the failover cluster validates if the servers that will be part of the availability groups are running without problems.
Availability Groups: After you create your cluster and install SQL Server as Stand Alone instances, you will have to configure an availability group and add both SQL Server instances, this is where you Always On starts to takes life.
Listener: This will be you single point of entry to you Availability group (AOG) it could be either by the domain name or the IP you will be assigning to it in the setup section, this will always route you to your primary server, unless you configure the Read-Intent routing and it will route your read operations to your replicas.

Think of the an Always On environment as an "Share almost nothing" environment" why? You are only sharing the failover clustering part, other than that you will have independent hardware for each of your SQL Servers. Also when you setup your environment with this type of setup, it reduces the changes (or steps) that you need to take when doing a planned DR because everything is already setup (At least the database part) and you only need to worry if the synchronization was up to date.

SQL DBA Internals